Despite 2020 being a mess so far, and after a very calm period in terms of events, it’s time to get back on stage. July 2020 is going to be…

In this article, you will find a list of hardware for this project, where to go to download it, and how to customize it to fit your needs and/or your…

Apache Spark v3.0.0 hits the road, let’s celebrate! Apache Spark v3.0.0 has been released on June 18th, 2020, just before Spark + AI Summit 2020, which is being held virtually…

In this episode, you will learn about doing a basic ETL (extract, transform, and load) operation using Apache Spark. You will load a basic CSV file with Apache Spark, make…

Starting today, I will host a weekly live show about data. You may join, attend “live,” and ask questions as I go through a data-oriented topic. For now, the topic…

I just wanted to share with you the latest update on Spark in Action, second edition What’s new? Chapter 12, “Transforming your data” Chapter 13, “Transforming entire documents” Appendix K,…

This article follows my previous post on my trip to NASA in Langley, Va. in early March 2019. This is the second and last part of this article. Before going…