In this episode, you will learn about doing a basic ETL (extract, transform, and load) operation using Apache Spark. You will load a basic CSV file with Apache Spark, make…

Starting today, I will host a weekly live show about data. You may join, attend “live,” and ask questions as I go through a data-oriented topic. For now, the topic…

I just wanted to share with you the latest update on Spark in Action, second edition What’s new? Chapter 12, “Transforming your data” Chapter 13, “Transforming entire documents” Appendix K,…

This article follows my previous post on my trip to NASA in Langley, Va. in early March 2019. This is the second and last part of this article. Before going…

I like to use π day, to remember a few things about science and technology that influence who I am. This year, Ï€ day is perfect for that. Let me tell…

A new chapter of Spark in Action, 2e, (formerly known as Spark with Java) is available. Chapter 11 is titled “Working with SQL”. In chapter 11, you will explore how…

As you may know, I start writing Apache Spark with Java (now renamed Spark in Action, 2nd edition). Usually, as the book develops, authors share a few excerpt of the book…