Starting today, I will host a weekly live show about data. You may join, attend “live,” and ask questions as I go through a data-oriented topic. For now, the topic […]
Spark in Action, Second Edition MEAP Update
I just wanted to share with you the latest update on Spark in Action, second edition What’s new? Chapter 12, “Transforming your data“ Chapter 13, “Transforming entire documents“ Appendix K, […]
My trip to the Moon… and Mars (part 2)
This article follows my previous post on my trip to NASA in Langley, Va. in early March 2019. This is the second and last part of this article. Before going […]
My trip to the Moon… and Mars (part 1)
I like to use π day, to remember a few things about science and technology that influence who I am. This year, π day is perfect for that. Let me tell […]
How I built the perfect data science team
When I assembled my first data science team, the term was barely getting printed in the Harvard Business Review. I had no clue that I was building a team pioneering […]
Eleven key elements of data science outcome
Before thinking about what is the outcome of data science, maybe I should take the two seconds I think it takes to define it. As how to define data science, […]
Spark in Action’s Chapter Eleven on Working with SQL is in MEAP
A new chapter of Spark in Action, 2e, (formerly known as Spark with Java) is available. Chapter 11 is titled “Working with SQL”. In chapter 11, you will explore how […]
(Almost) All you need to know about file ingestion in Apache Spark
As you may know, I start writing Apache Spark with Java (now renamed Spark in Action, 2nd edition). Usually, as the book develops, authors share a few excerpt of the book […]
Eight very hot data trends for 2019
Read about eight very hot predictions for data management in 2019, in usages, shapes, governance, and people.
What is Apache Spark, the podcast
A couple of weeks ago, I chatted about Apache Spark with Tobias Macey on data engineering on more specifically Apache Spark. Tobias Macey runs the data engineering podcast, which you can directly […]