After a (too long) hiatus, DataFriday is back. The first episode of the new season was released last Friday, January 14, 2022. It focuses on defining Enterprise Architects and how they are perceived and what they really bring to the enterprise.

On September 15th, 2021, after more than 18 months, I was finally able to give a talk in person. My conference schedule did not really go down during the pandemic, […]

In this episode, you will learn about doing a basic ETL (extract, transform, and load) operation using Apache Spark. You will load a basic CSV file with Apache Spark, make […]

When I assembled my first data science team, the term was barely getting printed in the Harvard Business Review. I had no clue that I was building a team pioneering […]

Before thinking about what is the outcome of data science, maybe I should take the two seconds I think it takes to define it. As how to define data science, […]

As you may know, I start writing Apache Spark with Java (now renamed Spark in Action, 2nd edition). Usually, as the book develops, authors share a few excerpt of the book […]