This new chapter, chapter 4, of Spark with Java (https://www.manning.com/books/spark-with-java) is not only about celebrating laziness, it also teaches, through examples and experiments, the fundamental differences in building a data […]

Chapter 3 of Spark with Java is focusing on the dataframe. There is something majestic with Apache Spark’s dataframe, like those mountains of Montana. Apache Spark revolves around the concept of […]

Chapter 9 still covers Spark ingestion (like chapter 7 and chapter 8), but this time, it’s about  “anything can become a Spark datasource.” When I was working for Zaloni, we […]

Chapter 8 of Spark with Java is out and it covers ingestion, as did chapter 7. However, as chapter 7 was focusing on ingestion from files, chapter 8 focus on […]

Apache Spark has been a game changer for distributed data processing, thanks to an easy to understand API, a focus on simplicity, and an adoption of modern infrastructure. However, rumors […]

Summer has been busy and it’s now behind us. I won’t annoy you with all the details of what happened but I wanted to come back on a project I […]

A Little History On August 18, 1227, the well-known Mongolian emperor Genghis Khan passed. Despite numerous criticisms, based on rumors of genocide and brutality, he united Mongolia. One of his […]

Earlier this month, I was in San Francisco, CA, to attend Spark Summit 2017. I gave a talk on the phase before you can apply Machine Learning on data, using […]

A quick flashback on a few articles I published recently. You Are Not a Machine, So Learn Machine Learning published by Database Trends and Applications on February 21st, 2017. What Are Spark […]