As you may know, I start writing Apache Spark with Java (now renamed Spark in Action, 2nd edition). Usually, as the book develops, authors share a few excerpt of the book […]
The majestic dataframe in Apache Spark
Chapter 3 of Spark with Java is focusing on the dataframe. There is something majestic with Apache Spark’s dataframe, like those mountains of Montana. Apache Spark revolves around the concept of […]
Recents Publications
A quick flashback on a few articles I published recently. You Are Not a Machine, So Learn Machine Learning published by Database Trends and Applications on February 21st, 2017. What Are Spark […]
Let’s Start Slack’ing
To help foster the Apache Spark community in the (Research) Triangle region (Raleigh, Durham, and Chapel Hill in North Carolina), with some friends, we decided to create a Slack team […]
Spark Recipes Updated
This week has seen the release of Apache Spark v2.0.0. As with every major releases, you can expect some changes. My Java recipes for Apache Spark have been affected, but […]
The new Apache Spark is out (2.0.0)
Unlike the new iPhone, the release of Apache Spark v2.0.0 did not gather 1,000s of people in a room, but it is a very important event in the small world of […]
Spark Java Recipes
Here are a few quick recipes to solve some common issues with Apache Spark. All examples are based on Java 8 (although I do not use consciously any of the […]