To help foster the Apache Spark community in the (Research) Triangle region (Raleigh, Durham, and Chapel Hill in North Carolina), with some friends, we decided to create a Slack team […]

This week has seen the release of Apache Spark v2.0.0. As with every major releases, you can expect some changes. My Java recipes for Apache Spark have been affected, but […]

Unlike the new iPhone, the release of Apache Spark v2.0.0 did not gather 1,000s of people in a room, but it is a very important event in the small world of […]

When you start an application, you need to think about where it’s going to run, and also how it’s going to run. Basically, the way I use Spark is in […]

Here are a few quick recipes to solve some common issues with Apache Spark. All examples are based on Java 8 (although I do not use consciously any of the […]

Successful first deployment of Apache Spark on a production server. Yep… I could add the line on my resume. Right now, we have set 24 cores, 72 GB of memory […]

Of course, nobody will tell you I am right. At least officially. But at was what was goal of Hadoop? Perform analytics over a wide range of servers. Of course, […]

Here is your very first Apache Spark program using Java: the equivalent of the Kernighan and Ritchie’s “Hello, World”. You can download it from GitHub: Basically, the key is to […]

The Apache Software Foundation (ASF) offers a wide range of tools, libraries, frameworks, and data stores for building enterprise applications. The purpose of this list is to keep track of […]