Earlier this month, I was in San Francisco, CA, to attend Spark Summit 2017. I gave a talk on the phase before you can apply Machine Learning on data, using […]
Recents Publications
A quick flashback on a few articles I published recently. You Are Not a Machine, So Learn Machine Learning published by Database Trends and Applications on February 21st, 2017. What Are Spark […]
What are Spark Checkpoints on Dataframes?
Let’s understand what can checkpoints do for your Spark dataframes and go through a Java example on how we can use them. Checkpoint on Dataframe In v2.1.0, Apache Spark introduced […]
Ways to Run your Apps with Apache Spark
When you start an application, you need to think about where it’s going to run, and also how it’s going to run. Basically, the way I use Spark is in […]
Managing Java UDF with Apache Spark
UDF stands for User Defined Functions. With those, you can easily extend Apache Spark with your own routines and business logic. Let’s see how we can build them and deploy […]
Spark Java Recipes
Here are a few quick recipes to solve some common issues with Apache Spark. All examples are based on Java 8 (although I do not use consciously any of the […]
Your Very First Apache Spark Application
Here is your very first Apache Spark program using Java: the equivalent of the Kernighan and Ritchie’s “Hello, World”. You can download it from GitHub: Basically, the key is to […]
The Apache Ecosystem for Enterprise Applications
The Apache Software Foundation (ASF) offers a wide range of tools, libraries, frameworks, and data stores for building enterprise applications. The purpose of this list is to keep track of […]
Failure to Update Raspbian Using apt-get
I had a system crash with my Raspberry Pi during an update process. As a result the disk was corrupted and I needed to fix the trusted GPG keys as […]
Performances of the Raspberry Pi in Programmez! 195
Following the benchmarks I did on the Raspberry Pi, Programmez! Magazine has published more of my benchmarks, both on CPU and storage. I compare more extensively the CPU performance of the […]