Hortonworks Data Platform (HDP) v2.6 has been released and you can download the platform from their website. The sandbox is not yet available in v2.6. New Versions of Key Components […]

A quick flashback on a few articles I published recently. You Are Not a Machine, So Learn Machine Learning published by Database Trends and Applications on February 21st, 2017. What Are Spark […]

Following President Trump’s election, some European countries have started reacted through their humorists in a very original way, mixing apprehension, gratitude, and (a little bit of) fear. It all started […]

Let’s understand what can checkpoints do for your Spark dataframes and go through a Java example on how we can use them. Checkpoint on Dataframe In v2.1.0, Apache Spark introduced […]

A quick post to share the next Spark event that we will run in the NC Triangle (RTP – Chapel Hill, Durham, Raleigh). This event will be held on December […]

Right before Halloween, from October 24th to October 27th, I went to WoW. Of course, when I told that to my kids they assumed I was going to play World […]

This is really hot off the telescripter. Leading hosting company OVH is continuing its development in the United States and they will come to Virginia. OVH announced today its decision […]

Zaloni’s CEO Ben Sharma is speaking about managing data lakes. What has happened is IT department starts by installing Hadoop and jumps into Big Data. Not a lot of companies […]

Mica is data preparation tool, which can be used by anyone… It makes it a self-service data preparation software. Data scientists and engineers can use it for discovery, curation, and […]

To help foster the Apache Spark community in the (Research) Triangle region (Raleigh, Durham, and Chapel Hill in North Carolina), with some friends, we decided to create a Slack team […]