Apache Spark has been a game changer for distributed data processing, thanks to an easy to understand API, a focus on simplicity, and an adoption of modern infrastructure. However, rumors […]

A Little History On August 18, 1227, the well-known Mongolian emperor Genghis Khan passed. Despite numerous criticisms, based on rumors of genocide and brutality, he united Mongolia. One of his […]

Hortonworks Data Platform (HDP) v2.6 has been released and you can download the platform from their website. The sandbox is not yet available in v2.6. New Versions of Key Components […]

A quick flashback on a few articles I published recently. You Are Not a Machine, So Learn Machine Learning published by Database Trends and Applications on February 21st, 2017. What Are Spark […]

Right before Halloween, from October 24th to October 27th, I went to WoW. Of course, when I told that to my kids they assumed I was going to play World […]

Mica is data preparation tool, which can be used by anyone… It makes it a self-service data preparation software. Data scientists and engineers can use it for discovery, curation, and […]