File Ingestion in Apache Spark

In a typical Big Data analytics scenario, you will probably be tempted to ingest files. You know, those pesky CSV files where the comma is sometimes a semicolon or a…

Apache Spark with Java

Apache Spark has been a game changer for distributed data processing, thanks to an easy to understand API, a focus on simplicity, and an adoption of modern infrastructure. However, rumors…

Apache Spark Maturity on the Rise

Spark Summit Europe 2017 just concluded, here, in Dublin. More than 102 speakers, 1200 attendees, and an impressive Databricks team attended the 3-day long celebration. Spark is reaching a new…

Spark is Making Big Data Easy at NCDevCon

NCDevCon is a yearly event in the Triangle, targeted for developers of all breeds, from front-end to back-end. Its origin starts in the ol’ days of Adobe ColdFusion, and thus…

Loading CSV in Spark

Loading CSV in Apache Spark is a standard feature since version 2.0, previously you required a free plugin (provided by Databricks). Although it starts with a basic value proposition: Comma…

A New Dimension for Apache Spark Clusters

Summer has been busy and it’s now behind us. I won’t annoy you with all the details of what happened but I wanted to come back on a project I…

A Deep-Dive Introduction to Spark for RDBMS Users

Earlier in the summer, I start a series of articles for IBM developerWorks. Those articles focus on Apache Spark from a RDBMS user perspective, of course, the database of choice…

Getting Ready for This Pint of Guinness

Next month, I’ll be heading to Dublin, the capital of Ireland. I have been to Ireland quite a few times – I was 3 the first time. However this time,…

Meet Cactar, the Ancient Mongolian Warlord of Data Quality

A Little History On August 18, 1227, the well-known Mongolian emperor Genghis Khan passed. Despite numerous criticisms, based on rumors of genocide and brutality, he united Mongolia. One of his…

Spark Boosts IBM Event Store

IBM just announced Event Store, a hybrid datastore to store events. The originality? Events can be streamed in and it is based on Apache Spark. IBM claims to be able…

Apache Spark

File Ingestion in Apache Spark

Apache Spark with Java

Apache Spark Maturity on the Rise

Spark is Making Big Data Easy at NCDevCon

Loading CSV in Spark

A New Dimension for Apache Spark Clusters

A Deep-Dive Introduction to Spark for RDBMS Users

Getting Ready for This Pint of Guinness

Meet Cactar, the Ancient Mongolian Warlord of Data Quality

Spark Boosts IBM Event Store

Let's be social

jgperrin.substack

/in/jgperrin

/jgperrin

Help share:

Help share:

Help share:

Help share:

Help share:

Help share:

Help share:

Help share:

Help share:

Help share:

Let's be social

jgperrin.substack

/in/jgperrin

/jgperrin