In this episode, you will learn about doing a basic ETL (extract, transform, and load) operation using Apache Spark. You will load a basic CSV file with Apache Spark, make […]
DataFriday: load a CSV file with Apache Spark
Starting today, I will host a weekly live show about data. You may join, attend “live,” and ask questions as I go through a data-oriented topic. For now, the topic […]
(Almost) All you need to know about file ingestion in Apache Spark
As you may know, I start writing Apache Spark with Java (now renamed Spark in Action, 2nd edition). Usually, as the book develops, authors share a few excerpt of the book […]
File Ingestion in Apache Spark
In a typical Big Data analytics scenario, you will probably be tempted to ingest files. You know, those pesky CSV files where the comma is sometimes a semicolon or a […]
Spark is Making Big Data Easy at NCDevCon
NCDevCon is a yearly event in the Triangle, targeted for developers of all breeds, from front-end to back-end. Its origin starts in the ol’ days of Adobe ColdFusion, and thus […]
Loading CSV in Spark
Loading CSV in Apache Spark is a standard feature since version 2.0, previously you required a free plugin (provided by Databricks). Although it starts with a basic value proposition: Comma […]