As AI ramifies in our society, as citizens, we need to demand explainability to guarantee trust.
Bringing vision to Apache Spark
DrstiĀ (pronounced drishti) is an effortless data visualization that interfaces easily with Apache Spark
Do your best work ever for Call for Code
I have been a mentor and judge for Call for Code. However, this year, I have other projects limiting my time to contribute to this world-changing initiative. That’s why I […]
Having a Smart Summer Thanks to IBM
As I am a rising senior in high school with a lot of time on my hands, I am able to look into different areas of engineering to see what […]
Spark in Action, Second Edition MEAP Update
I just wanted to share with you the latest update on Spark in Action, second edition What’s new? Chapter 12, “Transforming your data“ Chapter 13, “Transforming entire documents“ Appendix K, […]
How I built the perfect data science team
When I assembled my first data science team, the term was barely getting printed in the Harvard Business Review. I had no clue that I was building a team pioneering […]
(Almost) All you need to know about file ingestion in Apache Spark
As you may know, I start writing ApacheĀ Spark with Java (now renamed Spark in Action, 2nd edition). Usually, as the book develops, authors share a few excerpt of the book […]
Eight very hot data trends for 2019
Read about eight very hot predictions for data management in 2019, in usages, shapes, governance, and people.
What is Apache Spark, the podcast
A couple of weeks ago, I chatted about Apache Spark withĀ Tobias Macey on data engineering on more specifically Apache Spark.Ā Tobias Macey runs the data engineering podcast, which you can directly […]
Microsoft SQL Server 2019 gets a Spark
Yesterday, during Ignite 2018, Microsoft announced that they will integrate Apache Spark more tightly with SQL Server 2019. If you missed previous announcements around SQL Server, it now runs on […]