As for every science, data science requires methodology and knowledge

Giving talks and writing about tech is something I am passionate about. Why learn something if it is to keep it for yourself? This article is meant for all the aspiring data scientists, data engineers, and AI practitioners who want to get started in data science. They are often lost, keep asking the same questions, having the same needs. This article is not about ousting them from a conversation but about easing their learning experience.

If you are a learner in this field, then you have come to the right place. If you are not, I am sure it has come to you in the past (or will happen); feel free to share this article.

I have summarized five essential and free resources for learning data science and becoming a data scientist, from hardcore “hard” science to softer human skills. Add anything I may have missed in the comments.

Resource #1: Learn SQL

SQL remains the lingua franca of data queries. Yeah, nobody likes SQL. When I had to learn it in college, I was part of a rebellious group trying to understand what the point was. I remember asking my professor, “do you expect a secretary to type in SQL and get a report for their boss?” His answer was “yes.” Surely, a little over 25 years later, secretaries have been replaced by data scientists. But SQL is still there. In all its splendor.

Great way to learn it: W3Schools. They have been around since 1999 or so, first teaching web stuff (duh) and still doing it. They have some interactive tools that will prevent you from installing a shitload of stuff.

With W3Schools, you can run your SQL queries in a sandbox, without having to install a database. Of course, you are limited to the data they provide.

Resource #2: MOOC yourself!

Understand “open” as in open enrollment: you do not need to get accepted to a university and pass tests, etc., to get in. Do not misinterpret open as in free. A MOOC is a massive open online course, sites like edX, Coursera, or Cognitive Class to name the most famous ones. Some are free (like Cognitive Class), some are free without certificates, some are not free at all. Some, like Manning’s LiveProject, are more hands-on with optional mentoring. Content is usually pretty great and allows you to get started.

My favorite is Cognitive Class (it used to be called Big Data University when Big Data was a thing by itself). There is a lot of material, it’s all free, and sometimes they even give you computing credits.

Cognitive Class is a fantastic resource for learning cutting edge technologies from Big Data to Data Science, via Kubernetes and other cool stuff. Icing on the cake: you get badges. All Free.

Resource #3: Join a community

After learning stuff, you may end up asking yourself, “so what?” That’s where it’s great to join a community, and a community like AIDAUG, in its infancy, is perfect for getting noticed and getting support, exchanging ideas. Forums are all over the place and probably dying. That’s why AIDAUG decided to be only on Slack. As you join AIDAUG, you’ll gain access to their Slack and be able to interact with data scientists, engineers, thought leaders, and more great people like you. AIDAUG also sends a curated newsletter every month to remind us what happened, what’s happening, and more. Check it out and join for free.

AIDAUG is one community focusing on Artificial Intelligence, data, and analytics. The good melting pot for all data scientists.

Resource #4: Get free computing credits

Many of you are still struggling to get decent hardware (and sometimes even power) to get stuff done. Yeah, data science needs hardware. But why buy it when you can have it for free? The good thing about cloud computing is that vendors want you, so they are ready to give you credits to use their computers to build cool stuff (and hopefully for them, your ventures are going to be so cool that you are going to make money and pay them for more power). Check out the different players: Azure, IBM Cloud, AWS, OVHcloud, Alibaba Cloud, and many more… If you find a cool deal, share it in the comments!

IBM offers to get started on IBM Cloud for free, so are most of the cloud providers.

Resource #5: Learn to speak

A data scientist is a storyteller. When I was leading a team of scientists & engineers, we were building tools to tell a story. Remember who your audience is. Train yourself. The most challenging talk I got to do was my TEDx talk, as I put my guts in it: this is how a presentation should be. Not a bunch of figures on a boring PowerPoint. Watch killer talks and start reading about how you can deliver your content.

A starting point could be this article in Harvard Business Review from Chris Anderson, the founder of TED.

In this brief article, I wanted to share five resources I deemed essentials for a data scientist:

  • Learn SQL.
  • Learn through a MOOC.
  • Join a community.
  • Get a decent work environment for cheap or free.
  • Learn storytelling.

There is more out there: great resources for finding data, manipulating data, governing data, learning even more. Get a solid foundation, then get deeper on specific topics. Share your resources in the comments; other readers (and me) will appreciate it!