Skip to content

How to get started with Machine Learning? Read this

To use AI, you no longer need to have a lab coat or a PhD in Mathematics – there is an easier and more approachable way to get started, especially if you have a developer background.

What do you need to know to start using Machine Learning? Usually, the answer to this question is that you have to have a good understanding of the following terms: Statistics, Probability, Linear Algebra, Multivariate Statistics, Calculus, etc.

This can seem really frustrating and you might think that you need to go back to school to get a PhD in mathematics or similar before you are able to even start. This way of learning can be called Bottom-Up Approach, where you first are thought all the little details of the Math and Algorithms but no real implementation methods.

Luckily there is an easier and more approachable way to get started, especially if you have a developer background. Simply put, all you need to do is to focus on the actual results you want, working with real machine learning problems from end-to-end using modern tools and platforms. This can be called Top-Down Approach, it is not exactly a reversed way of the previous one, but a much better way to get you started.

Jason Browniee has a really good explanation and more information about these approaches, you can read more about this from his article Leap From Developer To Machine Learning Practitioner.

Focusing on the results

As a Developer myself, I like to get my hands dirty with experimenting readily available datasets and algorithms to familiarize myself with the tooling and the selected algorithms.

Two of the most used starting points available are MNIST-database, for handwritten digit recognition and Iris Data Set, for data pattern recognition. You can read more about these in my blog post What is Machine Learning and why should you care.

There is a huge amount of existing algorithms available for you to use, and selecting the right one might be a daunting task. Luckily there are some broader categories you should familiarize yourself with, to ease this selection process.

Supervised Learning

With Supervised learning, you have labeled data as an input and related example outputs, the machine then comes up with a general rule how to map them together through training iterations.

Imagine how good we are in recognizing objects in real world. When we look at something, we can easily label the things we see, ie. Apple, Orange, Dog, etc. We weren’t born with this ability, but instead we were trained as kids by our parents pointing things in real world or in picture books and giving us labels for those objects. After the training, our brains quickly start to build the relationships between the labels and the objects.

In Supervised Learning this is categorized as a Classification problem, there is also a Regression problem which is used when the goal is to predict a continuous target value.

Unsupervised learning

Unlike in Supervised learning, here we do not necessarily know what the outcome should be. Instead we can use Unsupervised learning to discover new meanings from the given data and unlock its mysteries.

Clustering is one way to get better understanding of the input data, we let the machine to find common relations in the data set and group or cluster those data points together. We can also utilize Anomaly Detection to find data points that are out of place.

Imagine you have thousands of articles and you want to group them to smaller chunks that are somehow similar or related by different attributes, such as word frequency, article length, page count, etc.

Or if you have bunch of server logs, you can use Unsupervised Learning to uncover misbehaviour or even malicious traffic automatically.

Reinforcement Learning

In Reinforcement Learning, our machine tries to map situations into action that has the biggest reward.

When you are teaching a dog a new trick, you either reward it or not, based on the action he has executed on your command. It has to figure out what it did to deserve the reward and what it did wrong to not get it.

Stay tuned, for more hands-on tutorials! Good luck, and naturally, feel free to comment below if there’s anything you want to ask or discuss.

Only few months ago Google opened the beta versions of it’s developer APIs to Machine Learning that enables the use of AI products, and after this the progress has been accelerating rapidly. By now, there are many ways AI is already used in practice. To name a few concrete ways, Qvik already has built proof  of concepts from Machine Learning, Chatbots and Content Analysis.