Fundamentally, machine learning models are divided into two camps: supervised and unsupervised. The supervised model is probably the type you’re most familiar with, and it represents a paradigm of learning that’s prevalent in the real world.
What is supervised learning?
In supervised learning, a model is presented with examples from a training data set. Such sets consist of a sequence of ordered pairs, (x, y), where x is the input and y is the output (sometimes called a label). The goal of the machine learning model then is to learn to replicate the function mapping each x to y. In other words, the model learns the underlying relationship in the data. Such algorithms are called supervised because the fact that the data is labeled means that it can act as a sort of teacher.
How supervised learning works
During training, if a model makes a mistake, it can compare its own output with that of the label provided in the training set. If there’s an incongruence there, the model can modify itself in such a way as to mitigate that error in the next prediction pass. Through repeated iterations of this process, the model learns the function in the training data.
What is unsupervised learning?
In contrast with supervised learning, unsupervised learning models learn from unlabeled data. In other words, the training data consists solely of points x, but no associated y’s. The idea is that by looking at different mathematical relationships between the x’s, the model can learn useful properties of the data that can be applied to future, unseen points.
How unsupervised learning works
However, because there’s no implicit function in the data that the model is being required to compute, the developer needs to put some of that structure into the training algorithm. In other words, the model creator must somehow specify what useful properties are to be extracted from the data, either directly in the problem setup or indirectly through the mathematics of the algorithm.
Types of models in supervised and unsupervised learning
The vast majority of popular machine learning algorithms perform supervised learning. A short but non-exhaustive list includes random forests, logistic regression, k nearest neighbors, convolutional neural networks, LSTMs, naive bayes, and many, many more. In contrast, unsupervised learning contains models such as k means, PCA, mixture models and the EM algorithm, SVD, and variational autoencoders.
Types of projects
As examples of problems that the two methods might tackle, a typical supervised learning task might involve taking a sample of used cars and their associated features along with the price at which that car was sold. The goal then would be to predict the price of any given car given the details of its features. A typical unsupervised learning problem might be focused on clustering a group of fauna into different species supertypes. As you can see, one involves learning a mapping whereas the other centers more around uncovering hidden structure.
Know which model type to use
The main determinant in deciding whether to use supervised or unsupervised learning is usually whether or not you have access to labelled data and the setup of the problem you’re trying to solve. However, it’s not uncommon to actually combine the two modes into a paradigm called semi-supervised learning. The idea here is that most of the data you have access to will be unlabeled, but that you might be able to get access to a small amount of labeled data, for example, by paying humans to label it. This small amount of labeled can provide weak supervision and allow you to improve the accuracy of the model on the unlabeled points.
Examples of semi-supervised learning include transductive support vector machines, HMRF K-Means, graph kernels, and more.
What is transfer learning?
Another form of supervised learning that’s been quickly gaining popularity within the past few years, especially within computer vision and NLP, is called transfer learning. Transfer learning in some sense solves the inverse problem of the one that semi-supervised learning works to solve.
In transfer learning, the problem is often that a practitioner might actually have too much labeled data such that training a full model might be too expensive or require more compute power than they have access to. This is the case with datasets such as ImageNet which contains 14 million images or the Common Crawl corpus which at most recent count stood at 220 Terabytes representing over 2.6 billion pages of web text.
Training complex models on such large datasets can potentially take weeks even on the best hardware and cost thousands of dollars. As such, access is only available to institutions with huge amounts of resources.
Transfer learning use case
Another use case for transfer learning is that the practitioner may just have a smaller amount of labeled data but want to apply the power of supervised learning models trained on huge datasets to the novel data they’ve acquired as part of their research or business. With transfer learning, a practitioner can take a model trained by someone with more resources (usually a behemoth tech company or academic powerhouse) and fine-tune the model’s weights to their unique dataset. This allows them to leverage the inferential power of large models but apply it to their own data.
Supervised learning is a broad field with many different model choices and applications. It can be used basically anywhere that labeled data is available in some quantity. Diverse fields from medicine, to genomics, to marketing, to finance, to operations research, to agriculture, and so many more have all benefited from supervised learning in some way, and the growth of the use of these models doesn’t look to be slowing any time soon.
For more information on machine learning types, visit the following resources:
How machine learning works – a blog post
Types of machine learning: supervised and unsupervised – a blog post
5 machine learning models you should know – a blog post