What Is Supervised Learning?

Supervised learning is a type of machine learning approach that uses labeled data sets to train algorithms to classify data or predict outcomes.

Written by Anthony Corbo
Supervised learning image of a group of professional people working together on a project. A man and a woman are in the foreground. He seeks her help with a problem. She stands next to him pointing at the screen.
Image: Shutterstock / Built In
Brand Studio Logo
UPDATED BY
Brennan Whitfield | Jun 17, 2025
REVIEWED BY
Summary: Supervised learning is a type of machine learning that trains models using labeled data sets, where inputs are paired with known outputs. This approach enables algorithms to classify data or predict outcomes by learning the relationship between inputs and outputs.

Supervised learning is a type of machine learning that uses labeled data sets — where each input is paired with a known output — to train artificial intelligence (AI) models. By using labeled data, this allows machine learning models to learn the mapping from inputs to outputs and improve prediction accuracy over time.

 

Supervised Learning | Video: CrashCourse

Types of Supervised Learning Algorithms

Supervised learning typically involves two main task types: regression and classification. These tasks are carried out using algorithms such as naive Bayes, decision trees, random forests and neural networks.

Regression Algorithms

Regression algorithms predict continuous numeric output values, such as prices or temperatures, based on the relationships between input variables.

Classification Algorithms

Classification algorithms assign inputs into predefined categories or classes, such as spam vs. not spam.

Naive Bayes

Naive Bayes is a classification algorithm based on Bayes’ Theorem. It assumes independence between input features and is effective for high-dimensional data sets like text.

Decision Trees

Decision trees are flowchart-like structures where each internal node represents a decision on a feature, and each leaf node represents an output label.

Random Forests

Random forests are ensembles of decision trees that aggregate the predictions of multiple trees to improve accuracy and reduce overfitting.

Neural Networks

Neural networks consist of layers of interconnected nodes that learn to recognize complex patterns in data. They are used in both classification and regression tasks.

RelatedArtificial Intelligence vs. Machine Learning vs. Deep Learning

 

Supervised vs. Unsupervised Learning

The biggest difference between supervised and unsupervised learning is the use of labeled data sets.

Supervised Learning

Supervised learning involves training a model on labeled data to predict known outputs from new inputs. By providing labeled data sets, the model iteratively adjusts its internal parameters to minimize the difference between its predictions and the true outputs.

The difference between the prediction and the correct answer is the key to the models producing accurate predictions when using new data. Like most varieties of machine learning, supervised learning is typically used to predict outcomes from data. Supervised learning models are typically implemented using programming languages like Python or R, which provide robust machine learning libraries such as scikit-learn, TensorFlow and caret.

Unsupervised Learning

Unsupervised learning does not make use of labeled data sets, meaning the models work on their own to uncover the inherent structure of the unlabeled data. Human intervention is still required to validate the output variables depending on the intended use of the data and if it makes sense for the data to be utilized.

Unsupervised learning is typically used to uncover insights from massive volumes of data, detect anomalies or make recommendations and may have inaccurate results without human validation.

 

Supervised Learning Example: Home Price Prediction

Supervised learning can be used to make accurate predictions using data, such as predicting a new home’s price.

In order for predictions to be made, input data must be gathered. To determine a new home’s price, for example, we need to know factors like location, square footage, outdoor space, number of floors, number of rooms and more. In this case, the home’s price is the labeled output, while features like location, square footage and number of rooms are inputs used to predict that price.

Once the corresponding labels are set, data from thousands of other homes can be gathered and compared against the existing input data. By weighing the features against the prices of these other homes, the model can do the same for the home in question to determine an accurate price.

Frequently Asked Questions

Supervised learning is a type of machine learning that uses labeled data to train models to classify data or predict outcomes.

Supervised learning includes two main task types:

  1. Regression (predicts continuous values)
  2. Classification (assigns inputs to predefined categories)

Common supervised learning algorithms include:

  • Decision tree
  • Naive Bayes,
  • Random forest
  • Neural network
  • Regression model
Explore Job Matches.