ML Objectives - Answer Key
K-Nearest Neighbors (KNN) is classified as what type of machine learning algorithm?
a) Instance-based learning
b) Parametric learning
c) Non-parametric learning
d) Model-based learning
Answer: a) Instance-based learning
Which of the following is not a supervised machine learning algorithm?
a) K-means
b) Naïve Bayes
c) SVM for classification problems
d) Decision tree
Answer: a) K-means
Which algorithm is best suited for a binary classification problem?
a) K-nearest Neighbors
b) Decision Trees
c) Random Forest
d) Linear Regression
Answer: d) Linear Regression
What is the key difference between supervised and unsupervised learning?
a) Supervised learning requires labeled data, while unsupervised learning does not.
b) Supervised learning predicts labels, while unsupervised learning discovers patterns.
c) Supervised learning is used for classification, while unsupervised learning is used for
regression.
d) Supervised learning is always more accurate than unsupervised learning.
Answer: a) Supervised learning requires labeled data, while unsupervised learning
does not.
In supervised learning, the training dataset consists of:
a) Input features only
b) Output labels only
c) Input features and output labels
d) None of the above
Answer: c) Input features and output labels
Which supervised learning algorithm is known for its ability to handle both classification
and regression tasks?
a) Support Vector Machines (SVM)
b) Random Forest
c) K-nearest neighbors (KNN)
d) Linear Regression
Answer: b) Random Forest
The main objective of a classification algorithm in supervised learning is to:
a) Predict continuous values
b) Determine the optimal number of clusters
c) Assign input data to predefined categories or classes
d) Identify patterns in unlabeled data
Answer: c) Assign input data to predefined categories or classes
Which algorithm is used to minimize the errors between predicted and actual outputs in
supervised learning?
a) Decision tree
b) Gradient Boosting
c) K-means clustering
d) Principal Component Analysis (PCA)
Answer: b) Gradient Boosting
Which algorithm is prone to overfitting in supervised learning?
a) Logistic Regression
b) Support Vector Machines (SVM)
c) K-means clustering
d) Linear Regression
Answer: b) Support Vector Machines (SVM)
Which supervised learning algorithm is an ensemble method that combines multiple weak
learners to make predictions?
a) K-means clustering
b) Random Forest
c) Naive Bayes
d) K-nearest neighbors (KNN)
Answer: b) Random Forest
A model performs well on the training data but poorly on new, unseen data, indicating:
a) Under-fitting
b) Over-fitting
c) Validated
d) Optimal Balanced
Answer: b) Over-fitting
Which of the following is considered in designing a machine learning system?
a) Choosing Training experience
b) Function approximation algorithm
c) Choosing Target Function
d) All the above
Answer: d) All the above
Identify the type of unsupervised learning algorithm.
a) Naïve Bayes Classifier
b) Linear Regression
c) Decision Tree algorithm
d) K-Means Clustering algorithm
Answer: d) K-Means Clustering algorithm
What is the bias-variance tradeoff in machine learning?
a) A concept related to feature selection.
b) Finding the right balance between bias and variance in a model.
c) The tradeoff between model complexity and computational cost.
d) The tradeoff between precision and recall.
Answer: b) Finding the right balance between bias and variance in a model.
In SVM, what is the kernel trick used for?
a) To increase the bias of the model
b) To transform data into a higher-dimensional space
c) To reduce the number of support vectors
d) To replace decision trees with kernels
Answer: b) To transform data into a higher-dimensional space
Which algorithm is used to determine whether an employee will get a promotion based on
their performance?
a) K-Means Clustering
b) Logistic Regression
c) DBSCAN algorithm
d) KNN algorithm
Answer: b) Logistic Regression
Support Vector Machine is:
a) Logical model
b) Probabilistic model
c) Geometric model
d) None of the above
Answer: c) Geometric model
In a simple linear regression model (one independent variable), if we change the input
variable by 1 unit, how much will the output variable change?
a) By 1
b) No change
c) By intercept
d) By its slope
Answer: d) By its slope
What is the fundamental idea behind the Random Forest model?
a) It constructs multiple decision trees and combines their predictions.
b) It uses a single decision tree for classification.
c) It employs a linear discriminant function.
d) It performs k-Nearest Neighbors classification.
Answer: a) It constructs multiple decision trees and combines their predictions.
Which of the following clustering algorithms follows a top-to-bottom approach?
a) K-means
b) Divisible
c) Agglomerative
d) None
Answer: b) Divisible
What are the typical steps in the machine learning process?
a) Data collection, data preprocessing, feature engineering, model selection, training,
evaluation, and deployment.
b) Data collection, Data analysis, feature extraction, model validation, and testing.
c) Data preprocessing, model selection, and deployment.
d) Data collection, model training, and testing.
Answer: a) Data collection, data preprocessing, feature engineering, model selection,
training, evaluation, and deployment.
What is the name of the diagram that represents the tree structure of hierarchical
clustering?
a) Cluster plot
b) Decision tree
c) Dendrogram
d) Scatter plot
Answer: c) Dendrogram
In the context of linear classification, what is a discriminant function?
a) A function that discriminates against certain data points
b) A function that transforms data to a higher-dimensional space
c) A function that defines a decision boundary between classes
d) A function that adds noise to the data
Answer: c) A function that defines a decision boundary between classes
In which category does linear regression belong?
a) Neither supervised nor unsupervised learning
b) Both supervised and unsupervised learning
c) Unsupervised learning
d) Supervised learning
Answer: d) Supervised learning
The learner is trying to predict housing prices based on the size of each house. What type of
regression is this?
a) Multivariate Logistic Regression
b) Logistic Regression
c) Linear Regression
d) Multivariate Linear Regression
Answer: c) Linear Regression
How many variables are required to represent a linear regression model?
a) 3
b) 2
c) 1
d) 4
Answer: b) 2
The cost function for logistic regression and linear regression are the same.
a) True
b) False
Answer: b) False
Which of the following statements is not true about the Decision Tree?
a) It can be applied to binary classification problems only.
b) It is a predictor that predicts the label associated with an instance by traveling from a
root node of a tree to a leaf.
c) At each node, the successor child is chosen based on a splitting of the input space.
d) The splitting is based on one of the features or a predefined set of splitting rules.
Answer: a) It can be applied to binary classification problems only.
Which is not true about clustering?
a) A collection of objects based on similarity and dissimilarity between them.
b) Dividing the population or data points into a number of clusters.
c) An unsupervised learning method.
d) Identifies the category of new observations based on training data.
Answer: d) Identifies the category of new observations based on training data.
Which is conclusively produced by Hierarchical Clustering?
a) Final estimation of cluster centroids
b) Tree showing how nearby things are to each other
c) Assignment of each point to clusters
d) All of these
Answer: b) Tree showing how nearby things are to each other
Which of the following is a good technique to evaluate the performance of a machine
learning model?
a) Sampling
b) Parameter Tuning
c) Cross-validation
d) Stratification
Answer: c) Cross-validation
Which of the following is a widely used and effective machine learning algorithm based on
the idea of bagging?
a) Random Forest
b) Regression
c) Classification
d) Decision Tree
Answer: a) Random Forest
Fill in the blanks
Answer: independent variable
The learner is trying to predict housing prices based on the size of each house. The variable
“size” is ___________
Answer: Y-axis
The target variable is represented along ____________
Answer: dependent variable
The learner is trying to predict the cost of papaya based on its size. The variable “cost” is
__________
Answer: X-axis
The independent variable is represented along _________
Answer: unsupervised
Some telecommunication company wants to segment their customers into distinct groups;
this is an example of ________________ learning.
Answer: Reinforcement Learning
______________________ is a machine learning training method based on rewarding desired
behaviors and punishing undesired ones.
Answer: Sigmoid
___________________ function transforms the raw output scores into a probability distribution
over two classes, ensuring that the probabilities range between 0 and 1.
Answer: Hyperparameters
____________________ are parameters that are set before the machine learning model is trained
and remain fixed during training.
Answer: Maximum Likelihood Estimation (MLE)
______________________________ is a widely used method for estimating the parameters of a
probability distribution from observed data.
Answer: Hypothesis space
________________________ is the space of all possible values that the weights of a machine
learning model can take.
Answer: Laplace Smoothing
______________________ is a smoothing technique that helps tackle the problem of zero
probability in the Naïve Bayes machine learning algorithm.
Answer: Confusion Matrix
______________________ is a table for defining the performance of a classification algorithm.
Answer: Agglomerative clustering
______________________________ is a bottom-up hierarchical clustering approach where each data
point starts as its own cluster and merges iteratively.
Answer: Multiple Linear Regression
____________________ is a statistical approach that represents the linear relationship between
two or more variables, either dependent or independent.
Answer: KNN
__________________ machine learning algorithm can be used for imputing missing values of both
categorical and continuous variables
Answer: Entropy
_______________ is the measurement of disorder/randomness in a dataset or impurities in the
information processed in machine learning.
Answer: Discriminative Models
The ___________________ aims to model the conditional distribution of the output variable given
the input variable. They learn a decision boundary that separates the different classes of the
output variable.
Answer: K-Means
____________________ clustering aims to partition n observations into k clusters in which each
observation belongs to the cluster with the nearest centroid.
Answer: Generative Models
In ________________________________ models, we assume that the data is generated by an
underlying probability distribution.
Answer: Decision Tree
_________________________ is a clustering algorithm that relies on maximizing the likelihood to
find the statistical parameters of the underlying sub-populations in the dataset.
Answer: − Σ pi log2(pi)
____________ type of machine learning algorithm falls under the category of “unsupervised
learning.”
Answer: supervised learning method that uses tree-like structure
______________ uses the inductive learning machine learning approach.
Answer: Squares
A _________ is a decision support tool that uses a tree-like graph or model of decisions and
their possible consequences, including chance event outcomes, resource costs, and utility.
Answer: Prediction
In a decision tree, the equation of entropy measure is ______________________________ .
Answer: Spam Detection
Decision tree means ___________________________
Answer: Predicting continuous values
Decision Nodes are represented by ____________
Answer: Multiple Linear Regression
_______________ is the goal of supervised learning.
Answer: Hierarchical Cluster Analysis
_____________ is an example of a classification problem.
Regression means _________________
MLR full form is ________________
HCA full form is ___________________