Machine Learning - Accuracy and Confusion Matrix

Machine Learning
Accuracy and Confusion Matrix
Portland Data Science Group
Created by Andrew Ferlitsch
Community Outreach Officer
July, 2017

Classification vs. Regressor
• A Model may output either a classifier or a regressor
(real value).
• A classifier outputs a discrete value.
• Finite set of values.
• Maybe an enumeration (e.g., types of fruit).
• Maybe a fixed set of numerical ranges (e.g., 10, 20, 30).
• A regresser outputs a continuous value.
• Infinite set of values.
• Maybe a probability (i.e., between 0 and 1).
• Maybe an unbounded value (e.g., income).

Accuracy in a Classification Model
• After a classification model has been trained (e.g., apple
vs. pear), a test data set is run against the model to
determine accuracy.
• The test data set has labels indicating the expected
result (y). E.g., an apple or pear.
• The model outputs predictions (ŷ). E.g., Is it an apple or a
pear.
• Accuracy is measured as the percentage of predicted
results that match the expected results.
Ex. If there are 1000 results, and 850 predicted results
match the expected results, then the accuracy is 85%.

Problem with Accuracy
• If the training and test data are skewed towards one
classification, then the model will predict everything
as being that class.
Ex. In Titanic training data, 68% of people died. If
one trained a model to predict everybody died,
it would be 68% accurate.
• The data may be fitted against a feature that is not
relevant.
Ex. In image classification, if all images of one class have
small/similar background, the model may match based
on the background, not the object in the image.

Confusion Matrix
• Four Quadrant Measurement on “Performance” of a
model.
Predicted (False) Predicted (True)
Actual (False) True Negative (TN) False Positive (FP)
Actual (True) False Negative (FN) True Positive (TP)
Number correctly predicted
as the class (e.g., dog)
Number correctly predicted
not as the class (e.g., not a dog)
Number incorrectly predicted
as the class (e.g. dog), when it is
not that class.
Number incorrectly predicted
as not the class (e.g. not a dog)
• Accuracy = ( TP + TN ) / N
• Misclassification = ( FP + FN ) / N
• Precision = TP / ( TP + FP )
Example
Measurements

Accuracy in a Regression Model
• After a regression model has been trained, a test data
set is run against the model to determine accuracy.
• The test data set has labels indicating the expected
result (y). E.g., expected spending level.
• The model outputs predictions (ŷ). E.g., Amount of spending.
• Accuracy is measured as a cost (or loss) function
between the expected result and predicted result.
• Ex. Mean Square Error

Loss Function
Minimize Loss (Estimated Error) when Fitting a Line
y1
Actual Values (y)
Predicted Values (yhat)
y2
y3
y4
y5
y6
1
𝑛
𝑗=1
𝑛
(𝑦 − 𝑦ℎ𝑎𝑡)2
MSE =
(y – yhat)
Mean Square Error
Sum the Square of the Difference
Divide by the number of samples

Machine Learning - Accuracy and Confusion Matrix

More Related Content

What's hot

Similar to Machine Learning - Accuracy and Confusion Matrix

More from Andrew Ferlitsch

Recently uploaded

Machine Learning - Accuracy and Confusion Matrix