0% found this document useful (0 votes)

18 views44 pages

DSML Clasification

DSML_Clasification

Uploaded by

arif.mba23064

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views44 pages

DSML Clasification

DSML_Clasification

Uploaded by

arif.mba23064

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Classification

INTRODUCTION

1
Classification
§ Introduction
§ Logistic Regression
§ Classification Process
§ Naïve Bayes
§ Decision Tress
§ KNN

2
Classification
Classification is the problem of identifying to which of a set of categories or label, a
new observation belongs.

Classification of the new observation is based on a training set of data containing observations (or
instances) whose category membership is known.

X1 X2 Y X1 X2 Y
10 20 100 10 20 A

15 30 150 15 30 A

5 10 75 5 10 B

Regression Classification

3
Logistic Regression
Logistic regression is a technique used for binary classification problems, where the goal is to predict one
of two possible outcomes.

Unlike linear regression, which predicts a continuous outcome, logistic regression predicts the probability
that a given input belongs to a certain class.

4
Logistic Regression
• What is the output of the logistic regression model?
• Is the output range bound?
• How is the output constrained to a range?

5
Logistic Regression
Logistic regression uses a logistic function (or sigmoid function) to model the probability of a particular
outcome. The logistic function maps any real-valued number into the range (0, 1).

z output
-5 0.01

-2 0.02

0 0.50

1 0.73

2 0.88

6
Logistic Regression

What do you notice in

the picture?

7
Logistic Regression

8
Logistic Regression
Instance Class (yi) Probability (yi) log p(yi) Class (1-yi) Probability (1-yi) log P(1-yi) - (yi*log p(yi) + (1-yi)(1-log p(yi)))
1 1 0.8 -0.22 0 0.2 -1.61 0.223
2 1 0.9 -0.11 0 0.1 -2.30 0.105
3 0 0.1 -2.30 1 0.9 -0.11 0.105
4 0 0.2 -1.61 1 0.8 -0.22 0.223
5 1 0.9 -0.11 0 0.1 -2.30 0.105
6 0 0.3 -1.20 1 0.7 -0.36 0.357
8 0 0.4 -0.92 1 0.6 -0.51 0.511
9 1 0.6 -0.51 0 0.4 -0.92 0.511
10 0 0.1 -2.30 1 0.9 -0.11 0.105
Obj Fn Value 0.250

9
Logistic Regression

Are the beta’s

significant?

If so, the feature

has a significant
impact on the
outcome

10
Interpretation
Positive Beta: Indicates that as the predictor
increases, the probability of the outcome increases.

Negative Beta: Indicates that as the predictor

increases, the probability of the outcome
decreases.

Magnitude of Beta: The larger the value of β, the

stronger the association between the predictor and
the outcome.

If β=0.5, then e0.5 ≈ 1.65. This means the odds of the

outcome are 65% higher for each one-unit increase
in the predictor

11
Classification Process
Evaluation Measures
• AUC-ROC
• Obtain the Receiver-Operating Characteristic
Curve and determine the AUC value
• Determine the overall goodness of the model(s)
• Confusion Matrix
• Obtain Predicted Probabilities
• Examine Probability Distribution and Identify
Threshold for Classification
• Assess accuracy, sensitivity, specificity,
precision, F1-score, etc.
• If needed, determine the threshold for classification
using other methods
• Youden's Index
• Cost-Benefit Approach

12
Now can I get different predictions I can get 99 different
based on threshold ? TPR and FPR Values by
getting predicted
Can I get TPR and FPR for different classes each time by
Model Performance thresholds ? setting prob between
0.01 and 0.99

Case Actual Class Pred Prob Predicted Class Type

1 1 0.9 1 True Positive

2 0 0.1 0 True Negative

3 1 0.8 0 False Negative
4 0 0.3 1 False Positive

5 1 0.7 1 True Positive

. . . . .

N 0 0.4 0 True Negative

TPR: Ratio of True Positives to all actual positive (Class 1) observations
FPR: Ratio of False Positives to all actual negative (Class 0) observations

13
Classification Process
AUC-ROC AUC = 0.8

AUC Quantifies the overall ability of the

model to discriminate between positive
and negative classes, with values closer Good
Bad
to 1 indicating better performance.

To find the AUC-ROC

1. Obtain the predicted
probabilities for instances in the
test dataset
2. Use the actual target labels and
their predicted probabilities to
plot the ROC curve and obtain
AUC
TPR: Ratio of True Positives to all actual positive (Class 1) observations
FPR: Ratio of False Positives to all actual negative (Class 0) observations
14
Classification Process
A confusion matrix is a fundamental tool used to evaluate the performance of a classification
model. It provides a detailed breakdown of the model's predictions compared to the actual
outcomes.

Creating the Confusion Matrix

1. Obtain the predicted probabilities for the instances in the test data
2. Plot the probability distribution of predicted probabilities for classes “1” and “0” of test data
3. Select a suitable threshold from the probability distribution for classifying an instance as “1”
4. Obtain the predicted classes using the threshold
5. Create the confusion matrix
6. Calculate Accuracy, Sensitivity, Specificity, Precision and F1-Score

15
Classification Process
Creating the Confusion Matrix

Step 1: Obtain the predicted probabilities (refers to probability for being class “1” ) for the instances
in the test data
# Actual Class Predicted Probability
Test data contains the instances
1 1 0.8
with their features and actual
class labels. The model is applied 2 0 0.7
on the test data to obtain the 3 1 0.3
predicted probabilities for the
4 1 0.9
instances.
5 0 0.2
6 0 0.1
7 1 0.9

16
Classification Process
Creating the Confusion Matrix
Step 2: Select a suitable threshold from the probability distribution for classifying an instance as “1”

Try to select a threshold that

reduces the margin of overlap
(overlap is responsible for
incorrect predictions)

17
Classification Process
Creating the Confusion Matrix
Step 3: Select a suitable threshold from the probability distribution for classifying an instance as “1”

Try to select a threshold that

reduces the margin of overlap
(overlap is responsible for
incorrect predictions)
TN
Let’s say we select 0.25 to be the
threshold. All instances in test TP
data with predicted probabilities
> 0.25 will be labeled as Class 1.
FN FP

18
Classification Process
Creating the Confusion Matrix
Step 4: Obtain the predicted classes using the threshold.
Predicted Class = 1 if Predicted Probability > 0.25 (for example)
# Actual Class Predicted Probability Predicted Class
1 1 0.8 1 TP

2 0 0.7 1 FP

3 1 0.1 0 FN

4 1 0.9 1 TP

5 0 0.2 0 TN

6 0 0.1 0 TN

7 1 0.9 1 TP

19
Classification Process
Creating the Confusion Matrix
Step 5: Create the confusion matrix

Actual Class

Class (1) Class (0)

Class (1) 3 [TP] 1 [FP]

Predicted Class
Class (0) 1 [FN] 2 [TN]

20
Classification Process
Creating the Confusion Matrix
Step 6: Calculate Accuracy, Sensitivity, Specificity, Precision and F1-Score

21
Classification Process
Gain

Gain Tables and Curves show the proportion of targets captured by the model up to a chosen
percentile of the predicted probabilities. Higher gain values indicate better model performance in
capturing the target class compared to random selection.

22
Classification Process
Lift

Lift measures how much better the model is at identifying positive cases compared to a random
model. A lift greater than 1 indicates that the model is effective at identifying positive outcomes
better than random selection. A lift of 1 means that the model performs no better than random
guessing.

23
Gain and Lift - Example
Let's consider an insurance website where visitors explore various insurance products. In this context, a conversion
occurs when a visitor responds positively to an offer. We mark visitors who convert as "1" and those who do not
convert as "0." Assume we have developed a classification model designed to predict which visitors are likely to
convert. How do we calculate the gain and lift in this case?

Number of visitors 1000

Number of converts in the entire dataset 200

Conversion Percentage 200/1000 = 20%

Number of actual converts in the top 20% of leads (200 predictions) by predicted
80
probability
Gain [means that the model captures 40% of all converted leads within these two deciles] 80/200 = 40%

Number of converts identifiable by random guessing 20% of 200 = 40

Lift [means that the model is 2 times better than random guessing at capturing leads that
80/40 = 2
convert in the two deciles predictions]

24
Gain and Lift - Example
Without Machine Learning With Machine Learning

Decile Number of Number of Percentage of Decile Number of Number of Percentage of

Visitors converts Converts Visitors converts Converts
1 100 20 20 1 100 50 50
2 100 20 20 2 100 30 30
3 100 20 20 3 100 25 25
4 100 20 20 4 100 20 20
5 100 20 20 5 100 17 17
6 100 20 20 6 100 15 15
7 100 20 20 7 100 13 13
8 100 20 20 8 100 12 12
9 100 20 20 9 100 10 10
10 100 20 20 10 100 8 8
1000 200 20 1000 200 20

25
Naïve Bayes
Naïve Bayes is a Probabilistic Classifier. It is based on Bayes theorem. Bayes Theorem describes the
probability of an event, based on prior knowledge of conditions that might be related to the event.

The conditional probability that an object belongs to Ck given the feature set X is given by

26
Naïve Bayes

27
Naïve Bayes
We have to find the probability of playing when Humidity = Medium, Temp = Low and Outlook =
Overcast
X = {Humidity=Medium, Temp = Low, Outlook=Overcast}
P(X|CYes) * P(CYes)
= P(Humidity=Medium |Play= Yes) * P(Temp = Low | Play = Yes) * P(Outlook = Overcast | Play = Yes) * P (Yes)
= (2/7)*(2/7)*(2/7)*(7/12) = 0.0136

P(X|CNo) * P(CNo)
= P(Humidity=Medium |Play= No) * P(Temp = Low | Play = No) * P(Outlook = Overcast | Play = No) * P (No)
= (2/5)*(2/5)*(1/5)*(5/12) = 0.0133

P(X) = P(Humidity=Medium)*P(Temp=Low)*P(Outlook=Overcast)
P(X) = (4/12)*(4/12)*(3/12) = 0.0278

P(Cricket=Y) when it is { Medium Humidity, Low Temp, Overcast Outlook} = 0.0136/0.0278 = 0.490
P(Cricket=N) when it is { Medium Humidity, Low Temp, Overcast Outlook} = 0.0133/0.0278 =0.480
28
Naïve Bayes
We know that P(Yes) + P(No) = 1, therefore standardizing results

P(Cricket=Y) when it is { Medium Humidity, Low Temp, Overcast Outlook} = 0.49/(0.49+0.48) = 0.505
P(Cricket=N) when it is { Medium Humidity, Low Temp, Overcast Outlook} = 0.48/0.49+0.48) = 0.495

29
Decision Trees
Decision tree (Classification tree) is an algorithm that constructs rules based on the independent variables (predictors),
by recursively partitioning the data, in order to split the data into given classes (class labels) that are as homogenous as
possible.
Root Node Contains all data
Structure of decision tree

Decision Node Applies a rule using one

of the IV’s
Condition Satisfied = Yes No

Terminal Node Decision Node

Yes No

Terminal Node Decision Node

30
Decision Trees - Example
Class = 0, 1 Class = 0, 1

SQUARE 0 1
1 1 0 0 1
0
0 1
1
1 0 0
Shape 0 1

0 1 0
1
1 0 CIRCLE 1 0
1 0 1 0 0
1
RED Blue RED GREEN
Colour Colour
IF Colour (IV) = Red, Class (DV) = 1 IF Colour (IV) = Red & Shape (IV) = Circle, Class (DV) = 1
IF Colour (IV) = Green & Shape (IV) = Square, Class (DV) = 1
IF Colour (IV) = Blue, Class (DV) = 0
IF Colour (IV) = Red & Shape (IV) = Square, Class (DV) = 0
IF Colour (IV) = Green & Shape (IV) = Circle, Class (DV) = 0

31
Decision Trees - Example

32
Decision Trees
Decision Trees Example- Example
IF Overcast, Play Golf 4Y
IF Sunny and Not Windy, Play Golf 3Y
IF Sunny and Windy, Not Play Golf 2N
n=14, Y=9, N=5 IF Rainy and High Humidity, Not Play Golf 3N
IF Rainy and Normal Humidity, Play Golf 2Y

Yes No n=10, Y=5, N=5

Overcast
Outlook

n=4, Y=4, N=0

Yes No n=5, Y=2, N=3
Sunny
Outlook
n=5, Y=3, N=2

Yes No Yes No
Windy High
True Humidity

n=2, Y=0, N=2 n=3, Y=3, N=0 n=3, Y=0, N=3 n=2, Y=2, N=0

33
Decision Trees - Challenge

Decision Trees
How do we find the feature (IV) to be used for determining the split

We select the feature which results in the most pure or homogenous sub-sets

So How do we measure homogeneity or purity ?

There are various measures of purity or homogeneity. If we have two classes, a and b, and P(a) and P(b) be the
probabilities of P(a) and P(b), then

(1) Gini Impurity

Gini_Impurity_Node = 1 – P(a)2 – P(b)2
(2) Information Gain
Entropy_node = - p(a)*log2(p(a)) - p(b)*log2(p(b))
Information Gain = Entropy of Original Set – Entropy of sets resulting after split
(3) Variance Reduction (usually for regression trees)

34
Decision Trees

Decision Trees
1. Calculate the entropy / gini impurity of root node

2. Choose the attribute which results in the highest information gain or reduction in impurity

3. Repeat procedures until no more split is possible

Attribute that gives highest resulting

homogeneity is said to have the highest
information gain

35
Decision Trees
Gini = 0.5
Gini = 0.5
n=10, Y=5, N=5 n=10, Y=5, N=5

Feature Feature
D1 D2

n=5, Y=0, N=5 n=5, Y=5, N=0 n=5, Y=3, N=2 n=5, Y=2, N=3

Gini = Gini =
Avg Gini = 0.0 Avg Gini = 0.480 0.480
Gini = 0.0 Gini = 0.0 0.480

36
Decision Trees

Entropy = 1 n=10, Y=5, N=5 Entropy = 1

n=10, Y=5, N=5

Feature Feature
D1 D2

n=5, Y=0, N=5 n=5, Y=5, N=0 n=5, Y=3, N=2 n=5, Y=2, N=3

Ent = Ent =
Avg Gini = 0.0 Avg Gini = 0.97 0.97
Ent = 0 Ent = 0 0.97

37
Decision Trees - GINI
Gini = 0.459 Node Gini
n=14, Y=9, N=5
00 Weighted Gini

A_Gini = 0.357

No
Overcast
Gini = 0.0 Outlook
Gini = 0.500
n=10, Y=5, N=5
n=4, Y=4, N=0
11 12
Yes No
Sunny Gini = 0.480
Gini = 0.480 Outlook
n=5, Y=2, N=3
n=5, Y=3, N=2 A_Gini = 0.480
22
21
Yes No Yes No
Windy High
True Humidity

A_Gini = 0.0 A_Gini = 0.0

n=2, Y=0, N=2 n=3, Y=3, N=0 n=2, Y=2, N=0 n=3, Y=0, N=3
Gini = 0.0 Gini = 0.0 Gini = 0.0 Gini = 0.0
31 32 33 34

38
Decision Trees
Decision Trees - Entropy
Example [Entropy Calculation]
E = 0.940 Node Entropy
n=14, Y=9, N=5
00 Weighted Entropy

A_Ent = 0.714

Yes No
Overcast
E = 0.0 Outlook
E = 1.000
n=10, Y=5, N=5
n=4, Y=4, N=0
11 12
Yes No
Sunny E=0.971
E= 0.971 Outlook
n=5, Y=2, N=3
n=5, Y=3, N=2 A_Ent = 0.714
22
21
Yes No Yes No
Windy High
True Humidity

A_Gini = 0.0 A_Gini = 0.0

n=2, Y=0, N=2 n=3, Y=3, N=0 n=2, Y=2, N=0 n=3, Y=0, N=3
E = 0.0 E = 0.0 E = 0.0 E = 0.0 34
31 32 33

39
k-NN

KNN
In k-NN Classification, an object is assigned to the class most
common among its k nearest neighbors.

The test sample (green dot) should be classified either to blue squares or to red triangles.

If k = 3 (solid line circle) it is assigned to the red triangles because there are 2 triangles and only 1 square inside the
inner circle.

If k = 5 (dashed line circle) it is assigned to the blue squares (3 squares vs. 2 triangles inside the outer circle).

40
k-NN

KNN
In k-NN Classification, an object is assigned to the class most common among its k nearest neighbors.
• The algorithm only stores the training examples during the learning phase
• The algorithm is executed during the classification phase. The unlabeled observation is assigned the label
which is the most frequent among its k-nearest neighbours.

41
Logistic
Naïve Bayes CART KNN
Regression

White-box Model Preferred approach White-box Model Useful in non-linear

with large # of patterns
Gives a very nice categorical Useful in non-linear
probabilistic estimate patterns Robust results in large
Useful in non-linear sample size
Robust enough from patterns Ensembles perform
overfitting, especially very well Choosing k is difficult
when regularized Computationally
Efficient Prone to overfitting Memory Intensive

Efficient, no Does not give good Sensitive to small Poor performance on

assumption of results when changes in values in high dimension data
distributions assumptions of data
independence are More susceptible to
Useful only in linear violated Large trees are difficult noise in small sample
models to interpret size
Continuous IVs must
hold normal
distribution

42
Model Evaluation – Training and Testing
Data

Fold1 Fold2 Fold3 Fold4

Use k-fold validation strategy

Train Model A on Data using Fold1, Fold2, Fold3 and test it on Fold4
Train Model B on Data using Fold2, Fold3, Fold4 and test it on Fold1
Train Model C on Data using Fold3, Fold4, Fold1 and test it on Fold2
Train Model D on Data using Fold4, Fold1, Fold2 and test it on Fold3

Confusion Matrix Measures on Fold1, Fold2, Fold3, Fold4 should be similar

43
Thank You

Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Unit 2 Classification
No ratings yet
Unit 2 Classification
59 pages
Chap5 Evaluating Performance
No ratings yet
Chap5 Evaluating Performance
54 pages
Classification
100% (2)
Classification
105 pages
CLASSIFICATION
No ratings yet
CLASSIFICATION
36 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
6.evaluation Metrics - UNIT 2
No ratings yet
6.evaluation Metrics - UNIT 2
4 pages
ML Classification Algorithms Guide
No ratings yet
ML Classification Algorithms Guide
13 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
6 Evaluation
No ratings yet
6 Evaluation
57 pages
Evaluation Matrix
No ratings yet
Evaluation Matrix
29 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Session-11 Machine Learning - Jupyter Notebook
No ratings yet
Session-11 Machine Learning - Jupyter Notebook
11 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Session 1 Evaluation Model
No ratings yet
Session 1 Evaluation Model
58 pages
Unit 2
No ratings yet
Unit 2
28 pages
Mlslides 2
No ratings yet
Mlslides 2
92 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
CH 6
No ratings yet
CH 6
24 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
S1 Evaluate Performance LKW 1mar2025
No ratings yet
S1 Evaluate Performance LKW 1mar2025
26 pages
Lecture 7 Classification
No ratings yet
Lecture 7 Classification
33 pages
Binary Classifier Evaluation Guide
No ratings yet
Binary Classifier Evaluation Guide
12 pages
Session-11 Machine Learning
No ratings yet
Session-11 Machine Learning
27 pages
Unit - 5
No ratings yet
Unit - 5
57 pages
DS Unit 4
No ratings yet
DS Unit 4
13 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Lec 02
No ratings yet
Lec 02
12 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Ca 3 Merged
No ratings yet
Ca 3 Merged
275 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities
No ratings yet
9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities
37 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Model Evaluation for Data Scientists
No ratings yet
Model Evaluation for Data Scientists
7 pages
Compare Class I Fiers Part 13
No ratings yet
Compare Class I Fiers Part 13
32 pages
A Gentle Introduction To Statistical Hypothesis Tests
No ratings yet
A Gentle Introduction To Statistical Hypothesis Tests
6 pages
Probability Theory & Stochastic Processes - BITS
100% (1)
Probability Theory & Stochastic Processes - BITS
12 pages
Regression Modelling and Least-Squares: GSA Short Course: Session 1 Regression
No ratings yet
Regression Modelling and Least-Squares: GSA Short Course: Session 1 Regression
6 pages
Notes 4 - Confidence Intervals and Significance Tests
No ratings yet
Notes 4 - Confidence Intervals and Significance Tests
6 pages
Mathematical Statistics: Introduction
No ratings yet
Mathematical Statistics: Introduction
5 pages
DPS Assignment #4 - 620147657-Merged
No ratings yet
DPS Assignment #4 - 620147657-Merged
6 pages
Moments J Skewness and Kurtosis - Final
No ratings yet
Moments J Skewness and Kurtosis - Final
70 pages
Effect of Aggregate Inhomogeneity On Mechanical Properties of Asphalt Mixtures
No ratings yet
Effect of Aggregate Inhomogeneity On Mechanical Properties of Asphalt Mixtures
460 pages
Statistical Methods For Forecasting
No ratings yet
Statistical Methods For Forecasting
8 pages
Bayesian Belief Network
No ratings yet
Bayesian Belief Network
30 pages
Moments and Measures of Skewness and Kurtosis
No ratings yet
Moments and Measures of Skewness and Kurtosis
19 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
12 pages
Case Problem 3 County Beverage Drive-Thru
No ratings yet
Case Problem 3 County Beverage Drive-Thru
4 pages
Essentials of Statistics For Business & Economics 9th Edition David R. Anderson - Ebook PDF All Chapters Instant Download
100% (3)
Essentials of Statistics For Business & Economics 9th Edition David R. Anderson - Ebook PDF All Chapters Instant Download
62 pages
Introduction To Statistical Methods: BITS Pilani
No ratings yet
Introduction To Statistical Methods: BITS Pilani
40 pages
2022 Stats 1 Ms (Paper 1 (A'level Statistics) ) MR Share
No ratings yet
2022 Stats 1 Ms (Paper 1 (A'level Statistics) ) MR Share
18 pages
EES 201 Course Outline
No ratings yet
EES 201 Course Outline
3 pages
Forecasting Methods and Models
100% (1)
Forecasting Methods and Models
30 pages
Bayesian Reputation Systems Guide
No ratings yet
Bayesian Reputation Systems Guide
10 pages
Statistics & Psychometrics Guide
No ratings yet
Statistics & Psychometrics Guide
32 pages
Learning Task 8
No ratings yet
Learning Task 8
6 pages
FMRI Group Analysis Techniques
No ratings yet
FMRI Group Analysis Techniques
94 pages
Ecotrix Assignment
No ratings yet
Ecotrix Assignment
5 pages
Recipe Traffic Prediction Model
No ratings yet
Recipe Traffic Prediction Model
12 pages
The Laymans Guide To Volatility Forecasting
No ratings yet
The Laymans Guide To Volatility Forecasting
17 pages
10 - Sampling and Sample Size Calculation 2009 Revised NJF - WB
No ratings yet
10 - Sampling and Sample Size Calculation 2009 Revised NJF - WB
41 pages
Principles and Applications of Multilevel Modeling in Human Resource Management Research
No ratings yet
Principles and Applications of Multilevel Modeling in Human Resource Management Research
15 pages
Regression: Variables Entered/Removed
No ratings yet
Regression: Variables Entered/Removed
1 page
Lec 3 SCM Demand Fore MCQ P
No ratings yet
Lec 3 SCM Demand Fore MCQ P
19 pages
Modeling Failure Time Data by Lehman Alternatives
No ratings yet
Modeling Failure Time Data by Lehman Alternatives
19 pages
Activity Expected Time Variance 1-2 10 1 1-3 10 0 1-4 5 1 2-6 7 4 3-6 5 1 3-7 7 1 3-5 2 0 4-5 5 1 5-7 8 4 6-7 4 1
No ratings yet
Activity Expected Time Variance 1-2 10 1 1-3 10 0 1-4 5 1 2-6 7 4 3-6 5 1 3-7 7 1 3-5 2 0 4-5 5 1 5-7 8 4 6-7 4 1
3 pages

DSML Clasification

Uploaded by

DSML Clasification

Uploaded by

Classification

What do you notice in

Are the beta’s

If so, the feature

Negative Beta: Indicates that as the predictor

Magnitude of Beta: The larger the value of β, the

If β=0.5, then e0.5 ≈ 1.65. This means the odds of the

Case Actual Class Pred Prob Predicted Class Type

1 1 0.9 1 True Positive

2 0 0.1 0 True Negative

5 1 0.7 1 True Positive

N 0 0.4 0 True Negative

AUC Quantifies the overall ability of the

To find the AUC-ROC

Creating the Confusion Matrix

Try to select a threshold that

Try to select a threshold that

Class (1) Class (0)

Class (1) 3 [TP] 1 [FP]

Number of visitors 1000

Number of converts in the entire dataset 200

Conversion Percentage 200/1000 = 20%

Number of converts identifiable by random guessing 20% of 200 = 40

Decile Number of Number of Percentage of Decile Number of Number of Percentage of

Decision Node Applies a rule using one

Terminal Node Decision Node

Terminal Node Decision Node

Yes No n=10, Y=5, N=5

n=4, Y=4, N=0

So How do we measure homogeneity or purity ?

(1) Gini Impurity

3. Repeat procedures until no more split is possible

Attribute that gives highest resulting

Entropy = 1 n=10, Y=5, N=5 Entropy = 1

A_Gini = 0.0 A_Gini = 0.0

A_Gini = 0.0 A_Gini = 0.0

White-box Model Preferred approach White-box Model Useful in non-linear

Efficient, no Does not give good Sensitive to small Poor performance on

Fold1 Fold2 Fold3 Fold4

Confusion Matrix Measures on Fold1, Fold2, Fold3, Fold4 should be similar

You might also like