0% found this document useful (0 votes)

14 views

intro

Uploaded by

GregMG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

intro

Uploaded by

GregMG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Classification:

A machine learning perspective

Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Part of a specialization

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

This course is a part of the
Machine Learning Specialization

1. Foundations

4. Clustering 5. Recommender
2. Regression 3. Classification
& Retrieval Systems

6. Capstone

3 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

What is the course about?

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

What is classification?
From features to predictions

ML
Data Classifier Intelligence
Method

Input x:
features derived Learn xày
from data
relationship Predict y:
categorical “output”,
class or label
5 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Sentiment classifier
Input x: Easily best sushi in Seattle.

Sentence Sentiment
Classifier

Output: y
Sentiment

6 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Classifier

Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class

7 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Example multiclass classifier
Output y has more than 2 categories

Education

Finance

Technology

Input: x Output: y
Webpage
8 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam

Spam

Input: x Output: y
Text of email,
9
sender, IP,… ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification

Input: x Output: y
Image pixels Predicted object
10 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia
…

11 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Reading your mind
Inputs x are
brain region Output y
intensities
“Hammer”

“House”
12 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Impact of classification

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Impact of classification

14 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course overview

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course philosophy: Always use case studies & …

Core
Visual Algorithm
concept

Advanced
Practical Implement
topics

I O N A L
OPT
16 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Overview of content

Models Algorithms Core ML

Linear Alleviating
Gradient
classifiers overfitting

Logistic Stochastic Handling

regression gradient missing data

Decision Recursive Precision-

trees greedy recall

Online
Ensembles Boosting
learning

17 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course outline

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Overview of modules

Models Algorithms Core ML

Alleviating
Linear classifiers Gradient
overfitting
Module 1 Modules 2 & 3
Modules 3 & 5

Handling missing
Logistic regression Stochastic gradient
data
Modules 1, 2, 3 Module 9
Module 6

Decision trees Recursive greedy Precision-recall

Modules 4 & 5 Module 4 Module 8

Ensembles Boosting Online learning

Module 7 Module 8 Module 9

19 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 1: Linear classifiers
Word Coeﬃcient
#awesome 1.0
#awful -1.5
Score(x) = 1.0 #awesome – 1.5 #awful
#awful

Score(x) < 0
…

0
Score(x) > 0
0 1 2 3 4 …
#awesome
20 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 1: Logistic regression represents probabilities
⌃
P(y=+1|x,ŵ) = 1 .

1 + e-ŵ h(x)
T

21 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 2: Learning “best” classifier
Maximize likelihood over all possible w0,w1,w2

ℓ(w0=0, w1=1, w2=-1.5) = 10-6

#awful

ℓ(w0=1, w1=1, w2=-1.5) = 10-5

… Best model with

4 gradient ascent:
3 Highest likelihood ℓ(w)
2 ŵ = (w0=1, w1=0.5, w2=-1.5)
1
ℓ(w0=1, w1=0.5, w2=-1.5) = 10-4
0
0 1 2 3 4 …
#awesome
23 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 3: Overfitting & regularization
True error
Classification
error

Training error

Model complexity

Use regularization penalty 2

to mitigate overfitting
ℓ(w)
(w) - λ||w||2
25 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 4: Decision trees
Start

excellent poor
Credit?

fair
Income?
Safe Term?
high Low
3 years 5 years

Risky Safe Term? Risky

3 years 5 years

Risky Safe

26 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 5: Overfitting in decision trees
Decision Tree
Depth 1 Depth 3 Depth 10

Logistic Regression
Degree 1 features Degree 2 features Degree 6 features

27 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 5: Alleviate overfitting by learning simpler trees
Occam’s Razor: “Among competing hypotheses,
the one with fewest assumptions should be
selected”, William of Occam, 13th Century

Complex Tree Simpler Tree

Simplify

Module 6: Handling missing data
Start

Credit Term Income y

excellent poor
excellent 3 yrs high safe Credit?

fair ? low risky fair

or unknown
fair 3 yrs high safe Income?
Safe Term?
poor 5 yrs high risky high Low
3 years 5 years or unknown
excellent 3 yrs low risky or unknown
fair 5 yrs high safe Risky Safe Term? Risky

poor ? high risky 3 years 5 years

or unknown
poor 5 yrs low safe
fair ? high safe Risky Safe

Module 7: Boosting question
“Can a set of weak learners be combined to
create a stronger learner?” Kearns and Valiant (1988)

Yes! Schapire (1990)

Boosting

Amazing impact: simple approach widely used in

industry wins most Kaggle competitions
32 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 7: Boosting using AdaBoost
Income>$100K? Credit history? Savings>$100K? Market conditions?

Yes No Bad Good Yes No Bad Good

Safe Risky Risky Safe Safe Risky Risky Safe

f1(xi) = +1 f2(xi) = -1 f3(xi) = -1 f4(xi) = +1

Ensemble: Combine votes from many simple

classifiers to learn complex classifiers

Module 8: Precision-recall
Goal: increase
# guests by 30%

Need an automated,
“authentic”
Reviews marketing campaign

Great quotes Spokespeople

“Easily best sushi in Seattle.”

Accuracy not most important metric

PRECISION RECALL
Did I (mistakenly) show a Did I not show a (great)
negative sentence??? positive sentence???
34 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 9: Scaling to huge datasets & online learning

4.8B webpages 500M Tweets/day 5B views/day

Stochastic gradient: tiny modification to gradient,

a lot faster, but annoying in practice
Avg. log likelihood

Gradient
Better

Assumed background

Courses 1 & 2 in this ML Specialization
• Course 1: Foundations
- Overview of ML case studies
- Black-box view of ML tasks
- Programming & data
manipulation skills

• Course 2: Regression
- Data representation (input, output, features)
- Linear regression model
- Basic ML concepts:
• ML algorithm
• Gradient descent
• Overfitting
• Validation set and cross-validation
• Bias-variance tradeoﬀ
• Regularization

Math background
• Basic calculus
- Concept of derivatives
• Basic vectors
• Basic functions
- Exponentiation ex
- Logarithm

Programming experience
• Basic Python used
- Can pick up along the way if
knowledge of other language

Reliance on GraphLab Create
• SFrames will be used, though not required
- open source project of Dato
(creators of GraphLab Create)
- can use pandas and numpy instead
• Assignments will:
1. Use GraphLab Create to
explore high-level concepts
2. Ask you to implement
all algorithms without GraphLab Create
• Net result:
- learn how to code methods in Python
40 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Computing needs
• Basic 64-bit desktop or laptop
• Access to internet
• Ability to:
- Install and run Python (and GraphLab Create)
- Store a few GB of data

Let’s get started!

MITx 6.86x Notes - MD
No ratings yet
MITx 6.86x Notes - MD
91 pages
LAB 4 BENG4711 Noise - Performance-Open - Ended-Latest
No ratings yet
LAB 4 BENG4711 Noise - Performance-Open - Ended-Latest
4 pages
What's Next For ML & You: Emily Fox & Carlos Guestrin
No ratings yet
What's Next For ML & You: Emily Fox & Carlos Guestrin
38 pages
Logistic Regression Learning Annotated
No ratings yet
Logistic Regression Learning Annotated
77 pages
1- intro
No ratings yet
1- intro
33 pages
Lecture 6 Linear Classifier 2
No ratings yet
Lecture 6 Linear Classifier 2
42 pages
[Fall 2024] Intro to ML
No ratings yet
[Fall 2024] Intro to ML
51 pages
Coursera Machine Learning Course Week 6 - Slides
No ratings yet
Coursera Machine Learning Course Week 6 - Slides
44 pages
Regression:: Emily Fox & Carlos Guestrin
No ratings yet
Regression:: Emily Fox & Carlos Guestrin
30 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Topic 1 - Introduction
No ratings yet
Topic 1 - Introduction
30 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
CE880_lecture5_slides
No ratings yet
CE880_lecture5_slides
32 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Course Collections by Coursera - Machine Learning & Artificial Intelligence
100% (2)
Course Collections by Coursera - Machine Learning & Artificial Intelligence
6 pages
ML Merged
No ratings yet
ML Merged
433 pages
Lecture 01 - Introduction To AML-Jan24
No ratings yet
Lecture 01 - Introduction To AML-Jan24
66 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
No ratings yet
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
44 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
60 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
EE353 - 769 00 Course Introduction
No ratings yet
EE353 - 769 00 Course Introduction
28 pages
MLUnit_1
No ratings yet
MLUnit_1
131 pages
Classification Annotated
No ratings yet
Classification Annotated
50 pages
(Report)
No ratings yet
(Report)
40 pages
State of The Art Research Methodology For Machine
No ratings yet
State of The Art Research Methodology For Machine
58 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
L2 What Is ML
No ratings yet
L2 What Is ML
38 pages
Lecture1 - Introduction To Machine Learning
No ratings yet
Lecture1 - Introduction To Machine Learning
39 pages
An Enlightenment To Machine Learning - Resp
No ratings yet
An Enlightenment To Machine Learning - Resp
22 pages
01_ml_basics
No ratings yet
01_ml_basics
61 pages
Machine Learning Specialization CloudxLab PDF
No ratings yet
Machine Learning Specialization CloudxLab PDF
12 pages
01 Introduction
No ratings yet
01 Introduction
23 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
Lecture 2 - What Is ML
No ratings yet
Lecture 2 - What Is ML
17 pages
Norvig Google ESTF2019
No ratings yet
Norvig Google ESTF2019
71 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
10 pages
Machine Learning - MT 2016: Varun Kanade
No ratings yet
Machine Learning - MT 2016: Varun Kanade
50 pages
Air quality prediction using machine learning
No ratings yet
Air quality prediction using machine learning
29 pages
Lecture 1
No ratings yet
Lecture 1
51 pages
ML-cahp-1
No ratings yet
ML-cahp-1
35 pages
ML Notes
No ratings yet
ML Notes
25 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
ML%20Key%20Concepts
No ratings yet
ML%20Key%20Concepts
139 pages
Aiml Online Brochure
No ratings yet
Aiml Online Brochure
20 pages
Building A ML System
No ratings yet
Building A ML System
42 pages
Lecture 1 Course Introduction
No ratings yet
Lecture 1 Course Introduction
18 pages
Module 1-Basics of ML
No ratings yet
Module 1-Basics of ML
142 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Introduction To Machine Learning: Pekka Parviainen
No ratings yet
Introduction To Machine Learning: Pekka Parviainen
39 pages
AWS Certified AI Practitioner Complete Study Guide Foundational Exam
From Everand
AWS Certified AI Practitioner Complete Study Guide Foundational Exam
Dániel Rozmán
No ratings yet
Modern C++ for Machine Learning: A Comprehensive Guide to Building Production-Ready AI Systems
From Everand
Modern C++ for Machine Learning: A Comprehensive Guide to Building Production-Ready AI Systems
Aarav Joshi
No ratings yet
Tackling Imbalanced Data with Python: Advanced Techniques and Real-World Applications for Tackling Class Imbalance
From Everand
Tackling Imbalanced Data with Python: Advanced Techniques and Real-World Applications for Tackling Class Imbalance
Aarav Joshi
No ratings yet
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Basic H&S Guidelines: .4 Fire Prevention Strategy 3
No ratings yet
Basic H&S Guidelines: .4 Fire Prevention Strategy 3
1 page
Water And Sustainability In Arid Regions Bridging The Gap Between Physical And Social Sciences 1st Edition Du Zheng instant download
No ratings yet
Water And Sustainability In Arid Regions Bridging The Gap Between Physical And Social Sciences 1st Edition Du Zheng instant download
87 pages
AE321 Unit 3 Working Capital Management
No ratings yet
AE321 Unit 3 Working Capital Management
16 pages
Information Technology Auditing and Assurance: Computer-Assisted Audit Tools and Techniques Multiple Choice Questions
No ratings yet
Information Technology Auditing and Assurance: Computer-Assisted Audit Tools and Techniques Multiple Choice Questions
3 pages
Clinker Silo
100% (1)
Clinker Silo
2 pages
Pounds Photoshop March
No ratings yet
Pounds Photoshop March
15 pages
PTS-2025-GS-Simulator-Test-2-QP-Eng_5161
No ratings yet
PTS-2025-GS-Simulator-Test-2-QP-Eng_5161
20 pages
Alchoholic Rice Beverages PDF
No ratings yet
Alchoholic Rice Beverages PDF
7 pages
Family Copping Index: Criterias/ Areas Description Rate Justification Physical Independence 5
No ratings yet
Family Copping Index: Criterias/ Areas Description Rate Justification Physical Independence 5
2 pages
FMCG Industry Profile:: Fast-Moving Consumer Goods (FMCG) Are Products That Are Sold Quickly and at Relatively Low
No ratings yet
FMCG Industry Profile:: Fast-Moving Consumer Goods (FMCG) Are Products That Are Sold Quickly and at Relatively Low
11 pages
Egy-Connect Plus-KG-EN-G2-TG-T1-All-low-compressed
No ratings yet
Egy-Connect Plus-KG-EN-G2-TG-T1-All-low-compressed
184 pages
Session 1 Introduction To IYCF
100% (1)
Session 1 Introduction To IYCF
21 pages
Advanced Database Systems (Lecture-2)
No ratings yet
Advanced Database Systems (Lecture-2)
12 pages
List of Exhibitors Country List of Product & Services
No ratings yet
List of Exhibitors Country List of Product & Services
7 pages
Necrosis and Gangrene
50% (2)
Necrosis and Gangrene
13 pages
Prins Installation Manual Twingo 1.2
No ratings yet
Prins Installation Manual Twingo 1.2
24 pages
STC Report2
No ratings yet
STC Report2
26 pages
Box Culvert Extension
No ratings yet
Box Culvert Extension
4 pages
Solah (Prayer in Islam)
No ratings yet
Solah (Prayer in Islam)
22 pages
Chapter 1 Reading Guide
No ratings yet
Chapter 1 Reading Guide
3 pages
Toshiba Satellite A200 (Compal LA-3661P)
No ratings yet
Toshiba Satellite A200 (Compal LA-3661P)
38 pages
Commissioning of A 6MV Elekta Compact Linear Accelerator For Clinical Treatment
No ratings yet
Commissioning of A 6MV Elekta Compact Linear Accelerator For Clinical Treatment
16 pages
Chapter 4
No ratings yet
Chapter 4
47 pages
Colleges Inviting Film Stars
No ratings yet
Colleges Inviting Film Stars
6 pages
Fachrur Razi - Tugas 1 Metopel 2024
No ratings yet
Fachrur Razi - Tugas 1 Metopel 2024
3 pages
almond wheat bread project
No ratings yet
almond wheat bread project
62 pages
Chapter 5
No ratings yet
Chapter 5
42 pages
Lesson 2
No ratings yet
Lesson 2
28 pages
Chapter 5: Curriculum Implementation: The Teacher and The School Curriculum A Guide To Curriculum Development
No ratings yet
Chapter 5: Curriculum Implementation: The Teacher and The School Curriculum A Guide To Curriculum Development
26 pages