0% found this document useful (0 votes)

75 views

Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010

gbgfb

Uploaded by

anand_sesham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODP, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views

Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010

gbgfb

Uploaded by

anand_sesham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODP, PDF, TXT or read online on Scribd

You are on page 1/ 30

Lecture Slides for

ETHEM ALPAYDIN
© The MIT Press, 2010
[email protected]
http://www.cmpe.boun.edu.tr/~ethem/i2ml2e
Outline
Last class Chapter 13 Kernel Machines
-Non separable case: Soft Margin Hyperplane
-Kernel Trick
-Vectorial Kernels
-Multiple Kernel Learning
-Multiclass Kernel Machines
Today: Finish Chapter 13 Kernel Machines
Chapter 16 Hidden Markov Models

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
SVM for Regression
Use a linear model (possibly kernelized)
f(x)=wTx+w0
Use the є-sensitive error function
0 if r t  f  xt   
e  r , f  x     t
t t

 r  f  x   
t
otherwise

min w  C   t   t 
1 2

2
 
t

r t  w T x  w0     t
w x  w   r
T
0
t
    t
 t , t  0
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 4
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 5
Kernel Regression
Polynomial kernel Gaussian kernel

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 6
One-Class Kernel Machines
Consider a sphere with center a and radius R
min R 2  C  t
t

subject to
xt  a  R 2   t , t  0

Ld    x  x    r r  x 
N
t t T s t s t s t T
xs
t t 1 s

subject to
0   t  C ,  t  1
t

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 7
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 8
Kernel Dimensionality Reduction
Kernel PCA does
PCA on the kernel
matrix (equal to
canonical PCA with
a linear kernel)

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 9
Introduction
 Assumption
 Modeling dependencies in input; no longer iid (independent and identically
distributed)

 Sequences
 Temporal:
In speech: phonemes in a word (dictionary), words in a sentence (syntax,


semantics of the language).

 In handwriting: pen movements

 Spatial:
 In a DNA sequence: base pairs

 Base pairs in a DNA sequence can not be modeled as simple

probability distribution.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
11
Discrete Markov Process
 N states: S1, S2, ..., SN
 State at “time” t, qt = Si
 First-order Markov
P(qt+1=Sj | qt=Si, qt-1=Sk ,...) = P(qt+1=Sj | qt=Si)

 Transition probabilities
aij ≡ P(qt+1=Sj | qt=Si) aij ≥ 0 and Σj=1N aij=1

 Initial probabilities
πi ≡ P(q1=Si) Σj=1N πi=1

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
12
Stochastic Automaton

T
P ( O=Q∣ A,Π ) =P ( q1 ) ∏ P ( q t∣qt −1 ) =π q a q q .. . aq q
t=2
1 1 2 T −1 T

For example:
π 3 a31 a12 a22 a 23 a32 a21 ...
Q= 3 1 2 2 3 2 1...
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
13
Example: Balls and Urns
Three urns each full of balls of one color
S1: red, S2: blue, S3: green
S1 S2 S3
¿

S1 S2
0.4 0 .3 0.3
Π= [ 0 .5,0 . 2,0 . 3 ]
T
A= 0 . 2
[ 0 .6 0.2
]
S3
0.1 0.1 0.8
O= { S1 ,S 1 ,S 3 ,S 3 } = {red, red, green, green}
P ( O∣A,Π ) =
¿
¿

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
14
Example: Balls and Urns
Three urns each full of balls of one color
S1: red, S2: blue, S3: green
S1 S2 S3
0.4 0.3 0.3

S1 S2
   0.5,0.2,0.3 A  0.2 0.6 0.2
T

0.1 0.1 0.8

S3
O  S1 ,S1 ,S3 ,S3 = {red, red, green, green}
P O | A ,    P  S1   P  S1 | S1   P  S3 | S1   P  S3 | S3 
 1  a11  a13  a33
 0.5  0.4  0.3  0.8  0.048
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
15
Balls and Urns: Learning
Observable Markov Model
Given K example sequences of length T
How to estimate the parameters?


Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
16
Balls and Urns: Learning
Given K example sequences of length T

{sequences starting with Si } ∑k 1 ( q1k =Si )

π̂ i = =
{ sequences } K
{ transitions from Si to S j }
a
̂ ij=
{ transitions from Si }
T −1
∑k ∑t 1 ( qkt =S i and qt+k 1 =S j )
= T −1
∑k ∑t 1 ( qkt =S i )

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
17
Hidden Markov Models
 States are not observable
 Discrete observations {v1,v2,...,vM} are recorded; a probabilistic function
of the state
 Emission probabilities
bj(m) ≡ P(Ot=vm | qt=Sj)
 Example
 In each urn, there are balls of different colors, but with different
probabilities.
 For each observation sequence, there are multiple state sequences

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
18
Another Example
A colored ball choosing example :

Urn 1 Urn 2 Urn 3

# of Red = 30 # of Red = 10 # of Red =60
# of Green = 50 # of Green = 40 # of Green =10
# of Blue = 20 # of Blue = 50 # of Blue = 30

Probability of transition to another Urn after picking a ball:

U1 U2 U3
U1 0.1 0.4 0.5
U2 0.6 0.2 0.2
U3 0.3 0.4 0.3
Example (contd.)
U1 U2 U3 R G B
Given : U1 0.1 0.4 0.5 U1 0.3 0.5 0.2
and
U2 0.6 0.2 0.2 U2 0.1 0.4 0.5
U3 0.3 0.4 0.3 U3 0.6 0.1 0.3

Observation : RRGGBRGR

State Sequence : ??

Not so Easily Computable.

Example (contd.)

Here :
S = {U1, U2, U3} U1 U2 U3
A=
V = { R,G,B} U1 0.1 0.4 0.5
For observation: U2 0.6 0.2 0.2
O ={o1… on}
U3 0.3 0.4 0.3
And State sequence R G B
Q ={q1… qn} B=
U1 0.3 0.5 0.2
π is
U2 0.1 0.4 0.5
 i  P(q1  U i )
U3 0.6 0.1 0.3
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
22
Elements

of an HMM
N: Number of states

S = {S1, S2, ..., SN }

 M: Number of observation symbols

V = {v1,v2,...,vM}
 A = [aij]: N by N state transition probability matrix

aij ≡ P (qt+1=Sj | qt=Si)

 B = bj(m): N by M observation probability matrix

bj(m) ≡ P (Ot=vm | qt=Sj)

 Π = [πi]: N by 1 initial state probability vector

πi ≡ P (q1=Si)
λ = (A, B, Π), parameter set of HMM
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
23
Examples

•Gene regulation O={A, C, G, T} S={Gene,Transcription factor binding site,Junk

DNA,...}

•Speech processing O=speech signal S=word or phoneme being uttered•

Text understanding O=words S=topic (e.g. sports, weather, etc)

•Robot localization–O=sensor readings S=discretized position of the robot

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
24
Three Basic Problems of HMMs
1. Evaluation:
Given λ, and O, calculate P (O | λ)
1. State sequence:
Given λ, and O, find Q* such that
P (Q* | O, λ ) = maxQ P (Q | O , λ )
1. Learning:
Given X={Ok}k, find λ* such that
P ( X | λ* )=maxλ P ( X | λ )
(Rabiner, 1989)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
25
Evaluation: Naïve solution
State sequence Q = {q1,…qT}
Assume independent observations:
T
P(O∣Q , λ )=∏ P(Ot ∣q t , λ )=b q (O1 )b q (O 2 ). .. b q (OT )
i=1 1 2 T

Observations are mutually independent, given the

hidden states.
Evaluation: Naïve solution
Observe that :

P(Q∣λ)=π q1 aq1q2 aq2q3 ...aqT −1qT

And that:

P(O∣λ )=∑ P(O∣Q , λ )P (Q∣λ )

q
Evaluation: Naïve solution
Finally get:

P(O∣λ )=∑ P(O∣Q , λ )P (Q∣λ )

-The above sum is over all state paths

-There are NT states paths, each ‘costing’
O(T) calculations, leading to O(TNT)
time complexity.
Evaluation
 Forward variable:
α t ( i ) ≡P ( O 1 ⋯O t ,q t =Si∣ λ )
 The probability of observing the partial
sequence {O1 ,⋯,O
until timet } t and being

in Si at time t, given the model λ

Initialization:
α 1 ( i ) =π i bi ( O 1 )
Recursion:
N
α t+1 ( j )=
N
[∑ ] (
i=1
α t ( i ) aij b j O t+1 )

P ( O∣λ ) =∑ αT ( i ) Evaluation result

i=1

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
29
Evaluation
 Backward variable:
β t ( i ) ≡P ( O t+1 ⋯O T ∣q t =S i ,λ )
 The probability of being in Si at time t
and observing the partial sequence
{O t+1 ,⋯,OT }
Initialization:
β T ( i )=1 (=P(OT+1∣qT =Si ,λ ))
Recursion:
N
β t ( i ) =∑ aij b j ( Ot+1 ) β t+1 ( j )
j=1

Introduction To Machine Learning - Ethem Alpaydin
100% (4)
Introduction To Machine Learning - Ethem Alpaydin
432 pages
Lab (I)
No ratings yet
Lab (I)
3 pages
UNIT - 5 Advanced Algorithm PDF
100% (1)
UNIT - 5 Advanced Algorithm PDF
31 pages
Intro
No ratings yet
Intro
24 pages
Slides 11
No ratings yet
Slides 11
39 pages
Multi Layer Perceptron Annotated
No ratings yet
Multi Layer Perceptron Annotated
53 pages
i2ml2e-chap4-v1-0
No ratings yet
i2ml2e-chap4-v1-0
27 pages
ML 5
No ratings yet
ML 5
28 pages
Machine Learning Chapter 1
No ratings yet
Machine Learning Chapter 1
24 pages
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
No ratings yet
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
28 pages
Hidden Markov Models: Ts. Nguyễn Văn Vinh Bộ môn KHMT, Trường ĐHCN, ĐH QG Hà nội
No ratings yet
Hidden Markov Models: Ts. Nguyễn Văn Vinh Bộ môn KHMT, Trường ĐHCN, ĐH QG Hà nội
51 pages
poly_aml
No ratings yet
poly_aml
76 pages
Kernel Machines
No ratings yet
Kernel Machines
33 pages
Machine Learning Chapter 1
No ratings yet
Machine Learning Chapter 1
25 pages
Chap13 KernelMachines
No ratings yet
Chap13 KernelMachines
24 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Artificial Intelligence and Learning Algorithms: Presented by Brian M. Frezza 12/1/05
No ratings yet
Artificial Intelligence and Learning Algorithms: Presented by Brian M. Frezza 12/1/05
67 pages
98867064
No ratings yet
98867064
53 pages
Topic: Machine Learning
No ratings yet
Topic: Machine Learning
35 pages
Applied Machine Learning
No ratings yet
Applied Machine Learning
49 pages
4. Ai_foundations of Machine Learning i
No ratings yet
4. Ai_foundations of Machine Learning i
40 pages
Chapter 4 ML Parametric Classification
No ratings yet
Chapter 4 ML Parametric Classification
42 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
I2ml Chap16 v1 1
No ratings yet
I2ml Chap16 v1 1
21 pages
Regression
No ratings yet
Regression
33 pages
Introduction To ML
No ratings yet
Introduction To ML
4 pages
Знімок екрана 2022-10-31 о 18.56.30
No ratings yet
Знімок екрана 2022-10-31 о 18.56.30
96 pages
I2ml Chap3 v1 1
No ratings yet
I2ml Chap3 v1 1
23 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
Artificial Intelligence and Learning Algorithms: Presented by Brian M. Frezza 12/1/05
No ratings yet
Artificial Intelligence and Learning Algorithms: Presented by Brian M. Frezza 12/1/05
67 pages
Lec20 PDF
No ratings yet
Lec20 PDF
7 pages
ML and Its Application
No ratings yet
ML and Its Application
13 pages
Brief Intro To ML PDF
No ratings yet
Brief Intro To ML PDF
236 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
i2ml2e-chap5-v1-0
No ratings yet
i2ml2e-chap5-v1-0
26 pages
Lecture 1- Introduction to Machine Learning-HO - Ch0
No ratings yet
Lecture 1- Introduction to Machine Learning-HO - Ch0
44 pages
MLBasicsBook
No ratings yet
MLBasicsBook
287 pages
I2ml3e Chap15
No ratings yet
I2ml3e Chap15
22 pages
A High-Bias, Low-Variance Introduction To Machine Learning For Physicists PDF
No ratings yet
A High-Bias, Low-Variance Introduction To Machine Learning For Physicists PDF
117 pages
Lec18 HMMs
No ratings yet
Lec18 HMMs
56 pages
Machine Figure
No ratings yet
Machine Figure
153 pages
02 ML Fundatmentals 2
No ratings yet
02 ML Fundatmentals 2
81 pages
slides_cours_ML[1]
No ratings yet
slides_cours_ML[1]
272 pages
I2ml3e Chap15
No ratings yet
I2ml3e Chap15
22 pages
01 - ML Introduction - Course Outline
No ratings yet
01 - ML Introduction - Course Outline
21 pages
01 Introduction
No ratings yet
01 Introduction
50 pages
Lecture MachineLearning
No ratings yet
Lecture MachineLearning
139 pages
24f_09_hidden_markov_models
No ratings yet
24f_09_hidden_markov_models
79 pages
Unit 3 Kernel Machines
No ratings yet
Unit 3 Kernel Machines
24 pages
Dimensionality Reduction Lecture Slide
No ratings yet
Dimensionality Reduction Lecture Slide
27 pages
Introduction To Machine Learning With Applications In Information Security Mark Stamp download
No ratings yet
Introduction To Machine Learning With Applications In Information Security Mark Stamp download
85 pages
i2ml-chap1-v1-1 (1)
No ratings yet
i2ml-chap1-v1-1 (1)
21 pages
CSE3008
No ratings yet
CSE3008
30 pages
ML Final
No ratings yet
ML Final
95 pages
supp2 (2)
No ratings yet
supp2 (2)
332 pages
01_introduction (2)
No ratings yet
01_introduction (2)
51 pages
DL (1-10)
No ratings yet
DL (1-10)
10 pages
Toc
No ratings yet
Toc
14 pages
Lecture 8: State-Space Models Based On Slides By: Probabilis C Graphical Models
No ratings yet
Lecture 8: State-Space Models Based On Slides By: Probabilis C Graphical Models
29 pages
Anais Do Workshop De Micro-ondas
From Everand
Anais Do Workshop De Micro-ondas
Alexandre Maniçoba De Oliveira
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Amigurumi Patterns of 2 Bunnies
From Everand
Amigurumi Patterns of 2 Bunnies
Durgesh
No ratings yet
Chapter3 - Basic Processing Unit
No ratings yet
Chapter3 - Basic Processing Unit
47 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
30 pages
FOL Notes
No ratings yet
FOL Notes
8 pages
Uninformed Search: Models To Be Studied in CS 540
No ratings yet
Uninformed Search: Models To Be Studied in CS 540
24 pages
Elec 263 Computer Architecture and Organization
No ratings yet
Elec 263 Computer Architecture and Organization
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
45 pages
Answer: C Answer: A, B, C, D Answer: D: Week 5 Solutions
No ratings yet
Answer: C Answer: A, B, C, D Answer: D: Week 5 Solutions
2 pages
BE Seminar Evaluation Format
No ratings yet
BE Seminar Evaluation Format
2 pages
Fce Exam Unit 1 2017
No ratings yet
Fce Exam Unit 1 2017
4 pages
Model Risk Management: 2016 CAS ERM Seminar Hsiu-Mei Chang, FCAS
No ratings yet
Model Risk Management: 2016 CAS ERM Seminar Hsiu-Mei Chang, FCAS
25 pages
Language Thesis
No ratings yet
Language Thesis
14 pages
Holiday Homework
No ratings yet
Holiday Homework
4 pages
4. Digital Paramount (EC) - Front
No ratings yet
4. Digital Paramount (EC) - Front
89 pages
p.6 MTC E.O.T 1 2024 - Asbat Examinations Board
100% (1)
p.6 MTC E.O.T 1 2024 - Asbat Examinations Board
7 pages
ELEN 3084 OBE Course Syllabus
No ratings yet
ELEN 3084 OBE Course Syllabus
4 pages
Math 465 - Introduction To Combinatorics Problem Set 1
No ratings yet
Math 465 - Introduction To Combinatorics Problem Set 1
3 pages
Semester III 2017
No ratings yet
Semester III 2017
13 pages
Dusty Nanofluid Flow With Bioconvection Past A Ver
No ratings yet
Dusty Nanofluid Flow With Bioconvection Past A Ver
6 pages
Mmpo 02
No ratings yet
Mmpo 02
48 pages
Sose Test Solution
No ratings yet
Sose Test Solution
11 pages
Datared: Data Reduction Program. JRC-LLB 2004
No ratings yet
Datared: Data Reduction Program. JRC-LLB 2004
5 pages
Presentation
No ratings yet
Presentation
41 pages
CBSE Class 9 Maths Lab Manual Activity 1 to 10 in Hindi
No ratings yet
CBSE Class 9 Maths Lab Manual Activity 1 to 10 in Hindi
27 pages
Divide
100% (1)
Divide
19 pages
D.K.Pandey: Lecture 1: Growth and Decay of Current in RL Circuit
No ratings yet
D.K.Pandey: Lecture 1: Growth and Decay of Current in RL Circuit
5 pages
Structure Stiffness S13
No ratings yet
Structure Stiffness S13
40 pages
EC Physical Sciences Grade 11 November 2019 P1 and Memo
No ratings yet
EC Physical Sciences Grade 11 November 2019 P1 and Memo
28 pages
Torque
No ratings yet
Torque
6 pages
Instructor s Solution Manuals to Calculus Early Transcendentals Single and Multiple 7th Edition James Stewart - The latest ebook is available, download it today
100% (2)
Instructor s Solution Manuals to Calculus Early Transcendentals Single and Multiple 7th Edition James Stewart - The latest ebook is available, download it today
55 pages
Article 1525968342 PDF
No ratings yet
Article 1525968342 PDF
10 pages
Test Construction With IRT
No ratings yet
Test Construction With IRT
11 pages
Torsional Moments
No ratings yet
Torsional Moments
16 pages
Measurement of Distance and Direction PDF
100% (1)
Measurement of Distance and Direction PDF
9 pages
Lecture - 9 Unsupervised Learning (K-Means, Association Analysis and Frequuent Items)
No ratings yet
Lecture - 9 Unsupervised Learning (K-Means, Association Analysis and Frequuent Items)
73 pages
Flight Dynamics, Simulation, and Control: For Rigid and Flexible Aircraft 2nd Edition Ranjan Vepa - The full ebook version is available, download now to explore
100% (1)
Flight Dynamics, Simulation, and Control: For Rigid and Flexible Aircraft 2nd Edition Ranjan Vepa - The full ebook version is available, download now to explore
74 pages
2.1.3 Digital Logic ^0 Computer Organization
No ratings yet
2.1.3 Digital Logic ^0 Computer Organization
2 pages
Trigonometry ME
No ratings yet
Trigonometry ME
7 pages

Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010

Uploaded by

Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010

Uploaded by

Lecture Slides for

semantics of the language).

 Base pairs in a DNA sequence can not be modeled as simple

0.1 0.1 0.8

{sequences starting with Si } ∑k 1 ( q1k =Si )

Urn 1 Urn 2 Urn 3

Probability of transition to another Urn after picking a ball:

Not so Easily Computable.

S = {S1, S2, ..., SN }

aij ≡ P (qt+1=Sj | qt=Si)

bj(m) ≡ P (Ot=vm | qt=Sj)

•Gene regulation O={A, C, G, T} S={Gene,Transcription factor binding site,Junk

•Speech processing O=speech signal S=word or phoneme being uttered•

Text understanding O=words S=topic (e.g. sports, weather, etc)

•Robot localization–O=sensor readings S=discretized position of the robot

Observations are mutually independent, given the

P(Q∣λ)=π q1 aq1q2 aq2q3 ...aqT −1qT

P(O∣λ )=∑ P(O∣Q , λ )P (Q∣λ )

P(O∣λ )=∑ P(O∣Q , λ )P (Q∣λ )

-The above sum is over all state paths

in Si at time t, given the model λ

P ( O∣λ ) =∑ αT ( i ) Evaluation result

You might also like