Markov Model &
Hidden Markov Model
Markov model
• In probability theory, a Markov model is a stochastic
model used to model randomly changing systems.
• It is assumed that future states depend only on the
current state, not on the events that occurred before it
(that is, it assumes the Markov property).
• Very useful in bio-informatics, NLP, speech recognition
HMM Example
• There are 3 states of weather Rain ,Cloudy and Sunny.
• According to marchov rule ,Weather tomorrow depends only on weather today.
• Any given day Jack mood depends weather. Assume jack has following two
moods sad and happy.
• We don’t live in Jack’s town ,we don’t know what is the weather of particular
day, however we can contact jack over internet and get to know his mood.
• So, states of marchov chain are unknown and hidden from us , but we can
observe some variables that are dependent on states.
• This is called Hidden Marchov Model.
• It consists of (Hidden marchov chain) & (a set of observed variables).
• Note: in his example observed mood is dependent on today’s weather.
HMM: Example
• Transition matrix Emission matrix
•
Stationary distribution vector ,initial vector of states :
Π = [.0218,.273,.509] for rain ,cloud and sunny respectively.
R C S
Cont..
• Lets consider this scenario, we cant observe the hidden states but
assume that this scenario happened.
• Analysing this will help us understand the mathematics behind HMM.
• What is the probability
of this scenario occurring?
Cont..
• In this framework, the joint probability of the observed sequence y1,y2,…yn and the hidden sequence
x1,x2,…xn can be factored as:
n n
P(y1,y2,y3…yn,x1,x3,x3…xn)=p(x1) Π p(xi|xi-1) Π p(yi|xi)
i=2 i=1
p(x1): initial probability of the hidden state variable.
n
Π p(xi|xi-1):The first order marchov process governing the hidden variable
i=2
n
Π p(yi|xi):Conditional probability of each observed variable given its
i=1 corresponding hidden state variable.
Therefore joint probability for above example scenario is : P(y1,y2,y3,x1,x2,x3)=p(x1)*p(y1|x1)*p(x2|
x1)*p(y2|x2)*p(x3|x2)*p(y3|x3)=0.00391.
Cont..
• Now let us hide the states now we have only observed variables
• What is the most likely weather sequence for observed mood
sequence?
• We have many possible permutations ,It can be r|c|s, c|r|s,r|c|r… so on..
•
• To find the most likely weather sequence ,we
need to compute probability for each sequence
and find one with maximum probability .
We can calculate probability just like in previous
case.
Cont..
• With help of a python script computed probabilities of sequence and
found that probability of this below weather sequence maximises.
• Joint probability
value is 0.04105
Next we represent
hidden states by X
observed variables by Y
• We can rewrite our problem like this:
• Find that particular sequence of X for which Probability of X given Y is
maximum.
• Note that in hidden marchov model we observe Y ,that is why we
have written P(X|Y)
• By using Bayes theorem we can
• rewrite this equation as:
• For P(x0) we must use prior distribution vector on initial state.
• Replace P(Y|X) & P(X) in above expression, we get
• Now we have got the expression that we need to maximise .
This is the Product of two n terms, terms of transition matrix and
emission matrix and prior distribution vector .