Markov chains and HMMs
evergreen#foundations#sequences
Up: 10_Foundations-MOC
Bar: write a transition matrix, solve for its stationary distribution by hand, and explain the
forward and Viterbi recursions in one line each.
Markov chains
Markov property: the future depends on the present only — P(xt+1∣xt,xt−1,…)=P(xt+1∣xt).
Encode it as a transition matrix P where Pij=P(next=j∣now=i); rows sum to 1.
- n-step transitions: P(n)=Pn (matrix power).
- Distribution after one step: πt+1=πtP (row vector convention).
- Stationary distribution π solves π=πP with ∑iπi=1.
If the chain is irreducible + aperiodic, π is unique and πt→π
from any start.
Worked: stationary distribution (by hand)
P=[0.90.50.10.5]Solve π1=0.9π1+0.5π2 with π2=1−π1:
π1=0.9π1+0.5(1−π1) ⇒ 0.6π1=0.5 ⇒ π1=65≈0.833, π2=61.
Long-run, the system spends 5/6 of its time in state 1. The matrix is the ground truth you plant when
generating synthetic page-transition data (Generating background traffic).
Hidden Markov Models
Now the state is hidden; you see only emissions. Parameters λ=(A,B,π):
- A — state transition matrix (Markov over hidden states),
- B — emission probabilities bj(o)=P(observe o∣state j),
- π — initial state distribution.
Three canonical problems:
| Problem |
Question |
Algorithm |
| Likelihood |
P(O∣λ) of an observation sequence? |
Forward (sum) |
| Decoding |
most likely hidden path? |
Viterbi (max) |
| Learning |
best λ from data? |
Baum–Welch (EM) |
Forward vs Viterbi (the only difference is sum vs max)
Forward: αt(j)=[i∑αt−1(i)Aij]bj(ot)Viterbi: δt(j)=[imaxδt−1(i)Aij]bj(ot) (+ backpointers)Forward sums over all paths to get total probability; Viterbi maximizes to recover the single best path
(then traces backpointers). Both are O(T⋅S2) dynamic programming — linear in sequence length, the
reason HMMs are practical.
Worked: one forward step
Hidden {R,S}, π=(0.6,0.4), emission of "walk": bR(walk)=0.1, bS(walk)=0.6.
α1(R)=0.6⋅0.1=0.06,α1(S)=0.4⋅0.6=0.24.
P(o1=walk)=0.06+0.24=0.30. Extend with the recursion for α2, etc.
Why this matters here
By-hand exercise (meets the bar)
- Find the stationary distribution of P=[0.70.40.30.6].
(Answer: π=(4/7,3/7).)
- Add a second emission and compute α2 for the HMM above for the sequence (walk, shop).
Links