These are Doubly Stochastic Models in the following sense:
Suppose that our observations in the case of plain Markov Models were not deterministically output by the states. For example, assume S1 did not always mean ``Rainy'', but rather ``90% chance of rain, 7% chance of cloud and a 3% chance of Sun''. A model that takes this into account is called a HMM.
It is completely specified by
where A and
are the same as before and B is the observation symbol
probability distribution given by
.
We assume that there are M distinct output symbols,
.
(Again assume that N is implicit in
A, M is implicit in V and V is implicit in B).
The probability that the kth symbol is produced by the ith state is bj(k).
Hereafter, let
be an observation sequence,
be a state sequence and
be a
HMM.
Now, discuss how a given O can be generated given
.
Three interesting questions about HMMs:
We will only look at answering questions (1) and (2) in these two lectures. (3) is reserved to serve as illustration of the EM algorithm during lectures 3/4 on Bayesian Learning later.