2
$\begingroup$

I was reading the book of Jurafsky about HMM and came along this graphic:

enter image description here

The problem that I have is in the interpretation of the graph. According to the problem the hidden states are the weather conditions (Hot and Cold), and the observations are the number of ice creams a person has eaten (in the figure 3, 1, 1). I know that the forward algorithm allows us to determine the likelihood of the observations given the transition probability matrix and the emission probabilities, but how do I interpret the values of, for example, $\alpha_{2}(2)$?

The calculated value of 0.0404 does it mean that the probability of observation events, 3 and 1, given the hidden states start,H,H and start,C,H is 0.0404? is it like that? or it should be described in other way? I suppose that for the value of $\alpha_{2}(1)$ the interpretation would be that the forward probability of being in state 2, such that the partial observations are 3, 1 and the hidden state is cold would be 0.069; is it like that?

At any point the values of the $\alpha_{2}$ can be summed up?

Thanks

$\endgroup$

1 Answer 1

1
$\begingroup$

From page 6 of that same reference, $"\alpha_t(j)$ represents the probability of being in state $j$ after seeing the first $t$ observations, given the automaton $\lambda$. . . . Formally, each cell expresses the following probability: $$ \alpha_t(j) = P(o_1, o_2, \dots, o_t, q_t = j|\lambda)." $$

So to answer your question, $\alpha_2(2)$ is the probability of being in the 2nd state (Hot) at time $t=2$. Your interpretation of the value 0.0404 sounds right to me: 0.404 is the probability of ending up in the Hot hidden state after 2 steps and observing the sequence $(o_1, o_2) = (3,1)$, summing over all possible paths that terminate in $H$ (in this case, just $HH$ and $CH$).

And yes, the $\alpha_2$'s will be appear in a weighted sum in the next Forward iteration to compute the $\alpha_3$'s.

Alternatively, if you were asking whether it makes sense to sum up the $\alpha_2$'s, the answer is still yes. Summing up the $\alpha_t$'s gives the marginal probability of observing the sequence $(o_1, \dots, o_t)$: $$ \sum_{j=1}^N \alpha_t(j) = \sum_{j=1}^N P(o_1, o_2, \dots, o_t, q_t = j|\lambda) = P(o_1, o_2, \dots, o_t | \lambda) $$ In fact, it's easy to check this for $t=1$. Plugging in 1, 2, and 3 for $o_1$ and summing up, you'll see that you get 1, which says that $\sum_{j=1}^N \alpha_1(j)$ is a valid probability distribution over the number of ice cream cone eaten on the first day.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.