1
$\begingroup$

While studying about Forward Backward Algorithm, I came across the question below. I was unable to solve the question after trying for a lot of time.

Consider a two-bit register. The register has four possible states: 00, 01, 10 and 11. Initially, at time 0, the contents of the register is chosen at random to be one of these four states, each with equal probability. At each time step, beginning at time 1, the register is randomly manipulated as follows: with probability 0.5, the register is left unchanged; with probability 0.25, the two bits of the register are exchanged (e.g., 01 becomes 10); and with probability 0.25, the right bit is flipped (e.g., 01 becomes 00). After the register has been manipulated in this fashion, the left bit is observed. Suppose that on the first three time steps, we observe 0, 0, 1.

Q) Using the forward-backward algorithm to determine the probability of being in each state at time k given all three observed bits, for k=0,1,2,3.

$\endgroup$
4
  • $\begingroup$ What type of model are you considering? Is it a Markov chain? $\endgroup$ Commented Apr 12, 2017 at 15:10
  • $\begingroup$ It's a Hidden Markov Model. $\endgroup$ Commented Apr 12, 2017 at 15:11
  • $\begingroup$ This is homework... $\endgroup$
    – SmallChess
    Commented Apr 12, 2017 at 16:00
  • $\begingroup$ hint: $\left[ \begin{array}{cccc} 1 & 0 & 0 & 0\\ .25 & .5 & .25 & 0\\ 0 & .25 & .5 & .25 \\ 0 & 0 & .25 & .75 \end{array}\right]$ is the transition matrix if you use the order of your states in your questions, $g(O_t=0|X_t=(0,0)) = (O_t=0|X_t=(0,1))=1$ and $g(O_t=1|X_t=(1,0)) = (O_t=1|X_t=(1,1))=1$ are the observation distributions $\endgroup$
    – Taylor
    Commented Apr 12, 2017 at 16:03

1 Answer 1

2
$\begingroup$

Assume that we have computed the forward and backward probabilities $\alpha(z_t)$ and $\beta(z_t)$ for each of the $K$ hidden states $z_t$ at time $t$. Then the probability of the state $x_t$ at time $t$ is

$$p(x_t|D)=\sum_{z_t=1}^K p(x_t|z_t)p(z_t|D)$$

where $D$ are the observed data. From equations 13.32 and 13.33 in Bishop (2006) we see that

$$ p(z_t|D)=\frac{p(D|z_t)p(z_t)}{p(D)}=\frac{\alpha(z_t) \beta(z_t)}{\sum_{z_t=1}^K \alpha(z_t) \beta(z_t)}$$

Literature: Bishop, C. M. (2006). Pattern recognition and Machine Learning. Springer, NY.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.