2
$\begingroup$

Suppose we have a simple Bayesian network with two rows of nodes: $x_1, x_2, \ldots, x_n$ and $y_1, y_2, \ldots, y_n$. Each node $x_k$ takes a state of either 0 or 1 with equal probability. Each node $y_k$ takes state 1 with probability $p_{k,0}$ if $x_k$ is state 0 and probability $p_{k,1}$ if $x_k$ is state 1.

Is exponential time required to compute the probability that all $y_k$ are 1? Please provide a proof either way.

$\endgroup$
3
  • 2
    $\begingroup$ It's probably best if you first tried to solve it on your own. $\endgroup$ Commented Apr 8, 2019 at 21:28
  • 2
    $\begingroup$ @SapereAude instead of deleting the question you could answer the question yourself, expanding on the order of the factoring and marginalizing. This way if anyone else comes across this problem or a similar one, they can learn from your example and about factoring orders in general. $\endgroup$
    – ryan
    Commented Apr 9, 2019 at 0:35
  • $\begingroup$ @ryan A meaningful suggestion. $\endgroup$
    – SapereAude
    Commented Apr 9, 2019 at 4:14

1 Answer 1

5
$\begingroup$

Exponential time is not required for finding the requested probability. In fact, only linear time is needed, as we will see below.

First let's define some additional notation. Let $X = (x_1, x_2, \ldots, x_n)$ and $Y = (y_1, y_2, \ldots, y_n)$ -- that is, two $n$-tuples of our variables. Let $\mathbf{1}$ denote an $n$-tuple of 1s, $ (1, 1, \ldots, 1)$. And let $S_n$ denote the set of all possible $n$-tuples of $0$ and $1$. In CS literature, the elements of $S_n$ are sometimes called "strings". If $\sigma$ is one such element, we write $\sigma(k)$ for its $k$th component.

In this new notation, the task is to compute $\Pr(Y = \mathbf{1})$. We can marginalize out the configuration of 0s and 1s on $X$ as follows: $$\Pr(Y = \mathbf{1}) = \sum_{\sigma \in S_n} \Pr(Y = \mathbf{1} | X = \sigma) \Pr (X = \sigma).$$

From the prompt we know that each $x_k$ is 0 or 1 with equal probability, so $\Pr (X = \sigma) = 2^{-n}$ for every $\sigma \in S_n$. Additionally, for any given $\sigma \in S_n$, we know that $\Pr(Y = \mathbf{1} | X = \sigma) = \prod_{k=1}^n p_{k,\sigma(k)}$, since the probabilities $p_{k,\sigma(k)}$ are independent. Combining these observations, we obtain $$\Pr(Y = \mathbf{1}) = 2^{-n} \sum_{\sigma \in S_n} \prod_{k=1}^n p_{k,\sigma(k)}.$$

Our next step is to show that we can factor the left-hand side of this equation as follows: $$\sum_{\sigma \in S_n} \prod_{k=1}^n p_{k,\sigma(k)} = \prod_{k=1}^n (p_{k,0}+p_{k,1}).$$ We prove this inductively. When $n = 1$, both sides are $p_{1,0}+p_{1,1}$, so the base case holds. For the inductive case, we assume the $(n-1)$th case and write the $n$th case as $$(p_{n,0}+p_{n,1}) \sum_{\sigma \in S_{n-1}} \prod_{k=1}^{n-1} p_{k,\sigma(k)} = (p_{n,0}+p_{n,1}) \prod_{k=1}^{n-1} (p_{k,0}+p_{k,1}).$$ Both sides simplify to those of the above identity, respectively, so the identity is proved.

This leaves us with the result that $\Pr(Y = \mathbf{1}) = 2^{-n} \prod_{k=1}^n (p_{k,0}+p_{k,1}).$ For a hash table representation, look-up times are $O(1)$ for each $p_{k,0}$ and $p_{k,1}$, so we can compute this product with a simple for-loop in a runtime of $O(n)$.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.