3
$\begingroup$

Suppose there are $n$ bags, each one contains various numbers of balls of differing sizes.

A white ball, of certain size, distinct from all other balls in the bags, is thrown into one of the bags, randomly selected with probability $p_{i}$ ( so that $\sum_{i=1}^{n} p_{i} = 1$).

The probability of picking this white ball, from bag $i$, conditioned on the ball being thrown in that bag $i$, is $c_{i} \in [0,1]$. The probabilities $c_{i}$ may vary from bag to bag, depending on the size and number of other balls there. Note that here the closed interval $[0,1]$ is given for $c_{i}$ as it could be that the white ball is so small compared to the other already present balls in bag $i$, that the chances of picking it out are essentially zero, while on the other extreme, bag $i$ could be empty beforehand.

I have $k$ tries to pick out a white ball from the bags. Here $k \geq n$. I can pick a ball from any of the bags in any order, and I can re-pick a ball from any of the bags as many times as I want (up to maximum $k$ in total). I stop when I first pick out a white ball. After each try that is unsuccessful, I replace the ball in the same bag.

For example, if $n=3$ and $k=6$, a possible sequence of bags I pick from, assuming that I only get a white ball on last attempt (or that I don't get the white ball at all) could be $3, 1, 3, 2, 2, 3 $. Another possibility could be $1,1,1,1,1,2$, etc.

My question is: how do I devise a strategy for the number of tries for each bag, so that the probability of me picking out a white ball in $k$ attempts or less is maximised?

$\endgroup$
15
  • $\begingroup$ Interesting problem. Is this homework? I doubt that there is closed form general solution. BTW: what matters here (if I understood it right) is not the sequence order of the search, but only how many tries I assign for each bag, right? $\endgroup$
    – leonbloy
    Commented Oct 10, 2019 at 17:09
  • $\begingroup$ I change a little the notation and statement, hope you are ok. BTW, "picking out a white ball by $k-$th attempt" mean in $k$ or less attempts, no? $\endgroup$
    – leonbloy
    Commented Oct 10, 2019 at 22:06
  • $\begingroup$ You know the $b_i$, is that correct? $\endgroup$
    – saulspatz
    Commented Oct 10, 2019 at 22:10
  • 1
    $\begingroup$ My immediate guess would be to start with the smallest bag, and in case of failure, recompute the probability of the ball's being in each bag. Then always try the bag with the greatest probability. Not only does that give you the best chance of success on the current draw, it should give you the most information in case of failure. That is, it should allow you to update the probabilities most effectively. $\endgroup$
    – saulspatz
    Commented Oct 10, 2019 at 22:20
  • 1
    $\begingroup$ Suppose $b_1=1$ and you draw from bag one ten times without finding the white ball. Don't you start to believe that the ball is most likely in another bag? This is what Bayes' rule is all about. $\endgroup$
    – saulspatz
    Commented Oct 11, 2019 at 0:12

1 Answer 1

2
$\begingroup$

The problem is stated as a global optimization (maximize the probability of finding the ball in no more than $k$ attempts), but it seems true (assumed, not proved here [*]) that it's equivalent to iteratively (greedily) maximize the probability of finding the ball in the next attempt.

Let $c_i$ be the probability of finding the white ball when attempting an extraction from bag $i$, conditioned that the white ball is there. This doesn't change.

Let $X$ be the position of the ball, and let $p_i=P(X=i)$ be the probability of the white ball being in bag $i$. Initially, this is a given priori. At each (failed) attempt, we'll update this to reflect the a posteriori probability.

Let $F_i$ be the event that we failed to get the white ball, given we tried bag $i$.

To maximize the probability of success on the first attempt, is to mimimize $F_i$. But $P(F_i) = 1 - d_i$, where $d_i = p_i \, c_i$. Hence, denoting by $s$ the selected bag, we have $s= \mathrm{argmax}_i \, d_i$ . And the probability of success is $d_s=\max_i(d_i)$

Given that we failed, the probabilities $p_i$ are updated via Bayes thus:

For $i=s$:

$$ \begin{align} p'_s=P(X=s \mid F_s)&=\frac{P(F_s\mid X=s)P(X=s)}{P(F_s\mid X=s)P(X=s)+P(F_s\mid x\ne s)P(X\ne s)}\\ &=\frac{(1-c_s)p_s}{(1-c_s)p_s + (1-p_s)}\\ &=\frac{p_s-d_s}{1-d_s} \tag 1 \end{align}$$

For $i\ne s$ we first compute

$$ \begin{align} P(F_s \mid X \ne i) &= P(F_s \mid X=s, X \ne i ) P(X=s\mid X \ne i) + P(F_s \mid X \ne s, X \ne i)P(X=s\mid X \ne i) \\ &= (1-c_s) \frac{p_s}{1-p_i} + 1\times (1 - \frac{p_s}{1-p_i}) \\ &= \frac{1}{1-p_i}(1- p_i - c_s p_s) \end{align} $$

And then

$$ \begin{align} p'_i =P(X=i \mid F_s)&= \frac{P(F_s\mid X=i)P(X=i)}{P(F_s\mid X=i)P(X=i)+P(F_s\mid X\ne i)P(X\ne i)} \\ &= \frac{ 1\times p_i }{1\times p_i + \frac{1}{1-p_i}(1- p_i - c_s p_s) (1-p_i) }\\ &=\frac{p_i}{1 -c_s p_s} = \frac{p_i}{1 - d_s} \tag{2} \end{align}$$

With $(1)$ and $(2)$ together we update the probabilites $p_i$ and $d_i$ and repeat.

An example with $n=4$ bags and $k=5$

enter image description here

In yellow, the selected bags in each attempt. In orange, the probability of having found the white ball at that point.


Update: For large $k$ , we could attempt a maximization using (abusing) Lagrange multipliers. I'll spare the details (ask if you want them), but the result is

Let $k_i$ denote how many times bag $i$ was selected, in $k$ extractions (hence $\sum_{i=1}^n k_i=k$) Then the probability of failure is

$$\begin{align} P(F; k_1,k_2 \cdots k_n) &= \sum_{i=1}^n P(X=i) P(F \mid X=i; k_1,k_2 \cdots k_n) \\ &= \sum_{i=1}^n p_i (1 - c_i)^{k_i} = \sum_{i=1}^n p_i e^{ -a_i \,k_i} \end{align}$$

where $a_i = - \log(1-c_i)$

We wish to minimize this function of the multivariate variable $(k_1,k_2 \cdots k_n)$ subject to the condition $C(k_1,k_2 \cdots k_n) = \sum_{i=1}^n k_i-k =0$

Lagrange multipliers would be apt for this.. except that our variables are not real but integer. Anyway, abusing the method, equating the derivative to zero gives:

$$ - p_i a_i e^{ -a_i \,k_i} + \lambda = 0 \hskip{1 cm} i=1,2 \dots n$$

which can be written as

$$k_i = \frac{\log(\, p_i a_i) + \beta }{a_i} \tag 3$$

where and $\beta$ is a number that is obtained by solving (numerically, it seems) the constraint $\sum k_i = k$ together with $(3)$.

Of course, $(3)$ gives non integer numbers, we should expect that rounding to the nearest integers (always respecting the sum restriction) gives at least a good approximation to the optimal value.

For example, with $k=10$ and same $p_i$, $c_i$ as in the example above, I get $k_i=(4.198, 2.941, 1.538, 1.330)$ while the optimal values are $(4, 3, 2 ,1)$.

$\endgroup$
10
  • 1
    $\begingroup$ +1 This is what I meant. I see now that I sloppily said, "the bag with the greatest probability", when I meant to say "the bag that gives the greatest probability of success on the next draw". $\endgroup$
    – saulspatz
    Commented Oct 11, 2019 at 16:07
  • $\begingroup$ Ok, so I see from your answer that the crucial point is that the events "failure on the $i$ - th try" and "success on the $(i+1)$ - th try" are NOT independent, right? @leonbloy $\endgroup$
    – Alex
    Commented Oct 11, 2019 at 22:38
  • $\begingroup$ I wouldn't say that. The crucial point is that "$F_s=$ failure when choosing bag $s$" and "$p(X=i)=$ probability of bag $i$ having the white ball" are not independent. Hence $p(X=i | F_s) \ne P(X = i)$ @saulspatz explained this in the comments $\endgroup$
    – leonbloy
    Commented Oct 11, 2019 at 23:08
  • $\begingroup$ @Alex Done..... $\endgroup$
    – leonbloy
    Commented Oct 12, 2019 at 2:02
  • $\begingroup$ Thanks - I have verified the calculation before you posted the updated answer (hence my deleted comment). I am thinking, would there be any further way to further optimise the solution in terms of $k$? Initially I thought it might be some variant of the Knapsack problem...I'd be very interested to see some more details of your Lagrange multipler approach? @leonbloy $\endgroup$
    – Alex
    Commented Oct 12, 2019 at 2:17

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .