6
$\begingroup$

We have $3$ urns. At each round a ball is placed is placed into one of them, at random, with uniform probability. The game stops when some urn has $100$ balls.

What is expected duration of the game (number of rounds)?

Results from a simulation:

enter image description here

$\endgroup$
21
  • $\begingroup$ each round we pick one enemy and we hit him once $\endgroup$
    – quester
    Commented Sep 24, 2019 at 15:18
  • 1
    $\begingroup$ @callculus yes. $\endgroup$
    – quester
    Commented Sep 24, 2019 at 15:40
  • 1
    $\begingroup$ @quester I don´t know what the mean at the graph is. But it does not clearly matches my result where I used your formula. Do you think your formula is wrong? $\endgroup$ Commented Sep 24, 2019 at 17:32
  • 1
    $\begingroup$ @Sudix problem with formula is that there should be $\sum p = 1$, but unfortunately it is $\sum p < 1$ it should be summed up and then multiplied by inverse $\sum p = c => p_{new} = \frac{p}{c}$ then $EX = \sum x p_{new}$ $\endgroup$
    – quester
    Commented Sep 28, 2019 at 16:21
  • 2
    $\begingroup$ @sudix Those are the probabilities for the state (100,i,j). However, that does not mean that this 100 was reached in the last placement of the ball. You can reach the state (100,i,j) via (99,i,j) but also via (100,i-1,j) or (100,i,j-1). See in my answer where I compute the case that a an urn has 100 in it in the last ball placement by computing the probability that the urn has 99 in it (and no then you have 1/3 probability to reach 100 in the next step). $\endgroup$ Commented Sep 28, 2019 at 21:16

2 Answers 2

4
$\begingroup$

Computation of the distribution

Let $n$ be the number of balls to draw. Let $m$ be the number of urns. Let $k$ be the target number of balls when the game stops.

You could express the probability to reach a stop in $n$ balls in terms of the probability that the number of balls in each urn is $k-1$ or less (the cumulative distribution).

  • The number of ways to put $n$ balls into $m$ urns is $m^n$ (with or without reaching the stop condition).

  • The number of ways to put $n$ balls into $m$ urns but not having reached the stop condition (that is having at most $k-1$ in each of them) can be found by enumerating over the set $S$ of vectors $\vec{k}$ (the numbers $(k_i)$ depicting the the number of balls in each $i$-th urn) that satisfy the condition $$\sum_i k_i = n \quad \text{and} \quad \forall i:0 \leq k_i < k$$ And for each of the vector $\vec{k}$ (a set of numbers $k_1,k_2,k_3$) that satisfies these conditions we compute the the number of ways to distribute the balls into the urns with those numbers which is $$\text{number of ways to put $k_i$ balls in urn $i$} = \frac{n!}{\prod_i{k_i!}}$$ Then we take the sum over all of this $$P(N \leq n) = \frac{1}{m^n}\sum_{\vec{k} \in S} \frac{n!}{\prod_{k_i\in \vec{k}}{k_i!}} $$ where the sum over $\vec{k} \in S$ means the summation over all vectors with numbers $k_i$ that satisfy the conditions and the product over $k_i \in \vec{k}$ means the product with all $k_i$ in the $\vec{k}$.

See below for an implementation in R code:

output image from code

# computation
n <- 99
sum <- rep(0,3*n+1)
for (k1 in 0:n) {
  for (k2 in 0:n) {
    for (k3 in 0:n) {
      t = (k1+k2+k3)
      sum[t+1] = sum[t+1]+exp(lfactorial(t)-lfactorial(k1)-lfactorial(k2)-lfactorial(k3))
    }
  }
}
x <- c(0:(3*n))
Xcum <- c(sum/3^x,0)

# simulation
set.seed(1)

draw <- function() {
  s <- sample(c(1:3),size = 300, replace=TRUE)
  min(which((cumsum(s==1)==100) | (cumsum(s==2)==100) | (cumsum(s==3)==100)))
}
q <- replicate(10^5,draw())

# computation using beta function

drn <- function(n,k) {
  a <- max(0,n-2*k+1)
  b <- min(k-1,n-k)
  choose(n-1,k-1) * 2^(n-k) / 3^(n-1) *
      ( zipfR::Ibeta(0.5,n-k-b+1,b+1)/beta(n-k-b+1,b+1) - 
        zipfR::Ibeta(0.5,n-(k-1)-(a-1),(a-1)+1)/beta(n-(k-1)-(a-1),(a-1)+1) )
  #choose(n-1,k-1) * 2^(n-k) / 3^(n-1) * (pbinom(b,n-k,0.5)-pbinom(a-1,n-k,0.5))
}
drn <- Vectorize(drn)


#plotting both together

h <- hist(q, breaks=c(0:298)+0.5, xlim=c(200,300),
          xlab = "N", ylab = "probability", freq = FALSE, main="")
lines(1:298,-diff(Xcum),col=2)
lines(c(100:298),drn(c(100:298),100),col=3)

multinomial distribution

You can view this as related to the multinomial distribution which has the pdf

$$\frac {n!}{k_1! k_2! ... k_m!} p_1^{k_1} p_2^{k_2} ... p_m^{k_m} $$

which becomes for equal $p_i = 1/m $ the following

$$\frac {1}{m^n}\frac {n!}{k_1! k_2! ... k_m!} $$

which shows similarity with the expresssion before. Then the probability that for $n$ draws you did not reach 100 yet is equal to the probability that in 100 draws each $k_i<100$. And you can see the computation of you probability density as the computation of the CDF for the multinomial distribution.


Expression in terms of regularized incomplete beta function

For the case of three urns we may write an explicit expression for the probability in terms of the regularized incomplete beta function.

The probability that there are in the $n$-th draw $k$ balls in the first urn and less than $k$ in the others is, is equal to the 1/3 of the probability that there are in the $n-1$ draw $l= k-1$ balls in the first urn and equal or less than $l$ in the others is :

$$\begin{array}{rcrl} P_{k_1=l=k-1,k_2 \leq l,k_3 \leq l \vert n-1} &=& &\sum_{a \leq k_2 \leq b} \frac {1}{3^{n-1}}\frac {(n-1)!}{l! k_2! (n-1-l-k_2)!} \\ & = & \frac{(n-1)!}{l! 3^{n-1}} &\sum_{a \leq k_2 \leq b} \frac {1}{k_2! (n-1-l-k_2)!} \\ & = & {{n-1}\choose{l}} \frac{2^{n-1-l}}{3^{n-1}}& \sum_{a \leq k_2 \leq b} \underbrace{{n-1-l\choose{k_2}} \frac{1}{2^{n-1-l}}}_{\text{this is a binomial distribution}} \\ & = & {{n-1}\choose{k-1}} \frac{2^{n-k}}{3^{n-1}} & \left( I_{1/2}(n-k-b+1,b+1) - I_{1/2}(n-k-a+2,a) \right) \end{array}$$

with $a = max(0,n-2k+1)$ and $b = min(k-1,n-k)$


Computation of the expectation value

In the first part we computed $P(n>k) = 1-P(n\leq k)$. To obtain the mean you can sum over all of these. $\mu = \sum 1-P(n\leq k)$. This will give:

$$\sum_{k_1=0}^{99}\sum_{k_2=0}^{99}\sum_{k_3=0}^{99} \frac{1}{3^{k_1+k_2+k_3}} \frac{(k_1+k_2+k_3)!}{k_1!k_2!k_3!} = 274.9186 $$

$\endgroup$
1
  • $\begingroup$ nice solution, specially in the first part. But $(k_1,k_2,k_3)$ is not a set, it is a triple (vector) $\endgroup$
    – G Cab
    Commented Sep 27, 2019 at 18:07
3
$\begingroup$

The expected time can be expressed in terms of the incomplete gamma function as follows (inspired by this paper and comments here):

In general: We want the expected value of the time to wait $T$ until one of the $3$ urns contains $n$ ($=100$) balls. Then

$$E_{n}[T] = \sum_{t=1}^\infty P(T\ge t) = \sum_{t=0}^\infty p_{n}(t) \tag1$$

where $p_{n}(t) $ is the probability that after $t$ rounds ($t$ balls) all $3$ urns have less than $n$ balls. But this is equivalent to

$$ \sum_{x=0}^{n-1}\sum_{y=0}^{n-1}\sum_{z=0}^{n-1} \frac{1}{3^{x+y+z}} \frac{(x+y+z)!}{x! \, y! \, z!} \tag2$$

Further, we use a property of the (upper) incomplete gamma function:

$$\begin{align} \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^3 &= \left( e^{-a} \sum_{r=0}^{n-1}\frac{a^r}{r!} \right)^3 \\&= e^{-3a} \sum_{x=0}^{n-1}\sum_{y=0}^{n-1}\sum_{z=0}^{n-1}\frac{a^{x+y+z}}{x! \, y! \, z!} \tag3 \end{align}$$

Integrating and using $\int_0^\infty \exp(-3a) a^p da = p!/3^{p+1}$ we get

$$ \int_0^\infty \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^3 da= \sum_{x=0}^{n-1}\sum_{y=0}^{n-1}\sum_{z=0}^{n-1} \frac{1}{3^{x+y+z+1}} \frac{(x+y+z)!}{x! \, y! \, z!} \tag4$$

and finally

$$E_{n}[T] = 3 \int_0^\infty \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^3 da \tag5$$

More in general, if we have $d$ urns:

$$E_{n,d}[T] = d \int_0^\infty \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^d da \tag6$$

This can be evaluated numerically, I don't know about asymptotics (asked here).

Empirically , it seems that $E = 3 n - \beta \sqrt{n} +O(1)$ where $\beta \approx 2.5$


And here's a numerical recursive computation (in Java):

public class MSE3368225 {

    static Double[] cache = new Double[(1<<21)];

    static double ex(int x, int y, int z) {
        if (x == 0 || y == 0 || z == 0)
            return 0;
        if (x > 127 || y > 127 || z > 127) 
            throw new RuntimeException("Out of range");
        int k = (x << 14) | (y << 7) | z; // packs three variables in one integer
        Double d = cache[k];
        if (d == null) {
            d = 1 + (ex(x - 1, y, z) + ex(x, y - 1, z) + ex(x, y, z - 1)) / 3.0;
            cache[k] = d;
        }
        return d;
    }

    public static void main(String[] args) {
        System.out.println(ex(100, 100, 100));
    }
}

This solves the recursion

$$g(x,y,z)=\begin{cases} 0 & \text {if $x=0$ or $y=0$ or $z=0$}\\ 1+ \frac13\left(g(x-1,y,z)+g(x,y-1,z)+g(x,y,z-1)\right) & \text{elsewhere} \end{cases} $$

where $g(x,y,z)$ is the expected remaining time, when there remains $(x,y,z)$ balls for each urn.

The result is $E_{100}[T]=274.9186440$


Some values

  n     E
  2  2.888889 
  3  5.049383 
  4  7.348270 
  5  9.734204
 10  22.34468
 20  48.99126
 50  132.3676
100  274.9186
$\endgroup$
6
  • $\begingroup$ noticed that if it would be optimized like (x,y,z) always would be sorted then number of steps would be greatly reduced $\endgroup$
    – quester
    Commented Sep 25, 2019 at 12:15
  • $\begingroup$ and how did you computed this integral $E(L, N)$? $\endgroup$
    – quester
    Commented Sep 27, 2019 at 10:22
  • $\begingroup$ You mean how I evaluated it or how I got it? If the former: numerically - and I checked that it agreed with the code above. $\endgroup$
    – leonbloy
    Commented Sep 27, 2019 at 18:07
  • $\begingroup$ I mean how did you figured out this formula with integral and fraction with gamma functions, I wish I worked more with gamma functions and I would like to develop intuition behind gamma functions in probability theory $\endgroup$
    – quester
    Commented Sep 28, 2019 at 8:25
  • $\begingroup$ @quester I added an explanation $\endgroup$
    – leonbloy
    Commented Sep 28, 2019 at 17:06

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .