2
$\begingroup$

Problem abstraction

A standard deck of $52$ cards has $26$ red cards: it has $13$ hearts, $13$ diamonds, as well as $26$ black cards ($13$ spades, as well as $13$ clubs). Let us draw $5$ cards from the deck at once, and return those cards to the deck afterward.

What is the expected number of draws before we see all $26$ red cards?

Use case

Let us say that there is a set of $N = 100$ cards in a game. $M = 30$ cards are of rare rarity and $N - M = 70$ cards are of common rarity. We buy booster packs of size $= 10$. The question is: how many booster packs need to be bought to collect all $M = 30$ cards?

Attempted solution

I have managed to calculate the approximate number of booster packs necessary to get $M = 30$ rare cards by calculating the expectation of the above hypergeometric distribution ($\mu$) and then calculating $M/\mu$. However, this is not the correct solution since it does not take into account the possibility of collecting duplicates.

Regarding the Coupon collector's problem, I'm not sure if it is applicable since we always draw a single coupon, whereas in my use case a booster pack contains more than a single card.

Related: https://stats.stackexchange.com/questions/198915/in-the-coupon-collectors-problem-with-group-drawings-why-does-the-probability

Simulations

Problem Abstraction

$10^6$ trials were conducted, AVG: $38.947$, STDEV: $12.3653$ draws

Use case

$10^6$ trials were conducted, AVG: $38.535$, STDEV: $11.962$ draws

$\endgroup$
4
  • $\begingroup$ Related: coupon collector's problem $\endgroup$
    – jvdhooft
    Commented Jan 31, 2019 at 10:18
  • $\begingroup$ @james - thank you for the feedback, I have edited my question to provide additional context. $\endgroup$ Commented Jan 31, 2019 at 11:23
  • $\begingroup$ close to $\frac25\cdot26H_{26}=40.085965$ draws. $\endgroup$
    – robjohn
    Commented Jan 31, 2019 at 15:31
  • $\begingroup$ Could you please explain how you got to the number? $\endgroup$ Commented Jan 31, 2019 at 21:19

1 Answer 1

3
$\begingroup$

Here is a solution of the "problem abstraction" by way of the Principle of Inclusion / Exclusion (PIE).

Let $T$ be the number of the first draw in which we have seen all the red cards. We would like to find $P(T>k)$ for some $k>0$, i.e. the probability that we have not seen all $26$ red cards in $k$ draws. To that end, let's say a sequence of $k$ draws has "Property $i$" if red card $i$ has not been drawn, for $i = 1,2,3,\dots,26$. Let $S_j$ be the sum of the probabilities of all the sequences with $j$ of the properties, for $j = 1,2,3,\dots,26$. For $S_j$, there are $\binom{26}{j}$ ways to select the $j$ cards which are missing. The probability that those cards are missing in a single draw is $\binom{52-j}{5} / \binom{52}{5}$, so the probability that the cards are missing in all $k$ draws is $[\binom{52-j}{5} / \binom{52}{5}]^k$. Therefore $$S_j = \binom{26}{j} \left( \frac{\binom{52-j}{5}}{ \binom{52}{5}} \right) ^k$$ By PIE, the probability of a sequence of draws with at least one of the properties, i.e. a sequence with at least one red card not seen, is $$P(T>k) = \sum_{j=1}^{26} (-1)^{j+1} S_j$$ so $$\begin{align} E(T) &= \sum_{k=0}^{\infty} P(T>k) \\ &= \sum_{k=0}^{\infty} \sum_{j=1}^{26} (-1)^{j+1} S_j \\ &= \sum_{k=0}^{\infty} \sum_{j=1}^{26} (-1)^{j+1} \binom{26}{j} \left( \frac{\binom{52-j}{5}}{ \binom{52}{5}} \right) ^k \\ &= \sum_{j=1}^{26} (-1)^{j+1} \binom{26}{j} \sum_{k=0}^{\infty} \left( \frac{\binom{52-j}{5}}{ \binom{52}{5}} \right) ^k \\ &= \sum_{j=1}^{26} (-1)^{j+1} \binom{26}{j} \frac{1}{1- \binom{52-j}{5}/ \binom{52}{5}} \\ &= 38.9133 \end{align}$$


Added February 4, 2019:

The following Monte Carlo simulation of $10^6$ trials in the R language is consistent with the result above. The average number of draws of 5-card hands necessary to see all 26 red cards was 38.973, with a 95% confidence interval of 38.91305 to 38.96158. The analytical result 38.9133 falls in the confidence interval, although just barely.

> # ndraws: return the number of draws of 5-card hands required 
> # to see all red cards at least once
> # We consider the red cards to be the cards numbered 1-26.
> ndraws <- function() {
+   seen <- rep(0, 52)
+   n <- 0
+   while (TRUE) {
+     n <- n+1
+     hand <- sample(1:52, 5)
+     seen[hand] <- 1
+     if (sum(seen[1:26]) >= 26)
+       return (n)
+   }
+ }
> nreps <- 1e6
> set.seed(1234)  # for reproducibility
> t <- replicate(nreps, ndraws())
> t.test(t)

        One Sample t-test

data:  t
t = 3145.3, df = 1e+06, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 38.91305 38.96158
sample estimates:
mean of x 
 38.93731 
$\endgroup$
5
  • $\begingroup$ Thank you for your answer! After some digging, in a paper The Coupon Subset Collection Problem by Adler, Ilan and Ross, Sheldon M I have found equation 7 that solves the issue, and link this question as well. What keeps me confused is that the expectation depends only on the number of distinct balls in the set and the size of the drawn subset, but not on the size of the entire set. Am I misunderstanding something? $\endgroup$ Commented Feb 4, 2019 at 11:11
  • $\begingroup$ @user3223162 I haven't read the paper you refer to, but in the question you link to, the assumption is that the urn contains one each of $n$ distinct balls, so the number of distinct balls is the same as the number of balls in the urn. $\endgroup$
    – awkward
    Commented Feb 4, 2019 at 12:56
  • $\begingroup$ In that case, that equation does not solve my issue. Furthermore, simulations I ran disagree with your attempt. Still, I appreciate the effort $\endgroup$ Commented Feb 4, 2019 at 15:46
  • $\begingroup$ @user3223162 There must be something amiss, then, because my simulation agrees with the analytical result I posted. I have added the simulation to the solution, above. $\endgroup$
    – awkward
    Commented Feb 4, 2019 at 20:20
  • $\begingroup$ You are correct, my tests contained an error and after correction confirm your results. I have ran the formula alongside with some simulations and the results correspond. For now, I am marking this as solved, I will try and confirm my findings in some existing literature on the subject. Thanks! $\endgroup$ Commented Feb 5, 2019 at 12:59

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .