11
$\begingroup$

You have a standard 52 card deck, with 13 cards of each of the 4 suits (Hearts, Diamonds, Spades, Clubs). What is the expected number of cards you have to draw from the deck until you have all 4 suits represented in your hand?

I couldn't think of how to get the negative binomial to work, since this is sampling without replacement and has 4 suits instead of just 2. I imagine a distribution that could solve this might be called the Negative Hypergeometric Multivariate. Anyone have any ideas? Thanks very much.

$\endgroup$
2
  • $\begingroup$ Nice question! Running some simulations, the average seems to be about $7.66$, but I don't know which distribution is appropriate here. $\endgroup$
    – TMM
    Commented Jan 1, 2013 at 2:03
  • $\begingroup$ Yep, you need the Multivariate Hypergeometric Distribution, see for example in wikipedia: en.wikipedia.org/wiki/… $\endgroup$
    – Bitwise
    Commented Jan 1, 2013 at 2:11

2 Answers 2

13
$\begingroup$

Let $s=13$ denote the number of cards in each suit and $N$ denote the (random) number of cards drawn when all $k=4$ suits are represented for the first time. Hence, $k\leqslant N\leqslant(k-1)s+1$ with full probability.

For each $n\geqslant1$, the event $[N\gt n]$ depends on the $n$ first cards drawn only. There are ${ks\choose n}$ collections of $n$ cards from a full deck of $ks$ cards and each such collection has the same probability ${ks\choose n}^{-1}$ to be drawn. The event $[N\gt n]$ means that one avoids at least one suit. Using inclusion-exclusion principle, there are $A_n$ ways to do so, where $$ A_n={k\choose1}{(k-1)s\choose n}-{k\choose2}{(k-2)s\choose n}+\cdots\pm{k\choose k-1}{s\choose n}\mp{k\choose 0}{0\choose n}. $$ This yields the expectation $$ \mathbb E(N)=\sum_{n\geqslant0}\mathbb P(N\gt n)=\sum_{n\geqslant0}{ks\choose n}^{-1}A_n. $$ In the case at hand, $$ \mathbb E(N)=\sum_{n\geqslant0}{52\choose n}^{-1}\left(4{39\choose n}-6{26\choose n}+4{13\choose n}-{0\choose n}\right), $$ that is, $$ \mathbb E(N)=4+\sum_{n=4}^{39}{52\choose n}^{-1}\left(4{39\choose n}-6{26\choose n}+4{13\choose n}\right), $$ or, $$ \mathbb E(N)=4+4B_{39}-6B_{26}+4B_{13},\qquad B_i=\sum_{n=4}^{i}{52\choose n}^{-1}{i\choose n}. $$ Numerically, $$ \mathbb E(N)=\frac{4829}{630}=7+\frac23-\frac1{630}\approx7.66508. $$

$\endgroup$
3
  • $\begingroup$ Another approach giving the same answer is that in the answer to math.stackexchange.com/questions/1660538/… so $\mathbb{E}(N)= 1+{4 \choose 1} \frac{39}{14} - {4 \choose 2} \frac{26}{27} + {4 \choose 3} \frac{13}{40} = \frac{4829}{630}$ $\endgroup$
    – Henry
    Commented Feb 18, 2016 at 23:08
  • $\begingroup$ @Henry: Could you explain the "comment" formula a bit ? I can see that the denominators are one more than cards in suit(s), but not getting the full rationale. $\endgroup$ Commented Oct 26, 2023 at 8:15
  • $\begingroup$ @trueblueanil It was seven years ago, but in comments in the linked question, the result was based on $\sum\limits_{n=1}^{a_1} \dfrac{a_1 \choose n}{a_1+a_2 \choose n}=\dfrac{a_1}{1+a_2}$ based on $\sum\limits_{m=0}^b {c+m \choose c} ={b+c+1 \choose c+1} = \frac{b+c+1}{c+1} {b+c \choose c}$ $\endgroup$
    – Henry
    Commented Oct 26, 2023 at 8:47
8
$\begingroup$

Suppose cards are drawn without replacement from a well-shuffled $52$-card deck until at least one card of every suit $\left\{ \spadesuit,\heartsuit,\clubsuit,\diamondsuit\right\}$ has been drawn. Let $X$ count the number of cards drawn. We would like to find $\mathbb{E}\left(X\right)$. To this end, it is not necessary to use the multivariate negative hypergeometric distribution. The (univariate) negative hypergeometric will suffice. Why? (1) you are only after $\mathbb{E}\left(X\right)$, and (2) because the four suits are equally proportioned (13 each) in the deck.

Since texts disagree slightly on the definition of negative hypergeometric, here is the definition I'll be using: Suppose a population of size $N$ contains exactly $K\ge1$ members having the “success” trait. We draw randomly and without replacement from the population until we have obtained one success. Let $Y$ count the number of draws. Then $Y$ is said to have a negative hypergeometric distribution with parameters $N$ and $K$.

The pmf of $Y$ will not be needed for our purposes. The expected value of $Y$ is$$\mathbb{E}\left(Y\right)=\frac{N+1}{K+1}\mbox{.}$$I'll take this formula as granted.

Now, getting back to $\mathbb{E}\left(X\right)$. Define the random variables$$\begin{array}{rcl} Y_{1} & = & \mbox{The number of cards drawn to get a first suit}\\ Y_{2} & = & \mbox{The number of additional cards drawn to get a second suit}\\ Y_{3} & = & \mbox{The number of additional cards drawn to get a third suit}\\ Y_{4} & = & \mbox{The number of additional cards drawn to get the final suit} \end{array}$$Then$$X=Y_{1}+Y_{2}+Y_{3}+Y_{4}$$and hence$$\mathbb{E}\left(X\right)=\mathbb{E}\left(Y_{1}\right)+\mathbb{E}\left(Y_{2}\right)+\mathbb{E}\left(Y_{3}\right)+\mathbb{E}\left(Y_{4}\right)\mbox{.}$$Start with $Y_{1}$. We always get a new suit on the first draw. So $Y_{1}$ is the constant random variable that takes the value $1$ with probability $1$, and $\mathbb{E}\left(Y_{1}\right)=1$.

Now consider $Y_{2}$. Note that the second suit drawn can be any of the three suits not yet drawn. There are $39$ such cards still in the deck following the first draw, while there are $12$ cards remaining of the first suit (Notice that the equal proportions of the suits in the initial deck is important here). Thus, $Y_{2}$ is negative hypergeometric with parameters $N=51$ and $K=39$. Its mean is$$\mathbb{E}\left(Y_{2}\right)=\frac{51+1}{39+1}=\frac{13}{10}\mbox{.}$$ Next consider $Y_{3}$. The third suit drawn can be either of the two suits not yet drawn. There are $26$ such cards still in the deck. However, the number of cards still in the deck having a suit already drawn depends on the value of $Y_{2}$. For instance, if $Y_{2}=4$, then there are $21$ cards remaining in the deck having either of the first two suits, so that $Y_{3}$ is negative hypergeometric with parameters $N=47$ and $K=26$. In general, if $Y_{2}=y_{2}$, then $Y_{3}$ is negative hypergeometric with parameters $N=51-y_{2}$ and $K=26$. The mean of $Y_{3}$, conditional on $Y_{2}$, is$$\mathbb{E}\left(Y_{3}\,\middle|\, Y_{2}=y_{2}\right)=\frac{51-y_{2}+1}{26+1}=\frac{52-y_{2}}{27}\mbox{ .}$$Using the iterative property of conditional expectation, the mean of $Y_{3}$ is$$\begin{array}{rcl} \mathbb{E}\left(Y_{3}\right) & = & {\displaystyle \mathbb{E}\left[\mathbb{E}\left(Y_{3}\,\big|\, Y_{2}\right)\right]=\mathbb{E}\left[\frac{52-Y_{2}}{27}\right]}\\ & = & {\displaystyle \frac{1}{27}\left[52-\mathbb{E}\left(Y_{2}\right)\right]=\frac{1}{27}\left[52-\frac{13}{10}\right]=\frac{169}{90}\mbox{ .}} \end{array}$$Now consider $Y_{4}$. There is only one suit not yet drawn, represented $13$ times in the deck. However, the number of cards still in the deck having a suit already drawn depends on the value of both $Y_{2}$ and $Y_{3}$. More precisely, it depends on the value of $Y_{2}+Y_{3}$. Letting $Z=Y_{2}+Y_{3}$, if $Z=z$, then $Y_{4}$ is negative hypergeometric with parameters $N=51-z$ and $K=13$. The mean of $Y_{4}$, conditional on $Z$, is$$\mathbb{E}\left(Y_{4}\,\middle|\, Z=z\right)\hspace{2mm}=\hspace{2mm}\frac{51-z+1}{13+1}\hspace{2mm}=\hspace{2mm}\frac{52-z}{14}\mbox{.}$$The unconditional mean of $Y_{4}$ is$$\begin{array}{rcl} \mathbb{E}\left(Y_{4}\right) & = & {\displaystyle \mathbb{E}\left[\mathbb{E}\left(Y_{4}\,\big|\, Z\right)\right]=\mathbb{E}\left[\frac{52-Z}{14}\right]=\frac{1}{14}\left[52-\mathbb{E}\left(Z\right)\right]}\\ & = & {\displaystyle \frac{1}{14}\left[52-\mathbb{E}\left(Y_{2}+Y_{3}\right)\right]=\frac{1}{14}\left[52-\mathbb{E}\left(Y_{2}\right)-\mathbb{E}\left(Y_{3}\right)\right]}\\ & = & {\displaystyle \frac{1}{14}\left[52-\frac{13}{10}-\frac{169}{90}\right]=\frac{2197}{630}\mbox{.}} \end{array}$$Putting the results together,$$\mathbb{E}\left(X\right)=1+\frac{13}{10}+\frac{169}{90}+\frac{2197}{630}=\frac{4829}{630}\mbox{.}$$

$\endgroup$
2
  • 1
    $\begingroup$ Sorry to comment on an old post but do u mind explaining why u did N+1/K+1 instead of just doing N/K to calculate E(Y2) and similarly E(Y3),E(Y4) (new to hypergeometrics). $\endgroup$ Commented Mar 5, 2023 at 9:57
  • 1
    $\begingroup$ @quantrader23 It's because the definition of negative hypergeometric RV that I'm using here counts the success draw in addition to the fail draws that precede the success draw. $\endgroup$
    – Marcus Nye
    Commented Jan 17 at 22:06

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .