21
$\begingroup$

I heard an interesting question recently:

What is the minimum number of people required to make it more likely than not that all 365 possible birthdays are covered?

Monte Carlo simulation suggests 2287 ($\pm 1$, I think). More generally, with $p$ people, what is the probability that for each of the 365 days of the year, there is at least one person in the group with that birthday? (Yes, ignoring the leap-day.)

$\endgroup$

3 Answers 3

27
$\begingroup$

For the coupon collector's problem with $n$ objects, let $T$ be the number of trials needed to get a complete set. Then we have the formula $$P(T\leq k)=n^{-k}\ n!\ \left\lbrace {k\atop n}\right\rbrace.$$ Here the braces indicate Stirling numbers of the second kind.

With $n=365$, Maple gives me $P(T\leq 2286)=.4994$ while $P(T\leq 2287)=.5003$, so that $2287$ is the smallest number to give a 50% chance to get all 365 birthdays.

$\endgroup$
9
  • 3
    $\begingroup$ Thanks, but I admit that approximations like those you provide are often more useful. $\endgroup$
    – user940
    Commented Mar 13, 2011 at 20:40
  • 3
    $\begingroup$ It's always good to see both. It can provide intuition in both directions. $\endgroup$
    – cardinal
    Commented Mar 13, 2011 at 20:42
  • 1
    $\begingroup$ Curiously, I produced the same formula for a recent similar question on the probability that with $m$ balls and $n$ boxes each box had a ball but a different formula producing different results was much more popular. $\endgroup$
    – Henry
    Commented Mar 13, 2011 at 21:34
  • 1
    $\begingroup$ @Henry That is a nice answer! $\endgroup$
    – user940
    Commented Mar 13, 2011 at 21:52
  • 6
    $\begingroup$ @Henry Just in case, I mention you should not worry about the relative popularities of answers on MSE and the like. To ponder on the ways votes accumulate on some answers and not on others is simply a waste of time, believe me. $\endgroup$
    – Did
    Commented Mar 14, 2011 at 6:50
27
$\begingroup$

As Ross also states, this can be framed as a coupon collector problem where birthdays correspond to coupons and individuals correspond to trials. Then, good bounds and a strong asymptotic result are both known for the time (number of trials) needed to collect at least one coupon of each unique type.

Let $T$ be the number of trials at which all coupons have been collected. Let there be $n$ coupons (so $n = 365$ in our case). For each $n$, and any $c > 0$,

$$ \mathbb{P}(T > n \log n + c n) < e^{-c} $$

and, also,

$$ \mathbb{P}(T < n \log n - c n) < e^{-c} \> . $$

From the first inequality we get that no more than 2407 trials are needed so that there is greater than 50% chance all coupons are collected, and from the second we know that at least 1900 are needed.

For more details, see the recent question and answer here.


The following classical asymptotic result holds for $c \in \mathbb{R}$,

$$ \mathbb{P}(T > n \log n + c n ) \to 1 - e^{-e^{-c}} . $$

See, e.g., Motwani and Raghavan, Randomized Algorithms, pp. 60--63 for a proof.

Note that if you plug in $c = -\log \log 2$ here, one gets 2287.240 which matches very closely to Byron's exact answer and the Monte Carlo estimate reported in the question.

$\endgroup$
2
  • 5
    $\begingroup$ I would be interested to know about the reason for the downvote. I don't post answers very often, preferring to comment where possible, instead. Hence, I try to take great care when answering---even continually revisiting what I've written in an effort to improve. Suggestions for improvement are welcome. $\endgroup$
    – cardinal
    Commented Mar 14, 2011 at 12:19
  • 1
    $\begingroup$ I dunno who or why about the downvote, but I've upvoted your answer to acknowledge its merit. $\endgroup$
    – Mike Jones
    Commented Aug 25, 2011 at 20:02
7
$\begingroup$

This is the Coupon collector problem The expected value to cover them all (not quite what you asked) is $365 \sum_{i=1}^{365}\frac{1}{i}\approx 365n \ln 365 + 365\gamma + \frac{1}{2}$

$\endgroup$
2
  • $\begingroup$ +1 Ahh, right, of course it's the coupon collector problem. $365\sum_{i=1}^{365}\frac{1}{i}\approx 2364.65$, so I'm pretty sure the expected number of people is not the same as the number of people to have a better-than-50% chance, though. $\endgroup$
    – Isaac
    Commented Mar 13, 2011 at 20:23
  • 2
    $\begingroup$ @Isaac: I would think the expected number is higher than the number to have 50% chance as the tail can go on a long way. The numerics quoted support this. $\endgroup$ Commented Mar 14, 2011 at 1:05

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .