1
$\begingroup$

Suppose we have created an army of n clones which are completely identical(except they may have different birthdays). The cloning happened at different times such that all 365(disregarding the 366th day) birthdays are equally likely.

What is the probability of at least 2 people sharing a birthday in this indistinguishable setting.

In the original birthday problem the solution is $$1-\frac{365Pn}{365^n}$$ But here the solution assumes the distinguishability of the people.

Also let us note that for the indistinguishable case the solution $$1-\frac{365 \choose n}{365+n-1 \choose n}$$ is incorrect because it fails to regard the probability weighting of each outcome as they are not equally likely(for example probability of two people having two Sep 1s is less that probability of having Sep 1 and Sep 2 as it can happen in two cases).

$\endgroup$
8
  • 2
    $\begingroup$ I would say that "distinguishability" is a moderately useful and quite often abused didactical device. If it helps you choose the correct way of counting, good, if not, don't use it. The solution has to be $1-\frac{365\cdot 364\cdots(365-n+1)}{365^n}$ as you said. If you arrive to it by distinguishing the people, so be it. If they are indistinguishable, print out labels saying "Clone $1$" ... "Clone $n$" and stick to the clones and make them distinguishable. (TBC) $\endgroup$
    – user700480
    Commented Sep 17, 2022 at 6:04
  • $\begingroup$ (Cont'd) The second ("indistinguishable case") solution is not jut incorrect but unjustified as well - you cannot just say "let me use the same formula as in the distinguishable case but replace any permutations with combinations, and it will automatically work." It won't necessarily. It may sometimes - but not here. $\endgroup$
    – user700480
    Commented Sep 17, 2022 at 6:05
  • 3
    $\begingroup$ This question doesn't make sense. The answer can't possibly depend on whether you distinguish the clones or not; you ask for a probability, not a count, and your description of choosing the birthdays uniformly at random already determines the answer (because it's the same as the answer to the ordinary birthday paradox problem). $\endgroup$ Commented Sep 17, 2022 at 6:05
  • $\begingroup$ @Stinking Bishop can you please elaborate why the "distinguishability" does not make a difference. Doesn't it affect the Sample and Event spaces? $\endgroup$
    – John Man.
    Commented Sep 17, 2022 at 6:17
  • $\begingroup$ Sample and event space are of course affected, exactly as you said in the question. By choosing one formula vs. another you have picked one events space vs. the other. In one of them, as you said, all $365^n$ possibilities of the birthdays are equally probable, in the other one they are not. As the condition of the problem is that they must be equally probable, you have to use the formula which comes out using the "distinguishing" step. Perhaps it will help if you try it out with a hypothetical alien year that has only two days, and set $n=2$ and see what exactly happens. $\endgroup$
    – user700480
    Commented Sep 17, 2022 at 6:22

1 Answer 1

0
$\begingroup$

Since we are not given whether or not the clone's birthdays are independent or not, there is not enough information to determine the probability that no two share a birthday.

Suppose you were given that the clone's birthdays were independent. Let us number the clones in an arbitrary order. The definition of independence implies that for every sequence $(b_1,b_2,\dots,b_n)$, where $b_i$ is a day of year for each $i\in \{1,\dots,365\}$, the probability of that sequence occurring (meaning the $i^\text{th}$ clone has birthday $b_i$ for all $i$) is $(1/365)^n$. In particular, all of these sequences are equally likely. Therefore,we can find the probability of all birthdays being different by counting the number of sequences where all entries are different, and dividing by $365^n$. The result is $$ 365\cdot 364\cdots\cdot (365-n+1) \over 365^n $$ which is the same result as the distinguishable person case.

The takeaway message here is that independence implies that the underlying sample space is most conveniently thought of as distinguishable, at least for the purpose of counting cases.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .