3
$\begingroup$

You initially start with all the integers between $1$ and $M$. You then pick $N$ of them randomly, without replacement, to generate a new set of $N$ non-repeating numbers. The maximum of this set is $Z_N$. Say you then add another number from the remaining integers between $1$ and $M$. What is the probability that the maximum of the new set, $Z_{N+1}$, equals $Z_N$?

I believe I am approaching this problem in entirely the wrong way. I have coded up the above scenario to get an empirical estimate of the solution, I have been told the answer is $\frac{N}{N+1}$. What follows is my attempt at solving the problem.

We are trying to calculate $Pr(Z_{N+1}=Z_{N})$. The following is true: $$ \begin{align} Pr(Z_{N+1}=Z_{N})&=\sum_{i=1}^{M}Pr((Z_N=i)\cap(Z_{N+1}=i)) \\ &=\sum_{i=1}^{M}Pr(Z_{N+1}=i|Z_N=i)Pr(Z_N=i) \tag{1}\label{eq1} \end{align} $$ We know that $Z_N$ cannot be smaller than $N$, therefore $$ Pr(Z_{N+1}=Z_{N})=\sum_{i=N+1}^{M}Pr(Z_{N+1}=i|Z_N=i)Pr(Z_N=i) \tag{2}\label{eq2} $$ Now, if we refer to the $(N+1)$-th number we draw as $X_{N+1}$, then we can rewrite \eqref{eq2} as $$ Pr(Z_{N+1}=Z_{N})=\sum_{i=N+1}^{M}Pr(X_{N+1}<i|Z_N=i)Pr(Z_N=i) \tag{3}\label{eq3} $$ For every $i>N+1$, given that $Z_N=i$, we know there are $i-N$ integers below $i$ available to $X_{N+1}$. Given that $X_{N+1}$ is chosen randomly as well, $$ Pr(X_{N+1}<i|Z_N=i)=\frac{i-N}{M-N} \tag{4}\label{eq4} $$ Now we need to find the probability mass function for $Z_N$. I will do this by the typical route of getting the cumulative mass function and then getting the pmf from that. $$ \begin{align} Pr(Z_N<i)&=\prod_{k=1}^NPr(X_k<i) \\ &=\left(\frac{i-1}{M}\right)^N \tag{5}\label{eq5} \\ Pr(Z_N=i)&=Pr(Z_N<i+1)-Pr(Z_N<i)\\ &=\left(\frac{i}{M}\right)^N-\left(\frac{i-1}{M}\right)^N \tag{6}\label{eq6} \end{align} $$ We can now substitute \eqref{eq4} and \eqref{eq6} into \eqref{eq3} to get the following $$ \begin{align} Pr(Z_{N+1}=Z_{N})&=\sum_{i=N+1}^{M}\frac{i-N}{M-N}\left(\left(\frac{i}{M}\right)^N-\left(\frac{i-1}{M}\right)^N\right) \tag{7}\label{eq7} \\ &=\sum_{i=N+1}^{M}\frac{i-N}{M-N}\left(\frac{i}{M}\right)^N-\sum_{i=N}^{M-1}\frac{i+1-N}{M-N}\left(\frac{i}{M}\right)^N \\ &=1-\frac{1}{M-N}\left(\frac{N}{M}\right)^N+\sum_{i=N+1}^{M-1}\left(\frac{i}{M}\right)^N\left(\frac{i-N-i-1+N}{M-N}\right) \\ &=1-\frac{1}{M-N}\left(\frac{N}{M}\right)^N-\frac{1}{M-N}\sum_{i=N+1}^{M-1}\left(\frac{i}{M}\right)^N \\ &=1-\frac{1}{M-N}\sum_{i=N}^{M-1}\left(\frac{i}{M}\right)^N \tag{8}\label{eq8} \end{align} $$

At this point, I am stuck. I'm not sure if \eqref{eq8} is right, and if it is, I have no idea how to get from that to $\frac{N}{N+1}$.

Three questions:

  1. What is the best way to solve this problem?
  2. Is my answer, \eqref{eq8}, correct?
  3. If so, how do I simplify my result? If not, where did I go wrong?

Thank you in advance for any help.

$\endgroup$
5
  • $\begingroup$ If you add another number less than or equal to the original maximum, then the maximum remains unchanged. Otherwise, the newly added number becomes the new maximum. $\endgroup$ Commented Apr 21 at 18:37
  • $\begingroup$ @GeoffreyTrang, yes, I agree with that. I try to make use of that in (8). Am I missing a more direct way to come to the answer using that fact? $\endgroup$ Commented Apr 21 at 18:50
  • 6
    $\begingroup$ I might be completely wrong but this problem is saying that if you choose $N+1$ different numbers from $1$ to $N$, then what is the probability that the last one you have chosen is not the largest? Then clearly the probability would be $1-\frac1{N+1}=\frac{N}{N+1}$. $\endgroup$
    – Sai Mehta
    Commented Apr 21 at 18:51
  • $\begingroup$ @SaiMehta Aaah that makes sense. Thank you! I hadn't thought of it that way. $\endgroup$ Commented Apr 21 at 18:54
  • $\begingroup$ @SaiMehta it is correct. You may write it as an answer. $\endgroup$
    – Amir
    Commented Apr 21 at 20:00

1 Answer 1

3
$\begingroup$

Your work is not correct, for the following reason. You erred when you said $$ P(Z_N<i)\stackrel{\text{error}}=\prod_{k=1}^NP(X_k<i). $$ The problem is that since we are sampling without replacement, the variables $X_1,\dots,X_N$ are dependent, so you cannot just multiply the probabilities. Instead, $$ P(Z_N<i)=\binom{i-1}{N}\Big/\binom{M}{N}. $$ Indeed, in order to have a maximum of less than $i$, then you need to choose the $N$ numbers from the set $\{1,\dots,i-1\}$, so there are $\binom{i-1}{N}$ favorable cases.

If you wanted, you could redo your calculations with the above correct formula for $P(Z_n<i)$, and then show algebraically that the result is $N/(N+1)$. However, it is easier to argue from symmetry. We know that the maximum among $X_1,\dots,X_N,X_{N+1}$ occurs at a unique entry. By symmetry, each entry is equally likely to be the largest, so each entry has probability $1/(N+1)$ of being the largest. The event $Z_N=Z_{N+1}$ occurs if and only if the maximum is among the first $N$ entries, hence the probability of $N/(N+1)$.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .