18
$\begingroup$

My question concerns the solution Professor Mosteller gives for the Locomotive Problem in his book, Fifty Challenging Problems in Probability. The problem is as follows:

A railroad numbers its locomotives in order 1, 2, ..., N. One day you see a locomotive and its number is 60. Guess how many locomotives the company has.

Mosteller's solution uses the "symmetry principle". That is, if you select a point at random on a line, on average the point you select will be halfway between the two ends. Based on this, Mosteller argues that the best guess for the number of locomotives is 119 (locomotive #60, plus an equal number on either "side" of 60 gives 59 + 59 + 1 = 119.

While I feel a bit nervous about challenging the judgment of a mathematician of Mosteller's stature, his answer doesn't seem right to me. I've picked a locomotive at random and it happens to be number 60. Given this datum, what number of locomotives has the maximum likelihood?

It seems to me that the best answer (if you have to choose a single value) is that there are 60 locomotives. If there are 60 locomotives, then the probability of my selecting the 60th locomotive at random is 1/60. Every other total number of locomotives gives a lower probability for selecting #60. For example, if there are 70 locomotives, I have only a 1/70 probability of selecting #60 (and similarly, the probability is 1/n for any n >= 60). Thus, while it's not particularly likely that there are exactly 60 locomotives, this conclusion is more likely than any other.

Have I missed something, or is my analysis correct?

$\endgroup$
4
  • $\begingroup$ Here's my first thoughts: You seem to be analyzing the question from the perspective of "If there are N locomotives, what is the chance of seeing Locomotive 60?", while the correct way to analyze it is "If I see Locomotive 60, what is the chance that there are N locomotives." $\endgroup$
    – PhiNotPi
    Commented Feb 20, 2012 at 19:25
  • 4
    $\begingroup$ This is a specific subcase of the German tank problem. $\endgroup$
    – cardinal
    Commented Feb 20, 2012 at 19:31
  • 1
    $\begingroup$ In a sense, it depends a little bit on how you define "best", but in any event, there are generally better estimators than just choosing the number you've seen. Indeed, consider that your estimate is a fortiori less than or equal to the true number. In statistical parlance, your estimate is (negatively) biased. (Bias is not wholely a bad thing, but in this case, you can remove the bias and reduce the variance simultaneously.) $\endgroup$
    – cardinal
    Commented Feb 20, 2012 at 19:33
  • $\begingroup$ I tagged this as a recreational problem, but the German Tank Problem, of which this is a special case, couldn't be further from recreational! Thanks to the commenters and answerer for setting me straight on the bias issue. $\endgroup$
    – eipi10
    Commented Feb 22, 2012 at 19:16

1 Answer 1

13
$\begingroup$

Choosing $2\times 60 - 1$ gives an unbiased estimate of $N$.

Choosing $60$ gives a maximum likelihood estimate of $N$.

But these two types of estimator are often different, and indeed this example is the one used by Wikipedia to show that the bias of maximum-likelihood estimators can be substantial.

$\endgroup$
3
  • 2
    $\begingroup$ (+1) Indeed, it is not hard to show that $2 X - 1$ is the only unbiased estimator of $N$ (hence it is also, trivially, the unique uniformly minimum variance unbiased estimator). $\endgroup$
    – cardinal
    Commented Feb 20, 2012 at 23:26
  • $\begingroup$ How can we show that 2N-1 is an unbiased estimator (and the only one)? $\endgroup$
    – Shreyans
    Commented Dec 17, 2022 at 10:49
  • 1
    $\begingroup$ @Shreyans $2X-1$ (not $2N-1$) is unbiased because $\mathbb E[X\mid N]=\frac{N+1}{2}$ so $\mathbb E[2X-1 \mid N]=N$ $\endgroup$
    – Henry
    Commented Dec 17, 2022 at 11:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .