16
$\begingroup$

There are three horses in the race. You know the following information about them:

  • Horse A will finish the race in 50 or 60 seconds with both events being equally likely.
  • Horse B will always finish the race in 55 seconds.
  • Horse C will finish the race in 53 or 57 seconds with both events being equally likely.

Which horse is most likely to win?

$\endgroup$
2
  • 2
    $\begingroup$ Nothing is stated about the probabilities of A and C finishing in the two timings. So what does "most likely" mean in this case? $\endgroup$
    – WhatsUp
    Commented Dec 14, 2021 at 13:13
  • 2
    $\begingroup$ Added this information. $\endgroup$ Commented Dec 14, 2021 at 13:37

6 Answers 6

28
$\begingroup$

A slightly different approach

50 seconds is the fastest time listed so Horse A has at least 1/2 a chance to win the race. Given that there are scenarios where Horse B or C could win, this means that neither of their chances can be as high as 1/2 so Horse A must be most likely to win.

$\endgroup$
17
  • 2
    $\begingroup$ @Stef Because the sum of the probabilities will be 1, the sum of the probabilities of B or C winning is <= 1/2 and since they are both non-zero, they must also be both less than 1/2 $\endgroup$
    – hexomino
    Commented Dec 16, 2021 at 11:14
  • 5
    $\begingroup$ Where did you get "and since they are both non-zero" from? $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 11:21
  • 4
    $\begingroup$ But how do you know that it's possible? That's not stated in the problem. You're making an assumption there. $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 11:32
  • 3
    $\begingroup$ @hexomino Because then you're answering a different problem. The problem statement describes a situation, and asks a question. Solving the problem means answering in all generality, so that the answer applies to any situation that fits the description. Placing a prior on the possible scenarios is really interesting and leads to an interesting answer, but it's an answer to a slightly different problem. Statisticians who choose their own priors are not solving a well-stated formal problem: they're modelling a problem themselves, then solving the problem under that model. $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 15:23
  • 3
    $\begingroup$ @hexomino Still, you are saying that there is a distribution of probabilities on the scenarios, even if you don't know the exact probabilities of each scenario. So you're answering a different problem where the scenarios are random events. In the given problem, the only random events are the finish-times of the horses. $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 15:32
24
$\begingroup$

It is

not possible to tell.

The following scenarios are compatible with the information given:

Scenario 1:

A and C are 100% correlated, i.e. whenever A finishes in 50, C will finish in 53 and likewise with 60 and 57. Then A and B are tied with both expected to win half the races.

Scenario 2:

A and C are 100% anti-correlated, i.e. whenever A finishes in 50, C will finish in 57 and likewise with 60 and 53. Then A and C are tied with both expected to win half the races.

Scenario 3:

A and C are neither perfectly correlated nor anti-correlated. Then A wins half the races and B and C both win less than half.

$\endgroup$
38
  • 6
    $\begingroup$ That's why in probability puzzles people always repeat "independently uniformly random". $\endgroup$
    – WhatsUp
    Commented Dec 14, 2021 at 14:28
  • 6
    $\begingroup$ But could you deduce from those three scenarios that A is most likely to win if you combine the results? A is always expected to win half the races in every scenario, whereas B and C are expected to win less often (Once half the races and once less than half) $\endgroup$
    – QBrute
    Commented Dec 15, 2021 at 11:38
  • 4
    $\begingroup$ @TheRubberDuck You are confounding probability and statistics, a common mistake. Whatever your reasons to believe scenario 3 is more likely I'd be willing to bet that the same reasons render a horse that finishes a track in either 50 or 60 seconds (and never in, say 55 seconds or 51.62 seconds) utterly implausible. As neither of us has any experience of a world where racing times are distributed binary there is no point in arguing what joint probabilities might be more reasonable to assume. $\endgroup$
    – loopy walt
    Commented Dec 15, 2021 at 17:03
  • 4
    $\begingroup$ Exactly, there is no reason to prefer any scenario over the other - all are possible, and regardless of what (non-zero) probabilities you put on them, A is overall the most likely to win. The only way that A is not uniquely the most likely to win is if you are 100% sure that either Scenario 1 or Scenario 2 are the case, and as you point out, there is no grounds to assume that. If you admit the possibility of all of these scenarios, A is most likely to win. $\endgroup$ Commented Dec 15, 2021 at 17:48
  • 4
    $\begingroup$ @NuclearHoagie Nope, you cannot average over scenarios. Only one of them can be correct, we just do not know which. $\endgroup$
    – loopy walt
    Commented Dec 15, 2021 at 18:30
13
$\begingroup$

Assuming horse performance is uniformly distributed between the two possibilities for A and C and they are independent,

$$\begin{array}{c|cc}C,A&50&60\\\hline53&A&C\\57&A&B\end{array}$$

Therefore

A is most likely to win.

$\endgroup$
1
  • $\begingroup$ Assuming multiple races A will win twice as often as the other horses. $\endgroup$
    – aslum
    Commented Dec 14, 2021 at 22:04
6
$\begingroup$

Horse A

Horse A will win 1/2 the time, horses B and C will each win 1/4 the time.

Because

1/2 the time Horse A will run the race in 50 seconds, and neither horse B or horse C are capable of running that fast. Therefore Horse A will win those races.

In the other 1/2 of the races horse A will run the race in 60 seconds and place 3rd because no other horse will run the race that slowly. But 1/2 of those times horse C will run in 53 seconds and beat horse B, the other 1/2 of those times horse C will run in 57 seconds and lose to horse B.

$\endgroup$
12
  • 1
    $\begingroup$ This answer makes the extra assumption that the events listed in the problem are independent, which wasn't stated in the question; yet the word "independent" doesn't even appear in this answer. So, you're basing your calculations on an unspoken extra assumption. $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 11:19
  • $\begingroup$ When two times are given with 'equally likely' specified, doesn't that mean there are no dependencies? $\endgroup$ Commented Dec 16, 2021 at 14:05
  • 1
    $\begingroup$ No, it doesn't. Imagine the following situation: everyday the horses are fed the same menu. When horse A is fed broccoli, it runs in 50 seconds. When horse C is fed broccoli, it runs in 53 seconds. When horse A is fed lasagna, it runs in 60 seconds. When horse C is fed lasagna, it runs in 57 seconds. Now, I tell you that the probability that today's menu is broccoli is 50%, and the probability that today's menu is lasagna is 50%. Then you can safely conclude: $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 14:11
  • 2
    $\begingroup$ "I think it's contrived to assume some dependency that has to itself be an equal probability for both statements to be true" I agree; it's more than just contrived, it's wrong. We can't assume anything that isn't spelled out in the problem statement. We cannot assume that the events are not independent, and we cannot assume that the events are independent. We have to solve the problem with only the information given in the problem statement. And the answer is: P(A wins) = 50%, and P(B wins or C wins) = 50%, but P(C wins) and P(B wins) cannot be calculated because we lack information. $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 14:47
  • 1
    $\begingroup$ Yes, you're absolutely right. Horse C's result is completely random, with 50% chance of finishing in 53 seconds and 50% chance of finishing in 57 seconds. And we have absolutely no information about how correlated this is with Horse A's finishing times. And since we have no information, we can't just assume. Perhaps Horse C wins 25% of the time, or perhaps Horse C wins 0% of the time, or perhaps 50% of the time, or perhaps some other number. There is no way to know. $\endgroup$
    – Stef
    Commented Dec 17, 2021 at 8:06
4
$\begingroup$

I found the existing answers insightful, but I find that I'm still confusing between scenarios and probabilities. But I found another answer that is helpful to me, and it has not been posted here, so here we go.

Let's denote the joint probability of (A finish in 50 seconds AND C finish in 53 seconds) as $p$ (formally, $p(A=50 \wedge C=53) = p$). Also note that the marginal probability of A finishing in 50 seconds is 0.5, since it has equal probability to A finishing in 60 seconds. Then we have the following:

$ p(A=50 \wedge C=53) = p\\ p(A=50 \wedge C=57) = 0.5-p\\ p(A=60 \wedge C=53) = 0.5-p\\ p(A=60 \wedge C=57) = p $

A wins when A finishes in 50 seconds, regardless of others, so $p(A\text{ wins}) = p + (0.5-p) = 0.5$.

B wins when A finishes in 60 seconds, and C in 57 seconds, so $p(B\text{ wins}) = p$

C wins when A finishes in 60 seconds, and C in 53 seconds, so $p(C\text{ wins}) = 0.5-p$

Now, the question doesn't specify $p$. So we can only rely on the information above with one unknown ($p$).

If $p=0.5$, then A and B have the same probability to win, if $p=0$, then A and C have the same probability to win, otherwise, A is the most likely to win.

Now, also "the most likely to win" is not defined in the question in the case there are two candidates with the same probability. If the question intends it to say that neither are most likely to win, then we don't have an answer, since we don't know $p$, and so we don't know whether there is a single candidate with highest probability to win.

However, if we consider two candidates having equal probability as both most likely to win, then A is most likely to win in all possible values of $p$. In this case we can say that A is most likely to win in all scenarios, although we don't know whether B or C shares that title as well.

For me this thought process is helpful since I couldn't see "scenarios" in loopy's answer as "something that we cannot assign probability to", but I can understand it when I put a variable $p$ to represent the scenario. (To be clear, I'm basically saying that this answer is the same as loopy's answer, but I came to understanding of the situation better through this formulation instead of loopy).

$\endgroup$
14
  • $\begingroup$ I like this reformulation of the answer to clarify things. In the context of this answer, my point from before is that $p$ will have some prior which is neither concentrated at 0 nor at 0.5 - note we do not need to know what the prior is just that it is not concentrated on a specific value, which seems fair since the intention is not specified. Then averaging over this prior gives A the highest probability of winning. Stef argued that you can't do this but I don't see why not. $\endgroup$
    – hexomino
    Commented Dec 24, 2021 at 22:11
  • $\begingroup$ I don't think we can do that. Consider the following statement: "This box contains a ball with probability $p$, otherwise it's empty." Can you say that it has non-zero probability of containing a ball, since if we put a prior over $p$, it's not concentrated on specific value 0? $\endgroup$
    – justhalf
    Commented Dec 25, 2021 at 4:16
  • 1
    $\begingroup$ Hm, yea, I guess our differences is whether we view the process that generates $p$ as part of the probability space or not. For me if $p$ is unspecified, then the process that had generated $p$ doesn't matter, since I consider it to already be in the past. If instead the question specifies how to generate $p$, then I would include that in the probability space. So we differ in when we put our baseline for probability. I start it after $p$ is generated, and you start it before $p$ is generated. So in the multiverse of all values of $p$, you are outside any of them, while I'm inside one. $\endgroup$
    – justhalf
    Commented Dec 27, 2021 at 17:40
  • 1
    $\begingroup$ Yea, so you're taking into account the puzzle writer's thought into the prior. As in, "there is a possibility that the puzzle writer intends $p$ to be non-zero, so we can say that there is a non-zero probability for B and C to not be perfectly correlated". However, I would say that this isn't valid here, since we are already given a question, and the puzzle writer already has a fixed intention. So if we answer from the PoV of the question, there is no more assigning probability to the prior of $p$. But if we answer from the PoV of external observer ... $\endgroup$
    – justhalf
    Commented Dec 28, 2021 at 4:10
  • 1
    $\begingroup$ Interesting discussion overall. I don't know how to resolve our differences. But to respond to your remark "and then not ask for an answer in terms of $p$", I think usually if not mentioned, then you present the answer in terms of the unknowns. E.g., "I have some dogs and chickens, totaling to 10 animals. Do I own more than 30 animal leg?". Then your answer will be basically "well, that depends on how many dogs you have. I can make a formula that is based on the number of dogs $N$, but I can't do better than that." It's the same here, Stef's answer is saying "this is the formula based on $p$." $\endgroup$
    – justhalf
    Commented Dec 28, 2021 at 4:16
1
$\begingroup$

Test it out yourself, and see who is most likely to win.

import random


class Horse:

    def __init__(self, finish):
        self.finish = random.choice(finish)


if __name__ == '__main__':
    count_a = 0
    count_b = 0
    count_c = 0

    for x in range(100000):
        horse_a = Horse([50, 60])
        horse_b = Horse([55])
        horse_c = Horse([53, 57])

        if horse_a.finish < horse_b.finish and horse_a.finish < horse_c.finish:
            count_a = count_a + 1

        if horse_b.finish < horse_a.finish and horse_b.finish < horse_c.finish:
            count_b = count_b + 1

        if horse_c.finish < horse_b.finish and horse_c.finish < horse_a.finish:
            count_c = count_c + 1

    print("HORSE A: " + str(count_a))
    print("HORSE B: " + str(count_b))
    print("HORSE C: " + str(count_c))
$\endgroup$
5
  • 1
    $\begingroup$ Nice work. What is the result of running this? $\endgroup$ Commented Dec 16, 2021 at 4:28
  • 1
    $\begingroup$ @DmitryKamenetsky Horse A has a 50% chance of winning, while Horses B & C both have a 25% chance of winning. $\endgroup$
    – lala
    Commented Dec 16, 2021 at 4:58
  • 2
    $\begingroup$ This answer makes the extra assumption that the events listed in the problem are independent, which wasn't stated in the question; yet the word "independent" doesn't even appear in this answer. So, you're basing your simulations on an unspoken extra assumption. $\endgroup$
    – Stef
    Commented Dec 16, 2021 at 11:20
  • $\begingroup$ @Stef Even worse than assuming "independent". This answer assumes that the events come from a pseudo random number generator ... $\endgroup$
    – WhatsUp
    Commented Dec 18, 2021 at 0:51
  • 3
    $\begingroup$ @WhatsUp It's called a simulation, you being able to work out the result in advance by performing a 'hack' to gain knowledge of the seed number makes no difference to the overall outcome of of the result, whether it came from a pseudo random number generator or an actual random number generator. (-‸ლ) $\endgroup$
    – lala
    Commented Dec 20, 2021 at 4:36

Not the answer you're looking for? Browse other questions tagged or ask your own question.