What are the probabilities of rolling doubles in a pool of d6s?

Question

I am thinking of designing a ttrpg, and as a dice rolling system I was thinking of a system where players roll a pool of d6s. Any roll that results in multiples of the same number is a success, and is more successful the more of the number rolled. To design the system fairly, I need to know probabilities of rolling doubles.

How many d6s would need to be rolled for the probability of rolling at least two of the same number to be 50%? What would the odds be of rolling at least two of the same number for a pool one or two sizes smaller or larger? What would the likelihood of rolling at least three of the same number in these pools? And finally, what formula or process could I use to calculate these probabilities on my own?

You might find some of the answers to this question helpful. — bowdens, Commented Oct 5, 2021 at 0:39
You might be interested in the One Roll Engine. en.m.wikipedia.org/wiki/One-Roll_Engine — Kyyshak, Commented Oct 5, 2021 at 7:46
How are the cases with two doubles (or more) handled? I mean, for example, if you roll 5d6 you may have (2,2,6,6,1). — Eddymage, Commented Oct 5, 2021 at 12:49
@Eddymage I would say it just counts as one success. However, I am beginning to see this may not be the best idea for a dice rolling system. — TheDragonOfFlame, Commented Oct 5, 2021 at 14:04
I'm trying to remember which system had this and it was basically the whole reason my group decided not to even try it.... maybe the Mistborn RPG? — gatherer818, Commented Oct 7, 2021 at 1:22

Dale M · Accepted Answer · 2021-10-05 22:34:37Z

The pigeonhole principle

The probabilistic generalisation of the pigeonhole principle tells us that for \$n\$ pigeons randomly assigned to \$m\$ pigeonholes with equal probability, the chance that one hole will contain at least 2 pigeons is:

$$ 1-{{(m)_n}\over{m^n}} $$

where \$(m)_n\$ is the falling factorial.

For us, our pigeons are the number of dice and the holes are the numbers they can display. For 0 & 1 dice, the answer is obviously 0 - you need at least 2 dice to have a pair. For 7 or more dice it's one due to the non-generalised pigeonhole principle - 7 pigeons can't fit in 6 holes without at least 2 sharing.

For 2 to 6 dice, the chances are \$1\over 6\$, \$4\over 9\$, \$13\over 18\$, \$49\over 54\$ and \$319\over 324\$ respectively.

Balls into bins

Now, this is a subset of the balls into bins problem which has been extensively studied because its a fundamental resource allocation problem in computers - if we have \$n\$ requests per second directed at \$m\$ servers/routers/whatevers, how much capacity does each need to cope with the maximum expected load. Or, more commonly, for a given capacity, how many do we need. In that area, they are looking at big numbers where it's not practical (or possible) to actually do the calculations since they involve factorials and the factorial of a big number is a f&%$ing big number (which is a technical computer science term meaning bigger than we can work out).

However, for real dice at a real table, we can work it out exactly. and by "we", I mean "you" but I'll get you started.

For the reasons above, for less than 3 dice the probability is 0 and for 13 or more the probability is 1 - it's impossible to roll 13 dice and not have at least 3 matching.

The total number of ways of putting \$n\$ balls into \$k\$ bins (where \$n\$ is the number of dice and \$k\$ is the number of sides = 6) is:

$$\begin{align} D&={{n+k+1}\choose{k-1}}\\ &={{n+7}\choose{5}}\\ \end{align} $$

To find the number of ways where there are at least 3 matching numbers, we first find the complement: how many ways are there where the most number of matches is 2? To do this we use a version of the balls-in-bins problem where we assign each bin a maximum capacity of 2. The number of ways we can arrange \$n\$ balls into \$k\$ bins subject to the contains that each bin can have at most \$m\$ balls is:

$$\begin{align} N&=\sum_i{(-1)}^i{k\choose i}{{n+k-1-i(m+1)}\choose {k-1}}\\ &=\sum_i{(-1)}^i{6\choose i}{{n+5-i(3)}\choose {5}}\\ \end{align} $$

Your probability of at least one triple is, therefore, \$1-{N\over D}\$. Plug in the numbers from 3 to 12 and crank the handle.

ShadowRanger · Accepted Answer · 2021-10-05 01:11:55Z

You're going to have a fairly limited range where success (of some level) is possible, but isn't guaranteed.

For 1d6, it's obviously impossible to roll doubles. 7d6 guarantees at least one double by the pigeonhole principle. So only from 2d6 to 6d6 do you have any uncertainty at all. Because my conditional probability skills have atrophied, I just brute-forced the chance of success from 2d6 to 6d6 with Python:

>>> from collections import Counter; from itertools import product
>>> def prob_success(n):
...     c = Counter(len(x) == len(set(x)) for x in product(range(1, 7), repeat=n))
...     return c[False] / sum(c.values())
...

>>> prob_success(2)
0.16666666666666666

>>> prob_success(3)
0.4444444444444444

>>> prob_success(4)
0.7222222222222222

>>> prob_success(5)
0.9074074074074074

>>> prob_success(6)
0.9845679012345679

You'll note the odds of success rise quickly; the birthday problem (or "paradox") means your odds of a repeat increase faster than a mere one-sixth for each additional die, so even within your range with a possibility of failure, three of five pools are heavily weighted towards success, and only one is heavily weighted towards failure. You'll have basically no ability to provide granular, predictably increasing levels of difficulty.

As it happens, the generalized birthday problem (for a "year" with six "days", looking for two or more "people" having the same birthday) describes your exact situation, so if you want to look for a general solution, that's the mathematical tool you'd use to solve for arbitrary numbers of dice (it's adaptable to different die sizes too), with arbitrary degrees of success (doubles, triples, etc.).

For triples, you have the same obvious end points (impossible below three dice, guaranteed to get a triple with 13 or more dice). Because I don't trust myself to apply the birthday problem, a generalized Python function that solves (incredibly slowly for large inputs) for arbitrary numbers of repetitions on arbitrary numbers of dice would be:

def prob_success(n, rep=2):
    c = Counter(max(x.values()) >= rep for x in map(Counter, product(range(1, 7), repeat=n)))
    return c[True] / sum(c.values())

# Or the much faster probabilistic one that gets the similar results for all the cases the exhaustive one can compute in reasonable time:
def prob_success_p(n, rep=2):
    d6_opts = tuple(range(1,7))
    c = Counter(max(Counter(random.choices(d6_opts, k=n)).values()) >= rep for _ in range(100_000))
    return c[True] / sum(c.values())

from 3 to 10 dice (stopped at 10 because exhaustive search is slow), the probabilities go:

3d6: 0.0278
4d6: 0.0972
5d6: 0.2130
6d6: 0.3673
7d6: 0.5409
8d6: 0.7074
9d6: 0.8425
10d6: 0.9325
11d6: 0.9788 (from probabilistic solver)
12d6: 0.9966 (from probabilistic solver)

which shows a less extreme version of the pattern looking for doubles; the probability of triples starts low (2.78% with three dice; basically, the odds that the other two dice happen to come up the same as whatever the first die rolled), but grows quickly; adding just one die makes a big change in the odds of "super success".

@TheDragonOfFlame: Gave data on triples. It's a less extreme form of the doubles case, but still pretty unpleasant. And when you're getting into larger numbers of dice, you're going to have a harder time determining when you have triples at a glance. I don't think this is a good idea for a game mechanic. — ShadowRanger, Commented Oct 5, 2021 at 1:01
I agree. I played a game with a similar mechanic, but that was just ‘whichever number that you rolled two of is your result I.e. 3,5,2,5, your result is 5. And never more than 6 dice. — TheDragonOfFlame, Commented Oct 5, 2021 at 3:07

Ilmari Karonen · Accepted Answer · 2021-10-04 23:49:24Z

For a numerical answer, we can use the AnyDice code I just recently wrote for another answer, which calculates the highest number of identical values in a roll of dice:

function: highest number of matches in ROLL:s {
  MAX: 0
  loop I over ROLL {
    COUNT: ROLL = I
    if MAX < COUNT { MAX: COUNT }
  }
  result: MAX
}

output [highest number of matches in 5d6]

The trick is to plot the results of this program in "at least" mode:

From this plot, we can read that the probability of rolling at least two identical values on 5d6 is 90.74%, while the probability of rolling at least 3 identical values is 21.30%, and so on.

Ps. If you want, it's also quite easy to tweak the program to calculate these probabilities for several different dice pool sizes and to graph the results:

HighDiceRoller · Accepted Answer · 2021-11-03 02:24:44Z

1

Stacked area plot up to 20 dice

This was generated using the algorithm from my CthulhuTech answer. Though for this specific problem Dale M's balls-into-bins formulation is probably easier to understand.

answered Nov 3, 2021 at 2:24

HighDiceRoller

4,8401 gold badge16 silver badges37 bronze badges

Add a comment |

Stack Exchange Network

What are the probabilities of rolling doubles in a pool of d6s?

4 Answers 4

The pigeonhole principle

Balls into bins

You're going to have a fairly limited range where success (of some level) is possible, but isn't guaranteed.

Stacked area plot up to 20 dice

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
statistics
dice
.

Linked

Hot Network Questions

4 Answers 4

You're going to have a fairly limited range where success (of some level) is possible, but isn't guaranteed.

Stacked area plot up to 20 dice

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged statisticsdice.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
statistics
dice
.