29
$\begingroup$

You are given a biased coin. With probability $p$ it produces heads, where $p$ is in the range $(0,1)$. How can you use this coin to simulate an unbiased fair coin, ie., a coin that produces heads with probability $0.5$?

$\endgroup$
5

7 Answers 7

35
$\begingroup$

One possibility:

Toss twice. If the outcomes are the same discard and start over. If they are different forget the first outcome and take the second as the "fair" one.

This works because:

The outcomes tails-heads and heads-tails have the same probability p(1-p). The "deeper" reason for this is that the coin doesn't change over time and the outcome of a single toss does not depend on history.

EDIT: Inspired by @trolley813's answer here is a way to recycle the rejected entropy:

Remember the rejected pairs and treat HH as the H of a new virtual coin and TT as the T. The new probabilies will be p^2 / (p^2+(1-p)^2) for H and (1-p)^2 / (p^2 + (1-p)^2) for T. To this virtual coin apply the same procedure as above. Note that this procedure can be stacked ad lib. (You'd need an infinite sheet of paper but returns are diminishing exponentially, meaning you can reap most of the benefits on a single sheet of A4 (or letter if you happen to be American) paper.)

$\endgroup$
9
  • 4
    $\begingroup$ For reference, this is known as Von Neumann's randomness extractor. $\endgroup$ Commented May 17, 2021 at 5:09
  • 3
    $\begingroup$ HT and TH both have the same probability, namely $p(1-p)$. This means that our final output is truly unbiased. $\endgroup$ Commented May 17, 2021 at 5:46
  • 3
    $\begingroup$ Of course, the closer p is to being exactly 0 or exactly 1, the longer this will take. A purely biased coin that lands on the same side 100% of the time will take an infinite amount of tosses to give you a useable result. $\endgroup$ Commented May 17, 2021 at 15:11
  • 4
    $\begingroup$ +1 for correctly nesting parentheses in text. $\endgroup$
    – Jeffrey
    Commented May 17, 2021 at 22:10
  • 3
    $\begingroup$ @DarrelHoffman 0 and 1 are explicitly ruled out in the question. That a more biased coin will take more tosses is a necessary consequence of the more biased coin's lower entropy. After all, to simulate a fair coin toss you need at least one bit of entropy. $\endgroup$ Commented May 18, 2021 at 8:29
7
$\begingroup$

A straightforward answer (actually, a generalisation of loopywalt's answer):

Toss the coin $n=2^k$ times, and the number ${n \choose h}=\frac{n!}{(h!)(n-h)!}$ will be necessarily even ($n$ is the total number of throws and $h$ is the number of heads) except when all throws produced the same results (i.e. $h=0$ or $h=n$, but the probability of that is $p^n+(1-p)^n$, which tends to zero upon $n\to\infty$ when $0<p<1$.). Since all $n \choose h$ ways of throwing $h$ heads in $n$ throws are equally probable (the order does not matter), you can split all possible outcomes in 2 halves (since ${n \choose h}$ is even), e.g. lexicographically, and assign the result according to which half the actual outcome belongs.

Example:

We got two heads with four throws. The number ${4 \choose 2}=6$ is (of course) even, and we get the 6 (lexicographically ordered) possible outcomes (the numbers of throws which gave heads): $(1, 2)$, $(1, 3)$, $(1, 4)$, $(2, 3)$, $(2, 4)$ and $(3, 4)$. If the actual outcome is among the first 3, we get a "head", otherwise a "tail".

$\endgroup$
11
  • 4
    $\begingroup$ Straightforward? More seriously, is it ever worthwhile, i.e. can you find an example where this method is better (in terms of expected number of tosses) than the simple one? $\endgroup$
    – loopy walt
    Commented May 17, 2021 at 6:01
  • 2
    $\begingroup$ @loopywalt If you got HHTT, this method would give you a result, while the simple one wouldn't. $\endgroup$
    – Deusovi
    Commented May 17, 2021 at 6:33
  • 2
    $\begingroup$ It's not that simple: Once you make your method dependent on a stopping criterion you can no longer assume the outcomes to be equal probability. For example all outcomes which would have allowed you to stop earlier now have probability zero. $\endgroup$
    – loopy walt
    Commented May 17, 2021 at 7:25
  • 4
    $\begingroup$ @trolley813 I don't think you can do "if 2 tosses didn't work throw 2 more times". Given a coin with 90% heads, you would get TT=1%, TH=9%, HT=9%, HH=81%. If you then throw 2 more times after HH, you will end up with HHxx,, or HHHHxxxx, etc, all of which convert to "head" given your lexicographical ordering. So in the end, the results would be TT=1% (T), TH = 9% (T), HT=9% (H), HH=81% (H), for a final of 10% T and 90% H. There may be an ordering other than lexicographical that makes "extending the tosses" work, but you'd have to prove it. $\endgroup$
    – JS1
    Commented May 17, 2021 at 7:28
  • 2
    $\begingroup$ @JS1, (134),(234) have already been considered, so answer needs to explicitly state these are no longer under consideration at the new decision point, and use a lexicographical ordering of the remaining outcomes. $\endgroup$
    – Steve
    Commented May 17, 2021 at 7:48
4
$\begingroup$

Combining the ideas from loopywalt and trolley813's answers, start by:

tossing the coin twice. If they are different forget the first outcome and take the second as the "fair" one.

After that:

continue making pairs of tosses. As soon as any pair differs, take the second as the "fair" one.

If this does not make a decision,

All pairs of tosses are necessarily HH or TT, which can be used to simulate an even more biased coin (HH mapping to H, and TT mapping to T), with the exact same procedure run from these results - e.g. HHTT maps to HT which is read as tails and TTHH maps to TH which is read as heads)

Generalising this to n

any sequence except a continuous run of H or T will be resolved within $2^n$ tosses. The point at which it is resolved will map to a final sequence of HT or TH either for the original coin, or for one of the layers of "virtual coins".

There is then also the possibility of further shortening the sequence, for example:

After 6 tosses resulting in HHHHTT, the next 2 tosses cannot fail to make a decision, and the next toss is completely irrelevant.
HHHHTTHH => HHTH (level 1 virtual coin) => "heads" (final toss of non-matching final pair)
HHHHTTHT => "tails" (final toss of non-matching final pair)
HHHHTTTH => "heads" (final toss of non-matching final pair)
HHHHTTTT => HHTT (level 1 virtual coin) => HT (level 2 virtual coin) => "tails"(final toss of non-matching final pair)

As such the 7th toss in this case is completely irrelevant, and could be skipped, allowing a decision with 1 fewer toss. In effect, we can immediately compress the HHHHTT sequence to HHT, upgrading level 1 virtual coins to "real" past tosses.

A simplified procedure for this seemed to suggest itself:

  1. Toss the coin twice.

  2. If the two tosses are different, ignore the first one, and the second is the final result.

  3. While all tosses are identical keep tossing until you get the opposite outcome.

  4. If you've done an EVEN number of tosses at the point you get a difference, the final coin toss is your result.

  5. Toss once more to make the number of tosses even. If it differs from the last toss, that final coin toss is your result.

  6. All tosses are in pairs of identical tosses. Compress to a sequence half the length, which has itself just had a toss of "opposite outcome", and go to step 4.

However, it seems the "compression" is over-aggressive, and may not work in certain cases:

Consider a sequence HHHHHHHHTT, which the above procedure prematurely compresses to HHHHT.
Rather than terminating early, we roll 6 more times, creating:
HHHHHHHHTTabcdef
The final result will necessarily be one of b, d, f, and the procedure will certainly terminate by then. HHHHHHHHTTHHxxxx => HHHHTH => "heads" (b)
HHHHHHHHTTHTxxxx => "tails" (b)
HHHHHHHHTTTHxxxx => "heads" (b)
HHHHHHHHTTTTHHHH => HHHHTTHH => HHTH => "heads" (f)
HHHHHHHHTTTTHHHT => "tails" (f)
HHHHHHHHTTTTHHTH => "heads" (f)
HHHHHHHHTTTTHHTT => HHHHTTHT => "tails" (f)
HHHHHHHHTTTTHTxx => "tails" (d)
HHHHHHHHTTTTTHxx => "heads" (d)
HHHHHHHHTTTTTTHH => HHHHTTTH => "heads" (f)
HHHHHHHHTTTTTTHT => "tails" (f)
HHHHHHHHTTTTTTTH => "heads" (f)
HHHHHHHHTTTTTTTT => HHHHTTTT => HHTT => HT "tails" (f)
Toss 'e' is completely irrelevant, but tosses 'a' and 'c' seem to be needed to determine whether we take 'b' 'd' or 'f' as the final result.

As such a more complicated compression function seems to be required (which I'd thought of before, and prematurely over-simplified it to the one above, missing a critical step in the logic):

  1. Keep tossing the coin until the first point you've seen BOTH outcomes. WLOG, assume that's a long sequence of 'H' followed by a single 'T'.

  2. If the total number of tosses is even, stop, with the result as the final toss (WLOG that's tails).

  3. Toss once more to get an even number. WLOG, if the extra toss is 'H' matching the sequence, the final result is "heads".

  4. If total number of tosses is divisible by 4, stop with the result as the final toss (either of the two identical tosses which differ from the rest of the sequence).

  5. If total number of tosses is 6 (mod 8), toss ONCE more, and return that as the final result (either due to not-matching the skipped coin, or not-matching one of the preceding runs)

  6. Total number of tosses is now 2 (mod 8). Toss TWICE more, if the two results differ, return the second as the final result. If they are both 'H' (matching the long sequence), return THAT as the final result.

  7. Total number of tosses is now 4 (mod 8) with the final 4 all 'T' and at least 8 'H' preceding it. Toss the coin twice more, and if the two results differ, return that as the final result.

  8. Now the total number of tosses is 6 (mod 8), with 6 identical tosses preceded by at least 8 opposite tosses. If the total is 14 (mod 16), toss ONCE more and return that as the final result.

  9. Otherwise, WLOG we have a sequence of 16 H followed by 8 T. This can be treated as equivalent to a long sequence of T for the next few flips. It will remain similarly unresolved for any sequence of (e.g.) $2^{n+1}$ H followed by $2^n$ T, then perhaps $2^(n-1)$ H, etc. in an alternating sequence.

However note that the worst case to reach before this "compression function" can be invoked is is

$2^k - 2$ tosses, where the first $2^{k-1}$ are identical, and the remainder contain at least one precisely-aligned run of $2^j$ of the opposite face (with $1 \le j < k$)

This does indeed then create almost as much potential book-keeping as the uncompressed version, although it's possible that a more efficient compression function (or possibly a better mathematical justification for the original one if it was "right by co-incidence") could be devised.

$\endgroup$
7
  • $\begingroup$ Nice. I wonder what the average number of tosses is needed to reach an outcome? Is this the optimal we can do? $\endgroup$ Commented May 17, 2021 at 8:49
  • 1
    $\begingroup$ @DmitryKamenetsky there will be more optimal procedures for cases where the value of p is known precisely. For example if $p^2$ or $(1-p)^2$ is precisely 0.5, a decision can be made within 2 tosses (and may be decided in 1 toss if the less likely outcome occurs first) $\endgroup$
    – Steve
    Commented May 17, 2021 at 8:54
  • 1
    $\begingroup$ It's a bit more complicated than the steps make it seem since the virtual coins in the compressed sequence do not have the same bias as a single coin. So if you've compressed the sequence in step 6, and step 4 give no immediate result, in step 5 you have to toss an extra virtual coin, i.e. toss the real coin twice. It's tricky to write down the recursivity correctly, especially when there can be several compression steps. $\endgroup$ Commented May 17, 2021 at 9:06
  • $\begingroup$ @JaapScherphuis as I read it the shortcut of "skipping the last but one toss" is meant to be used here. I'm unsure, though, whether that works over multiple recursion levels. $\endgroup$
    – loopy walt
    Commented May 17, 2021 at 9:16
  • 1
    $\begingroup$ @DmitryKamenetsky I added another answer covering that case. $\endgroup$
    – Steve
    Commented May 17, 2021 at 15:02
3
$\begingroup$

Similar to the previous and accepted answer:

keep track of
- the number of throws (N)
- the last number of subsequent heads/tails (L)

Start throwing the coin
- if N is a multiple of 2 and L = 1: use the last result
- if N is a multiple of 4 and L = 2: use the last result
- if N is a multiple of 8 and L = 4: use the last result
- if N is a multiple of 16 and L = 8: use the last result
- ... (add similar lines as long as you can calculate faster than you can throw more)

$\endgroup$
3
$\begingroup$

[An earlier version of this answer had the wrong formulae - these are now corrected]

Assuming that $p$ is known, we can proceed as follows:

  1. Initialise $t_1 = 0.5$ (target probability)

  2. Flip the coin for the nth time.

  3. If coin landed H, set $t_{n+1} = t_n/p$, otherwise, set $t_{n+1} = (t_n-p)/(1-p)$

  4. If $0 < t_{n+1} < 1$ go back to step 2.

  5. the coin shows the final result, as does the sign of the value of $t_{n+1}$ (0 counts as negative).

This procedure is optimal where $p$ is known, and in particular will always resolve in a single toss when $p = 0.5$.

Worked example for $p = 0.2$:

1:H => $t_2 = 0.5 / 0.2 = 2.5 \ge 1$ : immediate result
1:T => $t_2 = (0.5 - 0.2) / 0.8 = 0.375 \not\le 0$ : continue to 2
2:H => $t_3 = 0.375 / 0.2 = 1.875 \ge 1$ : immediate result
2:T => $t_3 = (0.375 - 0.2) / 0.8 = 0.21875 \not\le 0$ : continue to 3
3:H => $t_4 = 0.21875 / 0.2 = 1.09375 \ge 1$ : immediate result
3:T => $t_4 = (0.21875 - 0.2) / 0.8 \approx 0.02344 \not\le 0$ : continue to 4
4:H => $t_5 = 0.02344 / 0.2 \approx 0.11719 \not\ge 1$ : continue to 5
4:T => $t_5 = (0.02344 - 0.2) / 0.8 \approx -0.2207 \le 0$ : immediate result
5:H => $t_6 = 0.11719 / 0.2 \approx 0.5859 \not\ge 1$ : continue to 6
5:T => $t_6 = (0.11719-0.2)/ 0.8 \approx -0.1035 \le 0$ : immediate result
...
(in this table, a head in any of the first 3 flips should be kept, as $p$ is so low, that even this doesn't reach 0.5 probability, but after 3 flips without a head, the next flip should be kept only if it's a tail, etc...)
The procedure will terminate in a finite time with probability 1, and for a countably infinite number of values of $p$ has an upper bound to the number of flips. I think that the "average" number of flips required would be inversely proportional to $p(1-p)$, and 2 for an "almost precisely fair" coin (where each flip has a $(0.5 \pm \epsilon)$ chance of terminating the procedure).

For a known $p$

a table can be pre-generated up to any finite number of flips, stating for each flip whether H or T should be kept.

This can in principle be inverted,

by generating such a table (in effect a binary sequence), and then calculating what value of $p$ that corresponds to. Maths for that left as an exercise for the reader... there would also be a 1:1 correspondence between any possible value of $p$ and the (potentially infinite) binary sequence representing this table.
As an example of other special values where the sequence of rolls is always finite, for $p = \frac1{\sqrt2}$ or $p = 1 - \frac1{\sqrt2}$, the fair coin can be simulated precisely with 2 tosses, corresponding with the solutions of $\frac{\frac{0.5}p-p}{1-p}=0$ and $\frac{\frac{0.5-p}{1-p}-p}{1-p}=0$. There are 4 special values of $p$ for which the fair coin would be simulated with no more than 3 tosses, etc.

Similar sequences would more commonly be used

when you have a "fair coin", and want to simulate some other probability, or for the mathematical basis of Huffman coding or Arithmetic coding.

It can be generalised too, for example to simulate the roll of a fair die

use an array of starting thresholds from 1/6 to 5/6 instead of a single threshold at 0.5, and iterate until all thresholds are either $\le 0$ or $\ge 1$. Similar techniques can convert any known-probability event to any other.

If you want to guarantee that an arbitrary (but known) $p$ will still generate a result within a finite number of rolls, given the practical constraints (e.g. of floating point numbers, or of the required precision for the particular simulation):

Replace the calculations involving $t_1 = 0.5$ with (e.g.) $t_{1h} = 0.500000000001$ and $t_{1l} = 0.499999999999$. The difference between the "upper" and "lower" values of $t_n$ would increase with each iteration, guaranteeing a result within given tolerance of a fair coin and within a well-defined bounded number of tosses.

$\endgroup$
9
  • $\begingroup$ Wow that's very cool! $\endgroup$ Commented May 17, 2021 at 16:21
  • $\begingroup$ Steve there seems to be a problem with this method. I have implemented it in Java and compared it to the original method that doesn't know the value of $p$. For some $p$ your method does not give an unbiased result and sometimes the simple method gives less steps. I am not sure what the problem is. Here is my code: ghostbin.co/paste/ax6f5 $\endgroup$ Commented May 18, 2021 at 6:24
  • 1
    $\begingroup$ @DmitryKamenetsky I found an issue with the formulae - I'd got them wrong at first, and then hurriedly made a couple of incorrect "corrections" that happened to have almost the right effect for the first 5 steps or so of the worked example! Re-checking working, it should be that the new value of t goes below 0 for "tails" rather than above 1. $\endgroup$
    – Steve
    Commented May 18, 2021 at 7:46
  • $\begingroup$ Yes that works now! And it gives fewer steps than the simple method. I also had a bug in my code. Here is the new one: ghostbin.co/paste/8zzj57 $\endgroup$ Commented May 18, 2021 at 8:38
  • $\begingroup$ @DmitryKamenetsky it looks like your code is using something like $p = 0.499999999...$ for the final check due to accumulated floating point rounding errors in the for loop. (If it was calling with $p = 0.5$, which is represented precisely in binary, it's a special case and the loop within generateKnowingP would always terminate in the first iteration). $\endgroup$
    – Steve
    Commented May 18, 2021 at 9:23
1
$\begingroup$

My friend Adam came up with a new solution while we were hiking up a steep mountain. I want to post it here, because it attacks the problem from a different angle.

Toss the biased coin 100 times and count how many heads came up, call it $a$. Now toss it another 100 times and count how many heads came up, call it $b$. If $a>b$ then output "H", if $a<b$ then output "T", otherwise if $a=b$ do nothing. Both outcomes ($a>b$ and $a<b$) are equally likely, so "H" and "T" will be generated with the same probability. Now if we toss the coins just once then this method becomes identical to the accepted answer - "HH" and "TT" will be skipped as the counts are the same, while "HT" will produce "H" and "TH" will produce "T" with equal probability. I thought it is quite neat how a method that looks completely different initially ends up being the same.

$\endgroup$
0
$\begingroup$

I'm pretty simple, the other answers seem similar but don't make much sense to me.

I believe my solution is simple and should work

Flip the biased coin, if it's Heads then you add one to heads Flip the biased coin again, if it's Heads then you add one to tails Repeat. So just invert the result of every second dice result.

$\endgroup$
7
  • 1
    $\begingroup$ Welcome to PSE! Your answer is not enough clear to me: it seems like you are just counting the outcomes of the biased coin rather than simulating an unbiased one. In your solution what are the outcomes of the simulated biased coin? $\endgroup$
    – melfnt
    Commented May 19, 2021 at 7:41
  • 1
    $\begingroup$ This answer generates Head with average probability 0.5, but the different coins aren't identically distributed - if the biased coin has probability 0.01 for H you will likely generate THTHTHTH. It's fast and has correct average, but not random. $\endgroup$ Commented May 19, 2021 at 9:11
  • $\begingroup$ @melfnt no i meant that every second dice throw you count the opposite of whatever the dice shows, therefore I think it should negate the bias $\endgroup$
    – Aequitas
    Commented May 19, 2021 at 14:00
  • $\begingroup$ @HansOlsson that was just an example, I meant you just inverse every second dice throw, if you roll heads then heads again, you would count heads and tail, if you rolled heads and then tails you would count two heads. So assuming the biased die is 0.1 Heads and 0.9 tails; the first roll would be 0.1H0.9T, the 2nd roll would be 0.9H and 0.1T. so (0.9+0.1)/2 = 0.5 for each outcome. $\endgroup$
    – Aequitas
    Commented May 19, 2021 at 14:03
  • 2
    $\begingroup$ Ah, thanks everyone I think I now understand the flaw in my solution, and also it helped me understand why the accepted solution works. It's similar to mine in that it takes two opposite results, the difference is I force the 2nd result to be opposite, whereas the other one just waits until there happens to be different results in a row, though it also means that the biased and unbiased result can be in either order, so just take the 2nd (or I guess first) and you get your random generation. $\endgroup$
    – Aequitas
    Commented May 19, 2021 at 23:07

Not the answer you're looking for? Browse other questions tagged or ask your own question.