32
$\begingroup$

Inspired by Peter Donnelly's talk at TED, in which he discusses how long it would take for a certain pattern to appear in a series of coin tosses, I created the following script in R. Given two patterns 'hth' and 'htt', it calculates how long it takes (i.e. how many coin tosses) on average before you hit one of these patterns.

coin <- c('h','t')

hit <- function(seq) {
    miss <- TRUE
    fail <- 3
    trp  <- sample(coin,3,replace=T)
    while (miss) {
        if (all(seq == trp)) {
            miss <- FALSE
        }
        else {
            trp <- c(trp[2],trp[3],sample(coin,1,T))
            fail <- fail + 1
        }
    }
    return(fail)
}

n <- 5000
trials <- data.frame("hth"=rep(NA,n),"htt"=rep(NA,n))

hth <- c('h','t','h')
htt <- c('h','t','t')

set.seed(4321)
for (i in 1:n) {
    trials[i,] <- c(hit(hth),hit(htt))    
}
summary(trials)

The summary statistics are as follows,

      hth             htt        
 Min.   : 3.00   Min.   : 3.000  
 1st Qu.: 4.00   1st Qu.: 5.000  
 Median : 8.00   Median : 7.000  
 Mean   :10.08   Mean   : 8.014  
 3rd Qu.:13.00   3rd Qu.:10.000  
 Max.   :70.00   Max.   :42.000 

In the talk it is explained that the average number of coin tosses would be different for the two patterns; as can be seen from my simulation. Despite watching the talk a few times I'm still not quite getting why this would be the case. I understand that 'hth' overlaps itself and intuitively I would think that you would hit 'hth' sooner than 'htt', but this is not the case. I would really appreciate it if someone could explain this to me.

$\endgroup$

5 Answers 5

35
$\begingroup$

Think about what happens the first time you get an H followed by a T.

Case 1: you're looking for H-T-H, and you've seen H-T for the first time. If the next toss is H, you're done. If it's T, you're back to square one: since the last two tosses were T-T you now need the full H-T-H.

Case 2: you're looking for H-T-T, and you've seen H-T for the first time. If the next toss is T, you're done. If it's H, this is clearly a setback; however, it's a minor one since you now have the H and only need -T-T. If the next toss is H, this makes your situation no worse, whereas T makes it better, and so on.

Put another way, in case 2 the first H that you see takes you 1/3 of the way, and from that point on you never have to start from scratch. This is not true in case 1, where a T-T erases all progress you've made.

$\endgroup$
1
  • $\begingroup$ Ohh, so in this scenario the coin flipping doesn't stop when one pattern wins! That makes sense. This confused me for a while (haven't watched the TED talk) so I figured I'd comment to help others who might have been thinking the same thing. $\endgroup$
    – user136692
    Commented Nov 2, 2016 at 12:54
19
$\begingroup$

I like to draw pictures.

enter image description here

These diagrams are finite state automata (FSAs). They are tiny children's games (like Chutes and Ladders) that "recognize" or "accept" the HTT and HTH sequences, respectively, by moving a token from one node to another in response to the coin flips. The token begins at the top node, pointed to by an arrow (line i). After each toss of the coin, the token is moved along the edge labeled with that coin's outcome (either H or T) to another node (which I will call the "H node" and "T node," respectively). When the token lands on a terminal node (no outgoing arrows, indicated in green) the game is over and the FSA has accepted the sequence.

Think of each FSA as progressing vertically down a linear track. Tossing the "right" sequence of heads and tails causes the token to progress towards its destination. Tossing a "wrong" value causes the token to back up (or at least stand still). The token backs up to the most advanced state corresponding to the most recent tosses. For instance, the HTT FSA at line ii stays put at line ii upon seeing a head, because that head could be the initial sequence of an eventual HTH. It does not go all the way back to the beginning, because that would effectively ignore this last head altogether.

After verifying these two games indeed correspond to HTT and HTH as claimed, and comparing them line by line, and it should now be obvious that HTH is harder to win. They differ in their graphical structure only on line iii, where an H takes HTT back to line ii (and a T accepts) but, in HTH, a T takes us all the way back to line i (and an H accepts). The penalty at line iii in playing HTH is more severe than the penalty in playing HTT.

This can be quantified. I have labeled the nodes of these two FSAs with the expected number of tosses needed for acceptance. Let us call these the node "values." The labeling begins by

(1) writing the obvious value of 0 at the accepting nodes.

Let the probability of heads be p(H) and the probability of tails be 1 - p(H) = p(T). (For a fair coin, both probabilities equal 1/2.) Because each coin flip adds one to the number of tosses,

(2) the value of a node equals one plus p(H) times the value of the H node plus p(T) times the value of the T node.

These rules determine the values. It's a quick and informative exercise to verify that the labeled values (assuming a fair coin) are correct. As an example, consider the value for HTH on line ii. The rule says 8 must be 1 more than the average of 8 (the value of the H node on line i) and 6 (the value of the T node on line iii): sure enough, 8 = 1 + (1/2)*8 + (1/2)*6. You can just as readily check the remaining five values in the illustration.

$\endgroup$
3
  • $\begingroup$ The FSA approach is a great way to analyze Penney's Game (in the reply by @Henry). The values are labeled a little differently: the FSA now has one accepting node per pattern. To find the chance of your pattern winning, label its accepting node with 1 and all other accepting nodes with 0. The value at any other node equals the average of the values of its H and T nodes. The value of the (unique) start node is the chance of winning. $\endgroup$
    – whuber
    Commented Jun 21, 2011 at 23:24
  • $\begingroup$ I find your picture helpful & intuitive, @whuber, but I don't quite follow the rules for assigning the numbers to the nodes. Eg, your example uses 1 + (1/2)*10 + (1/2)*6. Shouldn't this be 9? Since I gather you fill these out by starting w/ $0$ at the terminal node, it might be easier to follow if your example was how to get the # for node iii, given iv=0. $\endgroup$ Commented Jul 26, 2013 at 15:46
  • $\begingroup$ @gung Thanks for catching that. I fixed the example. However, there is a typo in the figure: it looks like the value of HTT at line iii should be 4, rather than 2. $\endgroup$
    – whuber
    Commented Jul 26, 2013 at 16:03
18
$\begingroup$

Suppose you toss the coin $8n+2$ times and count the number of times you see a "HTH" pattern (including overlaps). The expected number is $n$. But it is also $n$ for "HTT". Since $HTH$ can overlap itself and "HTT" cannot, you would expect more clumping with "HTH", which increases the expected time for the first appearance of $HTH$.

Another way of looking at it is that after reaching "HT", a "T" will send "HTH" back to the start, while an "H" will start progress to a possible "HTT".

You can work out the two expected times using Conway's algorithm [I think], by looking at the overlaps: if the first $k$ tosses of the pattern match the last $k$, then add $2^k$. So for "HTH" you get $2+0+8=10$ as the expectation and for "HTT" you get $0+0+8=8$, confirming your simulation.

The oddness does not stop there. If you have a race between the two patterns, they have an equal probability of appearing first, and the expected time until one of them appears is $5$ (one more than expected time to get "HT", after which one of them must appear).

It gets worse: in Penney's game you choose a pattern to race and then I choose another. If you choose "HTH" then I will choose "HHT" and have 2:1 odds of winning; if you choose "HTT" then I will choose "HHT" again and still have 2:1 odds in my favour. But if you choose "HHT" then I will choose "THH" and have 3:1 odds. The second player can always bias the odds, and the best choices are not transitive.

$\endgroup$
2
  • $\begingroup$ +1 Thanks for the link to Penney's game; more sleepless nights :) $\endgroup$
    – lafrasu
    Commented Jun 21, 2011 at 18:53
  • $\begingroup$ Dear Henry, I asked a similar question on this site, and got told to look for an answer here. I looked at Penney's game, but still can't work out my problem. Any help will be appreciated. $\endgroup$ Commented Apr 3, 2013 at 20:09
4
$\begingroup$

Some great answers. I'd like to take a slightly different tack, and address the question of counter-intuitivity. (I quite agree, BTW)

Here's how I make sense of it. Imagine a column of random sequential coin-toss results printed on a paper tape, consisting of the letters "H" and "T".

Arbitrarily tear off a section of this tape, and make an identical copy.

On a given tape, the sequence HTH and the sequence HTT will each occur as often, if the tape is long enough.

But occasionally the HTH instances will run together, ie HTHTH. (or even very occasionally HTHTHTH)

This overlap cannot happen with HTT instances.

Use a highlighter to pick out the "stripes" of successful outcomes, HTH on one tape and HTT on the other. A few of the HTH stripes will be shorter due to the overlap. Consequently the gaps between them, on average, will be slightly longer than on the other tape.

It's a bit like waiting for a bus, when averagely there's one every five minutes. If the buses are allowed to overlap each other, the interval will be slightly longer than five minutes, on average, because sometime two will go past together.

If you arrive at an arbitrary time, you'll be waiting slightly longer for the next (to you, first) bus, on average, if they're allowed to overlap.

$\endgroup$
3
$\begingroup$

I was looking for the intuition to this in the integer case (as I'm slogging through Ross' Intro. to Probability Models). So I was thinking about integer cases. I found this helped:

Let $A$ be the symbol needed to begin the pattern I'm waiting for.

Let $B$ be the symbol needed to complete the pattern I'm waiting for.

In the case of an overlap roughly $A=B$ and so $P(A \cap \tilde{B})=0$.

Whereas in the case of no overlap $A \ne B$ and so $P(A \cap \tilde{B}) \ge 0$.

So, let me imagine that I have a chance to finish the pattern on the next draw. I draw the next symbol and it doesn't finish the pattern. In the case my pattern doesn't overlap, the symbol drawn might still allow me to begin building the pattern from the beginning again.

In the case of an overlap, the symbol I needed to finish my partial pattern was the same as the symbol that I would need to start rebuilding. So I can't do either, and therefore will definitely need to wait until the next draw for a chance to start building again.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.