3
$\begingroup$

How many binary strings of length 10 are there that don't contain either of the substrings 101 or 010?

I've tried doing some casework, thinking that there wouldn't be too many cases, but it didn't work. I split my cases up into 1) string begins with 1 and 2) string begins with 0.

Casework might not be the best option here. I think an ideal method to solve this would involve some sort of recursion, but I'm not sure where I should start.

$\endgroup$
13
  • 1
    $\begingroup$ I suggest computing it for several small values. If nothing else, that gives you a lot of test cases for the eventual result $\endgroup$
    – lulu
    Commented Nov 21, 2023 at 17:07
  • $\begingroup$ What @lulu wrote. Plus, the sequence of those small values itself may suggest a general form that motivates a hypothesis for you to then attempt to prove or disprove. By “small values” here we mean small values of 10; in other words, replace 10 with some smaller string lengths. $\endgroup$ Commented Nov 21, 2023 at 17:16
  • $\begingroup$ Further note: I believe that if you follow this suggestion the pattern will emerge quickly, and that, once you see it, it will be easy for you to prove. $\endgroup$
    – lulu
    Commented Nov 21, 2023 at 17:28
  • $\begingroup$ @PaulTanenbaum: Do you mean "Solve the problem by explicitly checking all permutations for binary strings of length 1, 2, 3, 4,... and it will be easy to see a pattern so you can do it for length 10 using a formula, rather than counting the occurrences"? $\endgroup$ Commented Nov 21, 2023 at 17:28
  • $\begingroup$ ...I think while I was composing my comment, @lulu just confirmed that's exactly what's meant! :) $\endgroup$ Commented Nov 21, 2023 at 17:29

3 Answers 3

3
$\begingroup$

This answer is based upon the Goulden-Jackson Cluster Method, a generating function method encoding PIE, the principle of inclusion-exclusion. We consider the set of words of length $n\geq 0$ built from an alphabet $$\mathcal{V}=\{0,1\}$$ and the set $B=\{010,101\}$ of bad words which are not allowed to be part of the words we are looking for. We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of wanted words of length $n$.

According to the paper (p.7) the generating function $f(s)$ is \begin{align*} \color{blue}{f(s)=\frac{1}{1-ds-\text{weight}(\mathcal{C})}}\tag{1} \end{align*} with $d=|\mathcal{V}|=2$, the size of the alphabet and $\mathcal{C}$ being the weight-numerator of bad words with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[010])+\text{weight}(\mathcal{C}[101])\tag{2} \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[010])&=-s^3-s\cdot\text{weight}(\mathcal{C}[010])-s^2\cdot\text{weight}(\mathcal{C}[101])\tag{3}\\ \text{weight}(\mathcal{C}[101])&=-s^3-s\cdot\text{weight}(\mathcal{C}[101])-s^2\cdot\text{weight}(\mathcal{C}[010])\tag{4}\\ \end{align*}

The additional terms on the right-hand side of (3) take account of the overlapping of $01\color{blue}{0}$ with $\color{blue}{0}10$ and of $0\color{blue}{10}$ with $\color{blue}{10}1$ and similarly in (4). Solving the linear equations in the unknowns $\text{weight}(\mathcal{C}[010])$ and $\text{weight}(\mathcal{C}[101])$ in (3) and (4) gives \begin{align*} \text{weight}(\mathcal{C}[010])=\text{weight}(\mathcal{C}[101])=-s^3\frac{1+s-s^2}{1+2s+s^2-s^4} \end{align*} from which according to (1) and (2) \begin{align*} \color{blue}{f(s)}&=\frac{1}{1-ds-\text{weight}(\mathcal{C})}\\ &=\frac{1}{1-2s+2s^3\frac{1+s-s^2}{1+2s+s^2-s^4}}\\ &\,\,\color{blue}{=\frac{1+s+s^2}{1-s-s^2}} \end{align*} follows. Series expansion of $f$ at $s=0$ gives \begin{align*} f(s)&=\frac{1+s+s^2}{1-s-s^2}\\ &=1 + 2 s + 4 s^2 + 6 s^3 + 10 s^4 + 16s^5+26s^6\\ &\qquad + 42 s^7+68s^8+110s^9 + \color{blue}{178} s^{10} + 288s^{11} +\cdots \end{align*} where the last line was calculated with the help of Wolfram Alpha.

Result: The blue marked coefficient of $s^{10}$ shows there are $\color{blue}{178}$ words of length $10$ over the alphabet $\mathcal{V}$ which do not contain $010$ or $101$.

$\endgroup$
5
  • $\begingroup$ The actual logic / algebra here mostly goes over my head (can I take comfort from the fact that even you used Wolfram Alpha to get to the end result? :) but I'm more than happy to accept it's all bulletproof, so that final 178 is definitely the right answer. Now I'm kinda interested in why my crude attempt to get to that answer by manually summing up "valid" strings of shorter lengths in search of a pattern didn't work. $\endgroup$ Commented Nov 22, 2023 at 12:03
  • $\begingroup$ @FumbleFingers: I've also checked the result with a bit of R code before I answered the problem. It is not too cumbersome to do the series expansion manually, but here the focus is on the PIE approach. :-) It is not that easy to manually keep track of all the overlapping cases. Some kind of bookkeeping is helpful. If no other nice answer regarding PIE will be presented, I can add an elementary answer at the weekend. $\endgroup$ Commented Nov 22, 2023 at 12:21
  • 1
    $\begingroup$ I'd be embarrassed to present my actual C++ code (three levels of nested subroutines! :), but printing the number of "non-excluded" strings for lengths 3-10 I got Len:03 count:6, Len:04 count:10, Len:05 count:16, Len:06 count:26, Len:07 count:42, Len:08 count:68, Len:09 count:110, Len:10 count:178. So at least my code works, but my extrapolation from a supposed pattern based on string lengths 3,4,5 was spurious. If there is a "formula" for generating the next number in the series, it's way too complicated for me to see it! Thanks to all for giving me something interesting. $\endgroup$ Commented Nov 22, 2023 at 13:46
  • $\begingroup$ Yes, 178 is the correct answer. I'm having a little bit of a hard time trying to understand this explanation, though - I think you implemented something like "weight-numerators", which is a technique I haven't learned yet. $\endgroup$
    – vic100
    Commented Nov 22, 2023 at 16:15
  • $\begingroup$ @vic100: Correct, here we use some kind of weight-numerators. You can find more information about this technique here. $\endgroup$ Commented Nov 22, 2023 at 21:48
2
$\begingroup$

Given a binary string, $b$, whose length is $n$, define a new binary string $b'$, as follows. The length of $b'$ is $n-1$, and for all $k\in \{1,\dots,n-1\}$, $$ b'_k=\begin{cases} 1 & \text{if }b_k\neq b_{k+1} \\ 0 & \text{if }b_k=b_{k+1} \end{cases} $$ $b'$ is like the "derivative" of $b$, because $b'$ records the change between adjacent entries of $b$ (0 for no change, 1 for a change).

Lemma 1: Let $n\ge 1$. Let $B$ be a binary string of length $n-1$. There exist exactly two binary strings, $b$, of length $n$, for which $b'=B$.

Lemma 2: Let $b$ be a binary string of length $n$, where $n\ge 1$. Then $b$ contains one of the substrings $101$ or $010$ if and only if $b'$ contains the substring $11$.

Let $a_n$ be the number of binary strings of length $n$ which avoid $101$ and $010$.
Let $a_n'$ be the number of binary strings of length $n$ which avoid $11$.
These two lemmas imply $$ a_n=2a_{n-1}',\qquad (n\ge 1) $$ All you need to do is compute what $a_{n-1}$ is. Counting binary strings without two adjacent ones is a well-known problem.

$\endgroup$
3
  • $\begingroup$ Thank you! I'll try to practice more problems related to recursion. The solution printed in the textbook used something similar to yours. $\endgroup$
    – vic100
    Commented Nov 21, 2023 at 23:55
  • $\begingroup$ @vic100 I also agree that recursion is the way to go here. However, I think that you will also need Inclusion-Exclusion, which is also discussed here. The reason is that you have to enumerate the number of strings in the union of those strings that contain either 101 or 010. $\endgroup$ Commented Nov 22, 2023 at 5:43
  • $\begingroup$ @user2661923 That was my original idea, but something weird happened to my calculations, so I decided to resort to casework, which didn't work out either. $\endgroup$
    – vic100
    Commented Nov 22, 2023 at 16:16
1
$\begingroup$

Here we derive a recurrence relation. We count the number $a_n$ of wanted strings of length $n$ from the set $\mathcal{V}=\{0,1\}$, i.e. binary strings of length $n$ which do not contain a bad word from $\mathcal{B}=\{010,101\}$.

We do so by partition them according to their matching length with the initial parts of a string of length $2$. We consider for $n\geq 2$: \begin{align*} \color{blue}{a_n=a^{[00]}_n+a^{[01]}_n+a^{[10]}_n+a^{[11]}_n}\tag{1} \end{align*}

  • The number $a^{[00]}_n$ counts the number of valid strings of length $n$ which start with $\color{blue}{00}$ from the left.

  • The number $a^{[01]}_n$ counts the number of valid strings of length $n$ which start with $\color{blue}{01}$ from the left.

  • The number $a^{[10]}_n$ counts the number of valid strings of length $n$ which start with $\color{blue}{10}$ from the left.

  • The number $a^{[11]}_n$ counts the number of valid strings of length $n$ which start with $\color{blue}{11}$ from the left.

We derive a relationship between valid strings of length $n$ with those of length $n+1$ as follows:

  • If a string counted by $a^{[00]}_n$ is appended

    • by $0$ from the left it contributes to $a^{[00]}_{n+1}$.

    • If it is appended by $1$ from the left it contributes to $a^{[10]}_{n+1}$.

  • If a string counted by $a^{[01]}_n$ is appended

    • by $0$ from the left it contributes to $a^{[00]}_{n+1}$.

    • Appending from the left by $1$ is not allowed since then we have an invalid string starting with $\color{blue}{101}$.

  • If a string counted by $a^{[10]}_n$ is appended

    • by $1$ from the left it contributes to $a^{[11]}_{n+1}$.

    • Appending from the left by $0$ is not allowed since then we have an invalid string starting with $\color{blue}{010}$.

  • If a string counted by $a^{[11]}_n$ is appended

    • by $0$ from the left it contributes to $a^{[01]}_{n+1}$.

    • If it is appended by $1$ from the left it contributes to $a^{[11]}_{n+1}$.

These relationships can be written as \begin{align*} \color{blue}{a^{[00]}_{n+1}}&\color{blue}{=a^{[00]}_{n}+a^{[01]}_{n}}\tag{2}\\ \color{blue}{a^{[01]}_{n+1}}&\color{blue}{=a^{[11]}_n}\tag{3}\\ \color{blue}{a^{[10]}_{n+1}}&\color{blue}{=a^{[00]}_n}\tag{4}\\ \color{blue}{a^{[11]}_{n+1}}&\color{blue}{=a^{[10]}_n+a^{[11]}_n}\tag{5}\\ \end{align*}

We can now derive a recurrence relation from (1) - (5). We obtain for $n\geq 2$: \begin{align*} \color{blue}{a_{n+1}}&=a^{[00]}_{n+1}+a^{[01]}_{n+1}+a^{[10]}_{n+1}+a^{[11]}_{n+1}\tag{$ \to (1)$}\\ &=\left(a^{[00]}_{n}+a^{[01]}_{n}\right)+a^{[11]}_{n}+a^{[00]}_{n} +\left(a^{[10]}_{n}+a^{[11]}_{n}\right)\tag{$\to (2),(3),(4),(5)$}\\ &=2a^{[00]}_{n}+a^{[01]}_{n}+2a^{[11]}_{n}+a^{[10]}_{n}\\ &=a_n+a^{[00]}_{n}+a^{[11]}_{n}\tag{$ \to (1)$}\\ &=a_n+\left(a^{[00]}_{n-1}+a^{[01]}_{n-1}\right)+\left(a^{[10]}_{n-1}+a^{[11]}_{n-1}\right)\tag{$ \to (2),(5)$}\\ &\,\,\color{blue}{=a_n+a_{n-1}}\tag{$\to (1)$} \end{align*}

We derive the wanted recurrence relation for $a_n$ as \begin{align*} \color{blue}{a_{n+1}}&\color{blue}{=a_{n}+a_{n-1}\qquad\qquad n\geq 2}\tag{6}\\ \color{blue}{a_0}&\color{blue}{=1}\\ \color{blue}{a_1}&\color{blue}{=2}\\ \color{blue}{a_2}&\color{blue}{=4}\\ \end{align*} where we set $a_0=1$ for the empty string and $a_1=2$ and $a_2=4$ according to the number of valid strings of length $1$ and length $2$.

We can now find the wanted number $a_{10}$ iteratively by calculating \begin{align*} a_3&=a_2+a_1=6\\ a_4&=a_3+a_2=10\\ a_5&=a_4+a_3=16\\ a_6&=a_5+a_4=26\\ a_7&=a_6+a_5=42\\ a_8&=a_7+a_6=68\\ a_9&=a_8+a_7=110\\ \color{blue}{a_{10}}&\,\,\color{blue}{ =a_9+a_8=178} \end{align*}

in accordance with the coefficients of the series expansion of $f$ in another answer.

$\endgroup$
1
  • 1
    $\begingroup$ Thanks! It's nice to see a variety of approaches for this problem. $\endgroup$
    – vic100
    Commented Nov 23, 2023 at 23:27

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .