2
$\begingroup$

The following induction is flawed because the result admits a counter-example, but I can't find where is the flaw. Please advise. [Edit: As pointed out in the answer, the error was in the base case. The base case can further be boiled down to $H_2$ alone.]

Let $A_n$ be the average number of comparisons to sort $n$ keys by merging them in a top-down fashion (see any algorithm textbook). It can he shown that $$ A_0 = A_1 = 0;\quad A_n = A_{\lfloor{n/2}\rfloor} + A_{\lceil{n/2}\rceil} + n - \frac{\lfloor{n/2}\rfloor}{\lceil{n/2}\rceil+1} - \frac{\lceil{n/2}\rceil}{\lfloor{n/2}\rfloor+1}. $$ I am interested in a lower bounds on $A_n$ of the form $n\lg n + an + b$, where $\lg n$ is the binary logarithm.

By distinguishing on the parity of $n$, we get $$ A_{2p} = 2 A_{p} + 2p - 2 + \frac{2}{p+1};\quad A_{2p+1} = A_{p} + A_{p+1} + 2p - 1 + \frac{2}{p+2}. $$ Let $H_n$ be the property to prove $n\lg n + an + b \leqslant A_n$.

Let us assume $H_n$ for $n < 2p$, in particular, $H_p \colon (2p\lg p + 2ap + 2b) + 2p - 2 + 2/(p+1) \leqslant A_{2p}$. We want $H_{2p} \colon 2p\lg(2p) + 2ap + b = 2p\lg p + 2ap + 2p + b \leqslant A_{2p}$, which holds if $$ 2 - \frac{2}{p+1} = \frac{2p}{p+1} \leqslant b. $$

Let $g(p) := 2p/(p+1)$. This function is strictly increasing for $p > 0$ and $g(p) \rightarrow 2^{-}$, as $p \rightarrow +\infty$.

We assume $H_n$ for $n < 2p+1$, in particular, $H_p$ and $H_{p+1}$, which imply $p\lg p + (p+1)\lg(p+1) + a(2p+1) + 2b + 2p - 1 + 2/(p+2) \leqslant A_{2p+1}$ We want to prove $H_{2p+1} \colon (2p+1)\lg(2p+1) + a(2p+1) + b \leqslant A_{2p+1}$, which holds if $$ (2p+1)\lg(2p+1) - (p+1)\lg(p+1) - p\lg p - 2p + 1 - \frac{2}{p+2} \leqslant b. $$ Let $f(p) := (2p+1)\lg(2p+1) - (p+1)\lg(p+1) - p\lg p - 2p + 1 - \frac{2}{p+2}$. We then have the condition $f(p) \leqslant b$. This function is strictly increasing for $p > 0$ and $f(p) \rightarrow 2^{-}$, as $p \rightarrow +\infty$.

Because we need to satisfy the conditions $f(p) \leqslant b$ and $g(p) \leqslant b$ for both inductive steps to hold, we have to compare $f$ and $g$, when $p$ is a natural number: $g(1) < f(1)$, $g(2) < f(2)$, but $f(p) < g(p)$ if $p \geqslant 3$.

We have $A_{6} = 59/6$ and $A_{7} = 191/15$. Let us choose $p=3$ for the base case:

  • $H_{6} \colon 6\lg 6 + 6a + b \leqslant A_{6} \Leftrightarrow a \leqslant 59/36 - \lg 6 - b/6$,
  • $H_{7} \colon 7\lg 7 + 7a + b \leqslant A_{7} \Leftrightarrow a \leqslant 191/105 - \lg 7 - b/7$.

Since we want to maximise $a$, we need to minimise $b$. We know that $f(p) < g(p) \leqslant b$, so the only choice for $b$ not to depend on $p$ is the smallest upper bound of $g(p)$, that is, $$ b_{\min} = 2, \quad \text{therefore}\quad a_{\max} = \frac{47}{36} - \lg 6 \simeq -1.2794. $$

By the principle of complete induction, we just established, for $n > 5$, $$ n\lg n - \left(\lg 6 - \frac{47}{36}\right)n + 2 < A_{n}. $$

But this is wrong! The first counter-example is $n=2^3$, for which the bound is approximatively $15.7647$ and the exact value is $A_8 = 236/15 \simeq 15.733$. Where is the flaw in my argument? Does the problem happens only at $n=2^p$?

$\endgroup$
2
  • 1
    $\begingroup$ I've no time to go through the details, but in general if you have a proof and a counterexample, you can locate the error by testing claims made in the proof on the concrete counterexample: they should be true at the start and false at the end, so there is a (at least one) point where the change from true to false, which is where there is an error in the reasoning. Using binary search you can locate the error in time logarithmic in the length of the proof ;-) $\endgroup$ Commented Aug 28, 2012 at 12:33
  • $\begingroup$ What you say is wise indeed, but I am afraid I still don't understand why $H_8$ fails. I did some trimming on the text, though;-) $\endgroup$ Commented Aug 30, 2012 at 14:26

1 Answer 1

1
$\begingroup$

Up to the point where you show that no $b\lt2$ can work, you are fine. After that, fix $b=2$ and find some $a$ such that the base case of the induction $A_n\geqslant n\lg n+an+2$ holds.

That is, one asks that $A_2\geqslant2\lg2+2a+2$ and $A_3\geqslant3\lg3+3a+2$. Hence, $a=-3/2$ works (to be checked).

Edit: Let $H_n$ denote the property $A_n\geqslant n\lg n+an+2$ for some $a$. Then $H_8$ is based on $H_4$, not on $H_6$ and $H_7$. For $a=-1.2794$, $H_8$ fails (as the OP notes) because $H_4$ fails too and $H_4$ fails because $H_2$ fails too. In conclusion, $(H_6,H_7)$ is not the base case for $(H_n)_{n\geqslant8}$, rather, the base case is $(H_n)_{n\leqslant7}$, which is implied by $(H_2,H_3)$. In particular, one needs $(H_2,H_3)$ to hold, even only to perform the recursion on $H_n$ for $n\geqslant8$, hence one needs that $a\leqslant-3/2$.

$\endgroup$
3
  • $\begingroup$ But why, if I start with $H_6$ and $H_7$, then $H_8$ fails? $\endgroup$ Commented Aug 30, 2012 at 14:24
  • $\begingroup$ See Edit. $ $ $ $ $\endgroup$
    – Did
    Commented Aug 30, 2012 at 15:00
  • $\begingroup$ Thank you very much for your explanation! $\endgroup$ Commented Aug 30, 2012 at 15:03

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .