The following induction is flawed because the result admits a counter-example, but I can't find where is the flaw. Please advise. [Edit: As pointed out in the answer, the error was in the base case. The base case can further be boiled down to $H_2$ alone.]
Let $A_n$ be the average number of comparisons to sort $n$ keys by merging them in a top-down fashion (see any algorithm textbook). It can he shown that $$ A_0 = A_1 = 0;\quad A_n = A_{\lfloor{n/2}\rfloor} + A_{\lceil{n/2}\rceil} + n - \frac{\lfloor{n/2}\rfloor}{\lceil{n/2}\rceil+1} - \frac{\lceil{n/2}\rceil}{\lfloor{n/2}\rfloor+1}. $$ I am interested in a lower bounds on $A_n$ of the form $n\lg n + an + b$, where $\lg n$ is the binary logarithm.
By distinguishing on the parity of $n$, we get $$ A_{2p} = 2 A_{p} + 2p - 2 + \frac{2}{p+1};\quad A_{2p+1} = A_{p} + A_{p+1} + 2p - 1 + \frac{2}{p+2}. $$ Let $H_n$ be the property to prove $n\lg n + an + b \leqslant A_n$.
Let us assume $H_n$ for $n < 2p$, in particular, $H_p \colon (2p\lg p + 2ap + 2b) + 2p - 2 + 2/(p+1) \leqslant A_{2p}$. We want $H_{2p} \colon 2p\lg(2p) + 2ap + b = 2p\lg p + 2ap + 2p + b \leqslant A_{2p}$, which holds if $$ 2 - \frac{2}{p+1} = \frac{2p}{p+1} \leqslant b. $$
Let $g(p) := 2p/(p+1)$. This function is strictly increasing for $p > 0$ and $g(p) \rightarrow 2^{-}$, as $p \rightarrow +\infty$.
We assume $H_n$ for $n < 2p+1$, in particular, $H_p$ and $H_{p+1}$, which imply $p\lg p + (p+1)\lg(p+1) + a(2p+1) + 2b + 2p - 1 + 2/(p+2) \leqslant A_{2p+1}$ We want to prove $H_{2p+1} \colon (2p+1)\lg(2p+1) + a(2p+1) + b \leqslant A_{2p+1}$, which holds if $$ (2p+1)\lg(2p+1) - (p+1)\lg(p+1) - p\lg p - 2p + 1 - \frac{2}{p+2} \leqslant b. $$ Let $f(p) := (2p+1)\lg(2p+1) - (p+1)\lg(p+1) - p\lg p - 2p + 1 - \frac{2}{p+2}$. We then have the condition $f(p) \leqslant b$. This function is strictly increasing for $p > 0$ and $f(p) \rightarrow 2^{-}$, as $p \rightarrow +\infty$.
Because we need to satisfy the conditions $f(p) \leqslant b$ and $g(p) \leqslant b$ for both inductive steps to hold, we have to compare $f$ and $g$, when $p$ is a natural number: $g(1) < f(1)$, $g(2) < f(2)$, but $f(p) < g(p)$ if $p \geqslant 3$.
We have $A_{6} = 59/6$ and $A_{7} = 191/15$. Let us choose $p=3$ for the base case:
- $H_{6} \colon 6\lg 6 + 6a + b \leqslant A_{6} \Leftrightarrow a \leqslant 59/36 - \lg 6 - b/6$,
- $H_{7} \colon 7\lg 7 + 7a + b \leqslant A_{7} \Leftrightarrow a \leqslant 191/105 - \lg 7 - b/7$.
Since we want to maximise $a$, we need to minimise $b$. We know that $f(p) < g(p) \leqslant b$, so the only choice for $b$ not to depend on $p$ is the smallest upper bound of $g(p)$, that is, $$ b_{\min} = 2, \quad \text{therefore}\quad a_{\max} = \frac{47}{36} - \lg 6 \simeq -1.2794. $$
By the principle of complete induction, we just established, for $n > 5$, $$ n\lg n - \left(\lg 6 - \frac{47}{36}\right)n + 2 < A_{n}. $$
But this is wrong! The first counter-example is $n=2^3$, for which the bound is approximatively $15.7647$ and the exact value is $A_8 = 236/15 \simeq 15.733$. Where is the flaw in my argument? Does the problem happens only at $n=2^p$?