The prime number theorem can be expressed as the assertion

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + o(x) \ \ \ \ \ (1)

as {x \rightarrow \infty}, where

\displaystyle  \Lambda(n) := \sum_{d|n} \mu(d) \log \frac{n}{d}

is the von Mangoldt function. It is a basic result in analytic number theory, but requires a bit of effort to prove. One “elementary” proof of this theorem proceeds through the Selberg symmetry formula

\displaystyle  \sum_{n \leq x} \Lambda_2(n) = 2 x \log x + O(x) \ \ \ \ \ (2)

where the second von Mangoldt function {\Lambda_2} is defined by the formula

\displaystyle  \Lambda_2(n) := \sum_{d|n} \mu(d) \log^2 \frac{n}{d} \ \ \ \ \ (3)

or equivalently

\displaystyle  \Lambda_2(n) = \Lambda(n) \log n + \sum_{d|n} \Lambda(d) \Lambda(\frac{n}{d}). \ \ \ \ \ (4)

(We are avoiding the use of the {*} symbol here to denote Dirichlet convolution, as we will need this symbol to denote ordinary convolution shortly.) For the convenience of the reader, we give a proof of the Selberg symmetry formula below the fold. Actually, for the purposes of proving the prime number theorem, the weaker estimate

\displaystyle  \sum_{n \leq x} \Lambda_2(n) = 2 x \log x + o(x \log x) \ \ \ \ \ (5)

suffices.

In this post I would like to record a somewhat “soft analysis” reformulation of the elementary proof of the prime number theorem in terms of Banach algebras, and specifically in Banach algebra structures on (completions of) the space {C_c({\bf R})} of compactly supported continuous functions {f: {\bf R} \rightarrow {\bf C}} equipped with the convolution operation

\displaystyle  f*g(t) := \int_{\bf R} f(u) g(t-u)\ du.

This soft argument does not easily give any quantitative decay rate in the prime number theorem, but by the same token it avoids many of the quantitative calculations in the traditional proofs of this theorem. Ultimately, the key “soft analysis” fact used is the spectral radius formula

\displaystyle  \lim_{n \rightarrow \infty} \|f^n\|^{1/n} = \sup_{\lambda \in \hat B} |\lambda(f)| \ \ \ \ \ (6)

for any element {f} of a unital commutative Banach algebra {B}, where {\hat B} is the space of characters (i.e., continuous unital algebra homomorphisms from {B} to {{\bf C}}) of {B}. This formula is due to Gelfand and may be found in any text on Banach algebras; for sake of completeness we prove it below the fold.

The connection between prime numbers and Banach algebras is given by the following consequence of the Selberg symmetry formula.

Theorem 1 (Construction of a Banach algebra norm) For any {G \in C_c({\bf R})}, let {\|G\|} denote the quantity

\displaystyle  \|G\| := \limsup_{x \rightarrow \infty} |\sum_n \frac{\Lambda(n)}{n} G( \log \frac{x}{n} ) - \int_{\bf R} G(t)\ dt|.

Then {\| \|} is a seminorm on {C_c({\bf R})} with the bound

\displaystyle  \|G\| \leq \|G\|_{L^1({\bf R})} := \int_{\bf R} |G(t)|\ dt \ \ \ \ \ (7)

for all {G \in C_c({\bf R})}. Furthermore, we have the Banach algebra bound

\displaystyle  \| G * H \| \leq \|G\| \|H\| \ \ \ \ \ (8)

for all {G,H \in C_c({\bf R})}.

We prove this theorem below the fold. The prime number theorem then follows from Theorem 1 and the following two assertions. The first is an application of the spectral radius formula (6) and some basic Fourier analysis (in particular, the observation that {C_c({\bf R})} contains a plentiful supply of local units):

Theorem 2 (Non-trivial Banach algebras with many local units have non-trivial spectrum) Let {\| \|} be a seminorm on {C_c({\bf R})} obeying (7), (8). Suppose that {\| \|} is not identically zero. Then there exists {\xi \in {\bf R}} such that

\displaystyle  |\int_{\bf R} G(t) e^{-it\xi}\ dt| \leq \|G\|

for all {G \in C_c}. In particular, by (7), one has

\displaystyle  \|G\| = \| G \|_{L^1({\bf R})}

whenever {G(t) e^{-it\xi}} is a non-negative function.

The second is a consequence of the Selberg symmetry formula and the fact that {\Lambda} is real (as well as Mertens’ theorem, in the {\xi=0} case), and is closely related to the non-vanishing of the Riemann zeta function {\zeta} on the line {\{ 1+i\xi: \xi \in {\bf R}\}}:

Theorem 3 (Breaking the parity barrier) Let {\xi \in {\bf R}}. Then there exists {G \in C_c({\bf R})} such that {G(t) e^{-it\xi}} is non-negative, and

\displaystyle  \|G\| < \|G\|_{L^1({\bf R})}.

Assuming Theorems 1, 2, 3, we may now quickly establish the prime number theorem as follows. Theorem 2 and Theorem 3 imply that the seminorm {\| \|} constructed in Theorem 1 is trivial, and thus

\displaystyle  \sum_n \frac{\Lambda(n)}{n} G( \log \frac{x}{n} ) = \int_{\bf R} G(t)\ dt + o(1)

as {x \rightarrow \infty} for any Schwartz function {G} (the decay rate in {o(1)} may depend on {G}). Specialising to functions of the form {G(t) = e^{-t} \eta( e^{-t} )} for some smooth compactly supported {\eta} on {(0,+\infty)}, we conclude that

\displaystyle  \sum_n \Lambda(n) \eta(\frac{n}{x}) = x \int_{\bf R} \eta(u)\ du + o(x)

as {x \rightarrow \infty}; by the smooth Urysohn lemma this implies that

\displaystyle  \sum_{\varepsilon x \leq n \leq x} \Lambda(n) = x - \varepsilon x + o(x)

as {x \rightarrow \infty} for any fixed {\varepsilon>0}, and the prime number theorem then follows by a telescoping series argument.

The same argument also yields the prime number theorem in arithmetic progressions, or equivalently that

\displaystyle  \sum_{n \leq x} \Lambda(n) \chi(n) = o(x)

for any fixed Dirichlet character {\chi}; the one difference is that the use of Mertens’ theorem is replaced by the basic fact that the quantity {L(1,\chi) = \sum_n \frac{\chi(n)}{n}} is non-vanishing.

— 1. Proof of Selberg symmetry formula —

We now prove (2). From (3) we have

\displaystyle  \frac{1}{x} \sum_{n \leq x} \Lambda_2(n) = \sum_{d \leq x} \frac{\mu(d)}{d} \frac{1}{x/d} \sum_{m \leq x/d} \log^2 m. \ \ \ \ \ (9)

From the integral test we have the estimates

\displaystyle  \frac{1}{x} \sum_{n \leq x} \log^2 n = \log^2 x - 2\log x + 2 + O( \frac{\log^2 (2 + x)}{x})

\displaystyle  \sum_{n \leq x} \frac{1}{x/n} \frac{1}{n} = 1 + O(\frac{1}{x})

\displaystyle  \sum_{n \leq x} \frac{1}{n} = \log x + c_1 + O(\frac{1}{x})

\displaystyle  \sum_{n \leq x} \frac{\log(x/n)}{n} = \frac{1}{2} \log^2 x + c_2 \log x + c_3 + O( \frac{\log(2+x)}{x} )

for some absolute constants {c_1,c_2,c_3} whose exact value is unimportant for us, and for any {x \geq 1}. We conclude that

\displaystyle  \frac{1}{x} \sum_{n \leq x} \log^2 n = \sum_{n \leq x} \frac{2\log(x/n)+ c_4 + c_5/(x/n)}{n} + O( \frac{\log^2 (2 + x)}{x})

for some further absolute constants {c_4,c_5}. Replacing {x} by {x/d} and inserting this into (9), one obtains

\displaystyle  \frac{1}{x} \sum_{n \leq x} \Lambda_2(n) = \sum_{d \leq x} \frac{\mu(d)}{d} \sum_{m \leq x/d} \frac{2 \log(x/dm) + c_4 + c_5/(x/dm)}{m}

\displaystyle  + O( \sum_{d \leq x} \frac{1}{d} \frac{\log^2( 2 + x/d )}{x/d} ).

The error term can be computed to be {O(1)}. The main term simplifies by Möbius inversion to {2 \log x + c_4 + c_5/x}, and the claim follows.

— 2. Constructing the Banach algebra —

We now prove Theorem 1. It is convenient to transform the situation from the classical context of arithmetic functions on {{\bf N}} (such as {\Lambda} or {\Lambda_2}) to the more Fourier-analytic context of Radon measures on the real line {{\bf R}}. Define the discrete Radon measure

\displaystyle  \mu := \sum_{n=1}^\infty \frac{\Lambda(n)}{n} \delta_{\log n}

and for any {h \in {\bf R}}, let {\tau_h \mu} denote the left translate of the measure {\mu} by {h}, thus

\displaystyle  \int_{\bf R} G(t)\ d\tau_h \mu(t) = \int_{\bf R} G(t+h)\ d\mu(t)

for any continuous compactly supported {G \in C_c({\bf R})}. We note in passing that the prime number theorem (1) is equivalent to the assertion that the translates {\tau_h \mu} converge in the vague topology to Lebesgue measure {m} as {h \rightarrow +\infty}.

Let {\nu} denote the measure

\displaystyle  \nu := \mu + \frac{1}{t} \mu * \mu \ \ \ \ \ (10)

where {\mu * \mu} is the convolution of the Radon measures {\mu}, and {t \mu} is the measure {\mu} multiplied by the identity function {t \mapsto t}. From (4) one has

\displaystyle  \nu := \sum_{n=1}^\infty \frac{\Lambda_2(n)}{n \log n} \delta_{\log n}.

We claim that the Selberg symmetry formula (5) implies (in fact, it is equivalent to) the assertion that the translates {\tau_h \nu} converge in the vague topology to {2m}. Indeed, (5) implies for any fixed {0 < a < b} that

\displaystyle  \sum_{ax \leq n \leq bx} \Lambda_2(n) = 2 (b-a) x \log x + o(x \log x)

or equivalently that

\displaystyle  \sum_{ax \leq n \leq bx} \frac{\Lambda_2(n)}{n \log n} \frac{n}{x} (1 + \frac{\log \frac{n}{x}}{\log x}) = 2 (b-a) + o(1),

which we rewrite as

\displaystyle  \int_{\log a}^{\log b} e^t (1 + \frac{t}{\log x}) \ d\tau_{\log x} \nu(t) = 2 \int_{\log a}^{\log b} e^t\ dt + o(1).

Since {\frac{t}{\log x} = o(1)} for {\log a \leq t \leq \log b}, we thus have

\displaystyle  \int_{\log a}^{\log b} e^t\ d\tau_{\log x} \nu(t) = 2 \int_{\log a}^{\log b} e^t\ dt + o(1),

which implies that {e^t \tau_{\log x} \nu} converges vaguely to {e^t m}, and the claim follows.

Now we begin the proof of Theorem 1. Observe that the quantity {\|G\|} can be rewritten as

\displaystyle  \|G\| = \limsup_{h \rightarrow +\infty} |\int_{\bf R} G(t)\ d\tau_h \mu(t) - \int_{\bf R} G(t)\ dt|. \ \ \ \ \ (11)

Since

\displaystyle  0 \leq \tau_h \mu \leq \tau_h \nu \ \ \ \ \ (12)

and {\tau_h \nu} converges vaguely to {2m}, we see that the measures {\tau_h \mu} are precompact in the vague topology, thanks to the Helly selection principle or Prokhorov theorem. In particular, we have

\displaystyle  \|G\| = |\int_{\bf R} G(t)\ d\lambda - \int_{\bf R} G(t)\ dt| \ \ \ \ \ (13)

for some limit point {\lambda = \lambda_G} of the translates {\tau_h\mu} in the vague topology. From (12) we have

\displaystyle  0 \leq \lambda \leq 2m \ \ \ \ \ (14)

and (7) follows from (13).

Finally, we prove (8). By(11), it suffices to show that

\displaystyle  |\int_{\bf R} G*H(t) d\tau_h\mu(t) - \int_{\bf R} G*H(t)\ dt| \leq \|G\| \|H\| + o(1)

for any {G,H \in C_c({\bf R})}, where the {o(1)} decay errors are allowed to depend on {G,H}. Since {\tau_h\nu} converges vaguely to {2m}, we already have from (10) that

\displaystyle  \int_{\bf R} G*H(t)\ d\tau_h\mu(t) + \int_{\bf R} G*H(t)\ d\tau_h(\frac{1}{t} \mu*\mu)(t)

\displaystyle  = 2 \int_{\bf R} G*H(t) + o(1)

so it suffices to show that

\displaystyle  |\int_{\bf R} G*H(t)\ d\tau_h(\frac{1}{t} \mu*\mu)(t) - \int_{\bf R} G*H(t)\ dt| \leq \|G\| \|H\| + o(1)

Let {m_+} be Lebesgue measure on the half-line {[0,+\infty)}. Then {m_+ * m_+ = t m_+}, so {\tau_h( \frac{1}{t} (m_+ * m_+) )} converges vaguely to {m}. The measure {\frac{1}{t} (m_+ * \mu)} is equal to {m} times the function {t \mapsto \frac{1}{t} \sum_{n \leq e^t} \frac{\Lambda(n)}{n}}, so by Mertens’ theorem this function also converges vaguely to {m}. We conclude that

\displaystyle  \tau_h(\frac{1}{t} \mu*\mu) - \tau_h(\frac{1}{t} (\mu-m_+)*(\mu-m_+))

converges vaguely to {m}, and so it suffices to show that

\displaystyle  |\int_{\bf R} G*H(t)\ d\tau_h(\frac{1}{t} (\mu-m_+)*(\mu-m_+))(t)| \leq \|G\| \|H\| + o(1)

We rewrite this as

\displaystyle  |\int_{\bf R} \frac{1}{t+h} G*H(t+h)\ d((\mu-m_+)*(\mu-m_+))(t)| \leq \|G\| \|H\| + o(1).

On the support of {G*H(t+h)}, we have {\frac{1}{t+h} = \frac{1+o(1)}{h}}, so it suffices to show that

\displaystyle  |\frac{1}{h} \int_{\bf R} G*H(t+h)\ d((\mu-m_+)*(\mu-m_+))(t)| \ \ \ \ \ (15)

\displaystyle  \leq \|G\| \|H\| + o(1).

(The error term in {\frac{1+o(1)}{h}} can be controlled by using (15) with {G,H} replaced by {|G|, |H|}, and modifying the preceding arguments to replace {(\mu-m_+)*(\mu-m_+)} by {(\mu+m_+)*(\mu+m_+)}.)

From Fubini’s theorem we have

\displaystyle  \int_{\bf R} G*H(t+h)\ d((\mu-m_+)*(\mu-m_+))(t)

\displaystyle  = \int_{\bf R} \int_{\bf R} G*H(t+s+h)\ d(\mu-m_+)(t) d(\mu-m_+)(s)

\displaystyle  = \int_{\bf R} \int_{\bf R} \int_{\bf R} G(t+k) H(s+h-k)\ d(\mu-m_+)(t) d(\mu-m_+)(s) dk

\displaystyle  = \int_{\bf R} (\int_{\bf R} G(t)\ d\tau_k (\mu-m_+)(t)) (\int_{\bf R} H(s)\ d\tau_{h-k} (\mu-m_+)(s))\ dk

The {k} integrand vanishes unless {-O(1) \leq k \leq h+O(1)}. By (11), we have

\displaystyle  \limsup_{k \rightarrow +\infty} |\int_{\bf R} G(t)\ d\tau_k (\mu-m_+)(t)| = \|G\|

and

\displaystyle  \limsup_{l \rightarrow +\infty} |\int_{\bf R} H(t)\ d\tau_l (\mu-m_+)(t)| = \|H\|,

and the claim (15) follows.

— 3. Non-trivial algebras with many local units have non-trivial spectrum —

We now prove Theorem 2. Let {B} be the Banach algebra completion of {C_c({\bf R})} under the seminorm {\| \|} (thus {B} is the space of Cauchy sequences in {C_c({\bf R})}, quotiented out by the sequences that go to zero in the seminorm {\| \|}). Since {\| \|} is not identically zero, {B} is a non-trivial commutative Banach algebra (but it is not necessarily unital).

It is convenient to adjoin a unit {1} to {B} to create a unital commutative Banach algebra {B' := {\bf C} 1 + B} with the extended norm

\displaystyle  \| t 1 + f \| := |t| + |f|

for {t \in {\bf C}} and {f \in B}; one easily verifies that {B'} is a unital commutative Banach algebra.

Suppose that all elements of {B} have zero spectral radius (as defined in (6)). Let {f} be a Schwartz function with compactly supported Fourier transform. Then we can find another Schwartz function {g} with compactly supported Fourier transform such that {f * g = f} (by ensuring that {\hat g=1} on the support of {\hat f}; thus {g} is a “local unit” on the Fourier support of {f}). Thus {f * g^{*n} = f} for all {n}. But {g} has spectral radius zero, thus {f} is zero in {B}. By density this implies that {B} is trivial, a contradiction.

Thus there is an element of {B} with positive spectral radius. Then by (6), there is a character {\lambda: B' \rightarrow {\bf C}} that is does not vanish identically on {B}. Suppose that for each {\xi \in {\bf R}} there exists {f \in C_c({\bf R})} in the kernel of {\lambda} whose Fourier coefficient {\hat f(\xi) := \int_{\bf R} f(t) e^{-it\xi}\ dt} is non-vanishing. Since the kernel of {\lambda} is a space closed with respect to convolutions by {C_c({\bf R})} functions, some Fourier analysis and a smooth partition of unity then shows that the kernel of {\lambda} contains any Schwartz function with compactly supported Fourier transform, and thus by density {\lambda} is trivial, a contradiction. Thus there must exist {\xi \in {\bf R}} such that {\hbox{ker}(\lambda)} contains all test functions with Fourier coefficient vanishing at {\xi}. From this we conclude that {\lambda} on {B} is a constant multiple of the Fourier coefficient map {f \mapsto \hat f(\xi)}; being a non-trivial algebra homomorphism on {B}, we thus have

\displaystyle  \lambda(f) = \hat f(\xi)

for all {f \in C_c({\bf R})}. Since characters have norm at most {1} (as can be seen for instance from (6)), we obtain the claim.

— 4. Breaking the parity barrier —

We now prove Theorem 3. We divide into two cases, depending on whether {\xi=0} or {\xi \neq 0}. If {\xi=0}, we let {G = G_0: {\bf R} \rightarrow [0,1]} be a continuous function that equals {1} on {[-N,0]} and is supported on {[-N-1,1]} for some large {N}. From Mertens’ theorem we have

\displaystyle  \sum_n \frac{\Lambda(n)}{n} G_0( \log \frac{x}{n} ) = N + O(1)

for {x} sufficiently large depending on {N}, and thus

\displaystyle  \|G_0\| = O(1).

The claim then follows by taking {N} sufficiently large.

Now suppose {\xi \neq 0}. In the language of Section 2, we have

\displaystyle  \|G\| = |\int_{\bf R} G(t)\ d(\lambda - m)(t)|

for some limit point {\lambda} of the {\tau_h \mu}. We can write the right-hand side as

\displaystyle  \int_{\bf R} \hbox{Re}( e^{i\theta} G(t)\ d(\lambda-m)(t) )

for some phase {e^{i\theta}}. From (14), {\lambda-m} is a real measure between {-m} and {m}, so by the triangle inequality we have

\displaystyle  \|G\| \leq \int_{\bf R} |\hbox{Re}( e^{i\theta} G(t) )|\ dt.

Now we set {G(t) := G_0(t) e^{it\xi}}, where {G_0} is as before. Then

\displaystyle  \int_{\bf R} |\hbox{Re}( e^{i\theta} G(t) )|\ dt = \int_{\bf R} |\cos(t \xi + \theta)| G_0(t)\ dt.

Since {t \mapsto |\cos(t \xi + \theta)|} is periodic with period {2\pi/|\xi|} and has mean value strictly less than {1} (in fact, it has mean {\frac{2}{\pi}}), we thus have

\displaystyle  \int_{\bf R} |\hbox{Re}( e^{i\theta} G(t) )|\ dt = \int_{\bf R} |\cos(t \xi + \theta)| G_0(t)\ dt < \int_{\bf R} G_0(t)\ dt

if {N} is sufficiently large depending on {\xi}. The claim follows.

— 5. The prime number theorem in arithmetic progressions —

Let {\chi} be a non-principal Dirichlet character of some period {q}. We allow all implied constants in the {O()} notation to depend on {\chi}. In this section we sketch the changes to the above arguments needed to establish

\displaystyle  \sum_{n \leq x} \chi(n) \Lambda(n) = o(x),

which gives the prime number in arithmetic progressions by the usual Fourier expansion into Dirichlet characters.

We have the twisted versions

\displaystyle  \chi(n)\Lambda_2(n) := \sum_{d|n} \chi(d)\mu(d) \chi(\frac{n}{d}) \log^2 \frac{n}{d}

and

\displaystyle  \chi(n)\Lambda_2(n) = \chi(n)\Lambda(n) \log n + \sum_{d|n} \chi(d)\Lambda(d) \chi(\frac{n}{d})\Lambda(\frac{n}{d})

of (3), (4). Since {\chi} has mean zero, a decomposition into intervals of length {q} reveals that

\displaystyle  \frac{1}{x} \sum_{n \leq x} \chi(n) \log^2 n = O( \frac{\log^2 (2 + x)}{x})

from which we obtain the twisted Selberg symmetry formula

\displaystyle  \sum_{n \leq x} \chi(n) \Lambda_2(n) = O(x).

If we define the twisted measures

\displaystyle  \mu_\chi := \sum_{n=1}^\infty \frac{\chi(n) \Lambda(n)}{n} \delta_{\log n}

and

\displaystyle  \nu_\chi := \mu_\chi + \frac{1}{t} \mu_\chi * \mu_\chi

then

\displaystyle  \nu_\chi := \sum_{n=1}^\infty \frac{\chi(n) \Lambda_2(n)}{n \log n} \delta_{\log n}.

and hence {\tau_h \nu_\chi} converges weakly to zero as {h \rightarrow +\infty}. Introducing the twisted norms

\displaystyle  \|G\|_\chi := \limsup_{x \rightarrow \infty} |\sum_n \frac{\chi(n) \Lambda(n)}{n} G( \log \frac{x}{n} )|

we may verify that {\| \|_\chi} obeys the conclusions of Theorem 1.

By repeating the previous arguments, it will suffice that the analogue of Theorem 3 for {\| \|_\chi} holds. When {\xi = 0}, we can argue as in Section 4, where the role of Mertens’ theorem is replaced by Dirichlet’s theorem

\displaystyle  \sum_{n \leq x} \frac{\chi(n) \Lambda(n)}{n} = O(1)

which is ultimately a consequence of the non-vanishing of {L(1,\chi)}.

For {\xi \neq 0}, the argument in Section 4 works with minimal changes if {\chi} is real-valued. If {\chi} is complex valued, it still takes only a finite number of values {S} in the unit disk. Then the limit measures {\lambda} appearing in Section 4 are equal to Lebesgue measure {m} times a density taking values in the convex hull {\overline{S}} of this finite set of values, which is a polygon in the unit disk. One can then modify the arguments in Section 4 to bound

\displaystyle  \|G\| \leq \int_{\bf R} \sup_{z \in \overline{S}} |\Re z e^{i\theta} G(t)|\ dt

for some phase {e^{i\theta}}. If we set {G(t) = e^{it\xi} G_0(t)} as before, we again observe that the function {t \mapsto \sup_{z \in \overline{S}} |\Re z e^{i\theta} e^{it\xi}|} is periodic and has mean strictly less than one, and so we can again establish the required bound {\|G\|_\chi < \|G\|_{L^1({\bf R})}} if {N} is large enough.

— 6. Proof of Gelfand formula —

We now prove (6).

If {\lambda} is a character, then it has an operator norm:

\displaystyle  |\lambda(f)| \leq \|\lambda\|_{op} \|f\|.

But we may eliminate this norm by using the “tensor power trick”: replacing {f} with {f^n} and then taking {n^{th}} roots we conclude that

\displaystyle  |\lambda(f)| \leq \|\lambda\|_{op}^{1/n} \|f\|

and then on sending {n \rightarrow \infty} we have

\displaystyle  |\lambda(f)| \leq \|f\|.

Replacing {f} by {f^n} again, taking {n^{th}} roots, and sending {n \rightarrow \infty} we conclude that

\displaystyle  |\lambda(f)| \leq \lim_{n \rightarrow \infty} \|f^n\|^{1/n}.

(The limit exists because {n \mapsto \|f^n\|} is submultiplicative.) This gives one direction of (6). To give the other direction, suppose for sake of contradiction that we could find an {f \in B} such that

\displaystyle  \lim_{n \rightarrow \infty} \|f^n\|^{1/n} > R > \sup_{\lambda \in \hat B} |\lambda(f)| \ \ \ \ \ (16)

for some real number {R>0}.

There are two cases, depending on whether we can find a complex number {z} with {|z| \geq R} and {f-z} non-invertible. First suppose that such a {z} exists; then {f-z} generates an ideal of {B}, which by Zorn’s lemma is contained in a maximal ideal {I}, whose quotient is then a field. By Neumann series, any element of {B} sufficiently close to the identity is invertible and thus not in {I}; since {B/I} is a field, we conclude that the complement of {I} is open, and so {B/I} is closed. This makes {B/I} a Banach algebra as well as a field. If {f \in B/I} is not a multiple of the identity, then {f-z} is invertible for every {z} and so (by Neumann series) {(f-z)^{-1}} is an analytic function from {{\bf C}} to {B/I} which goes to zero at infinity, contradicting Liouville’s theorem. Thus {B/I} is one-dimensional (this is the Banach-Mazur theorem) and thus isomorphic to {{\bf C}}; this gives a continuous unital algebra homomorphism {\lambda: B \rightarrow {\bf C}} with {f-z} in the kernel, thus {\lambda(f)=z}, contradicting the second inequality in (16).

Now suppose that {f-z} is invertible for all {|z| \geq R}. Then, as in the preceding argument, {z \mapsto (f-z)^{-1}} is an analytic function from {\{ z: |z| \geq R \}} to {B} which decays to zero at infinity, so we have the Cauchy integral formula

\displaystyle  f^n = -\frac{1}{2\pi} \int_{|z|=R} z^n (f-z)^{-1}\ dz

for any natural number {n}. From the triangle inequality we conclude in particular that

\displaystyle  \|f^n \| \ll_{R,f} R^n

which contradicts the first inequality in (16).