You are currently browsing the category archive for the ‘teaching’ category.

Previous set of notes: Notes 3. Next set of notes: 246C Notes 1.

One of the great classical triumphs of complex analysis was in providing the first complete proof (by Hadamard and de la Vallée Poussin in 1896) of arguably the most important theorem in analytic number theory, the prime number theorem:

Theorem 1 (Prime number theorem) Let {\pi(x)} denote the number of primes less than a given real number {x}. Then

\displaystyle  \lim_{x \rightarrow \infty} \frac{\pi(x)}{x/\ln x} = 1

(or in asymptotic notation, {\pi(x) = (1+o(1)) \frac{x}{\ln x}} as {x \rightarrow \infty}).

(Actually, it turns out to be slightly more natural to replace the approximation {\frac{x}{\ln x}} in the prime number theorem by the logarithmic integral {\int_2^x \frac{dt}{\ln t}}, which happens to be a more precise approximation, but we will not stress this point here.)

The complex-analytic proof of this theorem hinges on the study of a key meromorphic function related to the prime numbers, the Riemann zeta function {\zeta}. Initially, it is only defined on the half-plane {\{ s \in {\bf C}: \mathrm{Re} s > 1 \}}:

Definition 2 (Riemann zeta function, preliminary definition) Let {s \in {\bf C}} be such that {\mathrm{Re} s > 1}. Then we define

\displaystyle  \zeta(s) := \sum_{n=1}^\infty \frac{1}{n^s}. \ \ \ \ \ (1)

Note that the series is locally uniformly convergent in the half-plane {\{ s \in {\bf C}: \mathrm{Re} s > 1 \}}, so in particular {\zeta} is holomorphic on this region. In previous notes we have already evaluated some special values of this function:

\displaystyle  \zeta(2) = \frac{\pi^2}{6}; \quad \zeta(4) = \frac{\pi^4}{90}; \quad \zeta(6) = \frac{\pi^6}{945}. \ \ \ \ \ (2)

However, it turns out that the zeroes (and pole) of this function are of far greater importance to analytic number theory, particularly with regards to the study of the prime numbers.

The Riemann zeta function has several remarkable properties, some of which we summarise here:

Theorem 3 (Basic properties of the Riemann zeta function)
  • (i) (Euler product formula) For any {s \in {\bf C}} with {\mathrm{Re} s > 1}, we have

    \displaystyle  \zeta(s) = \prod_p (1 - \frac{1}{p^s})^{-1} \ \ \ \ \ (3)

    where the product is absolutely convergent (and locally uniform in {s}) and is over the prime numbers {p = 2, 3, 5, \dots}.
  • (ii) (Trivial zero-free region) {\zeta(s)} has no zeroes in the region {\{s: \mathrm{Re}(s) > 1 \}}.
  • (iii) (Meromorphic continuation) {\zeta} has a unique meromorphic continuation to the complex plane (which by abuse of notation we also call {\zeta}), with a simple pole at {s=1} and no other poles. Furthermore, the Riemann xi function

    \displaystyle  \xi(s) := \frac{1}{2} s(s-1) \pi^{-s/2} \Gamma(s/2) \zeta(s) \ \ \ \ \ (4)

    is an entire function of order {1} (after removing all singularities). The function {(s-1) \zeta(s)} is an entire function of order one after removing the singularity at {s=1}.
  • (iv) (Functional equation) After applying the meromorphic continuation from (iii), we have

    \displaystyle  \zeta(s) = 2^s \pi^{s-1} \sin(\frac{\pi s}{2}) \Gamma(1-s) \zeta(1-s) \ \ \ \ \ (5)

    for all {s \in {\bf C}} (excluding poles). Equivalently, we have

    \displaystyle  \xi(s) = \xi(1-s) \ \ \ \ \ (6)

    for all {s \in {\bf C}}. (The equivalence between the (5) and (6) is a routine consequence of the Euler reflection formula and the Legendre duplication formula, see Exercises 26 and 31 of Notes 1.)

Proof: We just prove (i) and (ii) for now, leaving (iii) and (iv) for later sections.

The claim (i) is an encoding of the fundamental theorem of arithmetic, which asserts that every natural number {n} is uniquely representable as a product {n = \prod_p p^{a_p}} over primes, where the {a_p} are natural numbers, all but finitely many of which are zero. Writing this representation as {\frac{1}{n^s} = \prod_p \frac{1}{p^{a_p s}}}, we see that

\displaystyle  \sum_{n \in S_{x,m}} \frac{1}{n^s} = \prod_{p \leq x} \sum_{a=0}^m \frac{1}{p^{as}}

whenever {x \geq 1}, {m \geq 0}, and {S_{x,m}} consists of all the natural numbers of the form {n = \prod_{p \leq x} p^{a_p}} for some {a_p \leq m}. Sending {m} and {x} to infinity, we conclude from monotone convergence and the geometric series formula that

\displaystyle  \sum_{n=1}^\infty \frac{1}{n^s} = \prod_{p} \sum_{a=0}^\infty \frac{1}{p^{as}} =\prod_p (1 - \frac{1}{p^s})^{-1}

whenever {s>1} is real, and then from dominated convergence we see that the same formula holds for complex {s} with {\mathrm{Re} s > 1} as well. Local uniform convergence then follows from the product form of the Weierstrass {M}-test (Exercise 19 of Notes 1).

The claim (ii) is immediate from (i) since the Euler product {\prod_p (1-\frac{1}{p^s})^{-1}} is absolutely convergent and all terms are non-zero. \Box

We remark that by sending {s} to {1} in Theorem 3(i) we conclude that

\displaystyle  \sum_{n=1}^\infty \frac{1}{n} = \prod_p (1-\frac{1}{p})^{-1}

and from the divergence of the harmonic series we then conclude Euler’s theorem {\sum_p \frac{1}{p} = \infty}. This can be viewed as a weak version of the prime number theorem, and already illustrates the potential applicability of the Riemann zeta function to control the distribution of the prime numbers.

The meromorphic continuation (iii) of the zeta function is initially surprising, but can be interpreted either as a manifestation of the extremely regular spacing of the natural numbers {n} occurring in the sum (1), or as a consequence of various integral representations of {\zeta} (or slight modifications thereof). We will focus in this set of notes on a particular representation of {\zeta} as essentially the Mellin transform of the theta function {\theta} that briefly appeared in previous notes, and the functional equation (iv) can then be viewed as a consequence of the modularity of that theta function. This in turn was established using the Poisson summation formula, so one can view the functional equation as ultimately being a manifestation of Poisson summation. (For a direct proof of the functional equation via Poisson summation, see these notes.)

Henceforth we work with the meromorphic continuation of {\zeta}. The functional equation (iv), when combined with special values of {\zeta} such as (2), gives some additional values of {\zeta} outside of its initial domain {\{s: \mathrm{Re} s > 1\}}, most famously

\displaystyle  \zeta(-1) = -\frac{1}{12}.

If one formally compares this formula with (1), one arrives at the infamous identity

\displaystyle  1 + 2 + 3 + \dots = -\frac{1}{12}

although this identity has to be interpreted in a suitable non-classical sense in order for it to be rigorous (see this previous blog post for further discussion).

From Theorem 3 and the non-vanishing nature of {\Gamma}, we see that {\zeta} has simple zeroes (known as trivial zeroes) at the negative even integers {-2, -4, \dots}, and all other zeroes (the non-trivial zeroes) inside the critical strip {\{ s \in {\bf C}: 0 \leq \mathrm{Re} s \leq 1 \}}. (The non-trivial zeroes are conjectured to all be simple, but this is hopelessly far from being proven at present.) As we shall see shortly, these latter zeroes turn out to be closely related to the distribution of the primes. The functional equation tells us that if {\rho} is a non-trivial zero then so is {1-\rho}; also, we have the identity

\displaystyle  \zeta(s) = \overline{\zeta(\overline{s})} \ \ \ \ \ (7)

for all {s>1} by (1), hence for all {s} (except the pole at {s=1}) by meromorphic continuation. Thus if {\rho} is a non-trivial zero then so is {\overline{\rho}}. We conclude that the set of non-trivial zeroes is symmetric by reflection by both the real axis and the critical line {\{ s \in {\bf C}: \mathrm{Re} s = \frac{1}{2} \}}. We have the following infamous conjecture:

Conjecture 4 (Riemann hypothesis) All the non-trivial zeroes of {\zeta} lie on the critical line {\{ s \in {\bf C}: \mathrm{Re} s = \frac{1}{2} \}}.

This conjecture would have many implications in analytic number theory, particularly with regard to the distribution of the primes. Of course, it is far from proven at present, but the partial results we have towards this conjecture are still sufficient to establish results such as the prime number theorem.

Return now to the original region where {\mathrm{Re} s > 1}. To take more advantage of the Euler product formula (3), we take complex logarithms to conclude that

\displaystyle  -\log \zeta(s) = \sum_p \log(1 - \frac{1}{p^s})

for suitable branches of the complex logarithm, and then on taking derivatives (using for instance the generalised Cauchy integral formula and Fubini’s theorem to justify the interchange of summation and derivative) we see that

\displaystyle  -\frac{\zeta'(s)}{\zeta(s)} = \sum_p \frac{\ln p/p^s}{1 - \frac{1}{p^s}}.

From the geometric series formula we have

\displaystyle  \frac{\ln p/p^s}{1 - \frac{1}{p^s}} = \sum_{j=1}^\infty \frac{\ln p}{p^{js}}

and so (by another application of Fubini’s theorem) we have the identity

\displaystyle  -\frac{\zeta'(s)}{\zeta(s)} = \sum_{n=1}^\infty \frac{\Lambda(n)}{n^s}, \ \ \ \ \ (8)

for {\mathrm{Re} s > 1}, where the von Mangoldt function {\Lambda(n)} is defined to equal {\Lambda(n) = \ln p} whenever {n = p^j} is a power {p^j} of a prime {p} for some {j=1,2,\dots}, and {\Lambda(n)=0} otherwise. The contribution of the higher prime powers {p^2, p^3, \dots} is negligible in practice, and as a first approximation one can think of the von Mangoldt function as the indicator function of the primes, weighted by the logarithm function.

The series {\sum_{n=1}^\infty \frac{1}{n^s}} and {\sum_{n=1}^\infty \frac{\Lambda(n)}{n^s}} that show up in the above formulae are examples of Dirichlet series, which are a convenient device to transform various sequences of arithmetic interest into holomorphic or meromorphic functions. Here are some more examples:

Exercise 5 (Standard Dirichlet series) Let {s} be a complex number with {\mathrm{Re} s > 1}.
  • (i) Show that {-\zeta'(s) = \sum_{n=1}^\infty \frac{\ln n}{n^s}}.
  • (ii) Show that {\zeta^2(s) = \sum_{n=1}^\infty \frac{\tau(n)}{n^s}}, where {\tau(n) := \sum_{d|n} 1} is the divisor function of {n} (the number of divisors of {n}).
  • (iii) Show that {\frac{1}{\zeta(s)} = \sum_{n=1}^\infty \frac{\mu(n)}{n^s}}, where {\mu(n)} is the Möbius function, defined to equal {(-1)^k} when {n} is the product of {k} distinct primes for some {k \geq 0}, and {0} otherwise.
  • (iv) Show that {\frac{\zeta(2s)}{\zeta(s)} = \sum_{n=1}^\infty \frac{\lambda(n)}{n^s}}, where {\lambda(n)} is the Liouville function, defined to equal {(-1)^k} when {n} is the product of {k} (not necessarily distinct) primes for some {k \geq 0}.
  • (v) Show that {\log \zeta(s) = \sum_{n=1}^\infty \frac{\Lambda(n)/\ln n}{n^s}}, where {\log \zeta} is the holomorphic branch of the logarithm that is real for {s>1}, and with the convention that {\Lambda(n)/\ln n} vanishes for {n=1}.
  • (vi) Use the fundamental theorem of arithmetic to show that the von Mangoldt function is the unique function {\Lambda: {\bf N} \rightarrow {\bf R}} such that

    \displaystyle  \ln n = \sum_{d|n} \Lambda(d)

    for every positive integer {n}. Use this and (i) to provide an alternate proof of the identity (8). Thus we see that (8) is really just another encoding of the fundamental theorem of arithmetic.

Given the appearance of the von Mangoldt function {\Lambda}, it is natural to reformulate the prime number theorem in terms of this function:

Theorem 6 (Prime number theorem, von Mangoldt form) One has

\displaystyle  \lim_{x \rightarrow \infty} \frac{1}{x} \sum_{n \leq x} \Lambda(n) = 1

(or in asymptotic notation, {\sum_{n\leq x} \Lambda(n) = x + o(x)} as {x \rightarrow \infty}).

Let us see how Theorem 6 implies Theorem 1. Firstly, for any {x \geq 2}, we can write

\displaystyle  \sum_{n \leq x} \Lambda(n) = \sum_{p \leq x} \ln p + \sum_{j=2}^\infty \sum_{p \leq x^{1/j}} \ln p.

The sum {\sum_{p \leq x^{1/j}} \ln p} is non-zero for only {O(\ln x)} values of {j}, and is of size {O( x^{1/2} \ln x )}, thus

\displaystyle  \sum_{n \leq x} \Lambda(n) = \sum_{p \leq x} \ln p + O( x^{1/2} \ln^2 x ).

Since {x^{1/2} \ln^2 x = o(x)}, we conclude from Theorem 6 that

\displaystyle  \sum_{p \leq x} \ln p = x + o(x)

as {x \rightarrow \infty}. Next, observe from the fundamental theorem of calculus that

\displaystyle  \frac{1}{\ln p} - \frac{1}{\ln x} = \int_p^x \frac{1}{\ln^2 y} \frac{dy}{y}.

Multiplying by {\log p} and summing over all primes {p \leq x}, we conclude that

\displaystyle  \pi(x) - \frac{\sum_{p \leq x} \ln p}{\ln x} = \int_2^x \sum_{p \leq y} \ln p \frac{1}{\ln^2 y} \frac{dy}{y}.

From Theorem 6 we certainly have {\sum_{p \leq y} \ln p = O(y)}, thus

\displaystyle  \pi(x) - \frac{x + o(x)}{\ln x} = O( \int_2^x \frac{dy}{\ln^2 y} ).

By splitting the integral into the ranges {2 \leq y \leq \sqrt{x}} and {\sqrt{x} < y \leq x} we see that the right-hand side is {o(x/\ln x)}, and Theorem 1 follows.

Exercise 7 Show that Theorem 1 conversely implies Theorem 6.

The alternate form (8) of the Euler product identity connects the primes (represented here via proxy by the von Mangoldt function) with the logarithmic derivative of the zeta function, and can be used as a starting point for describing further relationships between {\zeta} and the primes. Most famously, we shall see later in these notes that it leads to the remarkably precise Riemann-von Mangoldt explicit formula:

Theorem 8 (Riemann-von Mangoldt explicit formula) For any non-integer {x > 1}, we have

\displaystyle  \sum_{n \leq x} \Lambda(n) = x - \lim_{T \rightarrow \infty} \sum_{\rho: |\hbox{Im}(\rho)| \leq T} \frac{x^\rho}{\rho} - \ln(2\pi) - \frac{1}{2} \ln( 1 - x^{-2} )

where {\rho} ranges over the non-trivial zeroes of {\zeta} with imaginary part in {[-T,T]}. Furthermore, the convergence of the limit is locally uniform in {x}.

Actually, it turns out that this formula is in some sense too precise; in applications it is often more convenient to work with smoothed variants of this formula in which the sum on the left-hand side is smoothed out, but the contribution of zeroes with large imaginary part is damped; see Exercise 22. Nevertheless, this formula clearly illustrates how the non-trivial zeroes {\rho} of the zeta function influence the primes. Indeed, if one formally differentiates the above formula in {x}, one is led to the (quite nonrigorous) approximation

\displaystyle  \Lambda(n) \approx 1 - \sum_\rho n^{\rho-1} \ \ \ \ \ (9)

or (writing {\rho = \sigma+i\gamma})

\displaystyle  \Lambda(n) \approx 1 - \sum_{\sigma+i\gamma} \frac{n^{i\gamma}}{n^{1-\sigma}}.

Thus we see that each zero {\rho = \sigma + i\gamma} induces an oscillation in the von Mangoldt function, with {\gamma} controlling the frequency of the oscillation and {\sigma} the rate to which the oscillation dies out as {n \rightarrow \infty}. This relationship is sometimes known informally as “the music of the primes”.

Comparing Theorem 8 with Theorem 6, it is natural to suspect that the key step in the proof of the latter is to establish the following slight but important extension of Theorem 3(ii), which can be viewed as a very small step towards the Riemann hypothesis:

Theorem 9 (Slight enlargement of zero-free region) There are no zeroes of {\zeta} on the line {\{ 1+it: t \in {\bf R} \}}.

It is not quite immediate to see how Theorem 6 follows from Theorem 8 and Theorem 9, but we will demonstrate it below the fold.

Although Theorem 9 only seems like a slight improvement of Theorem 3(ii), proving it is surprisingly non-trivial. The basic idea is the following: if there was a zero at {1+it}, then there would also be a different zero at {1-it} (note {t} cannot vanish due to the pole at {s=1}), and then the approximation (9) becomes

\displaystyle  \Lambda(n) \approx 1 - n^{it} - n^{-it} + \dots = 1 - 2 \cos(t \ln n) + \dots.

But the expression {1 - 2 \cos(t \ln n)} can be negative for large regions of the variable {n}, whereas {\Lambda(n)} is always non-negative. This conflict eventually leads to a contradiction, but it is not immediately obvious how to make this argument rigorous. We will present here the classical approach to doing so using a trigonometric identity of Mertens.

In fact, Theorem 9 is basically equivalent to the prime number theorem:

Exercise 10 For the purposes of this exercise, assume Theorem 6, but do not assume Theorem 9. For any non-zero real {t}, show that

\displaystyle  -\frac{\zeta'(\sigma+it)}{\zeta(\sigma+it)} = o( \frac{1}{\sigma-1})

as {\sigma \rightarrow 1^+}, where {o( \frac{1}{\sigma-1})} denotes a quantity that goes to zero as {\sigma \rightarrow 1^+} after being multiplied by {\sigma-1}. Use this to derive Theorem 9.

This equivalence can help explain why the prime number theorem is remarkably non-trivial to prove, and why the Riemann zeta function has to be either explicitly or implicitly involved in the proof.

This post is only intended as the briefest of introduction to complex-analytic methods in analytic number theory; also, we have not chosen the shortest route to the prime number theorem, electing instead to travel in directions that particularly showcase the complex-analytic results introduced in this course. For some further discussion see this previous set of lecture notes, particularly Notes 2 and Supplement 3 (with much of the material in this post drawn from the latter).

Read the rest of this entry »

Previous set of notes: Notes 2. Next set of notes: Notes 4.

On the real line, the quintessential examples of a periodic function are the (normalised) sine and cosine functions {\sin(2\pi x)}, {\cos(2\pi x)}, which are {1}-periodic in the sense that

\displaystyle  \sin(2\pi(x+1)) = \sin(2\pi x); \quad \cos(2\pi (x+1)) = \cos(2\pi x).

By taking various polynomial combinations of {\sin(2\pi x)} and {\cos(2\pi x)} we obtain more general trigonometric polynomials that are {1}-periodic; and the theory of Fourier series tells us that all other {1}-periodic functions (with reasonable integrability conditions) can be approximated in various senses by such polynomial combinations. Using Euler’s identity, one can use {e^{2\pi ix}} and {e^{-2\pi ix}} in place of {\sin(2\pi x)} and {\cos(2\pi x)} as the basic generating functions here, provided of course one is willing to use complex coefficients instead of real ones. Of course, by rescaling one can also make similar statements for other periods than {1}. {1}-periodic functions {f: {\bf R} \rightarrow {\bf C}} can also be identified (by abuse of notation) with functions {f: {\bf R}/{\bf Z} \rightarrow {\bf C}} on the quotient space {{\bf R}/{\bf Z}} (known as the additive {1}-torus or additive unit circle), or with functions {f: [0,1] \rightarrow {\bf C}} on the fundamental domain (up to boundary) {[0,1]} of that quotient space with the periodic boundary condition {f(0)=f(1)}. The map {x \mapsto (\cos(2\pi x), \sin(2\pi x))} also identifies the additive unit circle {{\bf R}/{\bf Z}} with the geometric unit circle {S^1 = \{ (x,y) \in {\bf R}^2: x^2+y^2=1\} \subset {\bf R}^2}, thanks in large part to the fundamental trigonometric identity {\cos^2 x + \sin^2 x = 1}; this can also be identified with the multiplicative unit circle {S^1 = \{ z \in {\bf C}: |z|=1 \}}. (Usually by abuse of notation we refer to all of these three sets simultaneously as the “unit circle”.) Trigonometric polynomials on the additive unit circle then correspond to ordinary polynomials of the real coefficients {x,y} of the geometric unit circle, or Laurent polynomials of the complex variable {z}.

What about periodic functions on the complex plane? We can start with singly periodic functions {f: {\bf C} \rightarrow {\bf C}} which obey a periodicity relationship {f(z+\omega)=f(z)} for all {z} in the domain and some period {\omega \in {\bf C} \backslash \{0\}}; such functions can also be viewed as functions on the “additive cylinder” {\omega {\bf Z} \backslash {\bf C}} (or equivalently {{\bf C} / \omega {\bf Z}}). We can rescale {\omega=1} as before. For holomorphic functions, we have the following characterisations:

Proposition 1 (Description of singly periodic holomorphic functions)
  • (i) Every {1}-periodic entire function {f: {\bf C} \rightarrow {\bf C}} has an absolutely convergent expansion

    \displaystyle  f(z) = \sum_{n=-\infty}^\infty a_n e^{2\pi i nz} = \sum_{n=-\infty}^\infty a_n q^n \ \ \ \ \ (1)

    where {q} is the nome {q := e^{2\pi i z}}, and the {a_n} are complex coefficients such that

    \displaystyle  \limsup_{n \rightarrow +\infty} |a_n|^{1/n} = \limsup_{n \rightarrow +\infty} |a_{-n}|^{1/n} = 0. \ \ \ \ \ (2)

    Conversely, every doubly infinite sequence {(a_n)_{n \in {\bf Z}}} of coefficients obeying (2) gives rise to a {1}-periodic entire function {f: {\bf C} \rightarrow {\bf C}} via the formula (1).
  • (ii) Every bounded {1}-periodic holomorphic function {f: {\bf H} \rightarrow {\bf C}} on the upper half-plane {\{ z: \mathrm{Im}(z) > 0\}} has an expansion

    \displaystyle  f(z) = \sum_{n=0}^\infty a_n e^{2\pi i nz} = \sum_{n=0}^\infty a_n q^n \ \ \ \ \ (3)

    where the {a_n} are complex coefficients such that

    \displaystyle  \limsup_{n \rightarrow +\infty} |a_n|^{1/n} \leq 1. \ \ \ \ \ (4)

    Conversely, every infinite sequence {(a_n)_{n \in {\bf Z}}} obeying (4) gives rise to a {1}-periodic holomorphic function {f: {\bf H} \rightarrow {\bf C}} which is bounded away from the real axis (i.e., bounded on {\{ z: \mathrm{Im}(z) \geq \varepsilon\}} for every {\varepsilon > 0}).
In both cases, the coefficients {a_n} can be recovered from {f} by the Fourier inversion formula

\displaystyle  a_n = \int_{\gamma_{z_0 \rightarrow z_0+1}} f(z) e^{-2\pi i nz}\ dz \ \ \ \ \ (5)

for any {z_0} in {{\bf C}} (in case (i)) or {{\bf H}} (in case (ii)).

Proof: If {f: {\bf C} \rightarrow {\bf C}} is {1}-periodic, then it can be expressed as {f(z) = F(q) = F(e^{2\pi i z})} for some function {F: {\bf C} \backslash \{0\} \rightarrow {\bf C}} on the “multiplicative cylinder” {{\bf C} \backslash \{0\}}, since the fibres of the map {z \mapsto e^{2\pi i z}} are cosets of the integers {{\bf Z}}, on which {f} is constant by hypothesis. As the map {z \mapsto e^{2\pi i z}} is a covering map from {{\bf C}} to {{\bf C} \backslash \{0\}}, we see that {F} will be holomorphic if and only if {f} is. Thus {F} must have a Laurent series expansion {F(q) = \sum_{n=-\infty}^\infty a_n q^n} with coefficients {a_n} obeying (2), which gives (1), and the inversion formula (5) follows from the usual contour integration formula for Laurent series coefficients. The converse direction to (i) also follows by reversing the above arguments.

For part (ii), we observe that the map {z \mapsto e^{2\pi i z}} is also a covering map from {{\bf H}} to the punctured disk {D(0,1) \backslash \{0\}}, so we can argue as before except that now {F} is a bounded holomorphic function on the punctured disk. By the Riemann singularity removal theorem (Exercise 35 of 246A Notes 3) {F} extends to be holomorphic on all of {D(0,1)}, and thus has a Taylor expansion {F(q) = \sum_{n=0}^\infty a_n q^n} for some coefficients {a_n} obeying (4). The argument now proceeds as with part (i). \Box

The additive cylinder {{\bf Z} \backslash {\bf C}} and the multiplicative cylinder {{\bf C} \backslash \{0\}} can both be identified (on the level of smooth manifolds, at least) with the geometric cylinder {\{ (x,y,z) \in {\bf R}^3: x^2+y^2=1\}}, but we will not use this identification here.

Now let us turn attention to doubly periodic functions of a complex variable {z}, that is to say functions {f} that obey two periodicity relations

\displaystyle  f(z+\omega_1) = f(z); \quad f(z+\omega_2) = f(z)

for all {z \in {\bf C}} and some periods {\omega_1,\omega_2 \in {\bf C}}, which to avoid degeneracies we will assume to be linearly independent over the reals (thus {\omega_1,\omega_2} are non-zero and the ratio {\omega_2/\omega_1} is not real). One can rescale {\omega_1,\omega_2} by a common scaling factor {\lambda \in {\bf C} \backslash \{0\}} to normalise either {\omega_1=1} or {\omega_2=1}, but one of course cannot simultaneously normalise both parameters in this fashion. As in the singly periodic case, such functions can also be identified with functions on the additive {2}-torus {\Lambda \backslash {\bf C}}, where {\Lambda} is the lattice {\Lambda := \omega_1 {\bf Z} + \omega_2 {\bf Z}}, or with functions {f} on the solid parallelogram bounded by the contour {\gamma_{0 \rightarrow \omega_1 \rightarrow \omega_1+\omega_2 \rightarrow \omega_2 \rightarrow 0}} (a fundamental domain up to boundary for that torus), obeying the boundary periodicity conditions

\displaystyle  f(z+\omega_1) = f(z)

for {z} in the edge {\gamma_{\omega_2 \rightarrow 0}}, and

\displaystyle  f(z+\omega_2) = f(z)

for {z} in the edge {\gamma_{\omega_0 \rightarrow 1}}.

Within the world of holomorphic functions, the collection of doubly periodic functions is boring:

Proposition 2 Let {f: {\bf C} \rightarrow {\bf C}} be an entire doubly periodic function (with periods {\omega_1,\omega_2} linearly independent over {{\bf R}}). Then {f} is constant.

In the language of Riemann surfaces, this proposition asserts that the torus {\Lambda \backslash {\bf C}} is a non-hyperbolic Riemann surface; it cannot be holomorphically mapped non-trivially into a bounded subset of the complex plane.

Proof: The fundamental domain (up to boundary) enclosed by {\gamma_{0 \rightarrow \omega_1 \rightarrow \omega_1+\omega_2 \rightarrow \omega_2 \rightarrow 0}} is compact, hence {f} is bounded on this domain, hence bounded on all of {{\bf C}} by double periodicity. The claim now follows from Liouville’s theorem. (One could alternatively have argued here using the compactness of the torus {(\omega_1 {\bf Z} + \omega_2 {\bf Z}) \backslash {\bf C}}. \Box

To obtain more interesting examples of doubly periodic functions, one must therefore turn to the world of meromorphic functions – or equivalently, holomorphic functions into the Riemann sphere {{\bf C} \cup \{\infty\}}. As it turns out, a particularly fundamental example of such a function is the Weierstrass elliptic function

\displaystyle  \wp(z) := \frac{1}{z^2} + \sum_{z_0 \in \Lambda \backslash 0} \left( \frac{1}{(z-z_0)^2} - \frac{1}{z_0^2} \right) \ \ \ \ \ (6)

which plays a role in doubly periodic functions analogous to the role of {x \mapsto \cos(2\pi x)} for {1}-periodic real functions. This function will have a double pole at the origin {0}, and more generally at all other points on the lattice {\Lambda}, but no other poles. The derivative

\displaystyle  \wp'(z) = -2 \sum_{z_0 \in \Lambda} \frac{1}{(z-z_0)^3} \ \ \ \ \ (7)

of the Weierstrass function is another doubly periodic meromorphic function, now with a triple pole at every point of {\Lambda}, and plays a role analogous to {x \mapsto \sin(2\pi x)}. Remarkably, all the other doubly periodic meromorphic functions with these periods will turn out to be rational combinations of {\wp} and {\wp'}; furthermore, in analogy with the identity {\cos^2 x+ \sin^2 x = 1}, one has an identity of the form

\displaystyle  \wp'(z)^2 = 4 \wp(z)^3 - g_2 \wp(z) - g_3 \ \ \ \ \ (8)

for all {z \in {\bf C}} (avoiding poles) and some complex numbers {g_2,g_3} that depend on the lattice {\Lambda}. Indeed, much as the map {x \mapsto (\cos 2\pi x, \sin 2\pi x)} creates a diffeomorphism between the additive unit circle {{\bf R}/{\bf Z}} to the geometric unit circle {\{ (x,y) \in{\bf R}^2: x^2+y^2=1\}}, the map {z \mapsto (\wp(z), \wp'(z))} turns out to be a complex diffeomorphism between the torus {(\omega_1 {\bf Z} + \omega_2 {\bf Z}) \backslash {\bf C}} and the elliptic curve

\displaystyle  \{ (z, w) \in {\bf C}^2: z^2 = 4w^3 - g_2 w - g_3 \} \cup \{\infty\}

with the convention that {(\wp,\wp')} maps the origin {\omega_1 {\bf Z} + \omega_2 {\bf Z}} of the torus to the point {\infty} at infinity. (Indeed, one can view elliptic curves as “multiplicative tori”, and both the additive and multiplicative tori can be identified as smooth manifolds with the more familiar geometric torus, but we will not use such an identification here.) This fundamental identification with elliptic curves and tori motivates many of the further remarkable properties of elliptic curves; for instance, the fact that tori are obviously an abelian group gives rise to an abelian group law on elliptic curves (and this law can be interpreted as an analogue of the trigonometric sum identities for {\wp, \wp'}). The description of the various meromorphic functions on the torus also helps motivate the more general Riemann-Roch theorem that is a fundamental law governing meromorphic functions on other compact Riemann surfaces (and is discussed further in these 246C notes). So far we have focused on studying a single torus {\Lambda \backslash {\bf C}}. However, another important mathematical object of study is the space of all such tori, modulo isomorphism; this is a basic example of a moduli space, known as the (classical, level one) modular curve {X_0(1)}. This curve can be described in a number of ways. On the one hand, it can be viewed as the upper half-plane {{\bf H} = \{ z: \mathrm{Im}(z) > 0 \}} quotiented out by the discrete group {SL_2({\bf Z})}; on the other hand, by using the {j}-invariant, it can be identified with the complex plane {{\bf C}}; alternatively, one can compactify the modular curve and identify this compactification with the Riemann sphere {{\bf C} \cup \{\infty\}}. (This identification, by the way, produces a very short proof of the little and great Picard theorems, which we proved in 246A Notes 4.) Functions on the modular curve (such as the {j}-invariant) can be viewed as {SL_2({\bf Z})}-invariant functions on {{\bf H}}, and include the important class of modular functions; they naturally generalise to the larger class of (weakly) modular forms, which are functions on {{\bf H}} which transform in a very specific way under {SL_2({\bf Z})}-action, and which are ubiquitous throughout mathematics, and particularly in number theory. Basic examples of modular forms include the Eisenstein series, which are also the Laurent coefficients of the Weierstrass elliptic functions {\wp}. More number theoretic examples of modular forms include (suitable powers of) theta functions {\theta}, and the modular discriminant {\Delta}. Modular forms are {1}-periodic functions on the half-plane, and hence by Proposition 1 come with Fourier coefficients {a_n}; these coefficients often turn out to encode a surprising amount of number-theoretic information; a dramatic example of this is the famous modularity theorem, (a special case of which was) used amongst other things to establish Fermat’s last theorem. Modular forms can be generalised to other discrete groups than {SL_2({\bf Z})} (such as congruence groups) and to other domains than the half-plane {{\bf H}}, leading to the important larger class of automorphic forms, which are of major importance in number theory and representation theory, but which are well outside the scope of this course to discuss.

Read the rest of this entry »

Previous set of notes: Notes 1. Next set of notes: Notes 3.

In Exercise 5 (and Lemma 1) of 246A Notes 4 we already observed some links between complex analysis on the disk (or annulus) and Fourier series on the unit circle:

  • (i) Functions {f} that are holomorphic on a disk {\{ |z| < R \}} are expressed by a convergent Fourier series (and also Taylor series) {f(re^{i\theta}) = \sum_{n=0}^\infty r^n a_n e^{in\theta}} for {0 \leq r < R} (so in particular {a_n = \frac{1}{n!} f^{(n)}(0)}), where

    \displaystyle  \limsup_{n \rightarrow +\infty} |a_n|^{1/n} \leq \frac{1}{R}; \ \ \ \ \ (1)

    conversely, every infinite sequence {(a_n)_{n=0}^\infty} of coefficients obeying (1) arises from such a function {f}.
  • (ii) Functions {f} that are holomorphic on an annulus {\{ r_- < |z| < r_+ \}} are expressed by a convergent Fourier series (and also Laurent series) {f(re^{i\theta}) = \sum_{n=-\infty}^\infty r^n a_n e^{in\theta}}, where

    \displaystyle  \limsup_{n \rightarrow +\infty} |a_n|^{1/n} \leq \frac{1}{r_+}; \limsup_{n \rightarrow -\infty} |a_n|^{1/|n|} \leq \frac{1}{r_-}; \ \ \ \ \ (2)

    conversely, every doubly infinite sequence {(a_n)_{n=-\infty}^\infty} of coefficients obeying (2) arises from such a function {f}.
  • (iii) In the situation of (ii), there is a unique decomposition {f = f_1 + f_2} where {f_1} extends holomorphically to {\{ z: |z| < r_+\}}, and {f_2} extends holomorphically to {\{ z: |z| > r_-\}} and goes to zero at infinity, and are given by the formulae

    \displaystyle  f_1(z) = \sum_{n=0}^\infty a_n z^n = \frac{1}{2\pi i} \int_\gamma \frac{f(w)}{w-z}\ dw

    where {\gamma} is any anticlockwise contour in {\{ z: |z| < r_+\}} enclosing {z}, and and

    \displaystyle  f_2(z) = \sum_{n=-\infty}^{-1} a_n z^n = - \frac{1}{2\pi i} \int_\gamma \frac{f(w)}{w-z}\ dw

    where {\gamma} is any anticlockwise contour in {\{ z: |z| > r_-\}} enclosing {0} but not {z}.

This connection lets us interpret various facts about Fourier series through the lens of complex analysis, at least for some special classes of Fourier series. For instance, the Fourier inversion formula {a_n = \frac{1}{2\pi} \int_0^{2\pi} f(e^{i\theta}) e^{-in\theta}\ d\theta} becomes the Cauchy-type formula for the Laurent or Taylor coefficients of {f}, in the event that the coefficients are doubly infinite and obey (2) for some {r_- < 1 < r_+}, or singly infinite and obey (1) for some {R > 1}.

It turns out that there are similar links between complex analysis on a half-plane (or strip) and Fourier integrals on the real line, which we will explore in these notes.

We first fix a normalisation for the Fourier transform. If {f \in L^1({\bf R})} is an absolutely integrable function on the real line, we define its Fourier transform {\hat f: {\bf R} \rightarrow {\bf C}} by the formula

\displaystyle  \hat f(\xi) := \int_{\bf R} f(x) e^{-2\pi i x \xi}\ dx. \ \ \ \ \ (3)

From the dominated convergence theorem {\hat f} will be a bounded continuous function; from the Riemann-Lebesgue lemma it also decays to zero as {\xi \rightarrow \pm \infty}. My choice to place the {2\pi} in the exponent is a personal preference (it is slightly more convenient for some harmonic analysis formulae such as the identities (4), (5), (6) below), though in the complex analysis and PDE literature there are also some slight advantages in omitting this factor. In any event it is not difficult to adapt the discussion in this notes for other choices of normalisation. It is of interest to extend the Fourier transform beyond the {L^1({\bf R})} class into other function spaces, such as {L^2({\bf R})} or the space of tempered distributions, but we will not pursue this direction here; see for instance these lecture notes of mine for a treatment.

Exercise 1 (Fourier transform of Gaussian) If {a} is a complex number with {\mathrm{Re} a>0} and {f} is the Gaussian function {f(x) := e^{-\pi a x^2}}, show that the Fourier transform {\hat f} is given by the Gaussian {\hat f(\xi) = a^{-1/2} e^{-\pi \xi^2/a}}, where we use the standard branch for {a^{-1/2}}.

The Fourier transform has many remarkable properties. On the one hand, as long as the function {f} is sufficiently “reasonable”, the Fourier transform enjoys a number of very useful identities, such as the Fourier inversion formula

\displaystyle  f(x) = \int_{\bf R} \hat f(\xi) e^{2\pi i x \xi} d\xi, \ \ \ \ \ (4)

the Plancherel identity

\displaystyle  \int_{\bf R} |f(x)|^2\ dx = \int_{\bf R} |\hat f(\xi)|^2\ d\xi, \ \ \ \ \ (5)

and the Poisson summation formula

\displaystyle  \sum_{n \in {\bf Z}} f(n) = \sum_{k \in {\bf Z}} \hat f(k). \ \ \ \ \ (6)

On the other hand, the Fourier transform also intertwines various qualitative properties of a function {f} with “dual” qualitative properties of its Fourier transform {\hat f}; in particular, “decay” properties of {f} tend to be associated with “regularity” properties of {\hat f}, and vice versa. For instance, the Fourier transform of rapidly decreasing functions tend to be smooth. There are complex analysis counterparts of this Fourier dictionary, in which “decay” properties are described in terms of exponentially decaying pointwise bounds, and “regularity” properties are expressed using holomorphicity on various strips, half-planes, or the entire complex plane. The following exercise gives some examples of this:

Exercise 2 (Decay of {f} implies regularity of {\hat f}) Let {f \in L^1({\bf R})} be an absolutely integrable function.
  • (i) If {f} has super-exponential decay in the sense that {f(x) \lesssim_{f,M} e^{-M|x|}} for all {x \in {\bf R}} and {M>0} (that is to say one has {|f(x)| \leq C_{f,M} e^{-M|x|}} for some finite quantity {C_{f,M}} depending only on {f,M}), then {\hat f} extends uniquely to an entire function {\hat f : {\bf C} \rightarrow {\bf C}}. Furthermore, this function continues to be defined by (3).
  • (ii) If {f} is supported on a compact interval {[a,b]} then the entire function {\hat f} from (i) obeys the bounds {\hat f(\xi) \lesssim_f \max( e^{2\pi a \mathrm{Im} \xi}, e^{2\pi b \mathrm{Im} \xi} )} for {\xi \in {\bf C}}. In particular, if {f} is supported in {[-M,M]} then {\hat f(\xi) \lesssim_f e^{2\pi M |\mathrm{Im}(\xi)|}}.
  • (iii) If {f} obeys the bound {f(x) \lesssim_{f,a} e^{-2\pi a|x|}} for all {x \in {\bf R}} and some {a>0}, then {\hat f} extends uniquely to a holomorphic function {\hat f} on the horizontal strip {\{ \xi: |\mathrm{Im} \xi| < a \}}, and obeys the bound {\hat f(\xi) \lesssim_{f,a} \frac{1}{a - |\mathrm{Im}(\xi)|}} in this strip. Furthermore, this function continues to be defined by (3).
  • (iv) If {f} is supported on {[0,+\infty)} (resp. {(-\infty,0]}), then there is a unique continuous extension of {\hat f} to the lower half-plane {\{ \xi: \mathrm{Im} \xi \leq 0\}} (resp. the upper half-plane {\{ \xi: \mathrm{Im} \xi \geq 0 \}}) which is holomorphic in the interior of this half-plane, and such that {\hat f(\xi) \rightarrow 0} uniformly as {\mathrm{Im} \xi \rightarrow -\infty} (resp. {\mathrm{Im} \xi \rightarrow +\infty}). Furthermore, this function continues to be defined by (3).
Hint: to establish holomorphicity in each of these cases, use Morera’s theorem and the Fubini-Tonelli theorem. For uniqueness, use analytic continuation, or (for part (iv)) the Schwartz reflection principle.

Later in these notes we will give a partial converse to part (ii) of this exercise, known as the Paley-Wiener theorem; there are also partial converses to the other parts of this exercise.

From (3) we observe the following intertwining property between multiplication by an exponential and complex translation: if {\xi_0} is a complex number and {f: {\bf R} \rightarrow {\bf C}} is an absolutely integrable function such that the modulated function {f_{\xi_0}(x) := e^{2\pi i \xi_0 x} f(x)} is also absolutely integrable, then we have the identity

\displaystyle  \widehat{f_{\xi_0}}(\xi) = \hat f(\xi - \xi_0) \ \ \ \ \ (7)

whenever {\xi} is a complex number such that at least one of the two sides of the equation in (7) is well defined. Thus, multiplication of a function by an exponential weight corresponds (formally, at least) to translation of its Fourier transform. By using contour shifting, we will also obtain a dual relationship: under suitable holomorphicity and decay conditions on {f}, translation by a complex shift will correspond to multiplication of the Fourier transform by an exponential weight. It turns out to be possible to exploit this property to derive many Fourier-analytic identities, such as the inversion formula (4) and the Poisson summation formula (6), which we do later in these notes. (The Plancherel theorem can also be established by complex analytic methods, but this requires a little more effort; see Exercise 8.)

The material in these notes is loosely adapted from Chapter 4 of Stein-Shakarchi’s “Complex Analysis”.

Read the rest of this entry »

Previous set of notes: 246A Notes 5. Next set of notes: Notes 2.

— 1. Jensen’s formula —

Suppose {f} is a non-zero rational function {f =P/Q}, then by the fundamental theorem of algebra one can write

\displaystyle  f(z) = c \frac{\prod_\rho (z-\rho)}{\prod_\zeta (z-\zeta)}

for some non-zero constant {c}, where {\rho} ranges over the zeroes of {P} (counting multiplicity) and {\zeta} ranges over the zeroes of {Q} (counting multiplicity), and assuming {z} avoids the zeroes of {Q}. Taking absolute values and then logarithms, we arrive at the formula

\displaystyle  \log |f(z)| = \log |c| + \sum_\rho \log|z-\rho| - \sum_\zeta \log |z-\zeta|, \ \ \ \ \ (1)

as long as {z} avoids the zeroes of both {P} and {Q}. (In this set of notes we use {\log} for the natural logarithm when applied to a positive real number, and {\mathrm{Log}} for the standard branch of the complex logarithm (which extends {\log}); the multi-valued complex logarithm {\log} will only be used in passing.) Alternatively, taking logarithmic derivatives, we arrive at the closely related formula

\displaystyle  \frac{f'(z)}{f(z)} = \sum_\rho \frac{1}{z-\rho} - \sum_\zeta \frac{1}{z-\zeta}, \ \ \ \ \ (2)

again for {z} avoiding the zeroes of both {P} and {Q}. Thus we see that the zeroes and poles of a rational function {f} describe the behaviour of that rational function, as well as close relatives of that function such as the log-magnitude {\log|f|} and log-derivative {\frac{f'}{f}}. We have already seen these sorts of formulae arise in our treatment of the argument principle in 246A Notes 4.

Exercise 1 Let {P(z)} be a complex polynomial of degree {n \geq 1}.
  • (i) (Gauss-Lucas theorem) Show that the complex roots of {P'(z)} are contained in the closed convex hull of the complex roots of {P(z)}.
  • (ii) (Laguerre separation theorem) If all the complex roots of {P(z)} are contained in a disk {D(z_0,r)}, and {\zeta \not \in D(z_0,r)}, then all the complex roots of {nP(z) + (\zeta - z) P'(z)} are also contained in {D(z_0,r)}. (Hint: apply a suitable Möbius transformation to move {\zeta} to infinity, and then apply part (i) to a polynomial that emerges after applying this transformation.)

There are a number of useful ways to extend these formulae to more general meromorphic functions than rational functions. Firstly there is a very handy “local” variant of (1) known as Jensen’s formula:

Theorem 2 (Jensen’s formula) Let {f} be a meromorphic function on an open neighbourhood of a disk {\overline{D(z_0,r)} = \{ z: |z-z_0| \leq r \}}, with all removable singularities removed. Then, if {z_0} is neither a zero nor a pole of {f}, we have

\displaystyle  \log |f(z_0)| = \int_0^1 \log |f(z_0+re^{2\pi i t})|\ dt + \sum_{\rho: |\rho-z_0| \leq r} \log \frac{|\rho-z_0|}{r} \ \ \ \ \ (3)

\displaystyle  - \sum_{\zeta: |\zeta-z_0| \leq r} \log \frac{|\zeta-z_0|}{r}

where {\rho} and {\zeta} range over the zeroes and poles of {f} respectively (counting multiplicity) in the disk {\overline{D(z_0,r)}}.

One can view (3) as a truncated (or localised) variant of (1). Note also that the summands {\log \frac{|\rho-z_0|}{r}, \log \frac{|\zeta-z_0|}{r}} are always non-positive.

Proof: By perturbing {r} slightly if necessary, we may assume that none of the zeroes or poles of {f} (which form a discrete set) lie on the boundary circle {\{ z: |z-z_0| = r \}}. By translating and rescaling, we may then normalise {z_0=0} and {r=1}, thus our task is now to show that

\displaystyle  \log |f(0)| = \int_0^1 \log |f(e^{2\pi i t})|\ dt + \sum_{\rho: |\rho| < 1} \log |\rho| - \sum_{\zeta: |\zeta| < 1} \log |\zeta|. \ \ \ \ \ (4)

We may remove the poles and zeroes inside the disk {D(0,1)} by the useful device of Blaschke products. Suppose for instance that {f} has a zero {\rho} inside the disk {D(0,1)}. Observe that the function

\displaystyle  B_\rho(z) := \frac{\rho - z}{1 - \overline{\rho} z} \ \ \ \ \ (5)

has magnitude {1} on the unit circle {\{ z: |z| = 1\}}, equals {\rho} at the origin, has a simple zero at {\rho}, but has no other zeroes or poles inside the disk. Thus Jensen’s formula (4) already holds if {f} is replaced by {B_\rho}. To prove (4) for {f}, it thus suffices to prove it for {f/B_\rho}, which effectively deletes a zero {\rho} inside the disk {D(0,1)} from {f} (and replaces it instead with its inversion {1/\overline{\rho}}). Similarly we may remove all the poles inside the disk. As a meromorphic function only has finitely many poles and zeroes inside a compact set, we may thus reduce to the case when {f} has no poles or zeroes on or inside the disk {D(0,1)}, at which point our goal is simply to show that

\displaystyle  \log |f(0)| = \int_0^1 \log |f(e^{2\pi i t})|\ dt.

Since {f} has no zeroes or poles inside the disk, it has a holomorphic logarithm {F} (Exercise 46 of 246A Notes 4). In particular, {\log |f|} is the real part of {F}. The claim now follows by applying the mean value property (Exercise 17 of 246A Notes 3) to {\log |f|}. \Box

An important special case of Jensen’s formula arises when {f} is holomorphic in a neighborhood of {\overline{D(z_0,r)}}, in which case there are no contributions from poles and one simply has

\displaystyle  \int_0^1 \log |f(z_0+re^{2\pi i t})|\ dt = \log |f(z_0)| + \sum_{\rho: |\rho-z_0| \leq r} \log \frac{r}{|\rho-z_0|}. \ \ \ \ \ (6)

This is quite a useful formula, mainly because the summands {\log \frac{r}{|\rho-z_0|}} are non-negative; it can be viewed as a more precise assertion of the subharmonicity of {\log |f|} (see Exercises 60(ix) and 61 of 246A Notes 5). Here are some quick applications of this formula:

Exercise 3 Use (6) to give another proof of Liouville’s theorem: a bounded holomorphic function {f} on the entire complex plane is necessarily constant.

Exercise 4 Use Jensen’s formula to prove the fundamental theorem of algebra: a complex polynomial {P(z)} of degree {n} has exactly {n} complex zeroes (counting multiplicity), and can thus be factored as {P(z) = c (z-z_1) \dots (z-z_n)} for some complex numbers {c,z_1,\dots,z_n} with {c \neq 0}. (Note that the fundamental theorem was invoked previously in this section, but only for motivational purposes, so the proof here is non-circular.)

Exercise 5 (Shifted Jensen’s formula) Let {f} be a meromorphic function on an open neighbourhood of a disk {\{ z: |z-z_0| \leq r \}}, with all removable singularities removed. Show that

\displaystyle  \log |f(z)| = \int_0^1 \log |f(z_0+re^{2\pi i t})| \mathrm{Re} \frac{r e^{2\pi i t} + (z-z_0)}{r e^{2\pi i t} - (z-z_0)}\ dt \ \ \ \ \ (7)

\displaystyle  + \sum_{\rho: |\rho-z_0| \leq r} \log \frac{|\rho-z|}{|r - \rho^* (z-z_0)|}

\displaystyle - \sum_{\zeta: |\zeta-z_0| \leq r} \log \frac{|\zeta-z|}{|r - \zeta^* (z-z_0)|}

for all {z} in the open disk {\{ z: |z-z_0| < r\}} that are not zeroes or poles of {f}, where {\rho^* = \frac{\overline{\rho-z_0}}{r}} and {\zeta^* = \frac{\overline{\zeta-z_0}}{r}}. (The function {\Re \frac{r e^{2\pi i t} + (z-z_0)}{r e^{2\pi i t} - (z-z_0)}} appearing in the integrand is sometimes known as the Poisson kernel, particularly if one normalises so that {z_0=0} and {r=1}.)

Exercise 6 (Bounded type)
  • (i) If {f} is a holomorphic function on {D(0,1)} that is not identically zero, show that {\liminf_{r \rightarrow 1^-} \int_0^{2\pi} \log |f(re^{i\theta})|\ d\theta > -\infty}.
  • (ii) If {f} is a meromorphic function on {D(0,1)} that is the ratio of two bounded holomorphic functions that are not identically zero, show that {\limsup_{r \rightarrow 1^-} \int_0^{2\pi} |\log |f(re^{i\theta})||\ d\theta < \infty}. (Functions {f} of this form are said to be of bounded type and lie in the Nevanlinna class for the unit disk {D(0,1)}.)

Exercise 7 (Smoothed out Jensen formula) Let {f} be a meromorphic function on an open set {U}, and let {\phi: U \rightarrow {\bf C}} be a smooth compactly supported function. Show that

\displaystyle \sum_\rho \phi(\rho) - \sum_\zeta \phi(\zeta)

\displaystyle  = \frac{-1}{2\pi} \int\int_U ((\frac{\partial}{\partial x} + i \frac{\partial}{\partial y}) \phi(x+iy)) \frac{f'}{f}(x+iy)\ dx dy

\displaystyle  = \frac{1}{2\pi} \int\int_U ((\frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}) \phi(x+iy)) \log |f(x+iy)|\ dx dy

where {\rho, \zeta} range over the zeroes and poles of {f} (respectively) in the support of {\phi}. Informally argue why this identity is consistent with Jensen’s formula. (Note: as many of the functions involved here are not holomorphic, complex analysis tools are of limited use. Try using real variable tools such as Stokes theorem, Greens theorem, or integration by parts.)

When applied to entire functions {f}, Jensen’s formula relates the order of growth of {f} near infinity with the density of zeroes of {f}. Here is a typical result:

Proposition 8 Let {f: {\bf C} \rightarrow {\bf C}} be an entire function, not identically zero, that obeys a growth bound {|f(z)| \leq C \exp( C|z|^\alpha)} for some {C, \alpha > 0} and all {z}. Then there exists a constant {C'>0} such that {D(0,R)} has at most {C' R^\alpha} zeroes (counting multiplicity) for any {R \geq 1}.

Entire functions that obey a growth bound of the form {|f(z)| \leq C_\varepsilon \exp( C_\varepsilon |z|^{\rho+\varepsilon})} for every {\varepsilon>0} and {z} (where {C_\varepsilon} depends on {\varepsilon}) are said to be of order at most {\rho}. The above theorem shows that for such functions that are not identically zero, the number of zeroes in a disk of radius {R} does not grow much faster than {R^\rho}. This is often a useful preliminary upper bound on the zeroes of entire functions, as the order of an entire function tends to be relatively easy to compute in practice.

Proof: First suppose that {f(0)} is non-zero. From (6) applied with {r=2R} and {z_0=0} one has

\displaystyle  \int_0^1 \log(C \exp( C (2R)^\alpha ) )\ dt \geq \log |f(0)| + \sum_{\rho: |\rho| \leq 2R} \log \frac{2R}{|\rho|}.

Every zero in {D(0,R)} contribute at least {\log 2} to a summand on the right-hand side, while all other zeroes contribute a non-negative quantity, thus

\displaystyle  \log C + C (2R)^\alpha \geq \log |f(0)| + N_R \log 2

where {N_R} denotes the number of zeroes in {D(0,R)}. This gives the claim for {f(0) \neq 0}. When {f(0)=0}, one can shift {f} by a small amount to make {f} non-zero at the origin (using the fact that zeroes of holomorphic functions not identically zero are isolated), modifying {C} in the process, and then repeating the previous arguments. \Box

Just as (3) and (7) give truncated variants of (1), we can create truncated versions of (2). The following crude truncation is adequate for many applications:

Theorem 9 (Truncated formula for log-derivative) Let {f} be a holomorphic function on an open neighbourhood of a disk {\{ z: |z-z_0| \leq r \}} that is not identically zero on this disk. Suppose that one has a bound of the form {|f(z)| \leq M^{O_{c_1,c_2}(1)} |f(z_0)|} for some {M \geq 1} and all {z} on the circle {\{ z: |z-z_0| = r\}}. Let {0 < c_2 < c_1 < 1} be constants. Then one has the approximate formula

\displaystyle  \frac{f'(z)}{f(z)} = \sum_{\rho: |\rho - z_0| \leq c_1 r} \frac{1}{z-\rho} + O_{c_1,c_2}( \frac{\log M}{r} )

for all {z} in the disk {\{ z: |z-z_0| < c_2 r \}} other than zeroes of {f}. Furthermore, the number of zeroes {\rho} in the above sum is {O_{c_1,c_2}(\log M)}.

Proof: To abbreviate notation, we allow all implied constants in this proof to depend on {c_1,c_2}.

We mimic the proof of Jensen’s formula. Firstly, we may translate and rescale so that {z_0=0} and {r=1}, so we have {|f(z)| \leq M^{O(1)} |f(0)|} when {|z|=1}, and our main task is to show that

\displaystyle  \frac{f'(z)}{f(z)} - \sum_{\rho: |\rho| \leq c_1} \frac{1}{z-\rho} = O( \log M ) \ \ \ \ \ (8)

for {|z| \leq c_2}. Note that if {f(0)=0} then {f} vanishes on the unit circle and hence (by the maximum principle) vanishes identically on the disk, a contradiction, so we may assume {f(0) \neq 0}. From hypothesis we then have

\displaystyle  \log |f(z)| \leq \log |f(0)| + O(\log M)

on the unit circle, and so from Jensen’s formula (3) we see that

\displaystyle  \sum_{\rho: |\rho| \leq 1} \log \frac{1}{|\rho|} = O(\log M). \ \ \ \ \ (9)

In particular we see that the number of zeroes with {|\rho| \leq c_1} is {O(\log M)}, as claimed.

Suppose {f} has a zero {\rho} with {c_1 < |\rho| \leq 1}. If we factor {f = B_\rho g}, where {B_\rho} is the Blaschke product (5), then

\displaystyle  \frac{f'}{f} = \frac{B'_\rho}{B_\rho} + \frac{g'}{g}

\displaystyle  = \frac{g'}{g} + \frac{1}{z-\rho} - \frac{1}{z-1/\overline{\rho}}.

Observe from Taylor expansion that the distance between {\rho} and {1/\overline{\rho}} is {O( \log \frac{1}{|\rho|} )}, and hence {\frac{1}{z-\rho} - \frac{1}{z-1/\overline{\rho}} = O( \log \frac{1}{|\rho|} )} for {|z| \leq c_2}. Thus we see from (9) that we may use Blaschke products to remove all the zeroes in the annulus {c_1 < |\rho| \leq 1} while only affecting the left-hand side of (8) by {O( \log M)}; also, removing the Blaschke products does not affect {|f(z)|} on the unit circle, and only affects {\log |f(0)|} by {O(\log M)} thanks to (9). Thus we may assume without loss of generality that there are no zeroes in this annulus.

Similarly, given a zero {\rho} with {|\rho| \leq c_1}, we have {\frac{1}{z-1/\overline{\rho}} = O(1)}, so using Blaschke products to remove all of these zeroes also only affects the left-hand side of (8) by {O(\log M)} (since the number of zeroes here is {O(\log M)}), with {\log |f(0)|} also modified by at most {O(\log M)}. Thus we may assume in fact that {f} has no zeroes whatsoever within the unit disk. We may then also normalise {f(0) = 1}, then {\log |f(e^{2\pi i t})| \leq O(\log M)} for all {t \in [0,1]}. By Jensen’s formula again, we have

\displaystyle  \int_0^1 \log |f(e^{2\pi i t})|\ dt = 0

and thus (by using the identity {|x| = 2 \max(x,0) - x} for any real {x})

\displaystyle  \int_0^1 \left|\log |f(e^{2\pi i t})|\right|\ dt \ll \log M. \ \ \ \ \ (10)

On the other hand, from (7) we have

\displaystyle  \log |f(z)| = \int_0^1 \log |f(e^{2\pi i t})| \mathrm{Re} \frac{e^{2\pi i t} + z}{e^{2\pi i t} - z}\ dt

which implies from (10) that {\log |f(z)|} and its first derivatives are {O( \log M )} on the disk {\{ z: |z| \leq c_2 \}}. But recall from the proof of Jensen’s formula that {\frac{f'}{f}} is the derivative of a logarithm {\log f} of {f}, whose real part is {\log |f|}. By the Cauchy-Riemann equations for {\log f}, we conclude that {\frac{f'}{f} = O(\log M)} on the disk {\{ z: |z| \leq c_2 \}}, as required. \Box

Exercise 10
  • (i) (Borel-Carathéodory theorem) If {f: U \rightarrow {\bf C}} is analytic on an open neighborhood of a disk {\overline{D(z_0,R)}} and {0 < r < R}, show that

    \displaystyle  \sup_{z \in D(z_0,r)} |f(z)| \leq \frac{2r}{R-r} \sup_{z \in \overline{D(z_0,R)}} \mathrm{Re} f(z) + \frac{R+r}{R-r} |f(z_0)|.

    (Hint: one can normalise {z_0=0}, {R=1}, {f(0)=0}, and {\sup_{|z-z_0| \leq R} \mathrm{Re} f(z)=1}. Now {f} maps the unit disk to the half-plane {\{ \mathrm{Re} z \leq 1 \}}. Use a Möbius transformation to map the half-plane to the unit disk and then use the Schwarz lemma.)
  • (ii) Use (i) to give an alternate way to conclude the proof of Theorem 9.

A variant of the above argument allows one to make precise the heuristic that holomorphic functions locally look like polynomials:

Exercise 11 (Local Weierstrass factorisation) Let the notation and hypotheses be as in Theorem 9. Then show that

\displaystyle  f(z) = P(z) \exp( g(z) )

for all {z} in the disk {\{ z: |z-z_0| < c_2 r \}}, where {P} is a polynomial whose zeroes are precisely the zeroes of {f} in {\{ z: |z-z_0| \leq c_1r \}} (counting multiplicity) and {g} is a holomorphic function on {\{ z: |z-z_0| < c_2 r \}} of magnitude {O_{c_1,c_2}( \log M )} and first derivative {O_{c_1,c_2}( \log M / r )} on this disk. Furthermore, show that the degree of {P} is {O_{c_1,c_2}(\log M)}.

Exercise 12 (Preliminary Beurling factorisation) Let {H^\infty(D(0,1))} denote the space of bounded analytic functions {f: D(0,1) \rightarrow {\bf C}} on the unit disk; this is a normed vector space with norm

\displaystyle  \|f\|_{H^\infty(D(0,1))} := \sup_{z \in D(0,1)} |f(z)|.

  • (i) If {f \in H^\infty(D(0,1))} is not identically zero, and {z_n} denote the zeroes of {f} in {D(0,1)} counting multiplicity, show that

    \displaystyle  \sum_n (1-|z_n|) < \infty

    and

    \displaystyle  \sup_{1/2 < r < 1} \int_0^{2\pi} | \log |f(re^{i\theta})| |\ d\theta < \infty.

  • (ii) Let the notation be as in (i). If we define the Blaschke product

    \displaystyle  B(z) := z^m \prod_{|z_n| \neq 0} \frac{|z_n|}{z_n} \frac{z_n-z}{1-\overline{z_n} z}

    where {m} is the order of vanishing of {f} at zero, show that this product converges absolutely to a holomorphic function on {D(0,1)}, and that {|f(z)| \leq \|f\|_{H^\infty(D(0,1)} |B(z)|} for all {z \in D(0,1)}. (It may be easier to work with finite Blaschke products first to obtain this bound.)
  • (iii) Continuing the notation from (i), establish a factorisation {f(z) = B(z) \exp(g(z))} for some holomorphic function {g: D(0,1) \rightarrow {\bf C}} with {\mathrm{Re}(g(z)) \leq \log \|f\|_{H^\infty(D(0,1)}} for all {z\in D(0,1)}.
  • (iv) (Theorem of F. and M. Riesz, special case) If {f \in H^\infty(D(0,1))} extends continuously to the boundary {\{e^{i\theta}: 0 \leq \theta < 2\pi\}}, show that the set {\{ 0 \leq \theta < 2\pi: f(e^{i\theta})=0 \}} has zero measure.

Remark 13 The factorisation (iii) can be refined further, with {g} being the Poisson integral of some finite measure on the unit circle. Using the Lebesgue decomposition of this finite measure into absolutely continuous parts one ends up factorising {H^\infty(D(0,1))} functions into “outer functions” and “inner functions”, giving the Beurling factorisation of {H^\infty}. There are also extensions to larger spaces {H^p(D(0,1))} than {H^\infty(D(0,1))} (which are to {H^\infty} as {L^p} is to {L^\infty}), known as Hardy spaces. We will not discuss this topic further here, but see for instance this text of Garnett for a treatment.

Exercise 14 (Littlewood’s lemma) Let {f} be holomorphic on an open neighbourhood of a rectangle {R = \{ \sigma+it: \sigma_0 \leq \sigma \leq \sigma_1; 0 \leq t \leq T \}} for some {\sigma_0 < \sigma_1} and {T>0}, with {f} non-vanishing on the boundary of the rectangle. Show that

\displaystyle  2\pi \sum_\rho (\mathrm{Re}(\rho)-\sigma_0) = \int_0^T \log |f(\sigma_0+it)|\ dt - \int_0^T \log |f(\sigma_1+it)|\ dt

\displaystyle  + \int_{\sigma_0}^{\sigma_1} \mathrm{arg} f(\sigma+iT)\ d\sigma - \int_{\sigma_0}^{\sigma_1} \mathrm{arg} f(\sigma)\ d\sigma

where {\rho} ranges over the zeroes of {f} inside {R} (counting multiplicity) and one uses a branch of {\mathrm{arg} f} which is continuous on the upper, lower, and right edges of {C}. (This lemma is a popular tool to explore the zeroes of Dirichlet series such as the Riemann zeta function.)

Read the rest of this entry »

Just a short announcement that next quarter I will be continuing the recently concluded 246A complex analysis class as 246B. Topics I plan to cover:

Notes for the later material will appear on this blog in due course.

Consider a disk {D(z_0,r) := \{ z: |z-z_0| < r \}} in the complex plane. If one applies an affine-linear map {f(z) = az+b} to this disk, one obtains

\displaystyle  f(D(z_0,r)) = D(f(z_0), |f'(z_0)| r).

For maps that are merely holomorphic instead of affine-linear, one has some variants of this assertion, which I am recording here mostly for my own reference:

Theorem 1 (Holomorphic images of disks) Let {D(z_0,r)} be a disk in the complex plane, and {f: D(z_0,r) \rightarrow {\bf C}} be a holomorphic function with {f'(z_0) \neq 0}.
  • (i) (Open mapping theorem or inverse function theorem) {f(D(z_0,r))} contains a disk {D(f(z_0),\varepsilon)} for some {\varepsilon>0}. (In fact there is even a holomorphic right inverse of {f} from {D(f(z_0), \varepsilon)} to {D(z_0,r)}.)
  • (ii) (Bloch theorem) {f(D(z_0,r))} contains a disk {D(w, c |f'(z_0)| r)} for some absolute constant {c>0} and some {w \in {\bf C}}. (In fact there is even a holomorphic right inverse of {f} from {D(w, c |f'(z_0)| r)} to {D(z_0,r)}.)
  • (iii) (Koebe quarter theorem) If {f} is injective, then {f(D(z_0,r))} contains the disk {D(f(z_0), \frac{1}{4} |f'(z_0)| r)}.
  • (iv) If {f} is a polynomial of degree {n}, then {f(D(z_0,r))} contains the disk {D(f(z_0), \frac{1}{n} |f'(z_0)| r)}.
  • (v) If one has a bound of the form {|f'(z)| \leq A |f'(z_0)|} for all {z \in D(z_0,r)} and some {A>1}, then {f(D(z_0,r))} contains the disk {D(f(z_0), \frac{c}{A} |f'(z_0)| r)} for some absolute constant {c>0}. (In fact there is holomorphic right inverse of {f} from {D(f(z_0), \frac{c}{A} |f'(z_0)| r)} to {D(z_0,r)}.)

Parts (i), (ii), (iii) of this theorem are standard, as indicated by the given links. I found part (iv) as (a consequence of) Theorem 2 of this paper of Degot, who remarks that it “seems not already known in spite of its simplicity”; an equivalent form of this result also appears in Lemma 4 of this paper of Miller. The proof is simple:

Proof: (Proof of (iv)) Let {w \in D(f(z_0), \frac{1}{n} |f'(z_0)| r)}, then we have a lower bound for the log-derivative of {f(z)-w} at {z_0}:

\displaystyle  \frac{|f'(z_0)|}{|f(z_0)-w|} > \frac{n}{r}

(with the convention that the left-hand side is infinite when {f(z_0)=w}). But by the fundamental theorem of algebra we have

\displaystyle  \frac{f'(z_0)}{f(z_0)-w} = \sum_{j=1}^n \frac{1}{z_0-\zeta_j}

where {\zeta_1,\dots,\zeta_n} are the roots of the polynomial {f(z)-w} (counting multiplicity). By the pigeonhole principle, there must therefore exist a root {\zeta_j} of {f(z) - w} such that

\displaystyle  \frac{1}{|z_0-\zeta_j|} > \frac{1}{r}

and hence {\zeta_j \in D(z_0,r)}. Thus {f(D(z_0,r))} contains {w}, and the claim follows. \Box

The constant {\frac{1}{n}} in (iv) is completely sharp: if {f(z) = z^n} and {z_0} is non-zero then {f(D(z_0,|z_0|))} contains the disk

\displaystyle D(f(z_0), \frac{1}{n} |f'(z_0)| r) = D( z_0^n, |z_0|^n)

but avoids the origin, thus does not contain any disk of the form {D( z_0^n, |z_0|^n+\varepsilon)}. This example also shows that despite parts (ii), (iii) of the theorem, one cannot hope for a general inclusion of the form

\displaystyle  f(D(z_0,r)) \supset D(f(z_0), c |f'(z_0)| r )

for an absolute constant {c>0}.

Part (v) is implicit in the standard proof of Bloch’s theorem (part (ii)), and is easy to establish:

Proof: (Proof of (v)) From the Cauchy inequalities one has {f''(z) = O(\frac{A}{r} |f'(z_0)|)} for {z \in D(z_0,r/2)}, hence by Taylor’s theorem with remainder {f(z) = f(z_0) + f'(z_0) (z-z_0) (1 + O( A \frac{|z-z_0|}{r} ) )} for {z \in D(z_0, r/2)}. By Rouche’s theorem, this implies that the function {f(z)-w} has a unique zero in {D(z_0, 2cr/A)} for any {w \in D(f(z_0), cr|f'(z_0)|/A)}, if {c>0} is a sufficiently small absolute constant. The claim follows. \Box

Note that part (v) implies part (i). A standard point picking argument also lets one deduce part (ii) from part (v):

Proof: (Proof of (ii)) By shrinking {r} slightly if necessary we may assume that {f} extends analytically to the closure of the disk {D(z_0,r)}. Let {c} be the constant in (v) with {A=2}; we will prove (iii) with {c} replaced by {c/2}. If we have {|f'(z)| \leq 2 |f'(z_0)|} for all {z \in D(z_0,r/2)} then we are done by (v), so we may assume without loss of generality that there is {z_1 \in D(z_0,r/2)} such that {|f'(z_1)| > 2 |f'(z_0)|}. If {|f'(z)| \leq 2 |f'(z_1)|} for all {z \in D(z_1,r/4)} then by (v) we have

\displaystyle  f( D(z_0, r) ) \supset f( D(z_1,r/2) ) \supset D( f(z_1), \frac{c}{2} |f'(z_1)| \frac{r}{2} )

\displaystyle \supset D( f(z_1), \frac{c}{2} |f'(z_0)| r )

and we are again done. Hence we may assume without loss of generality that there is {z_2 \in D(z_1,r/4)} such that {|f'(z_2)| > 2 |f'(z_1)|}. Iterating this procedure in the obvious fashion we either are done, or obtain a Cauchy sequence {z_0, z_1, \dots} in {D(z_0,r)} such that {f'(z_j)} goes to infinity as {j \rightarrow \infty}, which contradicts the analytic nature of {f} (and hence continuous nature of {f'}) on the closure of {D(z_0,r)}. This gives the claim. \Box

Here is another classical result stated by Alexander (and then proven by Kakeya and by Szego, but also implied to a classical theorem of Grace and Heawood) that is broadly compatible with parts (iii), (iv) of the above theorem:

Proposition 2 Let {D(z_0,r)} be a disk in the complex plane, and {f: D(z_0,r) \rightarrow {\bf C}} be a polynomial of degree {n \geq 1} with {f'(z) \neq 0} for all {z \in D(z_0,r)}. Then {f} is injective on {D(z_0, \sin\frac{\pi}{n})}.

The radius {\sin \frac{\pi}{n}} is best possible, for the polynomial {f(z) = z^n} has {f'} non-vanishing on {D(1,1)}, but one has {f(\cos(\pi/n) e^{i \pi/n}) = f(\cos(\pi/n) e^{-i\pi/n})}, and {\cos(\pi/n) e^{i \pi/n}, \cos(\pi/n) e^{-i\pi/n}} lie on the boundary of {D(1,\sin \frac{\pi}{n})}.

If one narrows {\sin \frac{\pi}{n}} slightly to {\sin \frac{\pi}{2n}} then one can quickly prove this proposition as follows. Suppose for contradiction that there exist distinct {z_1, z_2 \in D(z_0, \sin\frac{\pi}{n})} with {f(z_1)=f(z_2)}, thus if we let {\gamma} be the line segment contour from {z_1} to {z_2} then {\int_\gamma f'(z)\ dz}. However, by assumption we may factor {f'(z) = c (z-\zeta_1) \dots (z-\zeta_{n-1})} where all the {\zeta_j} lie outside of {D(z_0,r)}. Elementary trigonometry then tells us that the argument of {z-\zeta_j} only varies by less than {\frac{\pi}{n}} as {z} traverses {\gamma}, hence the argument of {f'(z)} only varies by less than {\pi}. Thus {f'(z)} takes values in an open half-plane avoiding the origin and so it is not possible for {\int_\gamma f'(z)\ dz} to vanish.

To recover the best constant of {\sin \frac{\pi}{n}} requires some effort. By taking contrapositives and applying an affine rescaling and some trigonometry, the proposition can be deduced from the following result, known variously as the Grace-Heawood theorem or the complex Rolle theorem.

Proposition 3 (Grace-Heawood theorem) Let {f: {\bf C} \rightarrow {\bf C}} be a polynomial of degree {n \geq 1} such that {f(1)=f(-1)}. Then {f'} contains a zero in the closure of {D( 0, \cot \frac{\pi}{n} )}.

This is in turn implied by a remarkable and powerful theorem of Grace (which we shall prove shortly). Given two polynomials {f,g} of degree at most {n}, define the apolar form {(f,g)_n} by

\displaystyle  (f,g)_n := \sum_{k=0}^n (-1)^k f^{(k)}(0) g^{(n-k)}(0). \ \ \ \ \ (1)

Theorem 4 (Grace’s theorem) Let {C} be a circle or line in {{\bf C}}, dividing {{\bf C} \backslash C} into two open connected regions {\Omega_1, \Omega_2}. Let {f,g} be two polynomials of degree at most {n \geq 1}, with all the zeroes of {f} lying in {\Omega_1} and all the zeroes of {g} lying in {\Omega_2}. Then {(f,g)_n \neq 0}.

(Contrapositively: if {(f,g)_n=0}, then the zeroes of {f} cannot be separated from the zeroes of {g} by a circle or line.)

Indeed, a brief calculation reveals the identity

\displaystyle  f(1) - f(-1) = (f', g)_{n-1}

where {g} is the degree {n-1} polynomial

\displaystyle  g(z) := \frac{1}{n!} ((z+1)^n - (z-1)^n).

The zeroes of {g} are {i \cot \frac{\pi j}{n}} for {j=1,\dots,n-1}, so the Grace-Heawood theorem follows by applying Grace’s theorem with {C} equal to the boundary of {D(0, \cot \frac{\pi}{n})}.

The same method of proof gives the following nice consequence:

Theorem 5 (Perpendicular bisector theorem) Let {f: {\bf C} \rightarrow C} be a polynomial such that {f(z_1)=f(z_2)} for some distinct {z_1,z_2}. Then the zeroes of {f'} cannot all lie on one side of the perpendicular bisector of {z_1,z_2}. For instance, if {f(1)=f(-1)}, then the zeroes of {f'} cannot all lie in the halfplane {\{ z: \mathrm{Re} z > 0 \}} or the halfplane {\{ z: \mathrm{Re} z < 0 \}}.

I’d be interested in seeing a proof of this latter theorem that did not proceed via Grace’s theorem.

Now we give a proof of Grace’s theorem. The case {n=1} can be established by direct computation, so suppose inductively that {n>1} and that the claim has already been established for {n-1}. Given the involvement of circles and lines it is natural to suspect that a Möbius transformation symmetry is involved. This is indeed the case and can be made precise as follows. Let {V_n} denote the vector space of polynomials {f} of degree at most {n}, then the apolar form is a bilinear form {(,)_n: V_n \times V_n \rightarrow {\bf C}}. Each translation {z \mapsto z+a} on the complex plane induces a corresponding map on {V_n}, mapping each polynomial {f} to its shift {\tau_a f(z) := f(z-a)}. We claim that the apolar form is invariant with respect to these translations:

\displaystyle  ( \tau_a f, \tau_a g )_n = (f,g)_n.

Taking derivatives in {a}, it suffices to establish the skew-adjointness relation

\displaystyle  (f', g)_n + (f,g')_n = 0

but this is clear from the alternating form of (1).

Next, we see that the inversion map {z \mapsto 1/z} also induces a corresponding map on {V_n}, mapping each polynomial {f \in V_n} to its inversion {\iota f(z) := z^n f(1/z)}. From (1) we see that this map also (projectively) preserves the apolar form:

\displaystyle  (\iota f, \iota g)_n = (-1)^n (f,g)_n.

More generally, the group of Möbius transformations on the Riemann sphere acts projectively on {V_n}, with each Möbius transformation {T: {\bf C} \rightarrow {\bf C}} mapping each {f \in V_n} to {Tf(z) := g_T(z) f(T^{-1} z)}, where {g_T} is the unique (up to constants) rational function that maps this a map from {V_n} to {V_n} (its divisor is {n(T \infty) - n(\infty)}). Since the Möbius transformations are generated by translations and inversion, we see that the action of Möbius transformations projectively preserves the apolar form; also, we see this action of {T} on {V_n} also moves the zeroes of each {f \in V_n} by {T} (viewing polynomials of degree less than {n} in {V_n} as having zeroes at infinity). In particular, the hypotheses and conclusions of Grace’s theorem are preserved by this Möbius action. We can then apply such a transformation to move one of the zeroes of {f} to infinity (thus making {f} a polynomial of degree {n-1}), so that {C} must now be a circle, with the zeroes of {g} inside the circle and the remaining zeroes of {f} outside the circle. But then

\displaystyle  (f,g)_n = (f, g')_{n-1}.

By the Gauss-Lucas theorem, the zeroes of {g'} are also inside {C}. The claim now follows from the induction hypothesis.

Starting on Oct 2, I will be teaching Math 246A, the first course in the three-quarter graduate complex analysis sequence at the math department here at UCLA.  This first course covers much of the same ground as an honours undergraduate complex analysis course, in particular focusing on the basic properties of holomorphic functions such as the Cauchy and residue theorems, the classification of singularities, and the maximum principle, but there will be more of an emphasis on rigour, generalisation and abstraction, and connections with other parts of mathematics.  The main text I will be using for this course is Stein-Shakarchi (with Ahlfors as a secondary text), but I will also be using the blog lecture notes I wrote the last time I taught this course in 2016. At this time I do not expect to significantly deviate from my past lecture notes, though I do not know at present how different the pace will be this quarter when the course is taught remotely. As with my 247B course last spring, the lectures will be open to the public, though other coursework components will be restricted to enrolled students.

This set of notes discusses aspects of one of the oldest questions in Fourier analysis, namely the nature of convergence of Fourier series.

If {f: {\bf R}/{\bf Z} \rightarrow {\bf C}} is an absolutely integrable function, its Fourier coefficients {\hat f: {\bf Z} \rightarrow {\bf C}} are defined by the formula

\displaystyle  \hat f(n) := \int_{{\bf R}/{\bf Z}} f(x) e^{-2\pi i nx}\ dx.

If {f} is smooth, then the Fourier coefficients {\hat f} are absolutely summable, and we have the Fourier inversion formula

\displaystyle  f(x) = \sum_{n \in {\bf Z}} \hat f(n) e^{2\pi i nx}

where the series here is uniformly convergent. In particular, if we define the partial summation operators

\displaystyle  S_N f(x) := \sum_{|n| \leq N} \hat f(n) e^{2\pi i nx}

then {S_N f} converges uniformly to {f} when {f} is smooth.

What if {f} is not smooth, but merely lies in an {L^p({\bf R}/{\bf Z})} class for some {1 \leq p \leq \infty}? The Fourier coefficients {\hat f} remain well-defined, as do the partial summation operators {S_N}. The question of convergence in norm is relatively easy to settle:

Exercise 1
  • (i) If {1 < p < \infty} and {f \in L^p({\bf R}/{\bf Z})}, show that {S_N f} converges in {L^p({\bf R}/{\bf Z})} norm to {f}. (Hint: first use the boundedness of the Hilbert transform to show that {S_N} is bounded in {L^p({\bf R}/{\bf Z})} uniformly in {N}.)
  • (ii) If {p=1} or {p=\infty}, show that there exists {f \in L^p({\bf R}/{\bf Z})} such that the sequence {S_N f} is unbounded in {L^p({\bf R}/{\bf Z})} (so in particular it certainly does not converge in {L^p({\bf R}/{\bf Z})} norm to {f}. (Hint: first show that {S_N} is not bounded in {L^p({\bf R}/{\bf Z})} uniformly in {N}, then apply the uniform boundedness principle in the contrapositive.)

The question of pointwise almost everywhere convergence turned out to be a significantly harder problem:

Theorem 2 (Pointwise almost everywhere convergence)
  • (i) (Kolmogorov, 1923) There exists {f \in L^1({\bf R}/{\bf Z})} such that {S_N f(x)} is unbounded in {N} for almost every {x}.
  • (ii) (Carleson, 1966; conjectured by Lusin, 1913) For every {f \in L^2({\bf R}/{\bf Z})}, {S_N f(x)} converges to {f(x)} as {N \rightarrow \infty} for almost every {x}.
  • (iii) (Hunt, 1967) For every {1 < p \leq \infty} and {f \in L^p({\bf R}/{\bf Z})}, {S_N f(x)} converges to {f(x)} as {N \rightarrow \infty} for almost every {x}.

Note from Hölder’s inequality that {L^2({\bf R}/{\bf Z})} contains {L^p({\bf R}/{\bf Z})} for all {p\geq 2}, so Carleson’s theorem covers the {p \geq 2} case of Hunt’s theorem. We remark that the precise threshold near {L^1} between Kolmogorov-type divergence results and Carleson-Hunt pointwise convergence results, in the category of Orlicz spaces, is still an active area of research; see this paper of Lie for further discussion.

Carleson’s theorem in particular was a surprisingly difficult result, lying just out of reach of classical methods (as we shall see later, the result is much easier if we smooth either the function {f} or the summation method {S_N} by a tiny bit). Nowadays we realise that the reason for this is that Carleson’s theorem essentially contains a frequency modulation symmetry in addition to the more familiar translation symmetry and dilation symmetry. This basically rules out the possibility of attacking Carleson’s theorem with tools such as Calderón-Zygmund theory or Littlewood-Paley theory, which respect the latter two symmetries but not the former. Instead, tools from “time-frequency analysis” that essentially respect all three symmetries should be employed. We will illustrate this by giving a relatively short proof of Carleson’s theorem due to Lacey and Thiele. (There are other proofs of Carleson’s theorem, including Carleson’s original proof, its modification by Hunt, and a later time-frequency proof by Fefferman; see Remark 18 below.)

Read the rest of this entry »

In contrast to previous notes, in this set of notes we shall focus exclusively on Fourier analysis in the one-dimensional setting {d=1} for simplicity of notation, although all of the results here have natural extensions to higher dimensions. Depending on the physical context, one can view the physical domain {{\bf R}} as representing either space or time; we will mostly think in terms of the former interpretation, even though the standard terminology of “time-frequency analysis”, which we will make more prominent use of in later notes, clearly originates from the latter.

In previous notes we have often performed various localisations in either physical space or Fourier space {{\bf R}}, for instance in order to take advantage of the uncertainty principle. One can formalise these operations in terms of the functional calculus of two basic operations on Schwartz functions {{\mathcal S}({\bf R})}, the position operator {X: {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} defined by

\displaystyle  (Xf)(x) := x f(x)

and the momentum operator {D: {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})}, defined by

\displaystyle  (Df)(x) := \frac{1}{2\pi i} \frac{d}{dx} f(x). \ \ \ \ \ (1)

(The terminology comes from quantum mechanics, where it is customary to also insert a small constant {h} on the right-hand side of (1) in accordance with de Broglie’s law. Such a normalisation is also used in several branches of mathematics, most notably semiclassical analysis and microlocal analysis, where it becomes profitable to consider the semiclassical limit {h \rightarrow 0}, but we will not emphasise this perspective here.) The momentum operator can be viewed as the counterpart to the position operator, but in frequency space instead of physical space, since we have the standard identity

\displaystyle  \widehat{Df}(\xi) = \xi \hat f(\xi)

for any {\xi \in {\bf R}} and {f \in {\mathcal S}({\bf R})}. We observe that both operators {X,D} are formally self-adjoint in the sense that

\displaystyle  \langle Xf, g \rangle = \langle f, Xg \rangle; \quad \langle Df, g \rangle = \langle f, Dg \rangle

for all {f,g \in {\mathcal S}({\bf R})}, where we use the {L^2({\bf R})} Hermitian inner product

\displaystyle  \langle f, g\rangle := \int_{\bf R} f(x) \overline{g(x)}\ dx.

Clearly, for any polynomial {P(x)} of one real variable {x} (with complex coefficients), the operator {P(X): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} is given by the spatial multiplier operator

\displaystyle  (P(X) f)(x) = P(x) f(x)

and similarly the operator {P(D): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} is given by the Fourier multiplier operator

\displaystyle  \widehat{P(D) f}(\xi) = P(\xi) \hat f(\xi).

Inspired by this, if {m: {\bf R} \rightarrow {\bf C}} is any smooth function that obeys the derivative bounds

\displaystyle  \frac{d^j}{dx^j} m(x) \lesssim_{m,j} \langle x \rangle^{O_{m,j}(1)} \ \ \ \ \ (2)

for all {j \geq 0} and {x \in {\bf R}} (that is to say, all derivatives of {m} grow at most polynomially), then we can define the spatial multiplier operator {m(X): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} by the formula

\displaystyle  (m(X) f)(x) := m(x) f(x);

one can easily verify from several applications of the Leibniz rule that {m(X)} maps Schwartz functions to Schwartz functions. We refer to {m(x)} as the symbol of this spatial multiplier operator. In a similar fashion, we define the Fourier multiplier operator {m(D)} associated to the symbol {m(\xi)} by the formula

\displaystyle  \widehat{m(D) f}(\xi) := m(\xi) \hat f(\xi).

For instance, any constant coefficient linear differential operators {\sum_{k=0}^n c_k \frac{d^k}{dx^k}} can be written in this notation as

\displaystyle \sum_{k=0}^n c_k \frac{d^k}{dx^k} =\sum_{k=0}^n c_k (2\pi i D)^k;

however there are many Fourier multiplier operators that are not of this form, such as fractional derivative operators {\langle D \rangle^s = (1- \frac{1}{4\pi^2} \frac{d^2}{dx^2})^{s/2}} for non-integer values of {s}, which is a Fourier multiplier operator with symbol {\langle \xi \rangle^s}. It is also very common to use spatial cutoffs {\psi(X)} and Fourier cutoffs {\psi(D)} for various bump functions {\psi} to localise functions in either space or frequency; we have seen several examples of such cutoffs in action in previous notes (often in the higher dimensional setting {d>1}).

We observe that the maps {m \mapsto m(X)} and {m \mapsto m(D)} are ring homomorphisms, thus for instance

\displaystyle  (m_1 + m_2)(D) = m_1(D) + m_2(D)

and

\displaystyle  (m_1 m_2)(D) = m_1(D) m_2(D)

for any {m_1,m_2} obeying the derivative bounds (2); also {m(D)} is formally adjoint to {\overline{m(D)}} in the sense that

\displaystyle  \langle m(D) f, g \rangle = \langle f, \overline{m}(D) g \rangle

for {f,g \in {\mathcal S}({\bf R})}, and similarly for {m(X)} and {\overline{m}(X)}. One can interpret these facts as part of the functional calculus of the operators {X,D}, which can be interpreted as densely defined self-adjoint operators on {L^2({\bf R})}. However, in this set of notes we will not develop the spectral theory necessary in order to fully set out this functional calculus rigorously.

In the field of PDE and ODE, it is also very common to study variable coefficient linear differential operators

\displaystyle  \sum_{k=0}^n c_k(x) \frac{d^k}{dx^k} \ \ \ \ \ (3)

where the {c_0,\dots,c_n} are now functions of the spatial variable {x} obeying the derivative bounds (2). A simple example is the quantum harmonic oscillator Hamiltonian {-\frac{d^2}{dx^2} + x^2}. One can rewrite this operator in our notation as

\displaystyle  \sum_{k=0}^n c_k(X) (2\pi i D)^k

and so it is natural to interpret this operator as a combination {a(X,D)} of both the position operator {X} and the momentum operator {D}, where the symbol {a: {\bf R} \times {\bf R} \rightarrow {\bf C}} this operator is the function

\displaystyle  a(x,\xi) := \sum_{k=0}^n c_k(x) (2\pi i \xi)^k. \ \ \ \ \ (4)

Indeed, from the Fourier inversion formula

\displaystyle  f(x) = \int_{\bf R} \hat f(\xi) e^{2\pi i x \xi}\ d\xi

for any {f \in {\mathcal S}({\bf R})} we have

\displaystyle  (2\pi i D)^k f(x) = \int_{\bf R} (2\pi i \xi)^k \hat f(\xi) e^{2\pi i x \xi}\ d\xi

and hence on multiplying by {c_k(x)} and summing we have

\displaystyle (\sum_{k=0}^n c_k(X) (2\pi i D)^k) f(x) = \int_{\bf R} a(x,\xi) \hat f(\xi) e^{2\pi i x \xi}\ d\xi.

Inspired by this, we can introduce the Kohn-Nirenberg quantisation by defining the operator {a(X,D) = a_{KN}(X,D): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} by the formula

\displaystyle  a(X,D) f(x) = \int_{\bf R} a(x,\xi) \hat f(\xi) e^{2\pi i x \xi}\ d\xi \ \ \ \ \ (5)

whenever {f \in {\mathcal S}({\bf R})} and {a: {\bf R} \times {\bf R} \rightarrow {\bf C}} is any smooth function obeying the derivative bounds

\displaystyle  \frac{\partial^j}{\partial x^j} \frac{\partial^l}{\partial \xi^l} a(x,\xi) \lesssim_{a,j,l} \langle x \rangle^{O_{a,j}(1)} \langle \xi \rangle^{O_{a,j,l}(1)} \ \ \ \ \ (6)

for all {j,l \geq 0} and {x \in {\bf R}} (note carefully that the exponent in {x} on the right-hand side is required to be uniform in {l}). This quantisation clearly generalises both the spatial multiplier operators {m(X)} and the Fourier multiplier operators {m(D)} defined earlier, which correspond to the cases when the symbol {a(x,\xi)} is a function of {x} only or {\xi} only respectively. Thus we have combined the physical space {{\bf R} = \{ x: x \in {\bf R}\}} and the frequency space {{\bf R} = \{ \xi: \xi \in {\bf R}\}} into a single domain, known as phase space {{\bf R} \times {\bf R} = \{ (x,\xi): x,\xi \in {\bf R} \}}. The term “time-frequency analysis” encompasses analysis based on decompositions and other manipulations of phase space, in much the same way that “Fourier analysis” encompasses analysis based on decompositions and other manipulations of frequency space. We remark that the Kohn-Nirenberg quantization is not the only choice of quantization one could use; see Remark 19 below.

Exercise 1

  • (i) Show that for {a} obeying (6), that {a(X,D)} does indeed map {{\mathcal S}({\bf R})} to {{\mathcal S}({\bf R})}.
  • (ii) Show that the symbol {a} is uniquely determined by the operator {a(X,D)}. That is to say, if {a,b} are two functions obeying (6) with {a(X,D) f = b(X,D) f} for all {f \in {\mathcal S}({\bf R})}, then {a=b}. (Hint: apply {a(X,D)-b(X,D)} to a suitable truncation of a plane wave {x \mapsto e^{2\pi i x \xi}} and then take limits.)

In principle, the quantisations {a(X,D)} are potentially very useful for such tasks as inverting variable coefficient linear operators, or to localize a function simultaneously in physical and Fourier space. However, a fundamental difficulty arises: map from symbols {a} to operators {a(X,D)} is now no longer a ring homomorphism, in particular

\displaystyle  (a_1 a_2)(X,D) \neq a_1(X,D) a_2(X,D) \ \ \ \ \ (7)

in general. Fundamentally, this is due to the fact that pointwise multiplication of symbols is a commutative operation, whereas the composition of operators such as {X} and {D} does not necessarily commute. This lack of commutativity can be measured by introducing the commutator

\displaystyle  [A,B] := AB - BA

of two operators {A,B}, and noting from the product rule that

\displaystyle  [X,D] = -\frac{1}{2\pi i} \neq 0.

(In the language of Lie groups and Lie algebras, this tells us that {X,D} are (up to complex constants) the standard Lie algebra generators of the Heisenberg group.) From a quantum mechanical perspective, this lack of commutativity is the root cause of the uncertainty principle that prevents one from simultaneously localizing in both position and momentum past a certain point. Here is one basic way of formalising this principle:

Exercise 2 (Heisenberg uncertainty principle) For any {x_0, \xi_0 \in {\bf R}} and {f \in \mathcal{S}({\bf R})}, show that

\displaystyle  \| (X-x_0) f \|_{L^2({\bf R})} \| (D-\xi_0) f\|_{L^2({\bf R})} \geq \frac{1}{4\pi} \|f\|_{L^2({\bf R})}^2.

(Hint: evaluate the expression {\langle [X-x_0, D - \xi_0] f, f \rangle} in two different ways and apply the Cauchy-Schwarz inequality.) Informally, this exercise asserts that the spatial uncertainty {\Delta x} and the frequency uncertainty {\Delta \xi} of a function obey the Heisenberg uncertainty relation {\Delta x \Delta \xi \gtrsim 1}.

Nevertheless, one still has the correspondence principle, which asserts that in certain regimes (which, with our choice of normalisations, corresponds to the high-frequency regime), quantum mechanics continues to behave like a commutative theory, and one can sometimes proceed as if the operators {X,D} (and the various operators {a(X,D)} constructed from them) commute up to “lower order” errors. This can be formalised using the pseudodifferential calculus, which we give below the fold, in which we restrict the symbol {a} to certain “symbol classes” of various orders (which then restricts {a(X,D)} to be pseudodifferential operators of various orders), and obtains approximate identities such as

\displaystyle  (a_1 a_2)(X,D) \approx a_1(X,D) a_2(X,D)

where the error between the left and right-hand sides is of “lower order” and can in fact enjoys a useful asymptotic expansion. As a first approximation to this calculus, one can think of functions {f \in {\mathcal S}({\bf R})} as having some sort of “phase space portrait{\tilde f(x,\xi)} which somehow combines the physical space representation {x \mapsto f(x)} with its Fourier representation {\xi \mapsto f(\xi)}, and pseudodifferential operators {a(X,D)} behave approximately like “phase space multiplier operators” in this representation in the sense that

\displaystyle  \widetilde{a(X,D) f}(x,\xi) \approx a(x,\xi) \tilde f(x,\xi).

Unfortunately the uncertainty principle (or the non-commutativity of {X} and {D}) prevents us from making these approximations perfectly precise, and it is not always clear how to even define a phase space portrait {\tilde f} of a function {f} precisely (although there are certain popular candidates for such a portrait, such as the FBI transform (also known as the Gabor transform in signal processing literature), or the Wigner quasiprobability distribution, each of which have some advantages and disadvantages). Nevertheless even if the concept of a phase space portrait is somewhat fuzzy, it is of great conceptual benefit both within mathematics and outside of it. For instance, the musical score one assigns a piece of music can be viewed as a phase space portrait of the sound waves generated by that music.

To complement the pseudodifferential calculus we have the basic Calderón-Vaillancourt theorem, which asserts that pseudodifferential operators of order zero are Calderón-Zygmund operators and thus bounded on {L^p({\bf R})} for {1 < p < \infty}. The standard proof of this theorem is a classic application of one of the basic techniques in harmonic analysis, namely the exploitation of almost orthogonality; the proof we will give here will achieve this through the elegant device of the Cotlar-Stein lemma.

Pseudodifferential operators (especially when generalised to higher dimensions {d \geq 1}) are a fundamental tool in the theory of linear PDE, as well as related fields such as semiclassical analysis, microlocal analysis, and geometric quantisation. There is an even wider class of operators that is also of interest, namely the Fourier integral operators, which roughly speaking not only approximately multiply the phase space portrait {\tilde f(x,\xi)} of a function by some multiplier {a(x,\xi)}, but also move the portrait around by a canonical transformation. However, the development of theory of these operators is beyond the scope of these notes; see for instance the texts of Hormander or Eskin.

This set of notes is only the briefest introduction to the theory of pseudodifferential operators. Many texts are available that cover the theory in more detail, for instance this text of Taylor.

Read the rest of this entry »

The square root cancellation heuristic, briefly mentioned in the preceding set of notes, predicts that if a collection {z_1,\dots,z_n} of complex numbers have phases that are sufficiently “independent” of each other, then

\displaystyle |\sum_{j=1}^n z_j| \approx (\sum_{j=1}^n |z_j|^2)^{1/2};

similarly, if {f_1,\dots,f_n} are a collection of functions in a Lebesgue space {L^p(X,\mu)} that oscillate “independently” of each other, then we expect

\displaystyle \| \sum_{j=1}^n f_j \|_{L^p(X,\mu)} \approx \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p(X,\mu)}.

We have already seen one instance in which this heuristic can be made precise, namely when the phases of {z_j,f_j} are randomised by a random sign, so that Khintchine’s inequality (Lemma 4 from Notes 1) can be applied. There are other contexts in which a square function estimate

\displaystyle \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p(X,\mu)} \lesssim \| \sum_{j=1}^n f_j \|_{L^p(X,\mu)}

or a reverse square function estimate

\displaystyle \| \sum_{j=1}^n f_j \|_{L^p(X,\mu)} \lesssim \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p(X,\mu)}

(or both) are known or conjectured to hold. For instance, the useful Littlewood-Paley inequality implies (among other things) that for any {1 < p < \infty}, we have the reverse square function estimate

\displaystyle \| \sum_{j=1}^n f_j \|_{L^p({\bf R}^d)} \lesssim_{p,d} \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p({\bf R}^d)}, \ \ \ \ \ (1)

whenever the Fourier transforms {\hat f_j} of the {f_j} are supported on disjoint annuli {\{ \xi \in {\bf R}^d: 2^{k_j} \leq |\xi| < 2^{k_j+1} \}}, and we also have the matching square function estimate

\displaystyle \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p({\bf R}^d)} \lesssim_{p,d} \| \sum_{j=1}^n f_j \|_{L^p({\bf R}^d)}

if there is some separation between the annuli (for instance if the {k_j} are {2}-separated). We recall the proofs of these facts below the fold. In the {p=2} case, we of course have Pythagoras’ theorem, which tells us that if the {f_j} are all orthogonal elements of {L^2(X,\mu)}, then

\displaystyle \| \sum_{j=1}^n f_j \|_{L^2(X,\mu)} = (\sum_{j=1}^n \| f_j \|_{L^2(X,\mu)}^2)^{1/2} = \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^2(X,\mu)}.

In particular, this identity holds if the {f_j \in L^2({\bf R}^d)} have disjoint Fourier supports in the sense that their Fourier transforms {\hat f_j} are supported on disjoint sets. For {p=4}, the technique of bi-orthogonality can also give square function and reverse square function estimates in some cases, as we shall also see below the fold.
In recent years, it has begun to be realised that in the regime {p > 2}, a variant of reverse square function estimates such as (1) is also useful, namely decoupling estimates such as

\displaystyle \| \sum_{j=1}^n f_j \|_{L^p({\bf R}^d)} \lesssim_{p,d} (\sum_{j=1}^n \|f_j\|_{L^p({\bf R}^d)}^2)^{1/2} \ \ \ \ \ (2)

(actually in practice we often permit small losses such as {n^\varepsilon} on the right-hand side). An estimate such as (2) is weaker than (1) when {p\geq 2} (or equal when {p=2}), as can be seen by starting with the triangle inequality

\displaystyle \| \sum_{j=1}^n |f_j|^2 \|_{L^{p/2}({\bf R}^d)} \leq \sum_{j=1}^n \| |f_j|^2 \|_{L^{p/2}({\bf R}^d)},

and taking the square root of both side to conclude that

\displaystyle \| (\sum_{j=1}^n |f_j|^2)^{1/2} \|_{L^p({\bf R}^d)} \leq (\sum_{j=1}^n \|f_j\|_{L^p({\bf R}^d)}^2)^{1/2}. \ \ \ \ \ (3)

However, the flip side of this weakness is that (2) can be easier to prove. One key reason for this is the ability to iterate decoupling estimates such as (2), in a way that does not seem to be possible with reverse square function estimates such as (1). For instance, suppose that one has a decoupling inequality such as (2), and furthermore each {f_j} can be split further into components {f_j= \sum_{k=1}^m f_{j,k}} for which one has the decoupling inequalities

\displaystyle \| \sum_{k=1}^m f_{j,k} \|_{L^p({\bf R}^d)} \lesssim_{p,d} (\sum_{k=1}^m \|f_{j,k}\|_{L^p({\bf R}^d)}^2)^{1/2}.

Then by inserting these bounds back into (2) we see that we have the combined decoupling inequality

\displaystyle \| \sum_{j=1}^n\sum_{k=1}^m f_{j,k} \|_{L^p({\bf R}^d)} \lesssim_{p,d} (\sum_{j=1}^n \sum_{k=1}^m \|f_{j,k}\|_{L^p({\bf R}^d)}^2)^{1/2}.

This iterative feature of decoupling inequalities means that such inequalities work well with the method of induction on scales, that we introduced in the previous set of notes.
In fact, decoupling estimates share many features in common with restriction theorems; in addition to induction on scales, there are several other techniques that first emerged in the restriction theory literature, such as wave packet decompositions, rescaling, and bilinear or multilinear reductions, that turned out to also be well suited to proving decoupling estimates. As with restriction, the curvature or transversality of the different Fourier supports of the {f_j} will be crucial in obtaining non-trivial estimates.
Strikingly, in many important model cases, the optimal decoupling inequalities (except possibly for epsilon losses in the exponents) are now known. These estimates have in turn had a number of important applications, such as establishing certain discrete analogues of the restriction conjecture, or the first proof of the main conjecture for Vinogradov mean value theorems in analytic number theory.
These notes only serve as a brief introduction to decoupling. A systematic exploration of this topic can be found in this recent text of Demeter.
Read the rest of this entry »

Archives