You are currently browsing the category archive for the ‘math.CV’ category.
Previous set of notes: Notes 3. Next set of notes: 246C Notes 1.
One of the great classical triumphs of complex analysis was in providing the first complete proof (by Hadamard and de la Vallée Poussin in 1896) of arguably the most important theorem in analytic number theory, the prime number theorem:
Theorem 1 (Prime number theorem) Let denote the number of primes less than a given real number . Then (or in asymptotic notation, as ).
(Actually, it turns out to be slightly more natural to replace the approximation in the prime number theorem by the logarithmic integral , which happens to be a more precise approximation, but we will not stress this point here.)
The complex-analytic proof of this theorem hinges on the study of a key meromorphic function related to the prime numbers, the Riemann zeta function . Initially, it is only defined on the half-plane :
Definition 2 (Riemann zeta function, preliminary definition) Let be such that . Then we define
Note that the series is locally uniformly convergent in the half-plane , so in particular is holomorphic on this region. In previous notes we have already evaluated some special values of this function:
However, it turns out that the zeroes (and pole) of this function are of far greater importance to analytic number theory, particularly with regards to the study of the prime numbers.The Riemann zeta function has several remarkable properties, some of which we summarise here:
Theorem 3 (Basic properties of the Riemann zeta function)
- (i) (Euler product formula) For any with , we have where the product is absolutely convergent (and locally uniform in ) and is over the prime numbers .
- (ii) (Trivial zero-free region) has no zeroes in the region .
- (iii) (Meromorphic continuation) has a unique meromorphic continuation to the complex plane (which by abuse of notation we also call ), with a simple pole at and no other poles. Furthermore, the Riemann xi function is an entire function of order (after removing all singularities). The function is an entire function of order one after removing the singularity at .
- (iv) (Functional equation) After applying the meromorphic continuation from (iii), we have for all (excluding poles). Equivalently, we have for all . (The equivalence between the (5) and (6) is a routine consequence of the Euler reflection formula and the Legendre duplication formula, see Exercises 26 and 31 of Notes 1.)
Proof: We just prove (i) and (ii) for now, leaving (iii) and (iv) for later sections.
The claim (i) is an encoding of the fundamental theorem of arithmetic, which asserts that every natural number is uniquely representable as a product over primes, where the are natural numbers, all but finitely many of which are zero. Writing this representation as , we see that
whenever , , and consists of all the natural numbers of the form for some . Sending and to infinity, we conclude from monotone convergence and the geometric series formula that whenever is real, and then from dominated convergence we see that the same formula holds for complex with as well. Local uniform convergence then follows from the product form of the Weierstrass -test (Exercise 19 of Notes 1).The claim (ii) is immediate from (i) since the Euler product is absolutely convergent and all terms are non-zero.
We remark that by sending to in Theorem 3(i) we conclude that
and from the divergence of the harmonic series we then conclude Euler’s theorem . This can be viewed as a weak version of the prime number theorem, and already illustrates the potential applicability of the Riemann zeta function to control the distribution of the prime numbers.The meromorphic continuation (iii) of the zeta function is initially surprising, but can be interpreted either as a manifestation of the extremely regular spacing of the natural numbers occurring in the sum (1), or as a consequence of various integral representations of (or slight modifications thereof). We will focus in this set of notes on a particular representation of as essentially the Mellin transform of the theta function that briefly appeared in previous notes, and the functional equation (iv) can then be viewed as a consequence of the modularity of that theta function. This in turn was established using the Poisson summation formula, so one can view the functional equation as ultimately being a manifestation of Poisson summation. (For a direct proof of the functional equation via Poisson summation, see these notes.)
Henceforth we work with the meromorphic continuation of . The functional equation (iv), when combined with special values of such as (2), gives some additional values of outside of its initial domain , most famously
If one formally compares this formula with (1), one arrives at the infamous identity although this identity has to be interpreted in a suitable non-classical sense in order for it to be rigorous (see this previous blog post for further discussion).From Theorem 3 and the non-vanishing nature of , we see that has simple zeroes (known as trivial zeroes) at the negative even integers , and all other zeroes (the non-trivial zeroes) inside the critical strip . (The non-trivial zeroes are conjectured to all be simple, but this is hopelessly far from being proven at present.) As we shall see shortly, these latter zeroes turn out to be closely related to the distribution of the primes. The functional equation tells us that if is a non-trivial zero then so is ; also, we have the identity
for all by (1), hence for all (except the pole at ) by meromorphic continuation. Thus if is a non-trivial zero then so is . We conclude that the set of non-trivial zeroes is symmetric by reflection by both the real axis and the critical line . We have the following infamous conjecture:
Conjecture 4 (Riemann hypothesis) All the non-trivial zeroes of lie on the critical line .
This conjecture would have many implications in analytic number theory, particularly with regard to the distribution of the primes. Of course, it is far from proven at present, but the partial results we have towards this conjecture are still sufficient to establish results such as the prime number theorem.
Return now to the original region where . To take more advantage of the Euler product formula (3), we take complex logarithms to conclude that
for suitable branches of the complex logarithm, and then on taking derivatives (using for instance the generalised Cauchy integral formula and Fubini’s theorem to justify the interchange of summation and derivative) we see that From the geometric series formula we have and so (by another application of Fubini’s theorem) we have the identity for , where the von Mangoldt function is defined to equal whenever is a power of a prime for some , and otherwise. The contribution of the higher prime powers is negligible in practice, and as a first approximation one can think of the von Mangoldt function as the indicator function of the primes, weighted by the logarithm function.The series and that show up in the above formulae are examples of Dirichlet series, which are a convenient device to transform various sequences of arithmetic interest into holomorphic or meromorphic functions. Here are some more examples:
Exercise 5 (Standard Dirichlet series) Let be a complex number with .
- (i) Show that .
- (ii) Show that , where is the divisor function of (the number of divisors of ).
- (iii) Show that , where is the Möbius function, defined to equal when is the product of distinct primes for some , and otherwise.
- (iv) Show that , where is the Liouville function, defined to equal when is the product of (not necessarily distinct) primes for some .
- (v) Show that , where is the holomorphic branch of the logarithm that is real for , and with the convention that vanishes for .
- (vi) Use the fundamental theorem of arithmetic to show that the von Mangoldt function is the unique function such that for every positive integer . Use this and (i) to provide an alternate proof of the identity (8). Thus we see that (8) is really just another encoding of the fundamental theorem of arithmetic.
Given the appearance of the von Mangoldt function , it is natural to reformulate the prime number theorem in terms of this function:
Theorem 6 (Prime number theorem, von Mangoldt form) One has (or in asymptotic notation, as ).
Let us see how Theorem 6 implies Theorem 1. Firstly, for any , we can write
The sum is non-zero for only values of , and is of size , thus Since , we conclude from Theorem 6 that as . Next, observe from the fundamental theorem of calculus that Multiplying by and summing over all primes , we conclude that From Theorem 6 we certainly have , thus By splitting the integral into the ranges and we see that the right-hand side is , and Theorem 1 follows.
Exercise 7 Show that Theorem 1 conversely implies Theorem 6.
The alternate form (8) of the Euler product identity connects the primes (represented here via proxy by the von Mangoldt function) with the logarithmic derivative of the zeta function, and can be used as a starting point for describing further relationships between and the primes. Most famously, we shall see later in these notes that it leads to the remarkably precise Riemann-von Mangoldt explicit formula:
Theorem 8 (Riemann-von Mangoldt explicit formula) For any non-integer , we have where ranges over the non-trivial zeroes of with imaginary part in . Furthermore, the convergence of the limit is locally uniform in .
Actually, it turns out that this formula is in some sense too precise; in applications it is often more convenient to work with smoothed variants of this formula in which the sum on the left-hand side is smoothed out, but the contribution of zeroes with large imaginary part is damped; see Exercise 22. Nevertheless, this formula clearly illustrates how the non-trivial zeroes of the zeta function influence the primes. Indeed, if one formally differentiates the above formula in , one is led to the (quite nonrigorous) approximation
or (writing ) Thus we see that each zero induces an oscillation in the von Mangoldt function, with controlling the frequency of the oscillation and the rate to which the oscillation dies out as . This relationship is sometimes known informally as “the music of the primes”.Comparing Theorem 8 with Theorem 6, it is natural to suspect that the key step in the proof of the latter is to establish the following slight but important extension of Theorem 3(ii), which can be viewed as a very small step towards the Riemann hypothesis:
Theorem 9 (Slight enlargement of zero-free region) There are no zeroes of on the line .
It is not quite immediate to see how Theorem 6 follows from Theorem 8 and Theorem 9, but we will demonstrate it below the fold.
Although Theorem 9 only seems like a slight improvement of Theorem 3(ii), proving it is surprisingly non-trivial. The basic idea is the following: if there was a zero at , then there would also be a different zero at (note cannot vanish due to the pole at ), and then the approximation (9) becomes
But the expression can be negative for large regions of the variable , whereas is always non-negative. This conflict eventually leads to a contradiction, but it is not immediately obvious how to make this argument rigorous. We will present here the classical approach to doing so using a trigonometric identity of Mertens.In fact, Theorem 9 is basically equivalent to the prime number theorem:
Exercise 10 For the purposes of this exercise, assume Theorem 6, but do not assume Theorem 9. For any non-zero real , show that as , where denotes a quantity that goes to zero as after being multiplied by . Use this to derive Theorem 9.
This equivalence can help explain why the prime number theorem is remarkably non-trivial to prove, and why the Riemann zeta function has to be either explicitly or implicitly involved in the proof.
This post is only intended as the briefest of introduction to complex-analytic methods in analytic number theory; also, we have not chosen the shortest route to the prime number theorem, electing instead to travel in directions that particularly showcase the complex-analytic results introduced in this course. For some further discussion see this previous set of lecture notes, particularly Notes 2 and Supplement 3 (with much of the material in this post drawn from the latter).
Previous set of notes: Notes 2. Next set of notes: Notes 4.
On the real line, the quintessential examples of a periodic function are the (normalised) sine and cosine functions , , which are -periodic in the sense that
By taking various polynomial combinations of and we obtain more general trigonometric polynomials that are -periodic; and the theory of Fourier series tells us that all other -periodic functions (with reasonable integrability conditions) can be approximated in various senses by such polynomial combinations. Using Euler’s identity, one can use and in place of and as the basic generating functions here, provided of course one is willing to use complex coefficients instead of real ones. Of course, by rescaling one can also make similar statements for other periods than . -periodic functions can also be identified (by abuse of notation) with functions on the quotient space (known as the additive -torus or additive unit circle), or with functions on the fundamental domain (up to boundary) of that quotient space with the periodic boundary condition . The map also identifies the additive unit circle with the geometric unit circle , thanks in large part to the fundamental trigonometric identity ; this can also be identified with the multiplicative unit circle . (Usually by abuse of notation we refer to all of these three sets simultaneously as the “unit circle”.) Trigonometric polynomials on the additive unit circle then correspond to ordinary polynomials of the real coefficients of the geometric unit circle, or Laurent polynomials of the complex variable .What about periodic functions on the complex plane? We can start with singly periodic functions which obey a periodicity relationship for all in the domain and some period ; such functions can also be viewed as functions on the “additive cylinder” (or equivalently ). We can rescale as before. For holomorphic functions, we have the following characterisations:
Proposition 1 (Description of singly periodic holomorphic functions)In both cases, the coefficients can be recovered from by the Fourier inversion formula for any in (in case (i)) or (in case (ii)).
- (i) Every -periodic entire function has an absolutely convergent expansion where is the nome , and the are complex coefficients such that Conversely, every doubly infinite sequence of coefficients obeying (2) gives rise to a -periodic entire function via the formula (1).
- (ii) Every bounded -periodic holomorphic function on the upper half-plane has an expansion where the are complex coefficients such that Conversely, every infinite sequence obeying (4) gives rise to a -periodic holomorphic function which is bounded away from the real axis (i.e., bounded on for every ).
Proof: If is -periodic, then it can be expressed as for some function on the “multiplicative cylinder” , since the fibres of the map are cosets of the integers , on which is constant by hypothesis. As the map is a covering map from to , we see that will be holomorphic if and only if is. Thus must have a Laurent series expansion with coefficients obeying (2), which gives (1), and the inversion formula (5) follows from the usual contour integration formula for Laurent series coefficients. The converse direction to (i) also follows by reversing the above arguments.
For part (ii), we observe that the map is also a covering map from to the punctured disk , so we can argue as before except that now is a bounded holomorphic function on the punctured disk. By the Riemann singularity removal theorem (Exercise 35 of 246A Notes 3) extends to be holomorphic on all of , and thus has a Taylor expansion for some coefficients obeying (4). The argument now proceeds as with part (i).
The additive cylinder and the multiplicative cylinder can both be identified (on the level of smooth manifolds, at least) with the geometric cylinder , but we will not use this identification here.
Now let us turn attention to doubly periodic functions of a complex variable , that is to say functions that obey two periodicity relations
for all and some periods , which to avoid degeneracies we will assume to be linearly independent over the reals (thus are non-zero and the ratio is not real). One can rescale by a common scaling factor to normalise either or , but one of course cannot simultaneously normalise both parameters in this fashion. As in the singly periodic case, such functions can also be identified with functions on the additive -torus , where is the lattice , or with functions on the solid parallelogram bounded by the contour (a fundamental domain up to boundary for that torus), obeying the boundary periodicity conditions for in the edge , and for in the edge .Within the world of holomorphic functions, the collection of doubly periodic functions is boring:
Proposition 2 Let be an entire doubly periodic function (with periods linearly independent over ). Then is constant.
In the language of Riemann surfaces, this proposition asserts that the torus is a non-hyperbolic Riemann surface; it cannot be holomorphically mapped non-trivially into a bounded subset of the complex plane.
Proof: The fundamental domain (up to boundary) enclosed by is compact, hence is bounded on this domain, hence bounded on all of by double periodicity. The claim now follows from Liouville’s theorem. (One could alternatively have argued here using the compactness of the torus .
To obtain more interesting examples of doubly periodic functions, one must therefore turn to the world of meromorphic functions – or equivalently, holomorphic functions into the Riemann sphere . As it turns out, a particularly fundamental example of such a function is the Weierstrass elliptic function
which plays a role in doubly periodic functions analogous to the role of for -periodic real functions. This function will have a double pole at the origin , and more generally at all other points on the lattice , but no other poles. The derivative of the Weierstrass function is another doubly periodic meromorphic function, now with a triple pole at every point of , and plays a role analogous to . Remarkably, all the other doubly periodic meromorphic functions with these periods will turn out to be rational combinations of and ; furthermore, in analogy with the identity , one has an identity of the form for all (avoiding poles) and some complex numbers that depend on the lattice . Indeed, much as the map creates a diffeomorphism between the additive unit circle to the geometric unit circle , the map turns out to be a complex diffeomorphism between the torus and the elliptic curve with the convention that maps the origin of the torus to the point at infinity. (Indeed, one can view elliptic curves as “multiplicative tori”, and both the additive and multiplicative tori can be identified as smooth manifolds with the more familiar geometric torus, but we will not use such an identification here.) This fundamental identification with elliptic curves and tori motivates many of the further remarkable properties of elliptic curves; for instance, the fact that tori are obviously an abelian group gives rise to an abelian group law on elliptic curves (and this law can be interpreted as an analogue of the trigonometric sum identities for ). The description of the various meromorphic functions on the torus also helps motivate the more general Riemann-Roch theorem that is a fundamental law governing meromorphic functions on other compact Riemann surfaces (and is discussed further in these 246C notes). So far we have focused on studying a single torus . However, another important mathematical object of study is the space of all such tori, modulo isomorphism; this is a basic example of a moduli space, known as the (classical, level one) modular curve . This curve can be described in a number of ways. On the one hand, it can be viewed as the upper half-plane quotiented out by the discrete group ; on the other hand, by using the -invariant, it can be identified with the complex plane ; alternatively, one can compactify the modular curve and identify this compactification with the Riemann sphere . (This identification, by the way, produces a very short proof of the little and great Picard theorems, which we proved in 246A Notes 4.) Functions on the modular curve (such as the -invariant) can be viewed as -invariant functions on , and include the important class of modular functions; they naturally generalise to the larger class of (weakly) modular forms, which are functions on which transform in a very specific way under -action, and which are ubiquitous throughout mathematics, and particularly in number theory. Basic examples of modular forms include the Eisenstein series, which are also the Laurent coefficients of the Weierstrass elliptic functions . More number theoretic examples of modular forms include (suitable powers of) theta functions , and the modular discriminant . Modular forms are -periodic functions on the half-plane, and hence by Proposition 1 come with Fourier coefficients ; these coefficients often turn out to encode a surprising amount of number-theoretic information; a dramatic example of this is the famous modularity theorem, (a special case of which was) used amongst other things to establish Fermat’s last theorem. Modular forms can be generalised to other discrete groups than (such as congruence groups) and to other domains than the half-plane , leading to the important larger class of automorphic forms, which are of major importance in number theory and representation theory, but which are well outside the scope of this course to discuss.Previous set of notes: Notes 1. Next set of notes: Notes 3.
In Exercise 5 (and Lemma 1) of 246A Notes 4 we already observed some links between complex analysis on the disk (or annulus) and Fourier series on the unit circle:
- (i) Functions that are holomorphic on a disk are expressed by a convergent Fourier series (and also Taylor series) for (so in particular ), where conversely, every infinite sequence of coefficients obeying (1) arises from such a function .
- (ii) Functions that are holomorphic on an annulus are expressed by a convergent Fourier series (and also Laurent series) , where conversely, every doubly infinite sequence of coefficients obeying (2) arises from such a function .
- (iii) In the situation of (ii), there is a unique decomposition where extends holomorphically to , and extends holomorphically to and goes to zero at infinity, and are given by the formulae where is any anticlockwise contour in enclosing , and and where is any anticlockwise contour in enclosing but not .
This connection lets us interpret various facts about Fourier series through the lens of complex analysis, at least for some special classes of Fourier series. For instance, the Fourier inversion formula becomes the Cauchy-type formula for the Laurent or Taylor coefficients of , in the event that the coefficients are doubly infinite and obey (2) for some , or singly infinite and obey (1) for some .
It turns out that there are similar links between complex analysis on a half-plane (or strip) and Fourier integrals on the real line, which we will explore in these notes.
We first fix a normalisation for the Fourier transform. If is an absolutely integrable function on the real line, we define its Fourier transform by the formula
From the dominated convergence theorem will be a bounded continuous function; from the Riemann-Lebesgue lemma it also decays to zero as . My choice to place the in the exponent is a personal preference (it is slightly more convenient for some harmonic analysis formulae such as the identities (4), (5), (6) below), though in the complex analysis and PDE literature there are also some slight advantages in omitting this factor. In any event it is not difficult to adapt the discussion in this notes for other choices of normalisation. It is of interest to extend the Fourier transform beyond the class into other function spaces, such as or the space of tempered distributions, but we will not pursue this direction here; see for instance these lecture notes of mine for a treatment.
Exercise 1 (Fourier transform of Gaussian) If is a complex number with and is the Gaussian function , show that the Fourier transform is given by the Gaussian , where we use the standard branch for .
The Fourier transform has many remarkable properties. On the one hand, as long as the function is sufficiently “reasonable”, the Fourier transform enjoys a number of very useful identities, such as the Fourier inversion formula
the Plancherel identity and the Poisson summation formula On the other hand, the Fourier transform also intertwines various qualitative properties of a function with “dual” qualitative properties of its Fourier transform ; in particular, “decay” properties of tend to be associated with “regularity” properties of , and vice versa. For instance, the Fourier transform of rapidly decreasing functions tend to be smooth. There are complex analysis counterparts of this Fourier dictionary, in which “decay” properties are described in terms of exponentially decaying pointwise bounds, and “regularity” properties are expressed using holomorphicity on various strips, half-planes, or the entire complex plane. The following exercise gives some examples of this:
Exercise 2 (Decay of implies regularity of ) Let be an absolutely integrable function.Hint: to establish holomorphicity in each of these cases, use Morera’s theorem and the Fubini-Tonelli theorem. For uniqueness, use analytic continuation, or (for part (iv)) the Schwartz reflection principle.
- (i) If has super-exponential decay in the sense that for all and (that is to say one has for some finite quantity depending only on ), then extends uniquely to an entire function . Furthermore, this function continues to be defined by (3).
- (ii) If is supported on a compact interval then the entire function from (i) obeys the bounds for . In particular, if is supported in then .
- (iii) If obeys the bound for all and some , then extends uniquely to a holomorphic function on the horizontal strip , and obeys the bound in this strip. Furthermore, this function continues to be defined by (3).
- (iv) If is supported on (resp. ), then there is a unique continuous extension of to the lower half-plane (resp. the upper half-plane ) which is holomorphic in the interior of this half-plane, and such that uniformly as (resp. ). Furthermore, this function continues to be defined by (3).
Later in these notes we will give a partial converse to part (ii) of this exercise, known as the Paley-Wiener theorem; there are also partial converses to the other parts of this exercise.
From (3) we observe the following intertwining property between multiplication by an exponential and complex translation: if is a complex number and is an absolutely integrable function such that the modulated function is also absolutely integrable, then we have the identity
whenever is a complex number such that at least one of the two sides of the equation in (7) is well defined. Thus, multiplication of a function by an exponential weight corresponds (formally, at least) to translation of its Fourier transform. By using contour shifting, we will also obtain a dual relationship: under suitable holomorphicity and decay conditions on , translation by a complex shift will correspond to multiplication of the Fourier transform by an exponential weight. It turns out to be possible to exploit this property to derive many Fourier-analytic identities, such as the inversion formula (4) and the Poisson summation formula (6), which we do later in these notes. (The Plancherel theorem can also be established by complex analytic methods, but this requires a little more effort; see Exercise 8.)The material in these notes is loosely adapted from Chapter 4 of Stein-Shakarchi’s “Complex Analysis”.
Previous set of notes: 246A Notes 5. Next set of notes: Notes 2.
— 1. Jensen’s formula —
Suppose is a non-zero rational function , then by the fundamental theorem of algebra one can write
for some non-zero constant , where ranges over the zeroes of (counting multiplicity) and ranges over the zeroes of (counting multiplicity), and assuming avoids the zeroes of . Taking absolute values and then logarithms, we arrive at the formula as long as avoids the zeroes of both and . (In this set of notes we use for the natural logarithm when applied to a positive real number, and for the standard branch of the complex logarithm (which extends ); the multi-valued complex logarithm will only be used in passing.) Alternatively, taking logarithmic derivatives, we arrive at the closely related formula again for avoiding the zeroes of both and . Thus we see that the zeroes and poles of a rational function describe the behaviour of that rational function, as well as close relatives of that function such as the log-magnitude and log-derivative . We have already seen these sorts of formulae arise in our treatment of the argument principle in 246A Notes 4.
Exercise 1 Let be a complex polynomial of degree .
- (i) (Gauss-Lucas theorem) Show that the complex roots of are contained in the closed convex hull of the complex roots of .
- (ii) (Laguerre separation theorem) If all the complex roots of are contained in a disk , and , then all the complex roots of are also contained in . (Hint: apply a suitable Möbius transformation to move to infinity, and then apply part (i) to a polynomial that emerges after applying this transformation.)
There are a number of useful ways to extend these formulae to more general meromorphic functions than rational functions. Firstly there is a very handy “local” variant of (1) known as Jensen’s formula:
Theorem 2 (Jensen’s formula) Let be a meromorphic function on an open neighbourhood of a disk , with all removable singularities removed. Then, if is neither a zero nor a pole of , we have where and range over the zeroes and poles of respectively (counting multiplicity) in the disk .
One can view (3) as a truncated (or localised) variant of (1). Note also that the summands are always non-positive.
Proof: By perturbing slightly if necessary, we may assume that none of the zeroes or poles of (which form a discrete set) lie on the boundary circle . By translating and rescaling, we may then normalise and , thus our task is now to show that
We may remove the poles and zeroes inside the disk by the useful device of Blaschke products. Suppose for instance that has a zero inside the disk . Observe that the function has magnitude on the unit circle , equals at the origin, has a simple zero at , but has no other zeroes or poles inside the disk. Thus Jensen’s formula (4) already holds if is replaced by . To prove (4) for , it thus suffices to prove it for , which effectively deletes a zero inside the disk from (and replaces it instead with its inversion ). Similarly we may remove all the poles inside the disk. As a meromorphic function only has finitely many poles and zeroes inside a compact set, we may thus reduce to the case when has no poles or zeroes on or inside the disk , at which point our goal is simply to show that Since has no zeroes or poles inside the disk, it has a holomorphic logarithm (Exercise 46 of 246A Notes 4). In particular, is the real part of . The claim now follows by applying the mean value property (Exercise 17 of 246A Notes 3) to .An important special case of Jensen’s formula arises when is holomorphic in a neighborhood of , in which case there are no contributions from poles and one simply has
This is quite a useful formula, mainly because the summands are non-negative; it can be viewed as a more precise assertion of the subharmonicity of (see Exercises 60(ix) and 61 of 246A Notes 5). Here are some quick applications of this formula:
Exercise 3 Use (6) to give another proof of Liouville’s theorem: a bounded holomorphic function on the entire complex plane is necessarily constant.
Exercise 4 Use Jensen’s formula to prove the fundamental theorem of algebra: a complex polynomial of degree has exactly complex zeroes (counting multiplicity), and can thus be factored as for some complex numbers with . (Note that the fundamental theorem was invoked previously in this section, but only for motivational purposes, so the proof here is non-circular.)
Exercise 5 (Shifted Jensen’s formula) Let be a meromorphic function on an open neighbourhood of a disk , with all removable singularities removed. Show that for all in the open disk that are not zeroes or poles of , where and . (The function appearing in the integrand is sometimes known as the Poisson kernel, particularly if one normalises so that and .)
Exercise 6 (Bounded type)
- (i) If is a holomorphic function on that is not identically zero, show that .
- (ii) If is a meromorphic function on that is the ratio of two bounded holomorphic functions that are not identically zero, show that . (Functions of this form are said to be of bounded type and lie in the Nevanlinna class for the unit disk .)
Exercise 7 (Smoothed out Jensen formula) Let be a meromorphic function on an open set , and let be a smooth compactly supported function. Show that where range over the zeroes and poles of (respectively) in the support of . Informally argue why this identity is consistent with Jensen’s formula. (Note: as many of the functions involved here are not holomorphic, complex analysis tools are of limited use. Try using real variable tools such as Stokes theorem, Greens theorem, or integration by parts.)
When applied to entire functions , Jensen’s formula relates the order of growth of near infinity with the density of zeroes of . Here is a typical result:
Proposition 8 Let be an entire function, not identically zero, that obeys a growth bound for some and all . Then there exists a constant such that has at most zeroes (counting multiplicity) for any .
Entire functions that obey a growth bound of the form for every and (where depends on ) are said to be of order at most . The above theorem shows that for such functions that are not identically zero, the number of zeroes in a disk of radius does not grow much faster than . This is often a useful preliminary upper bound on the zeroes of entire functions, as the order of an entire function tends to be relatively easy to compute in practice.
Proof: First suppose that is non-zero. From (6) applied with and one has
Every zero in contribute at least to a summand on the right-hand side, while all other zeroes contribute a non-negative quantity, thus where denotes the number of zeroes in . This gives the claim for . When , one can shift by a small amount to make non-zero at the origin (using the fact that zeroes of holomorphic functions not identically zero are isolated), modifying in the process, and then repeating the previous arguments.Just as (3) and (7) give truncated variants of (1), we can create truncated versions of (2). The following crude truncation is adequate for many applications:
Theorem 9 (Truncated formula for log-derivative) Let be a holomorphic function on an open neighbourhood of a disk that is not identically zero on this disk. Suppose that one has a bound of the form for some and all on the circle . Let be constants. Then one has the approximate formula for all in the disk other than zeroes of . Furthermore, the number of zeroes in the above sum is .
Proof: To abbreviate notation, we allow all implied constants in this proof to depend on .
We mimic the proof of Jensen’s formula. Firstly, we may translate and rescale so that and , so we have when , and our main task is to show that
for . Note that if then vanishes on the unit circle and hence (by the maximum principle) vanishes identically on the disk, a contradiction, so we may assume . From hypothesis we then have on the unit circle, and so from Jensen’s formula (3) we see that In particular we see that the number of zeroes with is , as claimed.Suppose has a zero with . If we factor , where is the Blaschke product (5), then
Observe from Taylor expansion that the distance between and is , and hence for . Thus we see from (9) that we may use Blaschke products to remove all the zeroes in the annulus while only affecting the left-hand side of (8) by ; also, removing the Blaschke products does not affect on the unit circle, and only affects by thanks to (9). Thus we may assume without loss of generality that there are no zeroes in this annulus.Similarly, given a zero with , we have , so using Blaschke products to remove all of these zeroes also only affects the left-hand side of (8) by (since the number of zeroes here is ), with also modified by at most . Thus we may assume in fact that has no zeroes whatsoever within the unit disk. We may then also normalise , then for all . By Jensen’s formula again, we have
and thus (by using the identity for any real ) On the other hand, from (7) we have which implies from (10) that and its first derivatives are on the disk . But recall from the proof of Jensen’s formula that is the derivative of a logarithm of , whose real part is . By the Cauchy-Riemann equations for , we conclude that on the disk , as required.
Exercise 10
- (i) (Borel-Carathéodory theorem) If is analytic on an open neighborhood of a disk and , show that (Hint: one can normalise , , , and . Now maps the unit disk to the half-plane . Use a Möbius transformation to map the half-plane to the unit disk and then use the Schwarz lemma.)
- (ii) Use (i) to give an alternate way to conclude the proof of Theorem 9.
A variant of the above argument allows one to make precise the heuristic that holomorphic functions locally look like polynomials:
Exercise 11 (Local Weierstrass factorisation) Let the notation and hypotheses be as in Theorem 9. Then show that for all in the disk , where is a polynomial whose zeroes are precisely the zeroes of in (counting multiplicity) and is a holomorphic function on of magnitude and first derivative on this disk. Furthermore, show that the degree of is .
Exercise 12 (Preliminary Beurling factorisation) Let denote the space of bounded analytic functions on the unit disk; this is a normed vector space with norm
- (i) If is not identically zero, and denote the zeroes of in counting multiplicity, show that and
- (ii) Let the notation be as in (i). If we define the Blaschke product where is the order of vanishing of at zero, show that this product converges absolutely to a holomorphic function on , and that for all . (It may be easier to work with finite Blaschke products first to obtain this bound.)
- (iii) Continuing the notation from (i), establish a factorisation for some holomorphic function with for all .
- (iv) (Theorem of F. and M. Riesz, special case) If extends continuously to the boundary , show that the set has zero measure.
Remark 13 The factorisation (iii) can be refined further, with being the Poisson integral of some finite measure on the unit circle. Using the Lebesgue decomposition of this finite measure into absolutely continuous parts one ends up factorising functions into “outer functions” and “inner functions”, giving the Beurling factorisation of . There are also extensions to larger spaces than (which are to as is to ), known as Hardy spaces. We will not discuss this topic further here, but see for instance this text of Garnett for a treatment.
Exercise 14 (Littlewood’s lemma) Let be holomorphic on an open neighbourhood of a rectangle for some and , with non-vanishing on the boundary of the rectangle. Show that where ranges over the zeroes of inside (counting multiplicity) and one uses a branch of which is continuous on the upper, lower, and right edges of . (This lemma is a popular tool to explore the zeroes of Dirichlet series such as the Riemann zeta function.)
Just a short announcement that next quarter I will be continuing the recently concluded 246A complex analysis class as 246B. Topics I plan to cover:
- Schwartz-Christoffel transformations and the uniformisation theorem (using the remainder of the 246A notes);
- Jensen’s formula and factorisation theorems (particularly Weierstrass and Hadamard); the Gamma function;
- Connections with the Fourier transform on the real line;
- Elliptic functions and their relatives;
- (if time permits) the Riemann zeta function and the prime number theorem.
Notes for the later material will appear on this blog in due course.
I’ve just uploaded to the arXiv my paper “Sendov’s conjecture for sufficiently high degree polynomials“. This paper is a contribution to an old conjecture of Sendov on the zeroes of polynomials:
Conjecture 1 (Sendov’s conjecture) Let be a polynomial of degree that has all zeroes in the closed unit disk . If is one of these zeroes, then has at least one zero in .
It is common in the literature on this problem to normalise to be monic, and to rotate the zero to be an element of the unit interval . As it turns out, the location of on this unit interval ends up playing an important role in the arguments.
Many cases of this conjecture are already known, for instance
- When (Brown-Xiang 1999);
- When (Gauss-Lucas theorem);
- When (Bojanov 2011);
- When for a fixed , and is sufficiently large depending on (Dégot 2014);
- When for a sufficiently large absolute constant (Chalebgwa 2020);
- When (Rubinstein 1968; Goodman-Rahman-Ratti 1969; Joyal 1969);
- When , where is sufficiently small depending on (Miller 1993; Vajaitu-Zaharescu 1993);
- When (Chijiwa 2011);
- When (Kasmalkar 2014).
In particular, in high degrees the only cases left uncovered by prior results are when is close (but not too close) to , or when is close (but not too close) to ; see Figure 1 of my paper.
Our main result covers the high degree case uniformly for all values of :
Theorem 2 There exists an absolute constant such that Sendov’s conjecture holds for all .
In principle, this reduces the verification of Sendov’s conjecture to a finite time computation, although our arguments use compactness methods and thus do not easily provide an explicit value of . I believe that the compactness arguments can be replaced with quantitative substitutes that provide an explicit , but the value of produced is likely to be extremely large (certainly much larger than ).
Because of the previous results (particularly those of Chalebgwa and Chijiwa), we will only need to establish the following two subcases of the above theorem:
Theorem 3 (Sendov’s conjecture near the origin) Under the additional hypothesis , Sendov’s conjecture holds for sufficiently large .
Theorem 4 (Sendov’s conjecture near the unit circle) Under the additional hypothesis for a fixed , Sendov’s conjecture holds for sufficiently large .
We approach these theorems using the “compactness and contradiction” strategy, assuming that there is a sequence of counterexamples whose degrees going to infinity, using various compactness theorems to extract various asymptotic objects in the limit , and somehow using these objects to derive a contradiction. There are many ways to effect such a strategy; we will use a formalism that I call “cheap nonstandard analysis” and which is common in the PDE literature, in which one repeatedly passes to subsequences as necessary whenever one invokes a compactness theorem to create a limit object. However, the particular choice of asymptotic formalism one selects is not of essential importance for the arguments.
I also found it useful to use the language of probability theory. Given a putative counterexample to Sendov’s conjecture, let be a zero of (chosen uniformly at random among the zeroes of , counting multiplicity), and let similarly be a uniformly random zero of . We introduce the logarithmic potentials
and the Stieltjes transforms Standard calculations using the fundamental theorem of algebra yield the basic identities and and in particular the random variables are linked to each other by the identity On the other hand, the hypotheses of Sendov’s conjecture (and the Gauss-Lucas theorem) place inside the unit disk . Applying Prokhorov’s theorem, and passing to a subsequence, one can then assume that the random variables converge in distribution to some limiting random variables (possibly defined on a different probability space than the original variables ), also living almost surely inside the unit disk. Standard potential theory then gives the convergence and at least in the local sense. Among other things, we then conclude from the identity (2) and some elementary inequalities that for all . This turns out to have an appealing interpretation in terms of Brownian motion: if one takes two Brownian motions in the complex plane, one originating from and one originating from , then the location where these Brownian motions first exit the unit disk will have the same distribution. (In our paper we actually replace Brownian motion with the closely related formalism of balayage.) This turns out to connect the random variables , quite closely to each other. In particular, with this observation and some additional arguments involving both the unique continuation property for harmonic functions and Grace’s theorem (discussed in this previous post), with the latter drawn from the prior work of Dégot, we can get very good control on these distributions:
Theorem 5
- (i) If , then almost surely lie in the semicircle and have the same distribution.
- (ii) If , then is uniformly distributed on the circle , and is almost surely zero.
In case (i) (and strengthening the hypothesis to to control some technical contributions of “outlier” zeroes of ), we can use this information about and (4) to ensure that the normalised logarithmic derivative has a non-negative winding number in a certain small (but not too small) circle around the origin, which by the argument principle is inconsistent with the hypothesis that has a zero at and that has no zeroes near . This is how we establish Theorem 3.
Case (ii) turns out to be more delicate. This is because there are a number of “near-counterexamples” to Sendov’s conjecture that are compatible with the hypotheses and conclusion of case (ii). The simplest such example is , where the zeroes of are uniformly distributed amongst the roots of unity (including at ), and the zeroes of are all located at the origin. In my paper I also discuss a variant of this construction, in which has zeroes mostly near the origin, but also acquires a bounded number of zeroes at various locations inside the unit disk. Specifically, we take
where for some constants and By a perturbative analysis to locate the zeroes of , one eventually would be able to arrive at a true counterexample to Sendov’s conjecture if these locations were in the open lune and if one had the inequality for all . However, if one takes the mean of this inequality in , one arrives at the inequality which is incompatible with the hypotheses and . In order to extend this argument to more general polynomials , we require a stability analysis of the endpoint equation where we now only assume the closed conditions and . The above discussion then places all the zeros on the arc and if one also takes the second Fourier coefficient of (6) one also obtains the vanishing second moment These two conditions are incompatible with each other (except in the degenerate case when all the vanish), because all the non-zero elements of the arc (7) have argument in , so in particular their square will have negative real part. It turns out that one can adapt this argument to the more general potential counterexamples to Sendov’s conjecture (in the form of Theorem 4). The starting point is to use (1), (4), and Theorem 5(ii) to obtain good control on , which one then integrates and exponentiates to get good control on , and then on a second integration one gets enough information about to pin down the location of its zeroes to high accuracy. The constraint that these zeroes lie inside the unit disk then gives an inequality resembling (5), and an adaptation of the above stability analysis is then enough to conclude. The arguments here are inspired by the previous arguments of Miller, which treated the case when was extremely close to via a similar perturbative analysis; the main novelty is to control the error terms not in terms of the magnitude of the largest zero of (which is difficult to manage when gets large), but rather by the variance of those zeroes, which ends up being a more tractable expression to keep track of.Consider a disk in the complex plane. If one applies an affine-linear map to this disk, one obtains
For maps that are merely holomorphic instead of affine-linear, one has some variants of this assertion, which I am recording here mostly for my own reference:
Theorem 1 (Holomorphic images of disks) Let be a disk in the complex plane, and be a holomorphic function with .
- (i) (Open mapping theorem or inverse function theorem) contains a disk for some . (In fact there is even a holomorphic right inverse of from to .)
- (ii) (Bloch theorem) contains a disk for some absolute constant and some . (In fact there is even a holomorphic right inverse of from to .)
- (iii) (Koebe quarter theorem) If is injective, then contains the disk .
- (iv) If is a polynomial of degree , then contains the disk .
- (v) If one has a bound of the form for all and some , then contains the disk for some absolute constant . (In fact there is holomorphic right inverse of from to .)
Parts (i), (ii), (iii) of this theorem are standard, as indicated by the given links. I found part (iv) as (a consequence of) Theorem 2 of this paper of Degot, who remarks that it “seems not already known in spite of its simplicity”; an equivalent form of this result also appears in Lemma 4 of this paper of Miller. The proof is simple:
Proof: (Proof of (iv)) Let , then we have a lower bound for the log-derivative of at :
(with the convention that the left-hand side is infinite when ). But by the fundamental theorem of algebra we have where are the roots of the polynomial (counting multiplicity). By the pigeonhole principle, there must therefore exist a root of such that and hence . Thus contains , and the claim follows.The constant in (iv) is completely sharp: if and is non-zero then contains the disk
but avoids the origin, thus does not contain any disk of the form . This example also shows that despite parts (ii), (iii) of the theorem, one cannot hope for a general inclusion of the form for an absolute constant .Part (v) is implicit in the standard proof of Bloch’s theorem (part (ii)), and is easy to establish:
Proof: (Proof of (v)) From the Cauchy inequalities one has for , hence by Taylor’s theorem with remainder for . By Rouche’s theorem, this implies that the function has a unique zero in for any , if is a sufficiently small absolute constant. The claim follows.
Note that part (v) implies part (i). A standard point picking argument also lets one deduce part (ii) from part (v):
Proof: (Proof of (ii)) By shrinking slightly if necessary we may assume that extends analytically to the closure of the disk . Let be the constant in (v) with ; we will prove (iii) with replaced by . If we have for all then we are done by (v), so we may assume without loss of generality that there is such that . If for all then by (v) we have
and we are again done. Hence we may assume without loss of generality that there is such that . Iterating this procedure in the obvious fashion we either are done, or obtain a Cauchy sequence in such that goes to infinity as , which contradicts the analytic nature of (and hence continuous nature of ) on the closure of . This gives the claim.Here is another classical result stated by Alexander (and then proven by Kakeya and by Szego, but also implied to a classical theorem of Grace and Heawood) that is broadly compatible with parts (iii), (iv) of the above theorem:
Proposition 2 Let be a disk in the complex plane, and be a polynomial of degree with for all . Then is injective on .
The radius is best possible, for the polynomial has non-vanishing on , but one has , and lie on the boundary of .
If one narrows slightly to then one can quickly prove this proposition as follows. Suppose for contradiction that there exist distinct with , thus if we let be the line segment contour from to then . However, by assumption we may factor where all the lie outside of . Elementary trigonometry then tells us that the argument of only varies by less than as traverses , hence the argument of only varies by less than . Thus takes values in an open half-plane avoiding the origin and so it is not possible for to vanish.
To recover the best constant of requires some effort. By taking contrapositives and applying an affine rescaling and some trigonometry, the proposition can be deduced from the following result, known variously as the Grace-Heawood theorem or the complex Rolle theorem.
Proposition 3 (Grace-Heawood theorem) Let be a polynomial of degree such that . Then contains a zero in the closure of .
This is in turn implied by a remarkable and powerful theorem of Grace (which we shall prove shortly). Given two polynomials of degree at most , define the apolar form by
Theorem 4 (Grace’s theorem) Let be a circle or line in , dividing into two open connected regions . Let be two polynomials of degree at most , with all the zeroes of lying in and all the zeroes of lying in . Then .
(Contrapositively: if , then the zeroes of cannot be separated from the zeroes of by a circle or line.)
Indeed, a brief calculation reveals the identity
where is the degree polynomial The zeroes of are for , so the Grace-Heawood theorem follows by applying Grace’s theorem with equal to the boundary of .The same method of proof gives the following nice consequence:
Theorem 5 (Perpendicular bisector theorem) Let be a polynomial such that for some distinct . Then the zeroes of cannot all lie on one side of the perpendicular bisector of . For instance, if , then the zeroes of cannot all lie in the halfplane or the halfplane .
I’d be interested in seeing a proof of this latter theorem that did not proceed via Grace’s theorem.
Now we give a proof of Grace’s theorem. The case can be established by direct computation, so suppose inductively that and that the claim has already been established for . Given the involvement of circles and lines it is natural to suspect that a Möbius transformation symmetry is involved. This is indeed the case and can be made precise as follows. Let denote the vector space of polynomials of degree at most , then the apolar form is a bilinear form . Each translation on the complex plane induces a corresponding map on , mapping each polynomial to its shift . We claim that the apolar form is invariant with respect to these translations:
Taking derivatives in , it suffices to establish the skew-adjointness relation but this is clear from the alternating form of (1).Next, we see that the inversion map also induces a corresponding map on , mapping each polynomial to its inversion . From (1) we see that this map also (projectively) preserves the apolar form:
More generally, the group of Möbius transformations on the Riemann sphere acts projectively on , with each Möbius transformation mapping each to , where is the unique (up to constants) rational function that maps this a map from to (its divisor is ). Since the Möbius transformations are generated by translations and inversion, we see that the action of Möbius transformations projectively preserves the apolar form; also, we see this action of on also moves the zeroes of each by (viewing polynomials of degree less than in as having zeroes at infinity). In particular, the hypotheses and conclusions of Grace’s theorem are preserved by this Möbius action. We can then apply such a transformation to move one of the zeroes of to infinity (thus making a polynomial of degree ), so that must now be a circle, with the zeroes of inside the circle and the remaining zeroes of outside the circle. But then By the Gauss-Lucas theorem, the zeroes of are also inside . The claim now follows from the induction hypothesis.Dimitri Shlyakhtenko and I have uploaded to the arXiv our paper Fractional free convolution powers. For me, this project (which we started during the 2018 IPAM program on quantitative linear algebra) was motivated by a desire to understand the behavior of the minor process applied to a large random Hermitian matrix , in which one takes the successive upper left minors of and computes their eigenvalues in non-decreasing order. These eigenvalues are related to each other by the Cauchy interlacing inequalities
for , and are often arranged in a triangular array known as a Gelfand-Tsetlin pattern, as discussed in these previous blog posts.When is large and the matrix is a random matrix with empirical spectral distribution converging to some compactly supported probability measure on the real line, then under suitable hypotheses (e.g., unitary conjugation invariance of the random matrix ensemble ), a “concentration of measure” effect occurs, with the spectral distribution of the minors for for any fixed converging to a specific measure that depends only on and . The reason for this notation is that there is a surprising description of this measure when is a natural number, namely it is the free convolution of copies of , pushed forward by the dilation map . For instance, if is the Wigner semicircular measure , then . At the random matrix level, this reflects the fact that the minor of a GUE matrix is again a GUE matrix (up to a renormalizing constant).
As first observed by Bercovici and Voiculescu and developed further by Nica and Speicher, among other authors, the notion of a free convolution power of can be extended to non-integer , thus giving the notion of a “fractional free convolution power”. This notion can be defined in several different ways. One of them proceeds via the Cauchy transform
of the measure , and can be defined by solving the Burgers-type equation with initial condition (see this previous blog post for a derivation). This equation can be solved explicitly using the -transform of , defined by solving the equation for sufficiently large , in which case one can show that (In the case of the semicircular measure , the -transform is simply the identity: .)Nica and Speicher also gave a free probability interpretation of the fractional free convolution power: if is a noncommutative random variable in a noncommutative probability space with distribution , and is a real projection operator free of with trace , then the “minor” of (viewed as an element of a new noncommutative probability space whose elements are minors , with trace ) has the law of (we give a self-contained proof of this in an appendix to our paper). This suggests that the minor process (or fractional free convolution) can be studied within the framework of free probability theory.
One of the known facts about integer free convolution powers is monotonicity of the free entropy
and free Fisher information which were introduced by Voiculescu as free probability analogues of the classical probability concepts of differential entropy and classical Fisher information. (Here we correct a small typo in the normalization constant of Fisher entropy as presented in Voiculescu’s paper.) Namely, it was shown by Shylakhtenko that the quantity is monotone non-decreasing for integer , and the Fisher information is monotone non-increasing for integer . This is the free probability analogue of the corresponding monotonicities for differential entropy and classical Fisher information that was established by Artstein, Ball, Barthe, and Naor, answering a question of Shannon.Our first main result is to extend the monotonicity results of Shylakhtenko to fractional . We give two proofs of this fact, one using free probability machinery, and a more self contained (but less motivated) proof using integration by parts and contour integration. The free probability proof relies on the concept of the free score of a noncommutative random variable, which is the analogue of the classical score. The free score, also introduced by Voiculescu, can be defined by duality as measuring the perturbation with respect to semicircular noise, or more precisely
whenever is a polynomial and is a semicircular element free of . If has an absolutely continuous law for a sufficiently regular , one can calculate explicitly as , where is the Hilbert transform of , and the Fisher information is given by the formula One can also define a notion of relative free score relative to some subalgebra of noncommutative random variables.The free score interacts very well with the free minor process , in particular by standard calculations one can establish the identity
whenever is a noncommutative random variable, is an algebra of noncommutative random variables, and is a real projection of trace that is free of both and . The monotonicity of free Fisher information then follows from an application of Pythagoras’s theorem (which implies in particular that conditional expectation operators are contractions on ). The monotonicity of free entropy then follows from an integral representation of free entropy as an integral of free Fisher information along the free Ornstein-Uhlenbeck process (or equivalently, free Fisher information is essentially the rate of change of free entropy with respect to perturbation by semicircular noise). The argument also shows when equality holds in the monotonicity inequalities; this occurs precisely when is a semicircular measure up to affine rescaling.After an extensive amount of calculation of all the quantities that were implicit in the above free probability argument (in particular computing the various terms involved in the application of Pythagoras’ theorem), we were able to extract a self-contained proof of monotonicity that relied on differentiating the quantities in and using the differential equation (1). It turns out that if for sufficiently regular , then there is an identity
where is the kernel and . It is not difficult to show that is a positive semi-definite kernel, which gives the required monotonicity. It would be interesting to obtain some more insightful interpretation of the kernel and the identity (2).These monotonicity properties hint at the minor process being associated to some sort of “gradient flow” in the parameter. We were not able to formalize this intuition; indeed, it is not clear what a gradient flow on a varying noncommutative probability space even means. However, after substantial further calculation we were able to formally describe the minor process as the Euler-Lagrange equation for an intriguing Lagrangian functional that we conjecture to have a random matrix interpretation. We first work in “Lagrangian coordinates”, defining the quantity on the “Gelfand-Tsetlin pyramid”
by the formula which is well defined if the density of is sufficiently well behaved. The random matrix interpretation of is that it is the asymptotic location of the eigenvalue of the upper left minor of a random matrix with asymptotic empirical spectral distribution and with unitarily invariant distribution, thus is in some sense a continuum limit of Gelfand-Tsetlin patterns. Thus for instance the Cauchy interlacing laws in this asymptotic limit regime become After a lengthy calculation (involving extensive use of the chain rule and product rule), the equation (1) is equivalent to the Euler-Lagrange equation where is the Lagrangian density Thus the minor process is formally a critical point of the integral . The quantity measures the mean eigenvalue spacing at some location of the Gelfand-Tsetlin pyramid, and the ratio measures mean eigenvalue drift in the minor process. This suggests that this Lagrangian density is some sort of measure of entropy of the asymptotic microscale point process emerging from the minor process at this spacing and drift. There is work of Metcalfe demonstrating that this point process is given by the Boutillier bead model, so we conjecture that this Lagrangian density somehow measures the entropy density of this process.Kari Astala, Steffen Rohde, Eero Saksman and I have (finally!) uploaded to the arXiv our preprint “Homogenization of iterated singular integrals with applications to random quasiconformal maps“. This project started (and was largely completed) over a decade ago, but for various reasons it was not finalised until very recently. The motivation for this project was to study the behaviour of “random” quasiconformal maps. Recall that a (smooth) quasiconformal map is a homeomorphism that obeys the Beltrami equation
for some Beltrami coefficient ; this can be viewed as a deformation of the Cauchy-Riemann equation . Assuming that is asymptotic to at infinity, one can (formally, at least) solve for in terms of using the Beurling transform by the Neumann series We looked at the question of the asymptotic behaviour of if is a random field that oscillates at some fine spatial scale . A simple model to keep in mind is where are independent random signs and is a bump function. For models such as these, we show that a homogenisation occurs in the limit ; each multilinear expression converges weakly in probability (and almost surely, if we restrict to a lacunary sequence) to a deterministic limit, and the associated quasiconformal map similarly converges weakly in probability (or almost surely). (Results of this latter type were also recently obtained by Ivrii and Markovic by a more geometric method which is simpler, but is applied to a narrower class of Beltrami coefficients.) In the specific case (1), the limiting quasiconformal map is just the identity map , but if for instance replaces the by non-symmetric random variables then one can have significantly more complicated limits. The convergence theorem for multilinear expressions such as is not specific to the Beurling transform ; any other translation and dilation invariant singular integral can be used here.The random expression (2) is somewhat reminiscent of a moment of a random matrix, and one can start computing it analogously. For instance, if one has a decomposition such as (1), then (2) expands out as a sum
The random fluctuations of this sum can be treated by a routine second moment estimate, and the main task is to show that the expected value becomes asymptotically independent of .If all the were distinct then one could use independence to factor the expectation to get
which is a relatively straightforward expression to calculate (particularly in the model (1), where all the expectations here in fact vanish). The main difficulty is that there are a number of configurations in (3) in which various of the collide with each other, preventing one from easily factoring the expression. A typical problematic contribution for instance would be a sum of the form This is an example of what we call a non-split sum. This can be compared with the split sum If we ignore the constraint in the latter sum, then it splits into where and and one can hope to treat this sum by an induction hypothesis. (To actually deal with constraints such as requires an inclusion-exclusion argument that creates some notational headaches but is ultimately manageable.) As the name suggests, the non-split configurations such as (4) cannot be factored in this fashion, and are the most difficult to handle. A direct computation using the triangle inequality (and a certain amount of combinatorics and induction) reveals that these sums are somewhat localised, in that dyadic portions such as exhibit power decay in (when measured in suitable function space norms), basically because of the large number of times one has to transition back and forth between and . Thus, morally at least, the dominant contribution to a non-split sum such as (4) comes from the local portion when . From the translation and dilation invariance of this type of expression then simplifies to something like (plus negligible errors) for some reasonably decaying function , and this can be shown to converge to a weak limit as .In principle all of these limits are computable, but the combinatorics is remarkably complicated, and while there is certainly some algebraic structure to the calculations, it does not seem to be easily describable in terms of an existing framework (e.g., that of free probability).
A useful rule of thumb in complex analysis is that holomorphic functions behave like large degree polynomials . This can be evidenced for instance at a “local” level by the Taylor series expansion for a complex analytic function in the disk, or at a “global” level by factorisation theorems such as the Weierstrass factorisation theorem (or the closely related Hadamard factorisation theorem). One can truncate these theorems in a variety of ways (e.g., Taylor’s theorem with remainder) to be able to approximate a holomorphic function by a polynomial on various domains.
In some cases it can be convenient instead to work with polynomials of another variable such as (or more generally for a scaling parameter ). In the case of the Riemann zeta function, defined by meromorphic continuation of the formula
one ends up having the following heuristic approximation in the neighbourhood of a point on the critical line:
Heuristic 1 (Polynomial approximation) Let be a height, let be a “typical” element of , and let be an integer. Let be the linear change of variables
The requirement is necessary since the right-hand side is periodic with period in the variable (or period in the variable), whereas the zeta function is not expected to have any such periodicity, even approximately.
Let us give two non-rigorous justifications of this heuristic. Firstly, it is standard that inside the critical strip (with ) we have an approximate form
of (11). If we group the integers from to into bins depending on what powers of they lie between, we thus have
For with and we heuristically have
and so
where are the partial Dirichlet series
This gives the desired polynomial approximation.
A second non-rigorous justification is as follows. From factorisation theorems such as the Hadamard factorisation theorem we expect to have
where runs over the non-trivial zeroes of , and there are some additional factors arising from the trivial zeroes and poles of which we will ignore here; we will also completely ignore the issue of how to renormalise the product to make it converge properly. In the region , the dominant contribution to this product (besides multiplicative constants) should arise from zeroes that are also in this region. The Riemann-von Mangoldt formula suggests that for “typical” one should have about such zeroes. If one lets be any enumeration of zeroes closest to , and then repeats this set of zeroes periodically by period , one then expects to have an approximation of the form
again ignoring all issues of convergence. If one writes and , then Euler’s famous product formula for sine basically gives
(here we are glossing over some technical issues regarding renormalisation of the infinite products, which can be dealt with by studying the asymptotics as ) and hence we expect
This again gives the desired polynomial approximation.
Below the fold we give a rigorous version of the second argument suitable for “microscale” analysis. More precisely, we will show
Theorem 2 Let be an integer going sufficiently slowly to infinity. Let go to zero sufficiently slowly depending on . Let be drawn uniformly at random from . Then with probability (in the limit ), and possibly after adjusting by , there exists a polynomial of degree and obeying the functional equation (9) below, such that
It should be possible to refine the arguments to extend this theorem to the mesoscale setting by letting be anything growing like , and anything growing like ; also we should be able to delete the need to adjust by . We have not attempted these optimisations here.
Many conjectures and arguments involving the Riemann zeta function can be heuristically translated into arguments involving the polynomials , which one can view as random degree polynomials if is interpreted as a random variable drawn uniformly at random from . These can be viewed as providing a “toy model” for the theory of the Riemann zeta function, in which the complex analysis is simplified to the study of the zeroes and coefficients of this random polynomial (for instance, the role of the gamma function is now played by a monomial in ). This model also makes the zeta function theory more closely resemble the function field analogues of this theory (in which the analogue of the zeta function is also a polynomial (or a rational function) in some variable , as per the Weil conjectures). The parameter is at our disposal to choose, and reflects the scale at which one wishes to study the zeta function. For “macroscopic” questions, at which one wishes to understand the zeta function at unit scales, it is natural to take (or very slightly larger), while for “microscopic” questions one would take close to and only growing very slowly with . For the intermediate “mesoscopic” scales one would take somewhere between and . Unfortunately, the statistical properties of are only understood well at a conjectural level at present; even if one assumes the Riemann hypothesis, our understanding of is largely restricted to the computation of low moments (e.g., the second or fourth moments) of various linear statistics of and related functions (e.g., , , or ).
Let’s now heuristically explore the polynomial analogues of this theory in a bit more detail. The Riemann hypothesis basically corresponds to the assertion that all the zeroes of the polynomial lie on the unit circle (which, after the change of variables , corresponds to being real); in a similar vein, the GUE hypothesis corresponds to having the asymptotic law of a random scalar times the characteristic polynomial of a random unitary matrix. Next, we consider what happens to the functional equation
A routine calculation involving Stirling’s formula reveals that
with ; one also has the closely related approximation
when . Since , applying (5) with and using the approximation (2) suggests a functional equation for :
where is the polynomial with all the coefficients replaced by their complex conjugate. Thus if we write
then the functional equation can be written as
We remark that if we use the heuristic (3) (interpreting the cutoffs in the summation in a suitably vague fashion) then this equation can be viewed as an instance of the Poisson summation formula.
Another consequence of the functional equation is that the zeroes of are symmetric with respect to inversion across the unit circle. This is of course consistent with the Riemann hypothesis, but does not obviously imply it. The phase is of little consequence in this functional equation; one could easily conceal it by working with the phase rotation of instead.
One consequence of the functional equation is that is real for any ; the same is then true for the derivative . Among other things, this implies that cannot vanish unless does also; thus the zeroes of will not lie on the unit circle except where has repeated zeroes. The analogous statement is true for ; the zeroes of will not lie on the critical line except where has repeated zeroes.
Relating to this fact, it is a classical result of Speiser that the Riemann hypothesis is true if and only if all the zeroes of the derivative of the zeta function in the critical strip lie on or to the right of the critical line. The analogous result for polynomials is
Proposition 3 We have
(where all zeroes are counted with multiplicity.) In particular, the zeroes of all lie on the unit circle if and only if the zeroes of lie in the closed unit disk.
Proof: From the functional equation we have
Thus it will suffice to show that and have the same number of zeroes outside the closed unit disk.
Set , then is a rational function that does not have a zero or pole at infinity. For not a zero of , we have already seen that and are real, so on dividing we see that is always real, that is to say
(This can also be seen by writing , where runs over the zeroes of , and using the fact that these zeroes are symmetric with respect to reflection across the unit circle.) When is a zero of , has a simple pole at with residue a positive multiple of , and so stays on the right half-plane if one traverses a semicircular arc around outside the unit disk. From this and continuity we see that stays on the right-half plane in a circle slightly larger than the unit circle, and hence by the argument principle it has the same number of zeroes and poles outside of this circle, giving the claim.
From the functional equation and the chain rule, is a zero of if and only if is a zero of . We can thus write the above proposition in the equivalent form
One can use this identity to get a lower bound on the number of zeroes of by the method of mollifiers. Namely, for any other polynomial , we clearly have
By Jensen’s formula, we have for any that
We therefore have
As the logarithm function is concave, we can apply Jensen’s inequality to conclude
where the expectation is over the parameter. It turns out that by choosing the mollifier carefully in order to make behave like the function (while keeping the degree small enough that one can compute the second moment here), and then optimising in , one can use this inequality to get a positive fraction of zeroes of on the unit circle on average. This is the polynomial analogue of a classical argument of Levinson, who used this to show that at least one third of the zeroes of the Riemann zeta function are on the critical line; all later improvements on this fraction have been based on some version of Levinson’s method, mainly focusing on more advanced choices for the mollifier and of the differential operator that implicitly appears in the above approach. (The most recent lower bound I know of is , due to Pratt and Robles. In principle (as observed by Farmer) this bound can get arbitrarily close to if one is allowed to use arbitrarily long mollifiers, but establishing this seems of comparable difficulty to unsolved problems such as the pair correlation conjecture; see this paper of Radziwill for more discussion.) A variant of these techniques can also establish “zero density estimates” of the following form: for any , the number of zeroes of that lie further than from the unit circle is of order on average for some absolute constant . Thus, roughly speaking, most zeroes of lie within of the unit circle. (Analogues of these results for the Riemann zeta function were worked out by Selberg, by Jutila, and by Conrey, with increasingly strong values of .)
The zeroes of tend to live somewhat closer to the origin than the zeroes of . Suppose for instance that we write
where are the zeroes of , then by evaluating at zero we see that
and the right-hand side is of unit magnitude by the functional equation. However, if we differentiate
where are the zeroes of , then by evaluating at zero we now see that
The right-hand side would now be typically expected to be of size , and so on average we expect the to have magnitude like , that is to say pushed inwards from the unit circle by a distance roughly . The analogous result for the Riemann zeta function is that the zeroes of at height lie at a distance roughly to the right of the critical line on the average; see this paper of Levinson and Montgomery for a precise statement.
Recent Comments