You are currently browsing the tag archive for the ‘Fourier transform’ tag.
Previous set of notes: Notes 1. Next set of notes: Notes 3.
In Exercise 5 (and Lemma 1) of 246A Notes 4 we already observed some links between complex analysis on the disk (or annulus) and Fourier series on the unit circle:
- (i) Functions
that are holomorphic on a disk
are expressed by a convergent Fourier series (and also Taylor series)
for
(so in particular
), where
conversely, every infinite sequenceof coefficients obeying (1) arises from such a function
.
- (ii) Functions
that are holomorphic on an annulus
are expressed by a convergent Fourier series (and also Laurent series)
, where
conversely, every doubly infinite sequenceof coefficients obeying (2) arises from such a function
.
- (iii) In the situation of (ii), there is a unique decomposition
where
extends holomorphically to
, and
extends holomorphically to
and goes to zero at infinity, and are given by the formulae
whereis any anticlockwise contour in
enclosing
, and and
whereis any anticlockwise contour in
enclosing
but not
.
This connection lets us interpret various facts about Fourier series through the lens of complex analysis, at least for some special classes of Fourier series. For instance, the Fourier inversion formula becomes the Cauchy-type formula for the Laurent or Taylor coefficients of
, in the event that the coefficients are doubly infinite and obey (2) for some
, or singly infinite and obey (1) for some
.
It turns out that there are similar links between complex analysis on a half-plane (or strip) and Fourier integrals on the real line, which we will explore in these notes.
We first fix a normalisation for the Fourier transform. If is an absolutely integrable function on the real line, we define its Fourier transform
by the formula
Exercise 1 (Fourier transform of Gaussian) Ifis a complex number with
and
is the Gaussian function
, show that the Fourier transform
is given by the Gaussian
, where we use the standard branch for
.
The Fourier transform has many remarkable properties. On the one hand, as long as the function is sufficiently “reasonable”, the Fourier transform enjoys a number of very useful identities, such as the Fourier inversion formula
Exercise 2 (Decay ofimplies regularity of
) Let
be an absolutely integrable function.
Hint: to establish holomorphicity in each of these cases, use Morera’s theorem and the Fubini-Tonelli theorem. For uniqueness, use analytic continuation, or (for part (iv)) the Schwartz reflection principle.
- (i) If
has super-exponential decay in the sense that
for all
and
(that is to say one has
for some finite quantity
depending only on
), then
extends uniquely to an entire function
. Furthermore, this function continues to be defined by (3).
- (ii) If
is supported on a compact interval
then the entire function
from (i) obeys the bounds
for
. In particular, if
is supported in
then
.
- (iii) If
obeys the bound
for all
and some
, then
extends uniquely to a holomorphic function
on the horizontal strip
, and obeys the bound
in this strip. Furthermore, this function continues to be defined by (3).
- (iv) If
is supported on
(resp.
), then there is a unique continuous extension of
to the lower half-plane
(resp. the upper half-plane
) which is holomorphic in the interior of this half-plane, and such that
uniformly as
(resp.
). Furthermore, this function continues to be defined by (3).
Later in these notes we will give a partial converse to part (ii) of this exercise, known as the Paley-Wiener theorem; there are also partial converses to the other parts of this exercise.
From (3) we observe the following intertwining property between multiplication by an exponential and complex translation: if is a complex number and
is an absolutely integrable function such that the modulated function
is also absolutely integrable, then we have the identity
The material in these notes is loosely adapted from Chapter 4 of Stein-Shakarchi’s “Complex Analysis”.
We will shortly turn to the complex-analytic approach to multiplicative number theory, which relies on the basic properties of complex analytic functions. In this supplement to the main notes, we quickly review the portions of complex analysis that we will be using in this course. We will not attempt a comprehensive review of this subject; for instance, we will completely neglect the conformal geometry or Riemann surface aspect of complex analysis, and we will also avoid using the various boundary convergence theorems for Taylor series or Dirichlet series (the latter type of result is traditionally utilised in multiplicative number theory, but I personally find them a little unintuitive to use, and will instead rely on a slightly different set of complex-analytic tools). We will also focus on the “local” structure of complex analytic functions, in particular adopting the philosophy that such functions behave locally like complex polynomials; the classical “global” theory of entire functions, while traditionally used in the theory of the Riemann zeta function, will be downplayed in these notes. On the other hand, we will play up the relationship between complex analysis and Fourier analysis, as we will incline to using the latter tool over the former in some of the subsequent material. (In the traditional approach to the subject, the Mellin transform is used in place of the Fourier transform, but we will not emphasise the role of the Mellin transform here.)
We begin by recalling the notion of a holomorphic function, which will later be shown to be essentially synonymous with that of a complex analytic function.
Definition 1 (Holomorphic function) Let
be an open subset of
, and let
be a function. If
, we say that
is complex differentiable at
if the limit
exists, in which case we refer to
as the (complex) derivative of
at
. If
is differentiable at every point
of
, and the derivative
is continuous, we say that
is holomorphic on
.
Exercise 2 Show that a function
is holomorphic if and only if the two-variable function
is continuously differentiable on
and obeys the Cauchy-Riemann equation
Basic examples of holomorphic functions include complex polynomials
as well as the complex exponential function
which are holomorphic on the entire complex plane (i.e., they are entire functions). The sum or product of two holomorphic functions is again holomorphic; the quotient of two holomorphic functions is holomorphic so long as the denominator is non-zero. Finally, the composition of two holomorphic functions is holomorphic wherever the composition is defined.
- (i) Establish Euler’s formula
for all
. (Hint: it is a bit tricky to do this starting from the trigonometric definitions of sine and cosine; I recommend either using the Taylor series formulations of these functions instead, or alternatively relying on the ordinary differential equations obeyed by sine and cosine.)
- (ii) Show that every non-zero complex number
has a complex logarithm
such that
, and that this logarithm is unique up to integer multiples of
.
- (iii) Show that there exists a unique principal branch
of the complex logarithm in the region
, defined by requiring
to be a logarithm of
with imaginary part between
and
. Show that this principal branch is holomorphic with derivative
.
In real analysis, we have the fundamental theorem of calculus, which asserts that
whenever is a real interval and
is a continuously differentiable function. The complex analogue of this fact is that
whenever is a holomorphic function, and
is a contour in
, by which we mean a piecewise continuously differentiable function, and the contour integral
for a continuous function
is defined via change of variables as
The complex fundamental theorem of calculus (2) follows easily from the real fundamental theorem and the chain rule.
In real analysis, we have the rather trivial fact that the integral of a continuous function on a closed contour is always zero:
In complex analysis, the analogous fact is significantly more powerful, and is known as Cauchy’s theorem:
Theorem 4 (Cauchy’s theorem) Let
be a holomorphic function in a simply connected open set
, and let
be a closed contour in
(thus
). Then
.
Exercise 5 Use Stokes’ theorem to give a proof of Cauchy’s theorem.
A useful reformulation of Cauchy’s theorem is that of contour shifting: if is a holomorphic function on a open set
, and
are two contours in an open set
with
and
, such that
can be continuously deformed into
, then
. A basic application of contour shifting is the Cauchy integral formula:
Theorem 6 (Cauchy integral formula) Let
be a holomorphic function in a simply connected open set
, and let
be a closed contour which is simple (thus
does not traverse any point more than once, with the exception of the endpoint
that is traversed twice), and which encloses a bounded region
in the anticlockwise direction. Then for any
, one has
Proof: Let be a sufficiently small quantity. By contour shifting, one can replace the contour
by the sum (concatenation) of three contours: a contour
from
to
, a contour
traversing the circle
once anticlockwise, and the reversal
of the contour
that goes from
to
. The contributions of the contours
cancel each other, thus
By a change of variables, the right-hand side can be expanded as
Sending , we obtain the claim.
The Cauchy integral formula has many consequences. Specialising to the case when traverses a circle
around
, we conclude the mean value property
whenever is holomorphic in a neighbourhood of the disk
. In a similar spirit, we have the maximum principle for holomorphic functions:
Lemma 7 (Maximum principle) Let
be a simply connected open set, and let
be a simple closed contour in
enclosing a bounded region
anti-clockwise. Let
be a holomorphic function. If we have the bound
for all
on the contour
, then we also have the bound
for all
.
Proof: We use an argument of Landau. Fix . From the Cauchy integral formula and the triangle inequality we have the bound
for some constant depending on
and
. This ostensibly looks like a weaker bound than what we want, but we can miraculously make the constant
disappear by the “tensor power trick“. Namely, observe that if
is a holomorphic function bounded in magnitude by
on
, and
is a natural number, then
is a holomorphic function bounded in magnitude by
on
. Applying the preceding argument with
replaced by
we conclude that
and hence
Sending , we obtain the claim.
Another basic application of the integral formula is
Corollary 8 Every holomorphic function
is complex analytic, thus it has a convergent Taylor series around every point
in the domain. In particular, holomorphic functions are smooth, and the derivative of a holomorphic function is again holomorphic.
Conversely, it is easy to see that complex analytic functions are holomorphic. Thus, the terms “complex analytic” and “holomorphic” are synonymous, at least when working on open domains. (On a non-open set , saying that
is analytic on
is equivalent to asserting that
extends to a holomorphic function of an open neighbourhood of
.) This is in marked contrast to real analysis, in which a function can be continuously differentiable, or even smooth, without being real analytic.
Proof: By translation, we may suppose that . Let
be a a contour traversing the circle
that is contained in the domain
, then by the Cauchy integral formula one has
for all in the disk
. As
is continuously differentiable (and hence continuous) on
, it is bounded. From the geometric series formula
and dominated convergence, we conclude that
with the right-hand side an absolutely convergent series for , and the claim follows.
Exercise 9 Establish the generalised Cauchy integral formulae
for any non-negative integer
, where
is the
-fold complex derivative of
.
This in turn leads to a converse to Cauchy’s theorem, known as Morera’s theorem:
Corollary 10 (Morera’s theorem) Let
be a continuous function on an open set
with the property that
for all closed contours
. Then
is holomorphic.
Proof: We can of course assume to be non-empty and connected (hence path-connected). Fix a point
, and define a “primitive”
of
by defining
, with
being any contour from
to
(this is well defined by hypothesis). By mimicking the proof of the real fundamental theorem of calculus, we see that
is holomorphic with
, and the claim now follows from Corollary 8.
An important consequence of Morera’s theorem for us is
Corollary 11 (Locally uniform limit of holomorphic functions is holomorphic) Let
be holomorphic functions on an open set
which converge locally uniformly to a function
. Then
is also holomorphic on
.
Proof: By working locally we may assume that is a ball, and in particular simply connected. By Cauchy’s theorem,
for all closed contours
in
. By local uniform convergence, this implies that
for all such contours, and the claim then follows from Morera’s theorem.
Now we study the zeroes of complex analytic functions. If a complex analytic function vanishes at a point
, but is not identically zero in a neighbourhood of that point, then by Taylor expansion we see that
factors in a sufficiently small neighbourhood of
as
for some natural number (which we call the order or multiplicity of the zero at
) and some function
that is complex analytic and non-zero near
; this generalises the factor theorem for polynomials. In particular, the zero
is isolated if
does not vanish identically near
. We conclude that if
is connected and
vanishes on a neighbourhood of some point
in
, then it must vanish on all of
(since the maximal connected neighbourhood of
in
on which
vanishes cannot have any boundary point in
). This implies unique continuation of analytic functions: if two complex analytic functions on
agree on a non-empty open set, then they agree everywhere. In particular, if a complex analytic function does not vanish everywhere, then all of its zeroes are isolated, so in particular it has only finitely many zeroes on any given compact set.
Recall that a rational function is a function which is a quotient
of two polynomials (at least outside of the set where
vanishes). Analogously, let us define a meromorphic function on an open set
to be a function
defined outside of a discrete subset
of
(the singularities of
), which is locally the quotient
of holomorphic functions, in the sense that for every
, one has
in a neighbourhood of
excluding
, with
holomorphic near
and with
non-vanishing outside of
. If
and
has a zero of equal or higher order than
at
, then the singularity is removable and one can extend the meromorphic function holomorphically across
(by the holomorphic factor theorem (4)); otherwise, the singularity is non-removable and is known as a pole, whose order is equal to the difference between the order of
and the order of
at
. (If one wished, one could extend meromorphic functions to the poles by embedding
in the Riemann sphere
and mapping each pole to
, but we will not do so here. One could also consider non-meromorphic functions with essential singularities at various points, but we will have no need to analyse such singularities in this course.) If the order of a pole or zero is one, we say that it is simple; if it is two, we say it is double; and so forth.
Exercise 12 Show that the space of meromorphic functions on a non-empty open set
, quotiented by almost everywhere equivalence, forms a field.
By quotienting two Taylor series, we see that if a meromorphic function has a pole of order
at some point
, then it has a Laurent expansion
absolutely convergent in a neighbourhood of excluding
itself, and with
non-zero. The Laurent coefficient
has a special significance, and is called the residue of the meromorphic function
at
, which we will denote as
. The importance of this coefficient comes from the following significant generalisation of the Cauchy integral formula, known as the residue theorem:
Exercise 13 (Residue theorem) Let
be a meromorphic function on a simply connected domain
, and let
be a closed contour in
enclosing a bounded region
anticlockwise, and avoiding all the singularities of
. Show that
where
is summed over all the poles of
that lie in
.
The residue theorem is particularly useful when applied to logarithmic derivatives of meromorphic functions
, because the residue is of a specific form:
Exercise 14 Let
be a meromorphic function on an open set
that does not vanish identically. Show that the only poles of
are simple poles (poles of order
), occurring at the poles and zeroes of
(after all removable singularities have been removed). Furthermore, the residue of
at a pole
is an integer, equal to the order of zero of
if
has a zero at
, or equal to negative the order of pole at
if
has a pole at
.
Remark 15 The fact that residues of logarithmic derivatives of meromorphic functions are automatically integers is a remarkable feature of the complex analytic approach to multiplicative number theory, which is difficult (though not entirely impossible) to duplicate in other approaches to the subject. Here is a sample application of this integrality, which is challenging to reproduce by non-complex-analytic means: if
is meromorphic near
, and one has the bound
as
, then
must in fact stay bounded near
, because the only integer of magnitude less than
is zero.
Roth’s theorem on arithmetic progressions asserts that every subset of the integers of positive upper density contains infinitely many arithmetic progressions of length three. There are many versions and variants of this theorem. Here is one of them:
Theorem 1 (Roth’s theorem) Let
be a compact abelian group, with Haar probability measure
, which is
-divisible (i.e. the map
is surjective) and let
be a measurable subset of
with
for some
. Then we have
where
denotes the bound
for some
depending only on
.
This theorem is usually formulated in the case that is a finite abelian group of odd order (in which case the result is essentially due to Meshulam) or more specifically a cyclic group
of odd order (in which case it is essentially due to Varnavides), but is also valid for the more general setting of
-divisible compact abelian groups, as we shall shortly see. One can be more precise about the dependence of the implied constant
on
, but to keep the exposition simple we will work at the qualitative level here, without trying at all to get good quantitative bounds. The theorem is also true without the
-divisibility hypothesis, but the proof we will discuss runs into some technical issues due to the degeneracy of the
shift in that case.
We can deduce Theorem 1 from the following more general Khintchine-type statement. Let denote the Pontryagin dual of a compact abelian group
, that is to say the set of all continuous homomorphisms
from
to the (additive) unit circle
. Thus
is a discrete abelian group, and functions
have a Fourier transform
defined by
If is
-divisible, then
is
-torsion-free in the sense that the map
is injective. For any finite set
and any radius
, define the Bohr set
where denotes the distance of
to the nearest integer. We refer to the cardinality
of
as the rank of the Bohr set. We record a simple volume bound on Bohr sets:
Lemma 2 (Volume packing bound) Let
be a compact abelian group with Haar probability measure
. For any Bohr set
, we have
Proof: We can cover the torus by
translates
of the cube
. Then the sets
form an cover of
. But all of these sets lie in a translate of
, and the claim then follows from the translation invariance of
.
Given any Bohr set , we define a normalised “Lipschitz” cutoff function
by the formula
where is the constant such that
thus
The function should be viewed as an
-normalised “tent function” cutoff to
. Note from Lemma 2 that
We then have the following sharper version of Theorem 1:
Theorem 3 (Roth-Khintchine theorem) Let
be a
-divisible compact abelian group, with Haar probability measure
, and let
. Then for any measurable function
, there exists a Bohr set
with
and
such that
where
denotes the convolution operation
A variant of this result (expressed in the language of ergodic theory) appears in this paper of Bergelson, Host, and Kra; a combinatorial version of the Bergelson-Host-Kra result that is closer to Theorem 3 subsequently appeared in this paper of Ben Green and myself, but this theorem arguably appears implicitly in a much older paper of Bourgain. To see why Theorem 3 implies Theorem 1, we apply the theorem with and
equal to a small multiple of
to conclude that there is a Bohr set
with
and
such that
But from (2) we have the pointwise bound , and Theorem 1 follows.
Below the fold, we give a short proof of Theorem 3, using an “energy pigeonholing” argument that essentially dates back to the 1986 paper of Bourgain mentioned previously (not to be confused with a later 1999 paper of Bourgain on Roth’s theorem that was highly influential, for instance in emphasising the importance of Bohr sets). The idea is to use the pigeonhole principle to choose the Bohr set to capture all the “large Fourier coefficients” of
, but such that a certain “dilate” of
does not capture much more Fourier energy of
than
itself. The bound (3) may then be obtained through elementary Fourier analysis, without much need to explicitly compute things like the Fourier transform of an indicator function of a Bohr set. (However, the bound obtained by this argument is going to be quite poor – of tower-exponential type.) To do this we perform a structural decomposition of
into “structured”, “small”, and “highly pseudorandom” components, as is common in the subject (e.g. in this previous blog post), but even though we crucially need to retain non-negativity of one of the components in this decomposition, we can avoid recourse to conditional expectation with respect to a partition (or “factor”) of the space, using instead convolution with one of the
considered above to achieve a similar effect.
The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.
An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.
However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group with the property that the centraliser
of any non-identity element
is abelian; equivalently, the commuting relation
(defined as the relation that holds when
commutes with
, thus
) is an equivalence relation on the non-identity elements
of
. Trivially, every abelian group is CA. A non-abelian example of a CA-group is the
group of invertible affine transformations
on a field
. A little less obviously, the special linear group
over a finite field
is a CA-group when
is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in
, for instance, when
is bounded and
is large), is typically a maximal torus (because most elements in
are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have
Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.
Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.
Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group which has a subgroup
(called the Frobenius complement) with the property that all the non-trivial conjugates
of
for
, intersect
only at the origin. For instance the
group is also a Frobenius group (take
to be the affine transformations that fix a specified point
, e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if
is a CA-group and
is a maximal abelian subgroup of
, then any conjugate
of
that is not identical to
will intersect
only at the origin (because
and each of its conjugates consist of equivalence classes under the commuting relation
, together with the identity). So if a maximal abelian subgroup
of a CA-group is its own normaliser (thus
is equal to
), then the group is a Frobenius group.
Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:
Theorem 2 (Frobenius’ theorem) Let
be a Frobenius group with Frobenius complement
. Then there exists a normal subgroup
of
(called the Frobenius kernel of
) such that
is the semi-direct product
of
and
.
Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the example (which is a quintessential example of a semi-direct product).
Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of , since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).
Frobenius’ theorem can be reformulated in the following concrete combinatorial form:
Theorem 3 (Frobenius’ theorem, equivalent version) Let
be a group of permutations acting transitively on a finite set
, with the property that any non-identity permutation in
fixes at most one point in
. Then the set of permutations in
that fix no points in
, together with the identity, is closed under composition.
Again, a good example to keep in mind for this theorem is when is the group of affine permutations on a field
(i.e. the
group for that field), and
is the set of points on that field. In that case, the set of permutations in
that do not fix any points are the non-trivial translations.
To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in . Conversely, to deduce Theorem 2 from Theorem 3, set
to be the space of left-cosets of
, with the obvious left
-action; one easily verifies that this action is faithful, transitive, and each non-identity element
of
fixes at most one left-coset of
(basically because it lies in at most one conjugate of
). If we let
be the elements of
that do not fix any point in
, plus the identity, then by Theorem 3
is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of
. From construction
is the identity plus the complement of all the
conjugates of
, which are all disjoint except at the identity, so by counting elements we see that
As normalises
and is disjoint from
, we thus see that
is all of
, giving Theorem 2.
Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)
It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.
Let be
Hermitian matrices, with eigenvalues
and
. The Harish-Chandra–Itzykson-Zuber integral formula exactly computes the integral
where is integrated over the Haar probability measure of the unitary group
and
is a non-zero complex parameter, as the expression
when the eigenvalues of are simple, where
denotes the Vandermonde determinant
and is the constant
There are at least two standard ways to prove this formula in the literature. One way is by applying the Duistermaat-Heckman theorem to the pushforward of Liouville measure on the coadjoint orbit (or more precisely, a rotation of such an orbit by
) under the moment map
, and then using a stationary phase expansion. Another way, which I only learned about recently, is to use the formulae for evolution of eigenvalues under Dyson Brownian motion (as well as the closely related formulae for the GUE ensemble), which were derived in this previous blog post. Both of these approaches can be found in several places in the literature (the former being observed in the original paper of Duistermaat and Heckman, and the latter observed in the paper of Itzykson and Zuber as well as in this later paper of Johansson), but I thought I would record both of these here for my own benefit.
The Harish-Chandra-Itzykson-Zuber formula can be extended to other compact Lie groups than . At first glance, this might suggest that these formulae could be of use in the study of the GOE ensemble, but unfortunately the Lie algebra associated to
corresponds to real anti-symmetric matrices rather than real symmetric matrices. This also occurs in the
case, but there one can simply multiply by
to rotate a complex skew-Hermitian matrix into a complex Hermitian matrix. This is consistent, though, with the fact that the (somewhat rarely studied) anti-symmetric GOE ensemble has cleaner formulae (in particular, having a determinantal structure similar to GUE) than the (much more commonly studied) symmetric GOE ensemble.
Let be a compact group. (Throughout this post, all topological groups are assumed to be Hausdorff.) Then
has a number of unitary representations, i.e. continuous homomorphisms
to the group
of unitary operators on a Hilbert space
, equipped with the strong operator topology. In particular, one has the left-regular representation
, where we equip
with its normalised Haar measure
(and the Borel
-algebra) to form the Hilbert space
, and
is the translation operation
We call two unitary representations and
isomorphic if one has
for some unitary transformation
, in which case we write
.
Given two unitary representations and
, one can form their direct sum
in the obvious manner:
. Conversely, if a unitary representation
has a closed invariant subspace
of
(thus
for all
), then the orthogonal complement
is also invariant, leading to a decomposition
of
into the subrepresentations
,
. Accordingly, we will call a unitary representation
irreducible if
is nontrivial (i.e.
) and there are no nontrivial invariant subspaces (i.e. no invariant subspaces other than
and
); the irreducible representations play a role in the subject analogous to those of prime numbers in multiplicative number theory. By the principle of infinite descent, every finite-dimensional unitary representation is then expressible (perhaps non-uniquely) as the direct sum of irreducible representations.
The Peter-Weyl theorem asserts, among other things, that the same claim is true for the regular representation:
Theorem 1 (Peter-Weyl theorem) Let
be a compact group. Then the regular representation
is isomorphic to the direct sum of irreducible representations. In fact, one has
, where
is an enumeration of the irreducible finite-dimensional unitary representations
of
(up to isomorphism). (It is not difficult to see that such an enumeration exists.)
In the case when is abelian, the Peter-Weyl theorem is a consequence of the Plancherel theorem; in that case, the irreducible representations are all one dimensional, and are thus indexed by the space
of characters
(i.e. continuous homomorphisms into the unit circle
), known as the Pontryagin dual of
. (See for instance my lecture notes on the Fourier transform.) Conversely, the Peter-Weyl theorem can be used to deduce the Plancherel theorem for compact groups, as well as other basic results in Fourier analysis on these groups, such as the Fourier inversion formula.
Because the regular representation is faithful (i.e. injective), a corollary of the Peter-Weyl theorem (and a classical theorem of Cartan) is that every compact group can be expressed as the inverse limit of Lie groups, leading to a solution to Hilbert’s fifth problem in the compact case. Furthermore, the compact case is then an important building block in the more general theory surrounding Hilbert’s fifth problem, and in particular a result of Yamabe that any locally compact group contains an open subgroup that is the inverse limit of Lie groups.
I’ve recently become interested in the theory around Hilbert’s fifth problem, due to the existence of a correspondence principle between locally compact groups and approximate groups, which play a fundamental role in arithmetic combinatorics. I hope to elaborate upon this correspondence in a subsequent post, but I will mention that versions of this principle play a crucial role in Gromov’s proof of his theorem on groups of polynomial growth (discussed previously on this blog), and in a more recent paper of Hrushovski on approximate groups (also discussed previously). It is also analogous in many ways to the more well-known Furstenberg correspondence principle between ergodic theory and combinatorics (also discussed previously).
Because of the above motivation, I have decided to write some notes on how the Peter-Weyl theorem is proven. This is utterly standard stuff in abstract harmonic analysis; these notes are primarily for my own benefit, but perhaps they may be of interest to some readers also.
A recurring theme in mathematics is that of duality: a mathematical object can either be described internally (or in physical space, or locally), by describing what
physically consists of (or what kind of maps exist into
), or externally (or in frequency space, or globally), by describing what
globally interacts or resonates with (or what kind of maps exist out of
). These two fundamentally opposed perspectives on the object
are often dual to each other in various ways: performing an operation on
may transform it one way in physical space, but in a dual way in frequency space, with the frequency space description often being a “inversion” of the physical space description. In several important cases, one is fortunate enough to have some sort of fundamental theorem connecting the internal and external perspectives. Here are some (closely inter-related) examples of this perspective:
- Vector space duality A vector space
over a field
can be described either by the set of vectors inside
, or dually by the set of linear functionals
from
to the field
(or equivalently, the set of vectors inside the dual space
). (If one is working in the category of topological vector spaces, one would work instead with continuous linear functionals; and so forth.) A fundamental connection between the two is given by the Hahn-Banach theorem (and its relatives).
- Vector subspace duality In a similar spirit, a subspace
of
can be described either by listing a basis or a spanning set, or dually by a list of linear functionals that cut out that subspace (i.e. a spanning set for the orthogonal complement
. Again, the Hahn-Banach theorem provides a fundamental connection between the two perspectives.
- Convex duality More generally, a (closed, bounded) convex body
in a vector space
can be described either by listing a set of (extreme) points whose convex hull is
, or else by listing a set of (irreducible) linear inequalities that cut out
. The fundamental connection between the two is given by the Farkas lemma.
- Ideal-variety duality In a slightly different direction, an algebraic variety
in an affine space
can be viewed either “in physical space” or “internally” as a collection of points in
, or else “in frequency space” or “externally” as a collection of polynomials on
whose simultaneous zero locus cuts out
. The fundamental connection between the two perspectives is given by the nullstellensatz, which then leads to many of the basic fundamental theorems in classical algebraic geometry.
- Hilbert space duality An element
in a Hilbert space
can either be thought of in physical space as a vector in that space, or in momentum space as a covector
on that space. The fundamental connection between the two is given by the Riesz representation theorem for Hilbert spaces.
- Semantic-syntactic duality Much more generally still, a mathematical theory can either be described internally or syntactically via its axioms and theorems, or externally or semantically via its models. The fundamental connection between the two perspectives is given by the Gödel completeness theorem.
- Intrinsic-extrinsic duality A (Riemannian) manifold
can either be viewed intrinsically (using only concepts that do not require an ambient space, such as the Levi-Civita connection), or extrinsically, for instance as the level set of some defining function in an ambient space. Some important connections between the two perspectives includes the Nash embedding theorem and the theorema egregium.
- Group duality A group
can be described either via presentations (lists of generators, together with relations between them) or representations (realisations of that group in some more concrete group of transformations). A fundamental connection between the two is Cayley’s theorem. Unfortunately, in general it is difficult to build upon this connection (except in special cases, such as the abelian case), and one cannot always pass effortlessly from one perspective to the other.
- Pontryagin group duality A (locally compact Hausdorff) abelian group
can be described either by listing its elements
, or by listing the characters
(i.e. continuous homomorphisms from
to the unit circle, or equivalently elements of
). The connection between the two is the focus of abstract harmonic analysis.
- Pontryagin subgroup duality A subgroup
of a locally compact abelian group
can be described either by generators in
, or generators in the orthogonal complement
. One of the fundamental connections between the two is the Poisson summation formula.
- Fourier duality A (sufficiently nice) function
on a locally compact abelian group
(equipped with a Haar measure
) can either be described in physical space (by its values
at each element
of
) or in frequency space (by the values
at elements
of the Pontryagin dual
). The fundamental connection between the two is the Fourier inversion formula.
- The uncertainty principle The behaviour of a function
at physical scales above (resp. below) a certain scale
is almost completely controlled by the behaviour of its Fourier transform
at frequency scales below (resp. above) the dual scale
and vice versa, thanks to various mathematical manifestations of the uncertainty principle. (The Poisson summation formula can also be viewed as a variant of this principle, using subgroups instead of scales.)
- Stone/Gelfand duality A (locally compact Hausdorff) topological space
can be viewed in physical space (as a collection of points), or dually, via the
algebra
of continuous complex-valued functions on that space, or (in the case when
is compact and totally disconnected) via the boolean algebra of clopen sets (or equivalently, the idempotents of
). The fundamental connection between the two is given by the Stone representation theorem or the (commutative) Gelfand-Naimark theorem.
I have discussed a fair number of these examples in previous blog posts (indeed, most of the links above are to my own blog). In this post, I would like to discuss the uncertainty principle, that describes the dual relationship between physical space and frequency space. There are various concrete formalisations of this principle, most famously the Heisenberg uncertainty principle and the Hardy uncertainty principle – but in many situations, it is the heuristic formulation of the principle that is more useful and insightful than any particular rigorous theorem that attempts to capture that principle. Unfortunately, it is a bit tricky to formulate this heuristic in a succinct way that covers all the various applications of that principle; the Heisenberg inequality is a good start, but it only captures a portion of what the principle tells us. Consider for instance the following (deliberately vague) statements, each of which can be viewed (heuristically, at least) as a manifestation of the uncertainty principle:
- A function which is band-limited (restricted to low frequencies) is featureless and smooth at fine scales, but can be oscillatory (i.e. containing plenty of cancellation) at coarse scales. Conversely, a function which is smooth at fine scales will be almost entirely restricted to low frequencies.
- A function which is restricted to high frequencies is oscillatory at fine scales, but is negligible at coarse scales. Conversely, a function which is oscillatory at fine scales will be almost entirely restricted to high frequencies.
- Projecting a function to low frequencies corresponds to averaging out (or spreading out) that function at fine scales, leaving only the coarse scale behaviour.
- Projecting a frequency to high frequencies corresponds to removing the averaged coarse scale behaviour, leaving only the fine scale oscillation.
- The number of degrees of freedom of a function is bounded by the product of its spatial uncertainty and its frequency uncertainty (or more generally, by the volume of the phase space uncertainty). In particular, there are not enough degrees of freedom for a non-trivial function to be simulatenously localised to both very fine scales and very low frequencies.
- To control the coarse scale (or global) averaged behaviour of a function, one essentially only needs to know the low frequency components of the function (and vice versa).
- To control the fine scale (or local) oscillation of a function, one only needs to know the high frequency components of the function (and vice versa).
- Localising a function to a region of physical space will cause its Fourier transform (or inverse Fourier transform) to resemble a plane wave on every dual region of frequency space.
- Averaging a function along certain spatial directions or at certain scales will cause the Fourier transform to become localised to the dual directions and scales. The smoother the averaging, the sharper the localisation.
- The smoother a function is, the more rapidly decreasing its Fourier transform (or inverse Fourier transform) is (and vice versa).
- If a function is smooth or almost constant in certain directions or at certain scales, then its Fourier transform (or inverse Fourier transform) will decay away from the dual directions or beyond the dual scales.
- If a function has a singularity spanning certain directions or certain scales, then its Fourier transform (or inverse Fourier transform) will decay slowly along the dual directions or within the dual scales.
- Localisation operations in position approximately commute with localisation operations in frequency so long as the product of the spatial uncertainty and the frequency uncertainty is significantly larger than one.
- In the high frequency (or large scale) limit, position and frequency asymptotically behave like a pair of classical observables, and partial differential equations asymptotically behave like classical ordinary differential equations. At lower frequencies (or finer scales), the former becomes a “quantum mechanical perturbation” of the latter, with the strength of the quantum effects increasing as one moves to increasingly lower frequencies and finer spatial scales.
- Etc., etc.
- Almost all of the above statements generalise to other locally compact abelian groups than
or
, in which the concept of a direction or scale is replaced by that of a subgroup or an approximate subgroup. (In particular, as we will see below, the Poisson summation formula can be viewed as another manifestation of the uncertainty principle.)
I think of all of the above (closely related) assertions as being instances of “the uncertainty principle”, but it seems difficult to combine them all into a single unified assertion, even at the heuristic level; they seem to be better arranged as a cloud of tightly interconnected assertions, each of which is reinforced by several of the others. The famous inequality is at the centre of this cloud, but is by no means the only aspect of it.
The uncertainty principle (as interpreted in the above broad sense) is one of the most fundamental principles in harmonic analysis (and more specifically, to the subfield of time-frequency analysis), second only to the Fourier inversion formula (and more generally, Plancherel’s theorem) in importance; understanding this principle is a key piece of intuition in the subject that one has to internalise before one can really get to grips with this subject (and also with closely related subjects, such as semi-classical analysis and microlocal analysis). Like many fundamental results in mathematics, the principle is not actually that difficult to understand, once one sees how it works; and when one needs to use it rigorously, it is usually not too difficult to improvise a suitable formalisation of the principle for the occasion. But, given how vague this principle is, it is difficult to present this principle in a traditional “theorem-proof-remark” manner. Even in the more informal format of a blog post, I was surprised by how challenging it was to describe my own understanding of this piece of mathematics in a linear fashion, despite (or perhaps because of) it being one of the most central and basic conceptual tools in my own personal mathematical toolbox. In the end, I chose to give below a cloud of interrelated discussions about this principle rather than a linear development of the theory, as this seemed to more closely align with the nature of this principle.
In these notes we lay out the basic theory of the Fourier transform, which is of course the most fundamental tool in harmonic analysis and also of major importance in related fields (functional analysis, complex analysis, PDE, number theory, additive combinatorics, representation theory, signal processing, etc.). The Fourier transform, in conjunction with the Fourier inversion formula, allows one to take essentially arbitrary (complex-valued) functions on a group (or more generally, a space
that
acts on, e.g. a homogeneous space
), and decompose them as a (discrete or continuous) superposition of much more symmetric functions on the domain, such as characters
; the precise superposition is given by Fourier coefficients
, which take values in some dual object such as the Pontryagin dual
of
. Characters behave in a very simple manner with respect to translation (indeed, they are eigenfunctions of the translation action), and so the Fourier transform tends to simplify any mathematical problem which enjoys a translation invariance symmetry (or an approximation to such a symmetry), and is somehow “linear” (i.e. it interacts nicely with superpositions). In particular, Fourier analytic methods are particularly useful for studying operations such as convolution
and set-theoretic addition
, or the closely related problem of counting solutions to additive problems such as
or
, where
are constrained to lie in specific sets
. The Fourier transform is also a particularly powerful tool for solving constant-coefficient linear ODE and PDE (because of the translation invariance), and can also approximately solve some variable-coefficient (or slightly non-linear) equations if the coefficients vary smoothly enough and the nonlinear terms are sufficiently tame.
The Fourier transform also provides an important new way of looking at a function
, as it highlights the distribution of
in frequency space (the domain of the frequency variable
) rather than physical space (the domain of the physical variable
). A given property of
in the physical domain may be transformed to a rather different-looking property of
in the frequency domain. For instance:
- Smoothness of
in the physical domain corresponds to decay of
in the Fourier domain, and conversely. (More generally, fine scale properties of
tend to manifest themselves as coarse scale properties of
, and conversely.)
- Convolution in the physical domain corresponds to pointwise multiplication in the Fourier domain, and conversely.
- Constant coefficient differential operators such as
in the physical domain corresponds to multiplication by polynomials such as
in the Fourier domain, and conversely.
- More generally, translation invariant operators in the physical domain correspond to multiplication by symbols in the Fourier domain, and conversely.
- Rescaling in the physical domain by an invertible linear transformation corresponds to an inverse (adjoint) rescaling in the Fourier domain.
- Restriction to a subspace (or subgroup) in the physical domain corresponds to projection to the dual quotient space (or quotient group) in the Fourier domain, and conversely.
- Frequency modulation in the physical domain corresponds to translation in the frequency domain, and conversely.
(We will make these statements more precise below.)
On the other hand, some operations in the physical domain remain essentially unchanged in the Fourier domain. Most importantly, the norm (or energy) of a function
is the same as that of its Fourier transform, and more generally the inner product
of two functions
is the same as that of their Fourier transforms. Indeed, the Fourier transform is a unitary operator on
(a fact which is variously known as the Plancherel theorem or the Parseval identity). This makes it easier to pass back and forth between the physical domain and frequency domain, so that one can combine techniques that are easy to execute in the physical domain with other techniques that are easy to execute in the frequency domain. (In fact, one can combine the physical and frequency domains together into a product domain known as phase space, and there are entire fields of mathematics (e.g. microlocal analysis, geometric quantisation, time-frequency analysis) devoted to performing analysis on these sorts of spaces directly, but this is beyond the scope of this course.)
In these notes, we briefly discuss the general theory of the Fourier transform, but will mainly focus on the two classical domains for Fourier analysis: the torus , and the Euclidean space
. For these domains one has the advantage of being able to perform very explicit algebraic calculations, involving concrete functions such as plane waves
or Gaussians
.
[This post was typeset using a LaTeX to WordPress-HTML converter kindly provided to me by Luca Trevisan.]
Many properties of a (sufficiently nice) function are reflected in its Fourier transform
, defined by the formula
For instance, decay properties of are reflected in smoothness properties of
, as the following table shows:
If |
then |
and this relates to… |
Square-integrable | square-integrable | Plancherel’s theorem |
Absolutely integrable | continuous | Riemann-Lebesgue lemma |
Rapidly decreasing | smooth | theory of Schwartz functions |
Exponentially decreasing | analytic in a strip | |
Compactly supported | entire and at most exponential growth | Paley-Wiener theorem |
Another important relationship between a function and its Fourier transform
is the uncertainty principle, which roughly asserts that if a function
is highly localised in space, then its Fourier transform
must be widely dispersed in space, or to put it another way,
and
cannot both decay too strongly at infinity (except of course in the degenerate case
). There are many ways to make this intuition precise. One of them is the Heisenberg uncertainty principle, which asserts that if we normalise
then we must have
thus forcing at least one of or
to not be too concentrated near the origin. This principle can be proven (for sufficiently nice
, initially) by observing the integration by parts identity
and then using Cauchy-Schwarz and the Plancherel identity.
Another well known manifestation of the uncertainty principle is the fact that it is not possible for and
to both be compactly supported (unless of course they vanish entirely). This can be in fact be seen from the above table: if
is compactly supported, then
is an entire function; but the zeroes of a non-zero entire function are isolated, yielding a contradiction unless
vanishes. (Indeed, the table also shows that if one of
and
is compactly supported, then the other cannot have exponential decay.)
On the other hand, we have the example of the Gaussian functions ,
, which both decay faster than exponentially. The classical Hardy uncertainty principle asserts, roughly speaking, that this is the fastest that
and
can simultaneously decay:
Theorem 1 (Hardy uncertainty principle) Suppose that
is a (measurable) function such that
and
for all
and some
. Then
is a scalar multiple of the gaussian
.
This theorem is proven by complex-analytic methods, in particular the Phragmén-Lindelöf principle; for sake of completeness we give that proof below. But I was curious to see if there was a real-variable proof of the same theorem, avoiding the use of complex analysis. I was able to find the proof of a slightly weaker theorem:
Theorem 2 (Weak Hardy uncertainty principle) Suppose that
is a non-zero (measurable) function such that
and
for all
and some
. Then
for some absolute constant
.
Note that the correct value of should be
, as is implied by the true Hardy uncertainty principle. Despite the weaker statement, I thought the proof might still might be of interest as it is a little less “magical” than the complex-variable one, and so I am giving it below.
In the next few lectures, we will be studying four major classes of function spaces. In decreasing order of generality, these classes are the topological vector spaces, the normed vector spaces, the Banach spaces, and the Hilbert spaces. In order to motivate the discussion of the more general classes of spaces, we will first focus on the most special class – that of (real and complex) Hilbert spaces. These spaces can be viewed as generalisations of (real and complex) Euclidean spaces such as and
to infinite-dimensional settings, and indeed much of one’s Euclidean geometry intuition concerning lengths, angles, orthogonality, subspaces, etc. will transfer readily to arbitrary Hilbert spaces; in contrast, this intuition is not always accurate in the more general vector spaces mentioned above. In addition to Euclidean spaces, another fundamental example of Hilbert spaces comes from the Lebesgue spaces
of a measure space
. (There are of course many other Hilbert spaces of importance in complex analysis, harmonic analysis, and PDE, such as Hardy spaces
, Sobolev spaces
, and the space
of Hilbert-Schmidt operators, but we will not discuss those spaces much in this course. Complex Hilbert spaces also play a fundamental role in the foundations of quantum mechanics, being the natural space to hold all the possible states of a quantum system (possibly after projectivising the Hilbert space), but we will not discuss this subject here.)
Hilbert spaces are the natural abstract framework in which to study two important (and closely related) concepts: orthogonality and unitarity, allowing us to generalise familiar concepts and facts from Euclidean geometry such as the Cartesian coordinate system, rotations and reflections, and the Pythagorean theorem to Hilbert spaces. (For instance, the Fourier transform is a unitary transformation and can thus be viewed as a kind of generalised rotation.) Furthermore, the Hodge duality on Euclidean spaces has a partial analogue for Hilbert spaces, namely the Riesz representation theorem for Hilbert spaces, which makes the theory of duality and adjoints for Hilbert spaces especially simple (when compared with the more subtle theory of duality for, say, Banach spaces). Much later (next quarter, in fact), we will see that this duality allows us to extend the spectral theorem for self-adjoint matrices to that of self-adjoint operators on a Hilbert space.
These notes are only the most basic introduction to the theory of Hilbert spaces. In particular, the theory of linear transformations between two Hilbert spaces, which is perhaps the most important aspect of the subject, is not covered much at all here (but I hope to discuss it further in future lectures.)
Recent Comments