You are currently browsing the category archive for the ‘math.RT’ category.
A finite group is said to be a Frobenius group if there is a non-trivial subgroup
of
(known as the Frobenius complement of
) such that the conjugates
of
are “disjoint as possible” in the sense that
whenever
. This gives a decomposition
where the Frobenius kernel of
is defined as the identity element
together with all the non-identity elements that are not conjugate to any element of
. Taking cardinalities, we conclude that
A remarkable theorem of Frobenius gives an unexpected amount of structure on and hence on
:
Theorem 1 (Frobenius’ theorem) Let
be a Frobenius group with Frobenius complement
and Frobenius kernel
. Then
is a normal subgroup of
, and hence (by (2) and the disjointness of
and
outside the identity)
is the semidirect product
of
and
.
I discussed Frobenius’ theorem and its proof in this recent blog post. This proof uses the theory of characters on a finite group , in particular relying on the fact that a character on a subgroup
can induce a character on
, which can then be decomposed into irreducible characters with natural number coefficients. Remarkably, even though a century has passed since Frobenius’ original argument, there is no proof known of this theorem which avoids character theory entirely; there are elementary proofs known when the complement
has even order or when
is solvable (we review both of these cases below the fold), which by the Feit-Thompson theorem does cover all the cases, but the proof of the Feit-Thompson theorem involves plenty of character theory (and also relies on Theorem 1). (The answers to this MathOverflow question give a good overview of the current state of affairs.)
I have been playing around recently with the problem of finding a character-free proof of Frobenius’ theorem. I didn’t succeed in obtaining a completely elementary proof, but I did find an argument which replaces character theory (which can be viewed as coming from the representation theory of the non-commutative group algebra ) with the Fourier analysis of class functions (i.e. the representation theory of the centre
of the group algebra), thus replacing non-commutative representation theory by commutative representation theory. This is not a particularly radical depature from the existing proofs of Frobenius’ theorem, but it did seem to be a new proof which was technically “character-free” (even if it was not all that far from character-based in spirit), so I thought I would record it here.
The main ideas are as follows. The space of class functions can be viewed as a commutative algebra with respect to the convolution operation
; as the regular representation is unitary and faithful, this algebra contains no nilpotent elements. As such, (Gelfand-style) Fourier analysis suggests that one can analyse this algebra through the idempotents: class functions
such that
. In terms of characters, idempotents are nothing more than sums of the form
for various collections
of characters, but we can perform a fair amount of analysis on idempotents directly without recourse to characters. In particular, it turns out that idempotents enjoy some important integrality properties that can be established without invoking characters: for instance, by taking traces one can check that
is a natural number, and more generally we will show that
is a natural number whenever
is a subgroup of
(see Corollary 4 below). For instance, the quantity
is a natural number which we will call the rank of (as it is also the linear rank of the transformation
on
).
In the case that is a Frobenius group with kernel
, the above integrality properties can be used after some elementary manipulations to establish that for any idempotent
, the quantity
is an integer. On the other hand, one can also show by elementary means that this quantity lies between and
. These two facts are not strong enough on their own to impose much further structure on
, unless one restricts attention to minimal idempotents
. In this case spectral theory (or Gelfand theory, or the fundamental theorem of algebra) tells us that
has rank one, and then the integrality gap comes into play and forces the quantity (3) to always be either zero or one. This can be used to imply that the convolution action of every minimal idempotent
either preserves
or annihilates it, which makes
itself an idempotent, which makes
normal.
Suppose that is a finite group of even order, thus
is a multiple of two. By Cauchy’s theorem, this implies that
contains an involution: an element
in
of order two. (Indeed, if no such involution existed, then
would be partitioned into doubletons
together with the identity, so that
would be odd, a contradiction.) Of course, groups of odd order have no involutions
, thanks to Lagrange’s theorem (since
cannot split into doubletons
).
The classical Brauer-Fowler theorem asserts that if a group has many involutions, then it must have a large non-trivial subgroup:
Theorem 1 (Brauer-Fowler theorem) Let
be a finite group with at least
involutions for some
. Then
contains a proper subgroup
of index at most
.
This theorem (which is Theorem 2F in the original paper of Brauer and Fowler, who in fact manage to sharpen slightly to
) has a number of quick corollaries which are also referred to as “the” Brauer-Fowler theorem. For instance, if
is a an involution of a group
, and the centraliser
has order
, then clearly
(as
contains
and
) and the conjugacy class
has order
(since the map
has preimages that are cosets of
). Every conjugate of an involution is again an involution, so by the Brauer-Fowler theorem
contains a subgroup of order at least
. In particular, we can conclude that every group
of even order contains a proper subgroup of order at least
.
Another corollary is that the size of a simple group of even order can be controlled by the size of a centraliser of one of its involutions:
Corollary 2 (Brauer-Fowler theorem) Let
be a finite simple group with an involution
, and suppose that
has order
. Then
has order at most
.
Indeed, by the previous discussion has a proper subgroup
of index less than
, which then gives a non-trivial permutation action of
on the coset space
. The kernel of this action is a proper normal subgroup of
and is thus trivial, so the action is faithful, and the claim follows.
If one assumes the Feit-Thompson theorem that all groups of odd order are solvable, then Corollary 2 suggests a strategy (first proposed by Brauer himself in 1954) to prove the classification of finite simple groups (CFSG) by induction on the order of the group. Namely, assume for contradiction that the CFSG failed, so that there is a counterexample of minimal order
to the classification. This is a non-abelian finite simple group; by the Feit-Thompson theorem, it has even order and thus has at least one involution
. Take such an involution and consider its centraliser
; this is a proper subgroup of
of some order
. As
is a minimal counterexample to the classification, one can in principle describe
in terms of the CFSG by factoring the group into simple components (via a composition series) and applying the CFSG to each such component. Now, the “only” thing left to do is to verify, for each isomorphism class of
, that all the possible simple groups
that could have this type of group as a centraliser of an involution obey the CFSG; Corollary 2 tells us that for each such isomorphism class for
, there are only finitely many
that could generate this class for one of its centralisers, so this task should be doable in principle for any given isomorphism class for
. That’s all one needs to do to prove the classification of finite simple groups!
Needless to say, this program turns out to be far more difficult than the above summary suggests, and the actual proof of the CFSG does not quite proceed along these lines. However, a significant portion of the argument is based on a generalisation of this strategy, in which the concept of a centraliser of an involution is replaced by the more general notion of a normaliser of a -group, and one studies not just a single normaliser but rather the entire family of such normalisers and how they interact with each other (and in particular, which normalisers of
-groups commute with each other), motivated in part by the theory of Tits buildings for Lie groups which dictates a very specific type of interaction structure between these
-groups in the key case when
is a (sufficiently high rank) finite simple group of Lie type over a field of characteristic
. See the text of Aschbacher, Lyons, Smith, and Solomon for a more detailed description of this strategy.
The Brauer-Fowler theorem can be proven by a nice application of character theory, of the type discussed in this recent blog post, ultimately based on analysing the alternating tensor power of representations; I reproduce a version of this argument (taken from this text of Isaacs) below the fold. (The original argument of Brauer and Fowler is more combinatorial in nature.) However, I wanted to record a variant of the argument that relies not on the fine properties of characters, but on the cruder theory of quasirandomness for groups, the modern study of which was initiated by Gowers, and is discussed for instance in this previous post. It gives the following slightly weaker version of Corollary 2:
Corollary 3 (Weak Brauer-Fowler theorem) Let
be a finite simple group with an involution
, and suppose that
has order
. Then
can be identified with a subgroup of the unitary group
.
One can get an upper bound on from this corollary using Jordan’s theorem, but the resulting bound is a bit weaker than that in Corollary 2 (and the best bounds on Jordan’s theorem require the CFSG!).
Proof: Let be the set of all involutions in
, then as discussed above
. We may assume that
has no non-trivial unitary representation of dimension less than
(since such representations are automatically faithful by the simplicity of
); thus, in the language of quasirandomness,
is
-quasirandom, and is also non-abelian. We have the basic convolution estimate
(see Exercise 10 from this previous blog post). In particular,
and so there are at least pairs
such that
, i.e. involutions
whose product is also an involution. But any such involutions necessarily commute, since
Thus there are at least pairs
of non-identity elements that commute, so by the pigeonhole principle there is a non-identity
whose centraliser
has order at least
. This centraliser cannot be all of
since this would make
central which contradicts the non-abelian simple nature of
. But then the quasiregular representation of
on
has dimension at most
, contradicting the quasirandomness.
An abstract finite-dimensional complex Lie algebra, or Lie algebra for short, is a finite-dimensional complex vector space together with an anti-symmetric bilinear form
that obeys the Jacobi identity
for all ; by anti-symmetry one can also rewrite the Jacobi identity as
We will usually omit the subscript from the Lie bracket when this will not cause ambiguity. A homomorphism
between two Lie algebras
is a linear map that respects the Lie bracket, thus
for all
. As with many other classes of mathematical objects, the class of Lie algebras together with their homomorphisms then form a category. One can of course also consider Lie algebras in infinite dimension or over other fields, but we will restrict attention throughout these notes to the finite-dimensional complex case. The trivial, zero-dimensional Lie algebra is denoted
; Lie algebras of positive dimension will be called non-trivial.
Lie algebras come up in many contexts in mathematics, in particular arising as the tangent space of complex Lie groups. It is thus very profitable to think of Lie algebras as being the infinitesimal component of a Lie group, and in particular almost all of the notation and concepts that are applicable to Lie groups (e.g. nilpotence, solvability, extensions, etc.) have infinitesimal counterparts in the category of Lie algebras (often with exactly the same terminology). See this previous blog post for more discussion about the connection between Lie algebras and Lie groups (that post was focused over the reals instead of the complexes, but much of the discussion carries over to the complex case).
A particular example of a Lie algebra is the general linear Lie algebra of linear transformations
on a finite-dimensional complex vector space (or vector space for short)
, with the commutator Lie bracket
; one easily verifies that this is indeed an abstract Lie algebra. We will define a concrete Lie algebra to be a Lie algebra that is a subalgebra of
for some vector space
, and similarly define a representation of a Lie algebra
to be a homomorphism
into a concrete Lie algebra
. It is a deep theorem of Ado (discussed in this previous post) that every abstract Lie algebra is in fact isomorphic to a concrete one (or equivalently, that every abstract Lie algebra has a faithful representation), but we will not need or prove this fact here.
Even without Ado’s theorem, though, the structure of abstract Lie algebras is very well understood. As with objects in many other algebraic categories, a basic way to understand a Lie algebra is to factor it into two simpler algebras
via a short exact sequence
thus one has an injective homomorphism from to
and a surjective homomorphism from
to
such that the image of the former homomorphism is the kernel of the latter. (To be pedantic, a short exact sequence in a general category requires these homomorphisms to be monomorphisms and epimorphisms respectively, but in the category of Lie algebras these turn out to reduce to the more familiar concepts of injectivity and surjectivity respectively.) Given such a sequence, one can (non-uniquely) identify
with the vector space
equipped with a Lie bracket of the form
for some bilinear maps and
that obey some Jacobi-type identities which we will not record here. Understanding exactly what maps
are possible here (up to coordinate change) can be a difficult task (and is one of the key objectives of Lie algebra cohomology), but in principle at least, the problem of understanding
can be reduced to that of understanding that of its factors
. To emphasise this, I will (perhaps idiosyncratically) express the existence of a short exact sequence (3) by the ATLAS-type notation
although one should caution that for given and
, there can be multiple non-isomorphic
that can form a short exact sequence with
, so that
is not a uniquely defined combination of
and
; one could emphasise this by writing
instead of
, though we will not do so here. We will refer to
as an extension of
by
, and read the notation (5) as “
is
-by-
“; confusingly, these two notations reverse the subject and object of “by”, but unfortunately both notations are well entrenched in the literature. We caution that the operation
is not commutative, and it is only partly associative: every Lie algebra of the form
is also of the form
, but the converse is not true (see this previous blog post for some related discussion). As we are working in the infinitesimal world of Lie algebras (which have an additive group operation) rather than Lie groups (in which the group operation is usually written multiplicatively), it may help to think of
as a (twisted) “sum” of
and
rather than a “product”; for instance, we have
and
, and also
.
Special examples of extensions of
by
include the direct sum (or direct product)
(also denoted
), which is given by the construction (4) with
and
both vanishing, and the split extension (or semidirect product)
(also denoted
), which is given by the construction (4) with
vanishing and the bilinear map
taking the form
for some representation of
in the concrete Lie algebra of derivations
of
, that is to say the algebra of linear maps
that obey the Leibniz rule
for all . (The derivation algebra
of a Lie algebra
is analogous to the automorphism group
of a Lie group
, with the two concepts being intertwined by the tangent space functor
from Lie groups to Lie algebras (i.e. the derivation algebra is the infinitesimal version of the automorphism group). Of course, this functor also intertwines the Lie algebra and Lie group versions of most of the other concepts discussed here, such as extensions, semidirect products, etc.)
There are two general ways to factor a Lie algebra as an extension
of a smaller Lie algebra
by another smaller Lie algebra
. One is to locate a Lie algebra ideal (or ideal for short)
in
, thus
, where
denotes the Lie algebra generated by
, and then take
to be the quotient space
in the usual manner; one can check that
,
are also Lie algebras and that we do indeed have a short exact sequence
Conversely, whenever one has a factorisation , one can identify
with an ideal in
, and
with the quotient of
by
.
The other general way to obtain such a factorisation is is to start with a homomorphism of
into another Lie algebra
, take
to be the image
of
, and
to be the kernel
. Again, it is easy to see that this does indeed create a short exact sequence:
Conversely, whenever one has a factorisation , one can identify
with the image of
under some homomorphism, and
with the kernel of that homomorphism. Note that if a representation
is faithful (i.e. injective), then the kernel is trivial and
is isomorphic to
.
Now we consider some examples of factoring some class of Lie algebras into simpler Lie algebras. The easiest examples of Lie algebras to understand are the abelian Lie algebras , in which the Lie bracket identically vanishes. Every one-dimensional Lie algebra is automatically abelian, and thus isomorphic to the scalar algebra
. Conversely, by using an arbitrary linear basis of
, we see that an abelian Lie algebra is isomorphic to the direct sum of one-dimensional algebras. Thus, a Lie algebra is abelian if and only if it is isomorphic to the direct sum of finitely many copies of
.
Now consider a Lie algebra that is not necessarily abelian. We then form the derived algebra
; this algebra is trivial if and only if
is abelian. It is easy to see that
is an ideal whenever
are ideals, so in particular the derived algebra
is an ideal and we thus have the short exact sequence
The algebra is the maximal abelian quotient of
, and is known as the abelianisation of
. If it is trivial, we call the Lie algebra perfect. If instead it is non-trivial, then the derived algebra has strictly smaller dimension than
. From this, it is natural to associate two series to any Lie algebra
, the lower central series
and the derived series
By induction we see that these are both decreasing series of ideals of , with the derived series being slightly smaller (
for all
). We say that a Lie algebra is nilpotent if its lower central series is eventually trivial, and solvable if its derived series eventually becomes trivial. Thus, abelian Lie algebras are nilpotent, and nilpotent Lie algebras are solvable, but the converses are not necessarily true. For instance, in the general linear group
, which can be identified with the Lie algebra of
complex matrices, the subalgebra
of strictly upper triangular matrices is nilpotent (but not abelian for
), while the subalgebra
of upper triangular matrices is solvable (but not nilpotent for
). It is also clear that any subalgebra of a nilpotent algebra is nilpotent, and similarly for solvable or abelian algebras.
From the above discussion we see that a Lie algebra is solvable if and only if it can be represented by a tower of abelian extensions, thus
for some abelian . Similarly, a Lie algebra
is nilpotent if it is expressible as a tower of central extensions (so that in all the extensions
in the above factorisation,
is central in
, where we say that
is central in
if
). We also see that an extension
is solvable if and only of both factors
are solvable. Splitting abelian algebras into cyclic (i.e. one-dimensional) ones, we thus see that a finite-dimensional Lie algebra is solvable if and only if it is polycylic, i.e. it can be represented by a tower of cyclic extensions.
For our next fundamental example of using short exact sequences to split a general Lie algebra into simpler objects, we observe that every abstract Lie algebra has an adjoint representation
, where for each
,
is the linear map
; one easily verifies that this is indeed a representation (indeed, (2) is equivalent to the assertion that
for all
). The kernel of this representation is the center
, which the maximal central subalgebra of
. We thus have the short exact sequence
which, among other things, shows that every abstract Lie algebra is a central extension of a concrete Lie algebra (which can serve as a cheap substitute for Ado’s theorem mentioned earlier).
For our next fundamental decomposition of Lie algebras, we need some more definitions. A Lie algebra is simple if it is non-abelian and has no ideals other than
and
; thus simple Lie algebras cannot be factored
into strictly smaller algebras
. In particular, simple Lie algebras are automatically perfect and centerless. We have the following fundamental theorem:
Theorem 1 (Equivalent definitions of semisimplicity) Let
be a Lie algebra. Then the following are equivalent:
- (i)
does not contain any non-trivial solvable ideal.
- (ii)
does not contain any non-trivial abelian ideal.
- (iii) The Killing form
, defined as the bilinear form
, is non-degenerate on
.
- (iv)
is isomorphic to the direct sum of finitely many non-abelian simple Lie algebras.
We review the proof of this theorem later in these notes. A Lie algebra obeying any (and hence all) of the properties (i)-(iv) is known as a semisimple Lie algebra. The statement (iv) is usually taken as the definition of semisimplicity; the equivalence of (iv) and (i) is a special case of Weyl’s complete reducibility theorem (see Theorem 44), and the equivalence of (iv) and (iii) is known as the Cartan semisimplicity criterion. (The equivalence of (i) and (ii) is easy.)
If and
are solvable ideals of a Lie algebra
, then it is not difficult to see that the vector sum
is also a solvable ideal (because on quotienting by
we see that the derived series of
must eventually fall inside
, and thence must eventually become trivial by the solvability of
). As our Lie algebras are finite dimensional, we conclude that
has a unique maximal solvable ideal, known as the radical
of
. The quotient
is then a Lie algebra with trivial radical, and is thus semisimple by the above theorem, giving the Levi decomposition
expressing an arbitrary Lie algebra as an extension of a semisimple Lie algebra by a solvable algebra
(and it is not hard to see that this is the only possible such extension up to isomorphism). Indeed, a deep theorem of Levi allows one to upgrade this decomposition to a split extension
although we will not need or prove this result here.
In view of the above decompositions, we see that we can factor any Lie algebra (using a suitable combination of direct sums and extensions) into a finite number of simple Lie algebras and the scalar algebra . In principle, this means that one can understand an arbitrary Lie algebra once one understands all the simple Lie algebras (which, being defined over
, are somewhat confusingly referred to as simple complex Lie algebras in the literature). Amazingly, this latter class of algebras are completely classified:
Theorem 2 (Classification of simple Lie algebras) Up to isomorphism, every simple Lie algebra is of one of the following forms:
for some
.
for some
.
for some
.
for some
.
, or
.
.
.
(The precise definition of the classical Lie algebras
and the exceptional Lie algebras
will be recalled later.)
(One can extend the families of classical Lie algebras a little bit to smaller values of
, but the resulting algebras are either isomorphic to other algebras on this list, or cease to be simple; see this previous post for further discussion.)
This classification is a basic starting point for the classification of many other related objects, including Lie algebras and Lie groups over more general fields (e.g. the reals ), as well as finite simple groups. Being so fundamental to the subject, this classification is covered in almost every basic textbook in Lie algebras, and I myself learned it many years ago in an honours undergraduate course back in Australia. The proof is rather lengthy, though, and I have always had difficulty keeping it straight in my head. So I have decided to write some notes on the classification in this blog post, aiming to be self-contained (though moving rapidly). There is no new material in this post, though; it is all drawn from standard reference texts (I relied particularly on Fulton and Harris’s text, which I highly recommend). In fact it seems remarkably hard to deviate from the standard routes given in the literature to the classification; I would be interested in knowing about other ways to reach the classification (or substeps in that classification) that are genuinely different from the orthodox route.
The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.
An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.
However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group with the property that the centraliser
of any non-identity element
is abelian; equivalently, the commuting relation
(defined as the relation that holds when
commutes with
, thus
) is an equivalence relation on the non-identity elements
of
. Trivially, every abelian group is CA. A non-abelian example of a CA-group is the
group of invertible affine transformations
on a field
. A little less obviously, the special linear group
over a finite field
is a CA-group when
is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in
, for instance, when
is bounded and
is large), is typically a maximal torus (because most elements in
are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have
Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.
Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.
Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group which has a subgroup
(called the Frobenius complement) with the property that all the non-trivial conjugates
of
for
, intersect
only at the origin. For instance the
group is also a Frobenius group (take
to be the affine transformations that fix a specified point
, e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if
is a CA-group and
is a maximal abelian subgroup of
, then any conjugate
of
that is not identical to
will intersect
only at the origin (because
and each of its conjugates consist of equivalence classes under the commuting relation
, together with the identity). So if a maximal abelian subgroup
of a CA-group is its own normaliser (thus
is equal to
), then the group is a Frobenius group.
Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:
Theorem 2 (Frobenius’ theorem) Let
be a Frobenius group with Frobenius complement
. Then there exists a normal subgroup
of
(called the Frobenius kernel of
) such that
is the semi-direct product
of
and
.
Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the example (which is a quintessential example of a semi-direct product).
Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of , since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).
Frobenius’ theorem can be reformulated in the following concrete combinatorial form:
Theorem 3 (Frobenius’ theorem, equivalent version) Let
be a group of permutations acting transitively on a finite set
, with the property that any non-identity permutation in
fixes at most one point in
. Then the set of permutations in
that fix no points in
, together with the identity, is closed under composition.
Again, a good example to keep in mind for this theorem is when is the group of affine permutations on a field
(i.e. the
group for that field), and
is the set of points on that field. In that case, the set of permutations in
that do not fix any points are the non-trivial translations.
To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in . Conversely, to deduce Theorem 2 from Theorem 3, set
to be the space of left-cosets of
, with the obvious left
-action; one easily verifies that this action is faithful, transitive, and each non-identity element
of
fixes at most one left-coset of
(basically because it lies in at most one conjugate of
). If we let
be the elements of
that do not fix any point in
, plus the identity, then by Theorem 3
is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of
. From construction
is the identity plus the complement of all the
conjugates of
, which are all disjoint except at the identity, so by counting elements we see that
As normalises
and is disjoint from
, we thus see that
is all of
, giving Theorem 2.
Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)
It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.
Way back in 2007, I wrote a blog post giving Einstein’s derivation of his famous equation for the rest energy of a body with mass
. (Throughout this post, mass is used to refer to the invariant mass (also known as rest mass) of an object.) This derivation used a number of physical assumptions, including the following:
- The two postulates of special relativity: firstly, that the laws of physics are the same in every inertial reference frame, and secondly that the speed of light in vacuum is equal
in every such inertial frame.
- Planck’s relation and de Broglie’s law for photons, relating the frequency, energy, and momentum of such photons together.
- The law of conservation of energy, and the law of conservation of momentum, as well as the additivity of these quantities (i.e. the energy of a system is the sum of the energy of its components, and similarly for momentum).
- The Newtonian approximations
,
to energy and momentum at low velocities.
The argument was one-dimensional in nature, in the sense that only one of the three spatial dimensions was actually used in the proof.
As was pointed out in comments in the previous post by Laurens Gunnarsen, this derivation has the curious feature of needing some laws from quantum mechanics (specifically, the Planck and de Broglie laws) in order to derive an equation in special relativity (which does not ostensibly require quantum mechanics). One can then ask whether one can give a derivation that does not require such laws. As pointed out in previous comments, one can use the representation theory of the Lorentz group to give a nice derivation that avoids any quantum mechanics, but it now needs at least two spatial dimensions instead of just one. I decided to work out this derivation in a way that does not explicitly use representation theory (although it is certainly lurking beneath the surface). The concept of momentum is only barely used in this derivation, and the main ingredients are now reduced to the following:
- The two postulates of special relativity;
- The law of conservation of energy (and the additivity of energy);
- The Newtonian approximation
at low velocities.
The argument (which uses a little bit of calculus, but is otherwise elementary) is given below the fold. Whereas Einstein’s original argument considers a mass emitting two photons in several different reference frames, the argument here considers a large mass breaking up into two equal smaller masses. Viewing this situation in different reference frames gives a functional equation for the relationship between energy, mass, and velocity, which can then be solved using some calculus, using the Newtonian approximation as a boundary condition, to give the famous formula.
Disclaimer: As with the previous post, the arguments here are physical arguments rather than purely mathematical ones, and thus do not really qualify as a rigorous mathematical argument, due to the implicit use of a number of physical and metaphysical hypotheses beyond the ones explicitly listed above. (But it would be difficult to say anything non-tautological at all about the physical world if one could rely solely on rigorous mathematical reasoning.)
In the previous set of notes we saw how a representation-theoretic property of groups, namely Kazhdan’s property (T), could be used to demonstrate expansion in Cayley graphs. In this set of notes we discuss a different representation-theoretic property of groups, namely quasirandomness, which is also useful for demonstrating expansion in Cayley graphs, though in a somewhat different way to property (T). For instance, whereas property (T), being qualitative in nature, is only interesting for infinite groups such as or
, and only creates Cayley graphs after passing to a finite quotient, quasirandomness is a quantitative property which is directly applicable to finite groups, and is able to deduce expansion in a Cayley graph, provided that random walks in that graph are known to become sufficiently “flat” in a certain sense.
The definition of quasirandomness is easy enough to state:
Definition 1 (Quasirandom groups) Let
be a finite group, and let
. We say that
is
-quasirandom if all non-trivial unitary representations
of
have dimension at least
. (Recall a representation is trivial if
is the identity for all
.)
Exercise 1 Let
be a finite group, and let
. A unitary representation
is said to be irreducible if
has no
-invariant subspaces other than
and
. Show that
is
-quasirandom if and only if every non-trivial irreducible representation of
has dimension at least
.
Remark 1 The terminology “quasirandom group” was introduced explicitly (though with slightly different notational conventions) by Gowers in 2008 in his detailed study of the concept; the name arises because dense Cayley graphs in quasirandom groups are quasirandom graphs in the sense of Chung, Graham, and Wilson, as we shall see below. This property had already been used implicitly to construct expander graphs by Sarnak and Xue in 1991, and more recently by Gamburd in 2002 and by Bourgain and Gamburd in 2008. One can of course define quasirandomness for more general locally compact groups than the finite ones, but we will only need this concept in the finite case. (A paper of Kunze and Stein from 1960, for instance, exploits the quasirandomness properties of the locally compact group
to obtain mixing estimates in that group.)
Quasirandomness behaves fairly well with respect to quotients and short exact sequences:
Exercise 2 Let
be a short exact sequence of finite groups
.
- (i) If
is
-quasirandom, show that
is
-quasirandom also. (Equivalently: any quotient of a
-quasirandom finite group is again a
-quasirandom finite group.)
- (ii) Conversely, if
and
are both
-quasirandom, show that
is
-quasirandom also. (In particular, the direct or semidirect product of two
-quasirandom finite groups is again a
-quasirandom finite group.)
Informally, we will call quasirandom if it is
-quasirandom for some “large”
, though the precise meaning of “large” will depend on context. For applications to expansion in Cayley graphs, “large” will mean “
for some constant
independent of the size of
“, but other regimes of
are certainly of interest.
The way we have set things up, the trivial group is infinitely quasirandom (i.e. it is
-quasirandom for every
). This is however a degenerate case and will not be discussed further here. In the non-trivial case, a finite group can only be quasirandom if it is large and has no large subgroups:
Exercise 3 Let
, and let
be a finite
-quasirandom group.
- (i) Show that if
is non-trivial, then
. (Hint: use the mean zero component
of the regular representation
.) In particular, non-trivial finite groups cannot be infinitely quasirandom.
- (ii) Show that any proper subgroup
of
has index
. (Hint: use the mean zero component of the quasiregular representation.)
The following exercise shows that quasirandom groups have to be quite non-abelian, and in particular perfect:
Exercise 4 (Quasirandomness, abelianness, and perfection) Let
be a finite group.
- (i) If
is abelian and non-trivial, show that
is not
-quasirandom. (Hint: use Fourier analysis or the classification of finite abelian groups.)
- (ii) Show that
is
-quasirandom if and only if it is perfect, i.e. the commutator group
is equal to
. (Equivalently,
is
-quasirandom if and only if it has no non-trivial abelian quotients.)
Later on we shall see that there is a converse to the above two exercises; any non-trivial perfect finite group with no large subgroups will be quasirandom.
Exercise 5 Let
be a finite
-quasirandom group. Show that for any subgroup
of
,
is
-quasirandom, where
is the index of
in
. (Hint: use induced representations.)
Now we give an example of a more quasirandom group.
Lemma 2 (Frobenius lemma) If
is a field of some prime order
, then
is
-quasirandom.
This should be compared with the cardinality of the special linear group, which is easily computed to be
.
Proof: We may of course take to be odd. Suppose for contradiction that we have a non-trivial representation
on a unitary group of some dimension
with
. Set
to be the group element
and suppose first that is non-trivial. Since
, we have
; thus all the eigenvalues of
are
roots of unity. On the other hand, by conjugating
by diagonal matrices in
, we see that
is conjugate to
(and hence
conjugate to
) whenever
is a quadratic residue mod
. As such, the eigenvalues of
must be permuted by the operation
for any quadratic residue mod
. Since
has at least one non-trivial eigenvalue, and there are
distinct quadratic residues, we conclude that
has at least
distinct eigenvalues. But
is a
matrix with
, a contradiction. Thus
lies in the kernel of
. By conjugation, we then see that this kernel contains all unipotent matrices. But these matrices generate
(see exercise below), and so
is trivial, a contradiction.
Exercise 6 Show that for any prime
, the unipotent matrices
for
ranging over
generate
as a group.
Exercise 7 Let
be a finite group, and let
. If
is generated by a collection
of
-quasirandom subgroups, show that
is itself
-quasirandom.
Exercise 8 Show that
is
-quasirandom for any
and any prime
. (This is not sharp; the optimal bound here is
, which follows from the results of Landazuri and Seitz.)
As a corollary of the above results and Exercise 2, we see that the projective special linear group is also
-quasirandom.
Remark 2 One can ask whether the bound
in Lemma 2 is sharp, assuming of course that
is odd. Noting that
acts linearly on the plane
, we see that it also acts projectively on the projective line
, which has
elements. Thus
acts via the quasiregular representation on the
-dimensional space
, and also on the
-dimensional subspace
; this latter representation (known as the Steinberg representation) is irreducible. This shows that the
bound cannot be improved beyond
. More generally, given any character
,
acts on the
-dimensional space
of functions
that obey the twisted dilation invariance
for all
and
; these are known as the principal series representations. When
is the trivial character, this is the quasiregular representation discussed earlier. For most other characters, this is an irreducible representation, but it turns out that when
is the quadratic representation (thus taking values in
while being non-trivial), the principal series representation splits into the direct sum of two
-dimensional representations, which comes very close to matching the bound in Lemma 2. There is a parallel series of representations to the principal series (known as the discrete series) which is more complicated to describe (roughly speaking, one has to embed
in a quadratic extension
and then use a rotated version of the above construction, to change a split torus into a non-split torus), but can generate irreducible representations of dimension
, showing that the bound in Lemma 2 is in fact exactly sharp. These constructions can be generalised to arbitrary finite groups of Lie type using Deligne-Luzstig theory, but this is beyond the scope of this course (and of my own knowledge in the subject).
Exercise 9 Let
be an odd prime. Show that for any
, the alternating group
is
-quasirandom. (Hint: show that all cycles of order
in
are conjugate to each other in
(and not just in
); in particular, a cycle is conjugate to its
power for all
. Also, as
,
is simple, and so the cycles of order
generate the entire group.)
Remark 3 By using more precise information on the representations of the alternating group (using the theory of Specht modules and Young tableaux), one can show the slightly sharper statement that
is
-quasirandom for
(but is only
-quasirandom for
due to icosahedral symmetry, and
-quasirandom for
due to lack of perfectness). Using Exercise 3 with the index
subgroup
, we see that the bound
cannot be improved. Thus,
(for large
) is not as quasirandom as the special linear groups
(for
large and
bounded), because in the latter case the quasirandomness is as strong as a power of the size of the group, whereas in the former case it is only logarithmic in size.
If one replaces the alternating group
with the slightly larger symmetric group
, then quasirandomness is destroyed (since
, having the abelian quotient
, is not perfect); indeed,
is
-quasirandom and no better.
Remark 4 Thanks to the monumental achievement of the classification of finite simple groups, we know that apart from a finite number (26, to be precise) of sporadic exceptions, all finite simple groups (up to isomorphism) are either a cyclic group
, an alternating group
, or is a finite simple group of Lie type such as
. (We will define the concept of a finite simple group of Lie type more precisely in later notes, but suffice to say for now that such groups are constructed from reductive algebraic groups, for instance
is constructed from
in characteristic
.) In the case of finite simple groups
of Lie type with bounded rank
, it is known from the work of Landazuri and Seitz that such groups are
-quasirandom for some
depending only on the rank. On the other hand, by the previous remark, the large alternating groups do not have this property, and one can show that the finite simple groups of Lie type with large rank also do not have this property. Thus, we see using the classification that if a finite simple group
is
-quasirandom for some
and
is sufficiently large depending on
, then
is a finite simple group of Lie type with rank
. It would be of interest to see if there was an alternate way to establish this fact that did not rely on the classification, as it may lead to an alternate approach to proving the classification (or perhaps a weakened version thereof).
A key reason why quasirandomness is desirable for the purposes of demonstrating expansion is that quasirandom groups happen to be rapidly mixing at large scales, as we shall see below the fold. As such, quasirandomness is an important tool for demonstrating expansion in Cayley graphs, though because expansion is a phenomenon that must hold at all scales, one needs to supplement quasirandomness with some additional input that creates mixing at small or medium scales also before one can deduce expansion. As an example of this technique of combining quasirandomness with mixing at small and medium scales, we present a proof (due to Sarnak-Xue, and simplified by Gamburd) of a weak version of the famous “3/16 theorem” of Selberg on the least non-trivial eigenvalue of the Laplacian on a modular curve, which among other things can be used to construct a family of expander Cayley graphs in (compare this with the property (T)-based methods in the previous notes, which could construct expander Cayley graphs in
for any fixed
).
In the previous set of notes we introduced the notion of expansion in arbitrary -regular graphs. For the rest of the course, we will now focus attention primarily to a special type of
-regular graph, namely a Cayley graph.
Definition 1 (Cayley graph) Let
be a group, and let
be a finite subset of
. We assume that
is symmetric (thus
whenever
) and does not contain the identity
(this is to avoid loops). Then the (right-invariant) Cayley graph
is defined to be the graph with vertex set
and edge set
, thus each vertex
is connected to the
elements
for
, and so
is a
-regular graph.
Example 2 The graph in Exercise 3 of Notes 1 is the Cayley graph on
with generators
.
Remark 3 We call the above Cayley graphs right-invariant because every right translation
on
is a graph automorphism of
. This group of automorphisms acts transitively on the vertex set of the Cayley graph. One can thus view a Cayley graph as a homogeneous space of
, as it “looks the same” from every vertex. One could of course also consider left-invariant Cayley graphs, in which
is connected to
rather than
. However, the two such graphs are isomorphic using the inverse map
, so we may without loss of generality restrict our attention throughout to left Cayley graphs.
Remark 4 For minor technical reasons, it will be convenient later on to allow
to contain the identity and to come with multiplicity (i.e. it will be a multiset rather than a set). If one does so, of course, the resulting Cayley graph will now contain some loops and multiple edges.
For the purposes of building expander families, we would of course want the underlying groupto be finite. However, it will be convenient at various times to “lift” a finite Cayley graph up to an infinite one, and so we permit
to be infinite in our definition of a Cayley graph.
We will also sometimes consider a generalisation of a Cayley graph, known as a Schreier graph:
Definition 5 (Schreier graph) Let
be a finite group that acts (on the left) on a space
, thus there is a map
from
to
such that
and
for all
and
. Let
be a symmetric subset of
which acts freely on
in the sense that
for all
and
, and
for all distinct
and
. Then the Schreier graph
is defined to be the graph with vertex set
and edge set
.
Example 6 Every Cayley graph
is also a Schreier graph
, using the obvious left-action of
on itself. The
-regular graphs formed from
permutations
that were studied in the previous set of notes is also a Schreier graph provided that
for all distinct
, with the underlying group being the permutation group
(which acts on the vertex set
in the obvious manner), and
.
Exercise 7 If
is an even integer, show that every
-regular graph is a Schreier graph involving a set
of generators of cardinality
. (Hint: you may assume without proof Petersen’s 2-factor theorem, which asserts that every
-regular graph with
even can be decomposed into
edge-disjoint
-regular graphs. Now use the previous example.)
We return now to Cayley graphs. It is easy to characterise qualitative expansion properties of Cayley graphs:
Exercise 8 (Qualitative expansion) Let
be a finite Cayley graph.
- (i) Show that
is a one-sided
-expander for
for some
if and only if
generates
.
- (ii) Show that
is a two-sided
-expander for
for some
if and only if
generates
, and furthermore
intersects each index
subgroup of
.
We will however be interested in more quantitative expansion properties, in which the expansion constant is independent of the size of the Cayley graph, so that one can construct non-trivial expander families
of Cayley graphs.
One can analyse the expansion of Cayley graphs in a number of ways. For instance, by taking the edge expansion viewpoint, one can study Cayley graphs combinatorially, using the product set operation
of subsets of .
Exercise 9 (Combinatorial description of expansion) Let
be a family of finite
-regular Cayley graphs. Show that
is a one-sided expander family if and only if there is a constant
independent of
such that
for all sufficiently large
and all subsets
of
with
.
One can also give a combinatorial description of two-sided expansion, but it is more complicated and we will not use it here.
Exercise 10 (Abelian groups do not expand) Let
be a family of finite
-regular Cayley graphs, with the
all abelian, and the
generating
. Show that
are a one-sided expander family if and only if the Cayley graphs have bounded cardinality (i.e.
). (Hint: assume for contradiction that
is a one-sided expander family with
, and show by two different arguments that
grows at least exponentially in
and also at most polynomially in
, giving the desired contradiction.)
The left-invariant nature of Cayley graphs also suggests that such graphs can be profitably analysed using some sort of Fourier analysis; as the underlying symmetry group is not necessarily abelian, one should use the Fourier analysis of non-abelian groups, which is better known as (unitary) representation theory. The Fourier-analytic nature of Cayley graphs can be highlighted by recalling the operation of convolution of two functions , defined by the formula
This convolution operation is bilinear and associative (at least when one imposes a suitable decay condition on the functions, such as compact support), but is not commutative unless is abelian. (If one is more algebraically minded, one can also identify
(when
is finite, at least) with the group algebra
, in which case convolution is simply the multiplication operation in this algebra.) The adjacency operator
on a Cayley graph
can then be viewed as a convolution
where is the probability density
where is the Kronecker delta function on
. Using the spectral definition of expansion, we thus see that
is a one-sided expander if and only if
whenever is orthogonal to the constant function
, and is a two-sided expander if
whenever is orthogonal to the constant function
.
We remark that the above spectral definition of expansion can be easily extended to symmetric sets which contain the identity or have multiplicity (i.e. are multisets). (We retain symmetry, though, in order to keep the operation of convolution by
self-adjoint.) In particular, one can say (with some slight abuse of notation) that a set of elements
of
(possibly with repetition, and possibly with some elements equalling the identity) generates a one-sided or two-sided
-expander if the associated symmetric probability density
obeys either (2) or (3).
We saw in the last set of notes that expansion can be characterised in terms of random walks. One can of course specialise this characterisation to the Cayley graph case:
Exercise 11 (Random walk description of expansion) Let
be a family of finite
-regular Cayley graphs, and let
be the associated probability density functions. Let
be a constant.
- Show that the
are a two-sided expander family if and only if there exists a
such that for all sufficiently large
, one has
for some
, where
denotes the convolution of
copies of
.
- Show that the
are a one-sided expander family if and only if there exists a
such that for all sufficiently large
, one has
for some
.
In this set of notes, we will connect expansion of Cayley graphs to an important property of certain infinite groups, known as Kazhdan’s property (T) (or property (T) for short). In 1973, Margulis exploited this property to create the first known explicit and deterministic examples of expanding Cayley graphs. As it turns out, property (T) is somewhat overpowered for this purpose; in particular, we now know that there are many families of Cayley graphs for which the associated infinite group does not obey property (T) (or weaker variants of this property, such as property ). In later notes we will therefore turn to other methods of creating Cayley graphs that do not rely on property (T). Nevertheless, property (T) is of substantial intrinsic interest, and also has many connections to other parts of mathematics than the theory of expander graphs, so it is worth spending some time to discuss it here.
The material here is based in part on this recent text on property (T) by Bekka, de la Harpe, and Valette (available online here).
Read the rest of this entry »
In the last few notes, we have been steadily reducing the amount of regularity needed on a topological group in order to be able to show that it is in fact a Lie group, in the spirit of Hilbert’s fifth problem. Now, we will work on Hilbert’s fifth problem from the other end, starting with the minimal assumption of local compactness on a topological group , and seeing what kind of structures one can build using this assumption. (For simplicity we shall mostly confine our discussion to global groups rather than local groups for now.) In view of the preceding notes, we would like to see two types of structures emerge in particular:
- representations of
into some more structured group, such as a matrix group
; and
- metrics on
that capture the escape and commutator structure of
(i.e. Gleason metrics).
To build either of these structures, a fundamentally useful tool is that of (left-) Haar measure – a left-invariant Radon measure on
. (One can of course also consider right-Haar measures; in many cases (such as for compact or abelian groups), the two concepts are the same, but this is not always the case.) This concept generalises the concept of Lebesgue measure on Euclidean spaces
, which is of course fundamental in analysis on those spaces.
Haar measures will help us build useful representations and useful metrics on locally compact groups . For instance, a Haar measure
gives rise to the regular representation
that maps each element
of
to the unitary translation operator
on the Hilbert space
of square-integrable measurable functions on
with respect to this Haar measure by the formula
(The presence of the inverse is convenient in order to obtain the homomorphism property
without a reversal in the group multiplication.) In general, this is an infinite-dimensional representation; but in many cases (and in particular, in the case when
is compact) we can decompose this representation into a useful collection of finite-dimensional representations, leading to the Peter-Weyl theorem, which is a fundamental tool for understanding the structure of compact groups. This theorem is particularly simple in the compact abelian case, where it turns out that the representations can be decomposed into one-dimensional representations
, better known as characters, leading to the theory of Fourier analysis on general compact abelian groups. With this and some additional (largely combinatorial) arguments, we will also be able to obtain satisfactory structural control on locally compact abelian groups as well.
The link between Haar measure and useful metrics on is a little more complicated. Firstly, once one has the regular representation
, and given a suitable “test” function
, one can then embed
into
(or into other function spaces on
, such as
or
) by mapping a group element
to the translate
of
in that function space. (This map might not actually be an embedding if
enjoys a non-trivial translation symmetry
, but let us ignore this possibility for now.) One can then pull the metric structure on the function space back to a metric on
, for instance defining an
-based metric
if is square-integrable, or perhaps a
-based metric
if is continuous and compactly supported (with
denoting the supremum norm). These metrics tend to have several nice properties (for instance, they are automatically left-invariant), particularly if the test function is chosen to be sufficiently “smooth”. For instance, if we introduce the differentiation (or more precisely, finite difference) operators
(so that ) and use the metric (1), then a short computation (relying on the translation-invariance of the
norm) shows that
for all . This suggests that commutator estimates, such as those appearing in the definition of a Gleason metric in Notes 2, might be available if one can control “second derivatives” of
; informally, we would like our test functions
to have a “
” type regularity.
If was already a Lie group (or something similar, such as a
local group) then it would not be too difficult to concoct such a function
by using local coordinates. But of course the whole point of Hilbert’s fifth problem is to do without such regularity hypotheses, and so we need to build
test functions
by other means. And here is where the Haar measure comes in: it provides the fundamental tool of convolution
between two suitable functions , which can be used to build smoother functions out of rougher ones. For instance:
Exercise 1 Let
be continuous, compactly supported functions which are Lipschitz continuous. Show that the convolution
using Lebesgue measure on
obeys the
-type commutator estimate
for all
and some finite quantity
depending only on
.
This exercise suggests a strategy to build Gleason metrics by convolving together some “Lipschitz” test functions and then using the resulting convolution as a test function to define a metric. This strategy may seem somewhat circular because one needs a notion of metric in order to define Lipschitz continuity in the first place, but it turns out that the properties required on that metric are weaker than those that the Gleason metric will satisfy, and so one will be able to break the circularity by using a “bootstrap” or “induction” argument.
We will discuss this strategy – which is due to Gleason, and is fundamental to all currently known solutions to Hilbert’s fifth problem – in later posts. In this post, we will construct Haar measure on general locally compact groups, and then establish the Peter-Weyl theorem, which in turn can be used to obtain a reasonably satisfactory structural classification of both compact groups and locally compact abelian groups.
One of the fundamental structures in modern mathematics is that of a group. Formally, a group is a set equipped with an identity element
, a multiplication operation
, and an inversion operation
obeying the following axioms:
- (Closure) If
, then
and
are well-defined and lie in
. (This axiom is redundant from the above description, but we include it for emphasis.)
- (Associativity) If
, then
.
- (Identity) If
, then
.
- (Inverse) If
, then
.
One can also consider additive groups instead of multiplicative groups, with the obvious changes of notation. By convention, additive groups are always understood to be abelian, so it is convenient to use additive notation when one wishes to emphasise the abelian nature of the group structure. As usual, we often abbreviate
by
(and
by
) when there is no chance of confusion.
If furthermore is equipped with a topology, and the group operations
are continuous in this topology, then
is a topological group. Any group can be made into a topological group by imposing the discrete topology, but there are many more interesting examples of topological groups, such as Lie groups, in which
is not just a topological space, but is in fact a smooth manifold (and the group operations are not merely continuous, but also smooth).
There are many naturally occuring group-like objects that obey some, but not all, of the axioms. For instance, monoids are required to obey the closure, associativity, and identity axioms, but not the inverse axiom. If we also drop the identity axiom, we end up with a semigroup. Groupoids do not necessarily obey the closure axiom, but obey (versions of) the associativity, identity, and inverse axioms. And so forth.
Another group-like concept is that of a local topological group (or local group, for short), which is essentially a topological group with the closure axiom omitted (but do not obey the same axioms set as groupoids); they arise primarily in the study of local properties of (global) topological groups, and also in the study of approximate groups in additive combinatorics. Formally, a local group is a topological space
equipped with an identity element
, a partially defined but continuous multiplication operation
for some domain
, and a partially defined but continuous inversion operation
, where
, obeying the following axioms:
- (Local closure)
is an open neighbourhood of
, and
is an open neighbourhood of
.
- (Local associativity) If
are such that
and
are both well-defined, then they are equal. (Note however that it may be possible for one of these products to be defined but not the other, in contrast for instance with groupoids.)
- (Identity) For all
,
.
- (Local inverse) If
and
is well-defined, then
. (In particular this, together with the other axioms, forces
.)
We will often refer to ordinary groups as global groups (and topological groups as global topological groups) to distinguish them from local groups. Every global topological group is a local group, but not conversely.
One can consider discrete local groups, in which the topology is the discrete topology; in this case, the openness and continuity axioms in the definition are automatic and can be omitted. At the other extreme, one can consider local Lie groups, in which the local group has the structure of a smooth manifold, and the group operations are smooth. We can also consider symmetric local groups, in which
(i.e. inverses are always defined). Symmetric local groups have the advantage of local homogeneity: given any
, the operation of left-multiplication
is locally inverted by
near the identity, thus giving a homeomorphism between a neighbourhood of
and a neighbourhood of the identity; in particular, we see that given any two group elements
in a symmetric local group
, there is a homeomorphism between a neighbourhood of
and a neighbourhood of
. (If the symmetric local group is also Lie, then these homeomorphisms are in fact diffeomorphisms.) This local homogeneity already simplifies a lot of the possible topology of symmetric local groups, as it basically means that the local topological structure of such groups is determined by the local structure at the origin. (For instance, all connected components of a local Lie group necessarily have the same dimension.) It is easy to see that any local group has at least one symmetric open neighbourhood of the identity, so in many situations we can restrict to the symmetric case without much loss of generality.
A prime example of a local group can be formed by restricting any global topological group to an open neighbourhood
of the identity, with the domains
and
one easily verifies that this gives the structure of a local group (which we will sometimes call
to emphasise the original group
). If
is symmetric (i.e.
), then we in fact have a symmetric local group. One can also restrict local groups
to open neighbourhoods
to obtain a smaller local group
by the same procedure (adopting the convention that statements such as
or
are considered false if the left-hand side is undefined). (Note though that if one restricts to non-open neighbourhoods of the identity, then one usually does not get a local group; for instance
is not a local group (why?).)
Finite subsets of (Hausdorff) groups containing the identity can be viewed as local groups. This point of view turns out to be particularly useful for studying approximate groups in additive combinatorics, a point which I hope to expound more on later. Thus, for instance, the discrete interval is an additive symmetric local group, which informally might model an adding machine that can only handle (signed) one-digit numbers. More generally, one can view a local group as an object that behaves like a group near the identity, but for which the group laws (and in particular, the closure axiom) can start breaking down once one moves far enough away from the identity.
One can formalise this intuition as follows. Let us say that a word in a local group
is well-defined in
(or well-defined, for short) if every possible way of associating this word using parentheses is well-defined from applying the product operation. For instance, in order for
to be well-defined,
,
,
,
, and
must all be well-defined. In the preceding example
,
is not well-defined because one of the ways of associating this sum, namely
, is not well-defined (even though
is well-defined).
Exercise 1 (Iterating the associative law)
- Show that if a word
in a local group is well-defined, then all ways of associating this word give the same answer, and so we can uniquely evaluate
as an element in
.
- Give an example of a word
in a local group which has two ways of being associated that are both well-defined, but give different answers. (Hint: the local associativity axiom prevents this from happening for
, so try
. A small discrete local group will already suffice to give a counterexample; verifying the local group axioms are easier if one makes the domain of definition of the group operations as small as one can get away with while still having the counterexample.)
Exercise 2 Show that the number of ways to associate a word
is given by the Catalan number
.
Exercise 3 Let
be a local group, and let
be an integer. Show that there exists a symmetric open neighbourhood
of the identity such that every word of length
in
is well-defined in
(or more succinctly,
is well-defined). (Note though that these words will usually only take values in
, rather than in
, and also the sets
tend to become smaller as
increases.)
In many situations (such as when one is investigating the local structure of a global group) one is only interested in the local properties of a (local or global) group. We can formalise this by the following definition. Let us call two local groups and
locally identical if they have a common restriction, thus there exists a set
such that
(thus,
, and the topology and group operations of
and
agree on
). This is easily seen to be an equivalence relation. We call an equivalence class
of local groups a group germ.
Let be a property of a local group (e.g. abelianness, connectedness, compactness, etc.). We call a group germ locally
if every local group in that germ has a restriction that obeys
; we call a local or global group
locally
if its germ is locally
(or equivalently, every open neighbourhood of the identity in
contains a further neighbourhood that obeys
). Thus, the study of local properties of (local or global) groups is subsumed by the study of group germs.
Exercise 4
- Show that the above general definition is consistent with the usual definitions of the properties “connected” and “locally connected” from point-set topology.
- Strictly speaking, the above definition is not consistent with the usual definitions of the properties “compact” and “local compact” from point-set topology because in the definition of local compactness, the compact neighbourhoods are certainly not required to be open. Show however that the point-set topology notion of “locally compact” is equivalent, using the above conventions, to the notion of “locally precompact inside of an ambient local group”. Of course, this is a much more clumsy terminology, and so we shall abuse notation slightly and continue to use the standard terminology “locally compact” even though it is, strictly speaking, not compatible with the above general convention.
- Show that a local group is discrete if and only if it is locally trivial.
- Show that a connected global group is abelian if and only if it is locally abelian. (Hint: in a connected global group, the only open subgroup is the whole group.)
- Show that a global topological group is first-countable if and only if it is locally first countable. (By the Birkhoff-Kakutani theorem, this implies that such groups are metrisable if and only if they are locally metrisable.)
- Let
be a prime. Show that the solenoid group
, where
is the
-adic integers and
is the diagonal embedding of
inside
, is connected but not locally connected.
Remark 1 One can also study the local properties of groups using nonstandard analysis. Instead of group germs, one works (at least in the case when
is first countable) with the monad
of the identity element
of
, defined as the nonstandard group elements
in
that are infinitesimally close to the origin in the sense that they lie in every standard neighbourhood of the identity. The monad
is closely related to the group germ
, but has the advantage of being a genuine (global) group, as opposed to an equivalence class of local groups. It is possible to recast most of the results here in this nonstandard formulation; see e.g. the classic text of Robinson. However, we will not adopt this perspective here.
A useful fact to know is that Lie structure is local. Call a (global or local) topological group Lie if it can be given the structure of a (global or local) Lie group.
Lemma 1 (Lie is a local property) A global topological group
is Lie if and only if it is locally Lie. The same statement holds for local groups
as long as they are symmetric.
We sketch a proof of this lemma below the fold. One direction is obvious, as the restriction a global Lie group to an open neighbourhood of the origin is clearly a local Lie group; for instance, the continuous interval is a symmetric local Lie group. The converse direction is almost as easy, but (because we are not assuming
to be connected) requires one non-trivial fact, namely that local homomorphisms between local Lie groups are automatically smooth; details are provided below the fold.
As with so many other basic classes of objects in mathematics, it is of fundamental importance to specify and study the morphisms between local groups (and group germs). Given two local groups , we can define the notion of a (continuous) homomorphism
between them, defined as a continuous map with
such that whenever are such that
is well-defined, then
is well-defined and equal to
; similarly, whenever
is such that
is well-defined, then
is well-defined and equal to
. (In abstract algebra, the continuity requirement is omitted from the definition of a homomorphism; we will call such maps discrete homomorphisms to distinguish them from the continuous ones which will be the ones studied here.)
It is often more convenient to work locally: define a local (continuous) homomorphism from
to
to be a homomorphism from an open neighbourhood
of the identity to
. Given two local homomorphisms
,
from one pair of locally identical groups
to another pair
, we say that
are locally identical if they agree on some open neighbourhood of the identity in
(note that it does not matter here whether we require openness in
, in
, or both). An equivalence class
of local homomorphisms will be called a germ homomorphism (or morphism for short) from the group germ
to the group germ
.
Exercise 5 Show that the class of group germs, equipped with the germ homomorphisms, becomes a category. (Strictly speaking, because group germs are themselves classes rather than sets, the collection of all group germs is a second-order class rather than a class, but this set-theoretic technicality can be resolved in a number of ways (e.g. by restricting all global and local groups under consideration to some fixed “universe”) and should be ignored for this exercise.)
As is usual in category theory, once we have a notion of a morphism, we have a notion of an isomorphism: two group germs are isomorphic if there are germ homomorphisms
,
that invert each other. Lifting back to local groups, the associated notion is that of local isomorphism: two local groups
are locally isomorphic if there exist local isomorphisms
and
from
to
and from
to
that locally invert each other, thus
for
sufficiently close to
, and
for
sufficiently close to
. Note that all local properties of (global or local) groups that can be defined purely in terms of the group and topological structures will be preserved under local isomorphism. Thus, for instance, if
are locally isomorphic local groups, then
is locally connected iff
is,
is locally compact iff
is, and (by Lemma 1)
is Lie iff
is.
Exercise 6
Show that the additive global groups and
are locally isomorphic.
Show that every locally path-connected group is locally isomorphic to a path-connected, simply connected group.
— 1. Lie’s third theorem —
Lie’s fundamental theorems of Lie theory link the Lie group germs to Lie algebras. Observe that if is a locally Lie group germ, then the tangent space
at the identity of this germ is well-defined, and is a finite-dimensional vector space. If we choose
to be symmetric, then
can also be identified with the left-invariant (say) vector fields on
, which are first-order differential operators on
. The Lie bracket for vector fields then endows
with the structure of a Lie algebra. It is easy to check that every morphism
of locally Lie germs gives rise (via the derivative map at the identity) to a morphism
of the associated Lie algebras. From the Baker-Campbell-Hausdorff formula (which is valid for local Lie groups, as discussed in this previous post) we conversely see that
uniquely determines the germ homomorphism
. Thus the derivative map provides a covariant functor from the category of locally Lie group germs to the category of (finite-dimensional) Lie algebras. In fact, this functor is an isomorphism, which is part of a fact known as Lie’s third theorem:
Theorem 2 (Lie’s third theorem) For this theorem, all Lie algebras are understood to be finite dimensional (and over the reals).
- Every Lie algebra
is the Lie algebra of a local Lie group germ
, which is unique up to germ isomorphism (fixing
).
- Every Lie algebra
is the Lie algebra of some global connected, simply connected Lie group
, which is unique up to Lie group isomorphism (fixing
).
- Every homomorphism
between Lie algebras is the derivative of a unique germ homomorphism
between the associated local Lie group germs.
- Every homomorphism
between Lie algebras is the derivative of a unique Lie group homomorphism
between the associated global connected, simply connected, Lie groups.
- Every local Lie group germ is the germ of a global connected, simply connected Lie group
, which is unique up to Lie group isomorphism. In particular, every local Lie group is locally isomorphic to a global Lie group.
We record the (standard) proof of this theorem below the fold, which is ultimately based on Ado’s theorem and the Baker-Campbell-Hausdorff formula. Lie’s third theorem (which, actually, was proven in full generality by Cartan) demonstrates the equivalence of three categories: the category of finite-dimensonal Lie algebras, the category of local Lie group germs, and the category of connected, simply connected Lie groups.
— 2. Globalising a local group —
Many properties of a local group improve after passing to a smaller neighbourhood of the identity. Here are some simple examples:
Exercise 7 Let
be a local group.
- Give an example to show that
does not necessarily obey the cancellation laws
for
(with the convention that statements such as
are false if either side is undefined). However, show that there exists an open neighbourhood
of
within which the cancellation law holds.
- Repeat the previous part, but with the cancellation law (1) replaced by the inversion law
- Repeat the previous part, but with the inversion law replaced by the involution law
Note that the counterexamples in the above exercise demonstrate that not every local group is the restriction of a global group, because global groups (and hence, their restrictions) always obey the cancellation law (1), the inversion law (2), and the involution law (3). Another way in which a local group can fail to come from a global group is if it contains relations which can interact in a “global’ way to cause trouble, in a fashion which is invisible at the local level. For instance, consider the open unit cube , and consider four points
in this cube that are close to the upper four corners
of this cube respectively. Define an equivalence relation
on this cube by setting
if
and
is equal to either
or
for some
. Note that this indeed an equivalence relation if
are close enough to the corners (as this forces all non-trivial combinations
to lie outside the doubled cube
). The quotient space
(which is a cube with bits around opposite corners identified together) can then be seen to be a symmetric additive local Lie group, but will usually not come from a global group. Indeed, it is not hard to see that if
is the restriction of a global group
, then
must be a Lie group with Lie algebra
(by Lemma 1), and so the connected component
of
containing the identity is isomorphic to
for some sublattice
of
that contains
; but for generic
, there is no such lattice, as the
will generate a dense subset of
. (The situation here is somewhat analogous to a number of famous Escher prints, such as Ascending and Descending, in which the geometry is locally consistent but globally inconsistent.) We will give this sort of argument in more detail below the fold (see the proof of Proposition 7).
Nevertheless, the space is still locally isomorphic to a global Lie group, namely
; for instance, the open neighbourhood
is isomorphic to
, which is an open neighbourhood of
. More generally, Lie’s third theorem tells us that any local Lie group is locally isomorphic to a global Lie group.
Let us call a local group globalisable if it is locally isomorphic to a global group; thus Lie’s third theorem tells us that every local Lie group is globalisable. Thanks to Goldbring’s solution to the local version of Hilbert’s fifth problem, we also know that locally Euclidean local groups are globalisable. A modification of this argument by van den Dries and Goldbring shows in fact that every locally compact local group is globalisable.
In view of these results, it is tempting to conjecture that all local groups are globalisable;; among other things, this would simplify the proof of Lie’s third theorem (and of the local version of Hilbert’s fifth problem). Unfortunately, this claim as stated is false:
Theorem 3 There exists local groups
which are not globalisable.
The counterexamples used to establish Theorem 3 are remarkably delicate; the first example I know of is due to van Est and Korthagen. One reason for this, of course, is that the previous results prevents one from using any local Lie group, or even a locally compact group as a counterexample. We will present a (somewhat complicated) example below, based on the unit ball in the infinite-dimensional Banach space .
However, there are certainly many situations in which we can globalise a local group. For instance, this is the case if one has a locally faithful representation of that local group inside a global group:
Lemma 4 (Faithful representation implies globalisability) Let
be a local group, and suppose there exists an injective local homomorphism
from
into a global topological group
with
symmetric. Then
is isomorphic to the restriction of a global topological group to an open neighbourhood of the identity; in particular,
is globalisable.
The material here is based in part on this paper of Olver and this paper of Goldbring.
The classical formulation of Hilbert’s fifth problem asks whether topological groups that have the topological structure of a manifold, are necessarily Lie groups. This is indeed, the case, thanks to following theorem of Gleason and Montgomery-Zippin:
Theorem 1 (Hilbert’s fifth problem) Let
be a topological group which is locally Euclidean. Then
is isomorphic to a Lie group.
We have discussed the proof of this result, and of related results, in previous posts. There is however a generalisation of Hilbert’s fifth problem which remains open, namely the Hilbert-Smith conjecture, in which it is a space acted on by the group which has the manifold structure, rather than the group itself:
Conjecture 2 (Hilbert-Smith conjecture) Let
be a locally compact topological group which acts continuously and faithfully (or effectively) on a connected finite-dimensional manifold
. Then
is isomorphic to a Lie group.
Note that Conjecture 2 easily implies Theorem 1 as one can pass to the connected component of a locally Euclidean group (which is clearly locally compact), and then look at the action of
on itself by left-multiplication.
The hypothesis that the action is faithful (i.e. each non-identity group element acts non-trivially on
) cannot be completely eliminated, as any group
will have a trivial action on any space
. The requirement that
be locally compact is similarly necessary: consider for instance the diffeomorphism group
of, say, the unit circle
, which acts on
but is infinite dimensional and is not locally compact (with, say, the uniform topology). Finally, the connectedness of
is also important: the infinite torus
(with the product topology) acts faithfully on the disconnected manifold
by the action
The conjecture in full generality remains open. However, there are a number of partial results. For instance, it was observed by Montgomery and Zippin that the conjecture is true for transitive actions, by a modification of the argument used to establish Theorem 1. This special case of the Hilbert-Smith conjecture (or more precisely, a generalisation thereof in which “finite-dimensional manifold” was replaced by “locally connected locally compact finite-dimensional”) was used in Gromov’s proof of his famous theorem on groups of polynomial growth. I record the argument of Montgomery and Zippin below the fold.
Another partial result is the reduction of the Hilbert-Smith conjecture to the -adic case. Indeed, it is known that Conjecture 2 is equivalent to
Conjecture 3 (Hilbert-Smith conjecture for
-adic actions) It is not possible for a
-adic group
to act continuously and effectively on a connected finite-dimensional manifold
.
The reduction to the -adic case follows from the structural theory of locally compact groups (specifically, the Gleason-Yamabe theorem discussed in previous posts) and some results of Newman that sharply restrict the ability of periodic actions on a manifold
to be close to the identity. I record this argument (which appears for instance in this paper of Lee) below the fold also.
Recent Comments