You are currently browsing the monthly archive for August 2023.
As someone who had a relatively light graduate education in algebra, the import of Yoneda’s lemma in category theory has always eluded me somewhat; the statement and proof are simple enough, but definitely have the “abstract nonsense” flavor that one often ascribes to this part of mathematics, and I struggled to connect it to the more grounded forms of intuition, such as those based on concrete examples, that I was more comfortable with. There is a popular MathOverflow post devoted to this question, with many answers that were helpful to me, but I still felt vaguely dissatisfied. However, recently when pondering the very concrete concept of a polynomial, I managed to accidentally stumble upon a special case of Yoneda’s lemma in action, which clarified this lemma conceptually for me. In the end it was a very simple observation (and would be extremely pedestrian to anyone who works in an algebraic field of mathematics), but as I found this helpful to a non-algebraist such as myself, and I thought I would share it here in case others similarly find it helpful.
In algebra we see a distinction between a polynomial form (also known as a formal polynomial), and a polynomial function, although this distinction is often elided in more concrete applications. A polynomial form in, say, one variable with integer coefficients, is a formal expression of the form
where are coefficients in the integers, and is an indeterminate: a symbol that is often intended to be interpreted as an integer, real number, complex number, or element of some more general ring , but is for now a purely formal object. The collection of such polynomial forms is denoted , and is a commutative ring.A polynomial form can be interpreted in any ring (even non-commutative ones) to create a polynomial function , defined by the formula
for any . This definition (2) looks so similar to the definition (1) that we usually abuse notation and conflate with . This conflation is supported by the identity theorem for polynomials, that asserts that if two polynomial forms agree at an infinite number of (say) complex numbers, thus for infinitely many , then they agree as polynomial forms (i.e., their coefficients match). But this conflation is sometimes dangerous, particularly when working in finite characteristic. For instance:
- (i) The linear forms and are distinct as polynomial forms, but agree when interpreted in the ring , since for all .
- (ii) Similarly, if is a prime, then the degree one form and the degree form are distinct as polynomial forms (and in particular have distinct degrees), but agree when interpreted in the ring , thanks to Fermat’s little theorem.
- (iii) The polynomial form has no roots when interpreted in the reals , but has roots when interpreted in the complex numbers . Similarly, the linear form has no roots when interpreted in the integers , but has roots when interpreted in the rationals .
The above examples show that if one only interprets polynomial forms in a specific ring , then some information about the polynomial could be lost (and some features of the polynomial, such as roots, may be “invisible” to that interpretation). But this turns out not to be the case if one considers interpretations in all rings simultaneously, as we shall now discuss.
If are two different rings, then the polynomial functions and arising from interpreting a polynomial form in these two rings are, strictly speaking, different functions. However, they are often closely related to each other. For instance, if is a subring of , then agrees with the restriction of to . More generally, if there is a ring homomorphism from to , then and are intertwined by the relation
which basically asserts that ring homomorphism respect polynomial operations. Note that the previous observation corresponded to the case when was an inclusion homomorphism. Another example comes from the complex conjugation automorphism on the complex numbers, in which case (3) asserts the identity for any polynomial function on the complex numbers, and any complex number .What was surprising to me (as someone who had not internalized the Yoneda lemma) was that the converse statement was true: if one had a function associated to every ring that obeyed the intertwining relation
for every ring homomorphism , then there was a unique polynomial form such that for all rings . This seemed surprising to me because the functions were a priori arbitrary functions, and as an analyst I would not expect them to have polynomial structure. But the fact that (4) holds for all rings and all homomorphisms is in fact rather powerful. As an analyst, I am tempted to proceed by first working with the ring of complex numbers and taking advantage of the aforementioned identity theorem, but this turns out to be tricky because does not “talk” to all the other rings enough, in the sense that there are not always as many ring homomorphisms from to as one would like. But there is in fact a more elementary argument that takes advantage of a particularly relevant (and “talkative”) ring to the theory of polynomials, namely the ring of polynomials themselves. Given any other ring , and any element of that ring, there is a unique ring homomorphism from to that maps to , namely the evaluation map that sends a polynomial form to its evaluation at . Applying (4) to this ring homomorphism, and specializing to the element of , we conclude that for any ring and any . If we then define to be the formal polynomial then this identity can be rewritten as and so we have indeed shown that the family arises from a polynomial form . Conversely, from the identity valid for any polynomial form , we see that two polynomial forms can only generate the same polynomial functions for all rings if they are identical as polynomial forms. So the polynomial form associated to the family is unique.We have thus created an identification of form and function: polynomial forms are in one-to-one correspondence with families of functions obeying the intertwining relation (4). But this identification can be interpreted as a special case of the Yoneda lemma, as follows. There are two categories in play here: the category of rings (where the morphisms are ring homomorphisms), and the category of sets (where the morphisms are arbitrary functions). There is an obvious forgetful functor between these two categories that takes a ring and removes all of the algebraic structure, leaving behind just the underlying set. A collection of functions (i.e., -morphisms) for each in that obeys the intertwining relation (4) is precisely the same thing as a natural transformation from the forgetful functor to itself. So we have identified formal polynomials in as a set with natural endomorphisms of the forgetful functor:
Informally: polynomial forms are precisely those operations on rings that are respected by ring homomorphisms.What does this have to do with Yoneda’s lemma? Well, remember that every element of a ring came with an evaluation homomorphism . Conversely, every homomorphism from to will be of the form for a unique – indeed, will just be the image of under this homomorphism. So the evaluation homomorphism provides a one-to-one correspondence between elements of , and ring homomorphisms in . This correspondence is at the level of sets, so this gives the identification
Thus our identification can be written as which is now clearly a special case of the Yoneda lemma that applies to any functor from a (locally small) category and any object in . And indeed if one inspects the standard proof of this lemma, it is essentially the same argument as the argument we used above to establish the identification (5). More generally, it seems to me that the Yoneda lemma is often used to identify “formal” objects with their “functional” interpretations, as long as one simultaneously considers interpretations across an entire category (such as the category of rings), as opposed to just a single interpretation in a single object of the category in which there may be some loss of information due to the peculiarities of that specific object. Grothendieck’s “functor of points” interpretation of a scheme, discussed in this previous blog post, is one typical example of this.
Kevin Ford, Dimitris Koukoulopoulos and I have just uploaded to the arXiv our paper “A lower bound on the mean value of the Erdős-Hooley delta function“. This paper complements the recent paper of Dimitris and myself obtaining the upper bound
on the mean value of the Erdős-Hooley delta function In this paper we obtain a lower bound where is an exponent that arose in previous work of result of Ford, Green, and Koukoulopoulos, who showed that for all outside of a set of density zero. The previous best known lower bound for the mean value was due to Hall and Tenenbaum.The point is the main contributions to the mean value of are driven not by “typical” numbers of some size , but rather of numbers that have a splitting
where is the product of primes between some intermediate threshold and and behaves “typically” (so in particular, it has about prime factors, as per the Hardy-Ramanujan law and the Erdős-Kac law, but is the product of primes up to and has double the number of typical prime factors – , rather than – thus is the type of number that would make a significant contribution to the mean value of the divisor function . Here is such that is an integer in the range for some small constant there are basically different values of give essentially disjoint contributions. From the easy inequalities (the latter coming from the pigeonhole principle) and the fact that has mean about one, one would expect to get the above result provided that one could get a lower bound of the form for most typical with prime factors between and . Unfortunately, due to the lack of small prime factors in , the arguments of Ford, Green, Koukoulopoulos that give (1) for typical do not quite work for the rougher numbers . However, it turns out that one can get around this problem by replacing (2) by the more efficient inequality where is an enlarged version of when . This inequality is easily proven by applying the pigeonhole principle to the factors of of the form , where is one of the factors of , and is one of the factors of in the optimal interval . The extra room provided by the enlargement of the range to turns out to be sufficient to adapt the Ford-Green-Koukoulopoulos argument to the rough setting. In fact we are able to use the main technical estimate from that paper as a “black box”, namely that if one considers a random subset of for some small and sufficiently large with each lying in with an independent probability , then with high probability there should be subset sums of that attain the same value. (Initially, what “high probability” means is just “close to “, but one can reduce the failure probability significantly as by a “tensor power trick” taking advantage of Bennett’s inequality.)I have just uploaded to the arXiv my paper “The convergence of an alternating series of Erdős, assuming the Hardy–Littlewood prime tuples conjecture“. This paper concerns an old problem of Erdős concerning whether the alternating series converges, where denotes the prime. The main result of this paper is that the answer to this question is affirmative assuming a sufficiently strong version of the Hardy–Littlewood prime tuples conjecture.
The alternating series test does not apply here because the ratios are not monotonically decreasing. The deviations of monotonicity arise from fluctuations in the prime gaps , so the enemy arises from biases in the prime gaps for odd and even . By changing variables from to (or more precisely, to integers in the range between and ), this is basically equivalent to biases in the parity of the prime counting function. Indeed, it is an unpublished observation of Said that the convergence of is equivalent to the convergence of . So this question is really about trying to get a sufficiently strong amount of equidistribution for the parity of .
The prime tuples conjecture does not directly say much about the value of ; however, it can be used to control differences for and not too large. Indeed, it is a famous calculation of Gallagher that for fixed , and chosen randomly from to , the quantity is distributed according to the Poisson distribution of mean asymptotically if the prime tuples conjecture holds. In particular, the parity of this quantity should have mean asymptotic to . An application of the van der Corput -process then gives some decay on the mean of as well. Unfortunately, this decay is a bit too weak for this problem; even if one uses the most quantitative version of Gallagher’s calculation, worked out in a recent paper of (Vivian) Kuperberg, the best bound on the mean is something like , which is not quite strong enough to overcome the doubly logarithmic divergence of .
To get around this obstacle, we take advantage of the random sifted model of the primes that was introduced in a paper of Banks, Ford, and myself. To model the primes in an interval such as with drawn randomly from say , we remove one random residue class from this interval for all primes up to Pólya’s “magic cutoff” . The prime tuples conjecture can then be intepreted as the assertion that the random set produced by this sieving process is statistically a good model for the primes in . After some standard manipulations (using a version of the Bonferroni inequalities, as well as some upper bounds of Kuperberg), the problem then boils down to getting sufficiently strong estimates for the expected parity of the random sifted set .
For this problem, the main advantage of working with the random sifted model, rather than with the primes or the singular series arising from the prime tuples conjecture, is that the sifted model can be studied iteratively from the partially sifted sets arising from sifting primes up to some intermediate threshold , and that the expected parity of the experiences some decay in . Indeed, once exceeds the length of the interval , sifting by an additional prime will cause to lose one element with probability , and remain unchanged with probability . If concentrates around some value , this suggests that the expected parity will decay by a factor of about as one increases to , and iterating this should give good bounds on the final expected parity . It turns out that existing second moment calculations of Montgomery and Soundararajan suffice to obtain enough concentration to make this strategy work.
Recent Comments