You are currently browsing the category archive for the ‘math.NT’ category.
I’ve just uploaded to the arXiv my paper “Dense sets of natural numbers with unusually large least common multiples“. This short paper answers (in the negative) a somewhat obscure question of Erdös and Graham:
Problem 1 Is it true that if is a set of natural numbers for which goes to infinity as , then the quantity also goes to infinity as ?
At first glance, this problem may seem rather arbitrary, but it can be motivated as follows. The hypothesis that (1) goes to infinity is a largeness condition on ; in view of Mertens’ theorem, it can be viewed as an assertion that is denser than the set of primes. On the other hand, the conclusion that (2) grows is an assertion that becomes significantly larger than on the average for large ; that is to say, that many pairs of numbers in share a common factor. Intuitively, the problem is then asking whether sets that are significantly denser than the primes must start having lots of common factors on average.
For sake of comparison, it is easy to see that if (1) goes to infinity, then at least one pair of distinct elements in must have a non-trivial common factor. For if this were not the case, then the elements of are pairwise coprime, so each prime has at most one multiple in , and so can contribute at most to the sum in (1), and hence by Mertens’ theorem, and the fact that every natural number greater than one is divisible by at least one prime , the quantity (1) stays bounded, a contradiction.
It turns out, though, that the answer to the above problem is negative; one can find sets that are denser than the primes, but for which (2) stays bounded, so that the least common multiples in the set are unusually large. It was a bit surprising to me that this question had not been resolved long ago (in fact, I was not able to find any prior literature on the problem beyond the original reference of Erdös and Graham); in contrast, another problem of Erdös and Graham concerning sets with unusually small least common multiples was extensively studied (and essentially solved) about twenty years ago, while the study of sets with unusually large greatest common divisor for many pairs in the set has recently become somewhat popular, due to their role in the proof of the Duffin-Schaeffer conjecture by Koukoulopoulos and Maynard.
To search for counterexamples, it is natural to look for numbers with relatively few prime factors, in order to reduce their common factors and increase their least common multiple. A particularly simple example, whose verification is on the level of an exercise in a graduate analytic number theory course, is the set of semiprimes (products of two primes), for which one can readily verify that (1) grows like but (2) stays bounded. With a bit more effort, I was able to optimize the construction and uncover the true threshold for boundedness of (2), which was a little unexpected:
Theorem 2
The proofs are not particularly long or deep, but I thought I would record here some of the process towards finding them. My first step was to try to simplify the condition that (2) stays bounded. In order to use probabilistic intuition, I first expressed this condition in probabilistic terms as
for large , where are independent random variables drawn from with probability density function The presence of the least common multiple in the denominator is annoying, but one can easily flip the expression to the greatest common divisor: If the expression was a product of a function of and a function of , then by independence this expectation would decouple into simpler averages involving just one random variable instead of two. Of course, the greatest common divisor is not of this form, but there is a standard trick in analytic number theory to decouple the greatest common divisor, namely to use the classic Gauss identity , with the Euler totient function, to write Inserting this formula and interchanging the sum and expectation, we can now express the condition as bounding a sum of squares: Thus, the condition (2) is really an assertion to the effect that typical elements of do not have many divisors. From experience in sieve theory, the probabilities tend to behave multiplicatively in , so the expression here heuristically behaves like an Euler product that looks something like and so the condition (2) is morally something like Comparing this with the Mertens’ theorems, this leads to the heuristic prediction that (for a typical prie much smaller than ) should decay somewhat like (ignoring for now factors of ). This can be compared to the example of the set of primes or semiprimes on one hand, where the probability is like , and the set of all natural numbers on the other hand, where the probability is like . So the critical behavior should come from sets that are in some sense “halfway” between the primes and the natural numbers.It is then natural to try a random construction, in which one sieves out the natural numbers by permitting each natural number to survive with a probability resembling , in order to get the predicted behavior for . Performing some standard calculations, this construction could ensure (2) bounded with a density a little bit less than the one stated in the main theorem; after optimizing the parameters, I could only get something like
I was stuck on optimising the construction further, so I turned my attention to a positive result in the spirit of (ii) of the main theorem. On playing around with (3), I observed that one could use Cauchy-Schwarz and Mertens’ theorem to obtain the bound which was in line with the previous heuristic that should behave like . The left-hand side had a simple interpretation: by linearity of expectation, it was the expected number of prime factors of . So the boundedness of (2) implied that a typical element of only had about prime factors, in contrast to the predicted by the Hardy-Ramanujan law. Standard methods from the anatomy of integers can then be used to see how dense a set with that many prime factors could be, and this soon led to a short proof of part (ii) of the main theorem (I eventually found for instance that Jensen’s inequality could be used to create a particularly slick argument).It then remained to improve the lower bound construction to eliminate the losses in the exponents. By deconstructing the proof of the upper bound, it became natural to consider something like the set of natural numbers that had at most prime factors. This construction actually worked for some scales – namely those for which was a natural number – but there was some strange “discontinuities” in the analysis that prevented me from establishing the boundedness of (2) for arbitrary scales . The basic problem was that increasing the number of permitted prime factors from one natural number threshold to another ended up increasing the density of the set by an unbounded factor (of the order of , in practice), which heavily disrupted the task of trying to keep the ratio (2) bounded. Usually the resolution to these sorts of discontinuities is to use some sort of random “average” of two or more deterministic constructions – for instance, by taking some random union of some numbers with prime factors and some numbers with prime factors – but the numerology turned out to be somewhat unfavorable, allowing for some improvement in the lower bounds over my previous construction, but not enough to close the gap entirely. It was only after substantial trial and error that I was able to find a working deterministic construction, where at a given scale one collected either numbers with at most prime factors, or numbers with prime factors but with the largest prime factor in a specific range, in which I could finally get the numerator and denominator in (2) to be in balance for every . But once the construction was written down, the verification of the required properties ended up being quite routine.
Many modern mathematical proofs are a combination of conceptual arguments and technical calculations. There is something of a tradeoff between the two: one can add more conceptual arguments to try to reduce the technical computations, or vice versa. (Among other things, this leads to a Berkson paradox-like phenomenon in which a negative correlation can be observed between the two aspects of a proof; see this recent Mastodon post of mine for more discussion.)
In a recent article, Heather Macbeth argues that the preferred balance between conceptual and computational arguments is quite different for a computer-assisted proof than it is for a purely human-readable proof. In the latter, there is a strong incentive to minimize the amount of calculation to the point where it can be checked by hand, even if this requires a certain amount of ad hoc rearrangement of cases, unmotivated parameter selection, or otherwise non-conceptual additions to the arguments in order to reduce the calculation. But in the former, once one is willing to outsource any tedious verification or optimization task to a computer, the incentives are reversed: freed from the need to arrange the argument to reduce the amount of calculation, one can now describe an argument by listing the main ingredients and then letting the computer figure out a suitable way to combine them to give the stated result. The two approaches can thus be viewed as complementary ways to describe a result, with neither necessarily being superior to the other.
In this post, I would like to illustrate this computation-outsourced approach with the topic of zero-density theorems for the Riemann zeta function, in which all computer verifiable calculations (as well as other routine but tedious arguments) are performed “off-stage”, with the intent of focusing only on the conceptual inputs to these theorems.
Zero-density theorems concern upper bounds for the quantity for a given and large , which is defined as the number of zeroes of the Riemann zeta function in the rectangle . (There is also an important generalization of this quantity to -functions, but for simplicity we will focus on the classical zeta function case here). Such quantities are important in analytic number theory for many reasons, one of which is through explicit formulae such as the Riemann-von Mangoldt explicit formula
relating the prime numbers to the zeroes of the zeta function (the “music of the primes”). The better bounds one has on , the more control one has on the complicated term on the right-hand side.Clearly is non-increasing in . The Riemann-von Mangoldt formula, together with the functional equation, gives us the asymptotic
in the case, while the prime number theorem tells us that The various zero free regions for the zeta function can be viewed as slight improvements to (2); for instance, the classical zero-free region is equivalent to the assertion that vanishes if for some small absolute constant , and the Riemann hypothesis is equivalent to the assertion that for all .Experience has shown that the most important quantity to control here is the exponent , defined as the least constant for which one has an asymptotic
as . Thus, for instance, , and is a non-increasing function of , so we obtain the trivial “von Mangoldt” zero density theorem Of particular interest is the supremal value of , which has to be at least thanks to (3). The density hypothesis asserts that the maximum is in fact exactly , or equivalently that for all . This is of course implied by the Riemann hypothesis (which clearly implies that for all ), but is a more tractable hypothesis to work with; for instance, the hypothesis is already known to hold for by the work of Bourgain (building upon many previous authors). The quantity directly impacts our understanding of the prime number theorem in short intervals; indeed, it is not difficult using (1) (as well as the Vinogradov-Korobov zero-free region) to establish a short interval prime number theorem for all if is a fixed exponent, or for almost all if is a fixed exponent. Until recently, the best upper bound for was , thanks to a 1972 result of Huxley; but this was recently lowered to in a breakthrough work of Guth and Maynard.In between the papers of Huxley and Guth-Maynard are dozens of additional improvements on , though it is only the Guth-Maynard paper that actually lowered the supremum norm . A summary of most of the state of the art before Guth-Maynard may be found in Table 2 of this recent paper of Trudgian and Yang; it is complicated, but it is easy enough to get a computer to illustrate it with a plot:
(For an explanation of what is going on under the assumption of the Lindelöf hypothesis, see below the fold.) This plot represents the combined effort of nearly a dozen papers, each one of which claims one or more components of the depicted piecewise smooth curve, and is written in the “human-readable” style mentioned above, where the argument is arranged to reduce the amount of tedious computation to human-verifiable levels, even if this comes the cost of obscuring the conceptual ideas. (For an animation of how this bound improved over time, see here.) Below the fold, I will try to describe (in sketch form) some of the standard ingredients that go into these papers, in particular the routine reduction of deriving zero density estimates from large value theorems for Dirichlet series. We will not attempt to rewrite the entire literature of zero-density estimates in this fashion, but focus on some illustrative special cases.
I’ve just uploaded to the arXiv my paper “On product representations of squares“. This short paper answers (in the negative) a (somewhat obscure) question of Erdös. Namely, for any , let be the size of the largest subset of with the property that no distinct elements of multiply to a square. In a paper by Erdös, Sárközy, and Sós, the following asymptotics were shown for fixed :
- .
- .
- .
- for .
- for .
- for .
In the end, the argument turned out to be relatively simple; no advanced results from additive combinatorics, graph theory, or analytic number theory were required. I found it convenient to proceed via the probabilistic method (although the more combinatorial technique of double counting would also suffice here). The main idea is to generate a tuple of distinct random natural numbers in which multiply to a square, and which are reasonably uniformly distributed throughout , in that each individual number is attained by one of the random variables with a probability of . If one can find such a distribution, then if the density of is sufficienly close to , it will happen with positive probability that each of the will lie in , giving the claim.
When , this strategy cannot work, as it contradicts the arguments of Erdös, Särközy, and Sós. The reason can be explained as follows. The most natural way to generate a triple of random natural numbers in which multiply to a square is to set
for some random natural numbers . But if one wants all these numbers to have magnitude , one sees on taking logarithms that one would need which by elementary linear algebra forces so in particular each of the would have a factor comparable to . However, it follows from known results on the “multiplication table problem” (how many distinct integers are there in the multiplication table?) that most numbers up to do not have a factor comparable to . (Quick proof: by the Hardy–Ramanujan law, a typical number of size or of size has factors, hence typically a number of size will not factor into two factors of size .) So the above strategy cannot work for .However, the situation changes for larger . For instance, for , we can try the same strategy with the ansatz
Whereas before there were three (approximate) equations constraining three unknowns, now we would have four equations and six unknowns, and so we no longer have strong constraints on any of the . So in principle we now have a chance to find a suitable random choice of the . The most significant remaining obstacle is the Hardy–Ramanujan law: since the typically have prime factors, it is natural in this case to choose each to have prime factors. As it turns out, if one does this (basically by requiring each prime to divide with an independent probability of about , for some small , and then also adding in one large prime to bring the magnitude of the to be comparable to ), the calculations all work out, and one obtains the claimed result.
Earlier this year, I gave a series of lectures at the Joint Mathematics Meetings at San Francisco. I am uploading here the slides for these talks:
- “Machine assisted proof” (Video here)
- “Translational tilings of Euclidean space” (Video here)
- “Correlations of multiplicative functions” (Video here)
I also have written a text version of the first talk, which has been submitted to the Notices of the American Mathematical Society.
I have just uploaded to the arXiv my paper “Monotone non-decreasing sequences of the Euler totient function“. This paper concerns the quantity , defined as the length of the longest subsequence of the numbers from to for which the Euler totient function is non-decreasing. The first few values of are
(OEIS A365339). For instance, because the totient function is non-decreasing on the set or , but not on the set .Since for any prime , we have , where is the prime counting function. Empirically, the primes come quite close to achieving the maximum length ; indeed it was conjectured by Pollack, Pomerance, and Treviño, based on numerical evidence, that one had
for all ; this conjecture is verified up to . The previous best known upper bound was basically of the form as for an explicit constant , from combining results from the above paper with that of Ford or of Maier-Pomerance. In this paper we obtain the asymptotic so in particular . This answers a question of Erdős, as well as a closely related question of Pollack, Pomerance, and Treviño.The methods of proof turn out to be mostly elementary (the most advanced result from analytic number theory we need is the prime number theorem with classical error term). The basic idea is to isolate one key prime factor of a given number which has a sizeable influence on the totient function . For instance, for “typical” numbers , one has a factorization
where is a medium sized prime, is a significantly larger prime, and is a number with all prime factors less than . This leads to an approximation As a consequence, if we temporarily hold fixed, and also localize to a relatively short interval, then can only be non-decreasing in if is also non-decreasing at the same time. This turns out to significantly cut down on the possible length of a non-decreasing sequence in this regime, particularly if is large; this can be formalized by partitioning the range of into various subintervals and inspecting how this (and the monotonicity hypothesis on ) constrains the values of associated to each subinterval. When is small, we instead use a factorization where is very smooth (i.e., has no large prime factors), and is a large prime. Now we have the approximation and we can conclude that will have to basically be piecewise constant in order for to be non-decreasing. Pursuing this analysis more carefully (in particular controlling the size of various exceptional sets in which the above analysis breaks down), we end up achieving the main theorem so long as we can prove the preliminary inequality for all positive rational numbers . This is in fact also a necessary condition; any failure of this inequality can be easily converted to a counterexample to the bound (2), by considering numbers of the form (3) with equal to a fixed constant (and omitting a few rare values of where the approximation (4) is bad enough that is temporarily decreasing). Fortunately, there is a minor miracle, relating to the fact that the largest prime factor of denominator of in lowest terms necessarily equals the largest prime factor of , that allows one to evaluate the left-hand side of (5) almost exactly (this expression either vanishes, or is the product of for some primes ranging up to the largest prime factor of ) that allows one to easily establish (5). If one were to try to prove an analogue of our main result for the sum-of-divisors function , one would need the analogue of (5), which looks within reach of current methods (and was even claimed without proof by Erdos), but does not have a full proof in the literature at present.In the final section of the paper we discuss some near counterexamples to the strong conjecture (1) that indicate that it is likely going to be difficult to get close to proving this conjecture without assuming some rather strong hypotheses. Firstly, we show that failure of Legendre’s conjecture on the existence of a prime between any two consecutive squares can lead to a counterexample to (1). Secondly, we show that failure of the Dickson-Hardy-Littlewood conjecture can lead to a separate (and more dramatic) failure of (1), in which the primes are no longer the dominant sequence on which the totient function is non-decreasing, but rather the numbers which are a power of two times a prime become the dominant sequence. This suggests that any significant improvement to (2) would require assuming something comparable to the prime tuples conjecture, and perhaps also some unproven hypotheses on prime gaps.
Kevin Ford, Dimitris Koukoulopoulos and I have just uploaded to the arXiv our paper “A lower bound on the mean value of the Erdős-Hooley delta function“. This paper complements the recent paper of Dimitris and myself obtaining the upper bound
on the mean value of the Erdős-Hooley delta function In this paper we obtain a lower bound where is an exponent that arose in previous work of result of Ford, Green, and Koukoulopoulos, who showed that for all outside of a set of density zero. The previous best known lower bound for the mean value was due to Hall and Tenenbaum.The point is the main contributions to the mean value of are driven not by “typical” numbers of some size , but rather of numbers that have a splitting
where is the product of primes between some intermediate threshold and and behaves “typically” (so in particular, it has about prime factors, as per the Hardy-Ramanujan law and the Erdős-Kac law, but is the product of primes up to and has double the number of typical prime factors – , rather than – thus is the type of number that would make a significant contribution to the mean value of the divisor function . Here is such that is an integer in the range for some small constant there are basically different values of give essentially disjoint contributions. From the easy inequalities (the latter coming from the pigeonhole principle) and the fact that has mean about one, one would expect to get the above result provided that one could get a lower bound of the form for most typical with prime factors between and . Unfortunately, due to the lack of small prime factors in , the arguments of Ford, Green, Koukoulopoulos that give (1) for typical do not quite work for the rougher numbers . However, it turns out that one can get around this problem by replacing (2) by the more efficient inequality where is an enlarged version of when . This inequality is easily proven by applying the pigeonhole principle to the factors of of the form , where is one of the factors of , and is one of the factors of in the optimal interval . The extra room provided by the enlargement of the range to turns out to be sufficient to adapt the Ford-Green-Koukoulopoulos argument to the rough setting. In fact we are able to use the main technical estimate from that paper as a “black box”, namely that if one considers a random subset of for some small and sufficiently large with each lying in with an independent probability , then with high probability there should be subset sums of that attain the same value. (Initially, what “high probability” means is just “close to “, but one can reduce the failure probability significantly as by a “tensor power trick” taking advantage of Bennett’s inequality.)I have just uploaded to the arXiv my paper “The convergence of an alternating series of Erdős, assuming the Hardy–Littlewood prime tuples conjecture“. This paper concerns an old problem of Erdős concerning whether the alternating series converges, where denotes the prime. The main result of this paper is that the answer to this question is affirmative assuming a sufficiently strong version of the Hardy–Littlewood prime tuples conjecture.
The alternating series test does not apply here because the ratios are not monotonically decreasing. The deviations of monotonicity arise from fluctuations in the prime gaps , so the enemy arises from biases in the prime gaps for odd and even . By changing variables from to (or more precisely, to integers in the range between and ), this is basically equivalent to biases in the parity of the prime counting function. Indeed, it is an unpublished observation of Said that the convergence of is equivalent to the convergence of . So this question is really about trying to get a sufficiently strong amount of equidistribution for the parity of .
The prime tuples conjecture does not directly say much about the value of ; however, it can be used to control differences for and not too large. Indeed, it is a famous calculation of Gallagher that for fixed , and chosen randomly from to , the quantity is distributed according to the Poisson distribution of mean asymptotically if the prime tuples conjecture holds. In particular, the parity of this quantity should have mean asymptotic to . An application of the van der Corput -process then gives some decay on the mean of as well. Unfortunately, this decay is a bit too weak for this problem; even if one uses the most quantitative version of Gallagher’s calculation, worked out in a recent paper of (Vivian) Kuperberg, the best bound on the mean is something like , which is not quite strong enough to overcome the doubly logarithmic divergence of .
To get around this obstacle, we take advantage of the random sifted model of the primes that was introduced in a paper of Banks, Ford, and myself. To model the primes in an interval such as with drawn randomly from say , we remove one random residue class from this interval for all primes up to Pólya’s “magic cutoff” . The prime tuples conjecture can then be intepreted as the assertion that the random set produced by this sieving process is statistically a good model for the primes in . After some standard manipulations (using a version of the Bonferroni inequalities, as well as some upper bounds of Kuperberg), the problem then boils down to getting sufficiently strong estimates for the expected parity of the random sifted set .
For this problem, the main advantage of working with the random sifted model, rather than with the primes or the singular series arising from the prime tuples conjecture, is that the sifted model can be studied iteratively from the partially sifted sets arising from sifting primes up to some intermediate threshold , and that the expected parity of the experiences some decay in . Indeed, once exceeds the length of the interval , sifting by an additional prime will cause to lose one element with probability , and remain unchanged with probability . If concentrates around some value , this suggests that the expected parity will decay by a factor of about as one increases to , and iterating this should give good bounds on the final expected parity . It turns out that existing second moment calculations of Montgomery and Soundararajan suffice to obtain enough concentration to make this strategy work.
Dimitris Koukoulopoulos and I have just uploaded to the arXiv our paper “An upper bound on the mean value of the Erdős-Hooley delta function“. This paper concerns a (still somewhat poorly understood) basic arithmetic function in multiplicative number theory, namely the Erdos-Hooley delta function
where The function measures the extent to which the divisors of a natural number can be concentrated in a dyadic (or more precisely, -dyadic) interval . From the pigeonhole principle, we have the bounds where is the usual divisor function. The statistical behavior of the divisor function is well understood; for instance, if is drawn at random from to , then the mean value of is roughly , the median is roughly , and (by the Erdős-Kac theorem) asymptotically has a log-normal distribution. In particular, there are a small proportion of highly divisible numbers that skew the mean to be significantly higher than the median.On the other hand, the statistical behavior of the Erdős-Hooley delta function is significantly less well understood, even conjecturally. Again drawing at random from to for large , the median is known to be somewhere between and for large – a (difficult) recent result of Ford, Green, and Koukoulopolous (for the lower bound) and La Bretèche and Tenenbaum (for the upper bound). And the mean was even less well controlled; the best previous bounds were
for any , with the lower bound due to Hall and Tenenbaum, and the upper bound a recent result of La Bretèche and Tenenbaum.The main result of this paper is an improvement of the upper bound to
It is still unclear to us exactly what to conjecture regarding the actual order of the mean value.The reason we looked into this problem was that it was connected to forthcoming work of David Conlon, Jacob Fox, and Huy Pham on the following problem of Erdos: what is the size of the largest subset of with the property that no non-empty subset of sums to a perfect square? Erdos observed that one can obtain sets of size (basically by considering certain homogeneous arithmetic progressions), and Nguyen and Vu showed an upper bound of . With our mean value bound as input, together with several new arguments, Conlon, Fox, and Pham have been able to improve the upper bound to .
Let me now discuss some of the ingredients of the proof. The first few steps are standard. Firstly we may restrict attention to square-free numbers without much difficulty (the point being that if a number factors as with squarefree, then ). Next, because a square-free number can be uniquely factored as where is a prime and lies in the finite set of squarefree numbers whose prime factors are less than , and , it is not difficult to establish the bound
The upshot of this is that one can replace an ordinary average with a logarithmic average, thus it suffices to show We actually prove a slightly more refined distributional estimate: for any , we have a bound outside of an exceptional set which is small in the sense that It is not difficult to get from this distributional estimate to the logarithmic average estimate (1) (worsening the exponent to ).To get some intuition on the size of , we observe that if and is the factor of coming from the prime factors less than , then
On the other hand, standard estimates let one establish that for all , and all outside of an exceptional set that is small in the sense (3); in fact it turns out that one can also get an additional gain in this estimate unless is close to , which turns out to be useful when optimizing the bounds. So we would like to approximately reverse the inequalities in (4) and get from (5) to (2), possibly after throwing away further exceptional sets of size (3).At this point we perform another standard technique, namely the moment method of controlling the supremum by the moments
for natural numbers ; it is not difficult to establish the bound and one expects this bound to become essentially sharp once . We will be able to show a moment bound for any for some exceptional set obeying the smallness condition (3) (actually, for technical reasons we need to improve the right-hand side slightly to close an induction on ); this will imply the distributional bound (2) from a standard Markov inequality argument (setting ).The strategy is then to obtain a good recursive inequality for (averages of) . As in the reduction to (1), we factor where is a prime and . One observes the identity
for any ; taking moments, one obtains the identity As in previous literature, one can try to average in here and apply Hölder’s inequality. But it convenient to first use the symmetry of the summand in to reduce to the case of relatively small values of : One can extract out the term as It is convenient to eliminate the factor of by dividing out by the divisor function: This inequality is suitable for iterating and also averaging in and . After some standard manipulations (using the Brun–Titchmarsh and Hölder inequalities), one is able to estimate sums such as in terms of sums such as (assuming a certain monotonicity property of the exceptional set that turns out to hold in our application). By an induction hypothesis and a Markov inequality argument, one can get a reasonable pointwise upper bound on (after removing another exceptional set), and the net result is that one can basically control the sum (6) in terms of expressions such as for various . This allows one to estimate these expressions efficiently by induction.
Tamar Ziegler and I have just uploaded to the arXiv our paper “Infinite partial sumsets in the primes“. This is a short paper inspired by a recent result of Kra, Moreira, Richter, and Robertson (discussed for instance in this Quanta article from last December) showing that for any set of natural numbers of positive upper density, there exists a sequence of natural numbers and a shift such that for all this answers a question of Erdős). In view of the “transference principle“, it is then plausible to ask whether the same result holds if is replaced by the primes. We can show the following results:
Theorem 1
- (i) If the Hardy-Littlewood prime tuples conjecture (or the weaker conjecture of Dickson) is true, then there exists an increasing sequence of primes such that is prime for all .
- (ii) Unconditionally, there exist increasing sequences and of natural numbers such that is prime for all .
- (iii) These conclusions fail if “prime” is replaced by “positive (relative) density subset of the primes” (even if the density is equal to 1).
We remark that it was shown by Balog that there (unconditionally) exist arbitrarily long but finite sequences of primes such that is prime for all . (This result can also be recovered from the later results of Ben Green, myself, and Tamar Ziegler.) Also, it had previously been shown by Granville that on the Hardy-Littlewood prime tuples conjecture, there existed increasing sequences and of natural numbers such that is prime for all .
The conclusion of (i) is stronger than that of (ii) (which is of course consistent with the former being conditional and the latter unconditional). The conclusion (ii) also implies the well-known theorem of Maynard that for any given , there exist infinitely many -tuples of primes of bounded diameter, and indeed our proof of (ii) uses the same “Maynard sieve” that powers the proof of that theorem (though we use a formulation of that sieve closer to that in this blog post of mine). Indeed, the failure of (iii) basically arises from the failure of Maynard’s theorem for dense subsets of primes, simply by removing those clusters of primes that are unusually closely spaced.
Our proof of (i) was initially inspired by the topological dynamics methods used by Kra, Moreira, Richter, and Robertson, but we managed to condense it to a purely elementary argument (taking up only half a page) that makes no reference to topological dynamics and builds up the sequence recursively by repeated application of the prime tuples conjecture.
The proof of (ii) takes up the majority of the paper. It is easiest to phrase the argument in terms of “prime-producing tuples” – tuples for which there are infinitely many with all prime. Maynard’s theorem is equivalent to the existence of arbitrarily long prime-producing tuples; our theorem is equivalent to the stronger assertion that there exist an infinite sequence such that every initial segment is prime-producing. The main new tool for achieving this is the following cute measure-theoretic lemma of Bergelson:
Lemma 2 (Bergelson intersectivity lemma) Let be subsets of a probability space of measure uniformly bounded away from zero, thus . Then there exists a subsequence such that for all .
This lemma has a short proof, though not an entirely obvious one. Firstly, by deleting a null set from , one can assume that all finite intersections are either positive measure or empty. Secondly, a routine application of Fatou’s lemma shows that the maximal function has a positive integral, hence must be positive at some point . Thus there is a subsequence whose finite intersections all contain , thus have positive measure as desired by the previous reduction.
It turns out that one cannot quite combine the standard Maynard sieve with the intersectivity lemma because the events that show up (which roughly correspond to the event that is prime for some random number (with a well-chosen probability distribution) and some shift ) have their probability going to zero, rather than being uniformly bounded from below. To get around this, we borrow an idea from a paper of Banks, Freiberg, and Maynard, and group the shifts into various clusters , chosen in such a way that the probability that at least one of is prime is bounded uniformly from below. One then applies the Bergelson intersectivity lemma to those events and uses many applications of the pigeonhole principle to conclude.
Kaisa Matomäki, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Higher uniformity of arithmetic functions in short intervals I. All intervals“. This paper investigates the higher order (Gowers) uniformity of standard arithmetic functions in analytic number theory (and specifically, the Möbius function , the von Mangoldt function , and the generalised divisor functions ) in short intervals , where is large and lies in the range for a fixed constant (that one would like to be as small as possible). If we let denote one of the functions , then there is extensive literature on the estimation of short sums
and some literature also on the estimation of exponential sums such as for a real frequency , where . For applications in the additive combinatorics of such functions , it is also necessary to consider more general correlations, such as polynomial correlations where is a polynomial of some fixed degree, or more generally where is a nilmanifold of fixed degree and dimension (and with some control on structure constants), is a polynomial map, and is a Lipschitz function (with some bound on the Lipschitz constant). Indeed, thanks to the inverse theorem for the Gowers uniformity norm, such correlations let one control the Gowers uniformity norm of (possibly after subtracting off some renormalising factor) on such short intervals , which can in turn be used to control other multilinear correlations involving such functions.Traditionally, asymptotics for such sums are expressed in terms of a “main term” of some arithmetic nature, plus an error term that is estimated in magnitude. For instance, a sum such as would be approximated in terms of a main term that vanished (or is negligible) if is “minor arc”, but would be expressible in terms of something like a Ramanujan sum if was “major arc”, together with an error term. We found it convenient to cancel off such main terms by subtracting an approximant from each of the arithmetic functions and then getting upper bounds on remainder correlations such as
(actually for technical reasons we also allow the variable to be restricted further to a subprogression of , but let us ignore this minor extension for this discussion). There is some flexibility in how to choose these approximants, but we eventually found it convenient to use the following choices.
- For the Möbius function , we simply set , as per the Möbius pseudorandomness conjecture. (One could choose a more sophisticated approximant in the presence of a Siegel zero, as I did with Joni in this recent paper, but we do not do so here.)
- For the von Mangoldt function , we eventually went with the Cramér-Granville approximant , where and .
- For the divisor functions , we used a somewhat complicated-looking approximant for some explicit polynomials , chosen so that and have almost exactly the same sums along arithmetic progressions (see the paper for details).
The objective is then to obtain bounds on sums such as (1) that improve upon the “trivial bound” that one can get with the triangle inequality and standard number theory bounds such as the Brun-Titchmarsh inequality. For and , the Siegel-Walfisz theorem suggests that it is reasonable to expect error terms that have “strongly logarithmic savings” in the sense that they gain a factor of over the trivial bound for any ; for , the Dirichlet hyperbola method suggests instead that one has “power savings” in that one should gain a factor of over the trivial bound for some . In the case of the Möbius function , there is an additional trick (introduced by Matomäki and Teräväinen) that allows one to lower the exponent somewhat at the cost of only obtaining “weakly logarithmic savings” of shape for some small .
Our main estimates on sums of the form (1) work in the following ranges:
- For , one can obtain strongly logarithmic savings on (1) for , and power savings for .
- For , one can obtain weakly logarithmic savings for .
- For , one can obtain power savings for .
- For , one can obtain power savings for .
Conjecturally, one should be able to obtain power savings in all cases, and lower down to zero, but the ranges of exponents and savings given here seem to be the limit of current methods unless one assumes additional hypotheses, such as GRH. The result for correlation against Fourier phases was established previously by Zhan, and the result for such phases and was established previously by by Matomäki and Teräväinen.
By combining these results with tools from additive combinatorics, one can obtain a number of applications:
- Direct insertion of our bounds in the recent work of Kanigowski, Lemanczyk, and Radziwill on the prime number theorem on dynamical systems that are analytic skew products gives some improvements in the exponents there.
- We can obtain a “short interval” version of a multiple ergodic theorem along primes established by Frantzikinakis-Host-Kra and Wooley-Ziegler, in which we average over intervals of the form rather than .
- We can obtain a “short interval” version of the “linear equations in primes” asymptotics obtained by Ben Green, Tamar Ziegler, and myself in this sequence of papers, where the variables in these equations lie in short intervals rather than long intervals such as .
We now briefly discuss some of the ingredients of proof of our main results. The first step is standard, using combinatorial decompositions (based on the Heath-Brown identity and (for the result) the Ramaré identity) to decompose into more tractable sums of the following types:
- Type sums, which are basically of the form for some weights of controlled size and some cutoff that is not too large;
- Type sums, which are basically of the form for some weights , of controlled size and some cutoffs that are not too close to or to ;
- Type sums, which are basically of the form for some weights of controlled size and some cutoff that is not too large.
The precise ranges of the cutoffs depend on the choice of ; our methods fail once these cutoffs pass a certain threshold, and this is the reason for the exponents being what they are in our main results.
The Type sums involving nilsequences can be treated by methods similar to those in this previous paper of Ben Green and myself; the main innovations are in the treatment of the Type and Type sums.
For the Type sums, one can split into the “abelian” case in which (after some Fourier decomposition) the nilsequence is basically of the form , and the “non-abelian” case in which is non-abelian and exhibits non-trivial oscillation in a central direction. In the abelian case we can adapt arguments of Matomaki and Shao, which uses Cauchy-Schwarz and the equidistribution properties of polynomials to obtain good bounds unless is “major arc” in the sense that it resembles (or “pretends to be”) for some Dirichlet character and some frequency , but in this case one can use classical multiplicative methods to control the correlation. It turns out that the non-abelian case can be treated similarly. After applying Cauchy-Schwarz, one ends up analyzing the equidistribution of the four-variable polynomial sequence
as range in various dyadic intervals. Using the known multidimensional equidistribution theory of polynomial maps in nilmanifolds, one can eventually show in the non-abelian case that this sequence either has enough equidistribution to give cancellation, or else the nilsequence involved can be replaced with one from a lower dimensional nilmanifold, in which case one can apply an induction hypothesis.For the type sum, a model sum to study is
which one can expand as We experimented with a number of ways to treat this type of sum (including automorphic form methods, or methods based on the Voronoi formula or van der Corput’s inequality), but somewhat to our surprise, the most efficient approach was an elementary one, in which one uses the Dirichlet approximation theorem to decompose the hyperbolic region into a number of arithmetic progressions, and then uses equidistribution theory to establish cancellation of sequences such as on the majority of these progressions. As it turns out, this strategy works well in the regime unless the nilsequence involved is “major arc”, but the latter case is treatable by existing methods as discussed previously; this is why the exponent for our result can be as low as .In a sequel to this paper (currently in preparation), we will obtain analogous results for almost all intervals with in the range , in which we will be able to lower all the way to .
Recent Comments