An abridged proof of Marton’s conjecture

22 June, 2024 in expository, math.CO | Tags: Ben Green, Freddie Manners, Luca Trevisan, Polynomial Freiman-Ruzsa conjecture, Shannon entropy, Timothy Gowers | by Terence Tao

[This post is dedicated to Luca Trevisan, who recently passed away due to cancer. Though far from his most significant contribution to the field, I would like to mention that, as with most of my other blog posts on this site, this page was written with the assistance of Luca’s LaTeX to WordPress converter. Mathematically, his work and insight on pseudorandomness in particular have greatly informed how I myself think about the concept. – T.]

Recently, Timothy Gowers, Ben Green, Freddie Manners, and I were able to establish the following theorem:

Theorem 1 (Marton’s conjecture) Let ${A \subset {\bf F}_2^n}$ be non-empty with ${|A+A| \leq K|A|}$ . Then there exists a subgroup ${H}$ of ${{\bf F}_2^n}$ with ${|H| \leq |A|}$ such that ${A}$ is covered by at most ${2K^C}$ translates of ${H}$ , for some absolute constant ${C}$ .

We established this result with ${C=12}$ , although it has since been improved to ${C=9}$ by Jyun-Jie Liao.

Our proof was written in order to optimize the constant ${C}$ as much as possible; similarly for the more detailed blueprint of the proof that was prepared in order to formalize the result in Lean. I have been asked a few times whether it is possible to present a streamlined and more conceptual version of the proof in which one does not try to establish an explicit constant ${C}$ , but just to show that the result holds for some constant ${C}$ . This is what I will attempt to do in this post, though some of the more routine steps will be outsourced to the aforementioned blueprint.

The key concept here is that of the entropic Ruzsa distance ${d[X;Y]}$ between two random variables ${X,Y}$ taking values ${{\bf F}_2^n}$ , defined as

$\displaystyle d[X;Y] := {\mathbf H}[X'+Y'] - \frac{1}{2} {\mathbf H}[X] - \frac{1}{2} {\mathbf H}[Y]$

where ${X',Y'}$ are independent copies of ${X,Y}$ , and ${{\mathbf H}[X]}$ denotes the Shannon entropy of ${X}$ . This distance is symmetric and non-negative, and obeys the triangle inequality

$\displaystyle d[X;Z] \leq d[X;Y] + d[Y;Z]$

for any random variables ${X,Y,Z}$ ; see the blueprint for a proof. The above theorem then follows from an entropic analogue:

Theorem 2 (Entropic Marton’s conjecture) Let ${X}$ be a ${{\bf F}_2^n}$ -valued random variable with ${d[X;X] \leq \log K}$ . Then there exists a uniform random variable ${U_H}$ on a subgroup ${H}$ of ${{\bf F}_2^n}$ such that ${d[X; U_H] \leq C \log K}$ for some absolute constant ${C}$ .

We were able to establish Theorem 2 with ${C=11}$ , which implies Theorem 1 with ${C=12}$ by fairly standard additive combinatorics manipulations (such as the Ruzsa covering lemma); see the blueprint for details.

The key proposition needed to establish Theorem 2 is the following distance decrement property:

Proposition 3 (Distance decrement) If ${X,Y}$ are ${{\bf F}_2^n}$ -valued random variables, then one can find ${{\bf F}_2^n}$ -valued random variables ${X',Y'}$ such that
$\displaystyle d[X';Y'] \leq (1-\eta) d[X;Y]$
and
$\displaystyle d[X;X'], d[Y,Y'] \leq C d[X;Y]$
for some absolute constants ${C, \eta > 0}$ .

Indeed, suppose this proposition held. Starting with ${X,Y}$ both equal to ${X}$ and iterating, one can then find sequences of random variables ${X_n, Y_n}$ with ${X_0=Y_0=X}$ ,

$\displaystyle d[X_n;Y_n] \leq (1-\eta)^n d[X;X],$

and

$\displaystyle d[X_{n+1};X_n], d[Y_{n+1};Y_n] \leq C (1-\eta)^n d[X;X].$

In particular, from the triangle inequality and geometric series

$\displaystyle d[X_n;X], d[Y_n;X] \leq \frac{C}{\eta} d[X;X].$

By weak compactness, some subsequence of the ${X_n}$ , ${Y_n}$ converge to some limiting random variables ${X_\infty, Y_\infty}$ , and by some simple continuity properties of entropic Ruzsa distance, we conclude that

$\displaystyle d[X_\infty;Y_\infty] = 0$

and

$\displaystyle d[X_\infty;X], d[Y_\infty;X] \leq \frac{C}{\eta} d[X;X].$

Theorem 2 then follows from the “100% inverse theorem” for entropic Ruzsa distance; see the blueprint for details.

To prove Proposition 3, we can reformulate it as follows:

Proposition 4 (Lack of distance decrement implies vanishing) If ${X,Y}$ are ${{\bf F}_2^n}$ -valued random variables, with the property that
$\displaystyle d[X';Y'] > d[X;Y] - \eta ( d[X;Y] + d[X';X] + d[Y',Y] ) \ \ \ \ \ (1)$
for all ${{\bf F}_2^n}$ -valued random variables ${X',Y'}$ and some sufficiently small absolute constant ${\eta > 0}$ , then one can derive a contradiction.

Indeed, we may assume from the above proposition that

$\displaystyle d[X';Y'] \leq d[X;Y] - \eta ( d[X; Y] + d[X';X] + d[Y',Y] )$

for some ${X',Y'}$ , which will imply Proposition 3 with ${C = 1/\eta}$ .

The entire game is now to use Shannon entropy inequalities and “entropic Ruzsa calculus” to deduce a contradiction from (1) for ${\eta}$ small enough. This we will do below the fold, but before doing so, let us first make some adjustments to (1) that will make it more useful for our purposes. Firstly, because conditional entropic Ruzsa distance (see blueprint for definitions) is an average of unconditional entropic Ruzsa distance, we can automatically upgrade (1) to the conditional version

$\displaystyle d[X'|Z;Y'|W] \geq d[X;Y] - \eta ( d[X;Y] + d[X'|Z;X] + d[Y'|W;Y] )$

for any random variables ${Z,W}$ that are possibly coupled with ${X',Y'}$ respectively. In particular, if we define a “relevant” random variable ${X'}$ (conditioned with respect to some auxiliary data ${Z}$ ) to be a random variable for which

$\displaystyle d[X'|Z;X] = O( d[X;Y] )$

or equivalently (by the triangle inequality)

$\displaystyle d[X'|Z;Y] = O( d[X;Y] )$

then we have the useful lower bound

$\displaystyle d[X'|Z;Y'|W] \geq (1-O(\eta)) d[X;Y] \ \ \ \ \ (2)$

whenever ${X'}$ and ${Y'}$ are relevant conditioning on ${Z, W}$ respectively. This is quite a useful bound, since the laws of “entropic Ruzsa calculus” will tell us, roughly speaking, that virtually any random variable that we can create from taking various sums of copies of ${X,Y}$ and conditioning against other sums, will be relevant. (Informally: the space of relevant random variables is ${(1-O(\eta))d[X;Y]}$ -separated with respect to the entropic Ruzsa distance.)

— 1. Main argument —

Now we derive more and more consequences of (2) – at some point crucially using the hypothesis that we are in characteristic two – before we reach a contradiction.

Right now, our hypothesis (2) only supplies lower bounds on entropic distances. The crucial ingredient that allows us to proceed is what we call the fibring identity, which lets us convert these lower bounds into useful upper bounds as well, which in fact match up very nicely when ${\eta}$ is small. Informally, the fibring identity captures the intuitive fact that the doubling constant of a set ${A}$ should be at least as large as the doubling constant of the image ${\pi(A)}$ of that set under a homomorphism, times the doubling constant of a typical fiber ${A \cap \pi^{-1}(\{z\})}$ of that homomorphism; and furthermore, one should only be close to equality if the fibers “line up” in some sense.

Here is the fibring identity:

Proposition 5 (Fibring identity) Let ${\pi: G \rightarrow H}$ be a homomorphism. Then for any independent ${G}$ -valued random variables ${X, Y}$ , one has
$\displaystyle d[X;Y] = d[\pi(X); \pi(Y)] + d[X|\pi(X); Y|\pi(Y)]$

$\displaystyle + I[X-Y : \pi(X),\pi(Y) | \pi(X)-\pi(Y) ].$

The proof is of course in the blueprint, but given that it is a central pillar of the argument, I reproduce it here.

Proof: Expanding out the definition of Ruzsa distance, and using the conditional entropy chain rule

$\displaystyle {\mathbf H}[X] = {\mathbf H}[\pi(X)] + {\mathbf H}[X|\pi(X)]$

and

$\displaystyle {\mathbf H}[Y] = {\mathbf H}[\pi(Y)] + {\mathbf H}[Y|\pi(Y)],$

it suffices to establish the identity

$\displaystyle {\mathbf H}[X-Y] = {\mathbf H}[\pi(X)-\pi(Y)] + {\mathbf H}[X - Y|\pi(X), \pi(Y)]$

$\displaystyle + I[X-Y : \pi(X),\pi(Y) | \pi(X)-(Y) ].$

But from the chain rule again we have

$\displaystyle {\mathbf H}[X-Y] = {\mathbf H}[\pi(X)-\pi(Y)] + {\mathbf H}[X - Y|\pi(X)-\pi(Y)]$

and from the definition of conditional mutual information (using the fact that ${\pi(X)-\pi(Y)}$ is determined both by ${X-Y}$ and by ${(\pi(X),\pi(Y))}$ ) one has

$\displaystyle {\mathbf H}[X - Y|\pi(X)-\pi(Y)] = {\mathbf H}[X - Y|\pi(X), \pi(Y)]$

$\displaystyle + I[X-Y : \pi(X),\pi(Y) | \pi(X)-(Y) ]$

giving the claim. $\Box$

We will only care about the characteristic ${2}$ setting here, so we will now assume that all groups involved are ${2}$ -torsion, so that we can replace all subtractions with additions. If we specialize the fibring identity to the case where ${G = {\bf F}_2^n \times {\bf F}_2^n}$ , ${H = {\bf F}_2^n}$ , ${\pi: G \rightarrow H}$ is the addition map ${\pi(x,y) = x+y}$ , and ${X = (X_1, X_2)}$ , ${Y = (Y_1, Y_2)}$ are pairs of independent random variables in ${{\bf F}_2^n}$ , we obtain the following corollary:

Corollary 6 Let ${X_1,X_2,Y_1,Y_2}$ be independent ${{\bf F}_2^n}$ -valued random variables. Then we have the identity
$\displaystyle d[X_1;Y_1] + d[X_2;Y_2] = d[X_1+X_2;Y_1+Y_2]$

$\displaystyle + d[X_1|X_1+X_2;Y_1|Y_1+Y_2]$

$\displaystyle + I[(X_1+Y_1, X_2+Y_2) : (X_1+X_2,Y_1+Y_2) | X_1+X_2+Y_1+Y_2 ].$

This is a useful and flexible identity, especially when combined with (2). For instance, we can discard the conditional mutual information term as being non-negative, to obtain the inequality

$\displaystyle d[X_1;Y_1] + d[X_2;Y_2] \geq d[X_1+X_2;Y_1+Y_2]$

$\displaystyle + d[X_1|X_1+X_2;Y_1|Y_1+Y_2].$

If we let ${X_1, Y_1, X_2, Y_2}$ be independent copies of ${X, Y, Y, X}$ respectively (note the swap in the last two variables!) we obtain

$\displaystyle 2 d[X;Y] \geq d[X+Y;X+Y] + d[X_1|X_1+X_2;Y_1|Y_1+Y_2].$

From entropic Ruzsa calculus, one can check that ${X+Y}$ , ${X_1|X_1+X_2}$ , and ${Y_1|Y_1+Y_2}$ are all relevant random variables, so from (2) we now obtain both upper and lower bounds for ${d[X+Y;X+Y]}$ :

$\displaystyle d[X+Y; X+Y] = (1 + O(\eta)) d[X;Y].$

A pleasant upshot of this is that we now get to work in the symmetric case ${X=Y}$ without loss of generality. Indeed, if we set ${X^* := X+Y}$ , we now have from (2) that

$\displaystyle d[X'|Z; Y'|W] \geq (1-O(\eta)) d[X^*;X^*] \ \ \ \ \ (3)$

whenever ${X'|Z, Y'|W}$ are relevant, which by entropic Ruzsa calculus is equivalent to asking that

$\displaystyle d[X'|Z; X^*], d[Y'|W; X^*] = O(d[X^*;X^*]).$

Now we use the fibring identity again, relabeling ${Y_1,Y_2}$ as ${X_3,X_4}$ and requiring ${X_1,X_2,X_3,X_4}$ to be independent copies of ${X^*}$ . We conclude that

$\displaystyle 2d[X^*; X^*] = d[X_1+X_2;X_3+Y_4] + d[X_1|X_1+X_2;X_3|X_1+X_4]$

$\displaystyle + I[(X_1+X_3, X_2+X_4) : (X_1+X_2,X_3+X_4) | X_1+X_2+X_3+X_4 ].$

As before, the random variables ${X_1+X_2}$ , ${X_3+X_4}$ , ${X_1|X_1+X_2}$ , ${X_3|X_3+X_4}$ are all relevant, so from (3) we have

$\displaystyle d[X_1+X_2;X_3+X_4], d[X_1|X_1+X_2;X_3|X_1+X_4]$

$\displaystyle \geq (1-O(\eta)) d[X^*;X^*].$

We could now also match these lower bounds with upper bounds, but the more important takeaway from this analysis is a really good bound on the conditional mutual information:

$\displaystyle I[(X_1+X_3, X_2+X_4) : (X_1+X_2,X_3+X_4) | X_1+X_2+X_3+X_4 ]$

$\displaystyle = O(\eta) d[X^*;X^*].$

By the data processing inequality, we can discard some of the randomness here, and conclude

$\displaystyle I[X_1+X_3 : X_1+X_2 | X_1+X_2+X_3+X_4 ] = O(\eta) d[X^*;X^*].$

Let us introduce the random variables

$\displaystyle S := X_1+X_2+X_3+X_4; U := X_1+X_2; V = X_1 + X_3$

then we have

$\displaystyle I[U : V | S] = O(\eta) d[X^*;X^*].$

Intuitively, this means that ${U}$ and ${V}$ are very nearly independent given ${S}$ . For sake of argument, let us assume that they are actually independent; one can achieve something resembling this by invoking the entropic Balog-Szemerédi-Gowers theorem, established in the blueprint, after conceding some losses of ${O(\eta) d[X^*,X^*]}$ in the entropy, but we skip over the details for this blog post. The key point now is that because we are in characteristic ${2}$ , ${U+V}$ has the same form as ${U}$ or ${V}$ :

$\displaystyle U + V = X_2 + X_3.$

In particular, by permutation symmetry, we have

$\displaystyle {\mathbf H}[U+V|S] ={\mathbf H}[U|S] ={\mathbf H}[V|S],$

and so by the definition of conditional Ruzsa distance we have a massive distance decrement

$\displaystyle {\bf E}_s d[U|S=s; V|S=s] = 0,$

(where ${s}$ is drawn from the distribution of ${S}$ ), contradicting (1) as desired. (In reality, we end up decreasing the distance not all the way to zero, but instead to ${O(\eta d[X^*,X^*])}$ due to losses in the Balog-Szemerédi-Gowers theorem, but this is still enough to reach a contradiction. The quantity ${{\bf E}_s d[U|S=s; V|S=s]}$ is very similar to ${d[U|S; V|S]}$ , but is slightly different; the latter quantity is ${{\bf E}_{s,s'}d[U|S=s; V|S=s']}$ .)

Remark 7 A similar argument works in the ${m}$ -torsion case for general ${m}$ . Instead of decrementing the entropic Ruzsa distance, one instead decrements a “multidistance”
$\displaystyle {\mathbf H}[X_1 + \dots + X_m] - \frac{1}{m} ({\mathbf H}[X_1] + \dots + {\mathbf H}[X_m])$
for independent ${X_1,\dots,X_m}$ . By an iterated version of the fibring identity, one can first reduce again to the symmetric case where the random variables are all copies of the same variable ${X^*}$ . If one then takes ${X_{i,j}}$ , ${i,j=1,\dots,m}$ to be an array of ${m^2}$ copies of ${X^*}$ , one can get to the point where the row sums ${\sum_i X_{i,j}}$ and the column sums ${\sum_j X_{i,j}}$ have small conditional mutual information with respect to the double sum ${S := \sum_i \sum_j X_{i,j}}$ . If we then set ${U := \sum_i \sum_j j X_{i,j}}$ and ${V := \sum_i \sum_j i X_{i,j}}$ , the data processing inequality again shows that ${U}$ and ${V}$ are nearly independent given ${S}$ . The ${m}$ -torsion now crucially intervenes as before to ensure that ${U+V = \sum_i \sum_j (i+j) X_{i,j}}$ has the same form as ${U}$ or ${V}$ , leading to a contradiction as before. See this previous blog post for more discussion.

18 comments

Comments feed for this article

23 June, 2024 at 2:35 am

Anonymous

Is there a conjectured value for the best exponent $C$ ?

23 June, 2024 at 10:59 am

Terence Tao

By considering Hamming balls, it is known that the best constant $C$ for Theorem 1 is at least $1.46601\dots$ ; see Theorem 1.8 (or Theorem 1.10) of https://arxiv.org/abs/math/0703668. In particular the ratio between the best upper and lower bound here is approximately 6. At a recent AIM workshop there was an effort to establish a corresponding lower bound for Theorem 2, but I don’t know what the end result of their calculation was (and whether the ratio is better).

In the absence of any better construction, one could tentatively conjecture that the Hamming ball counterexamples are in fact the optimal ones, but the evidence for such a conjecture is quite scant at present.

23 June, 2024 at 2:06 pm

Anonymous

(This is Zach Hunter. I worked on this at AIM with Hans Yu.)

The AIM project suggested that a “continuous Niveau set” was optimal. Specifically, choose $p$ so that $h(p) = 1/2$ (here $h$ is the binary entropy function). Then letting $C = 2h(2p(1-p))$ should be optimal for Theorem 2 (this can be achieved in one dimension by considering the distribution which takes one value with probability $p$ and the other value with probability $1-p$ ; it can also be achieved in higher dimensions by “tensorizing” the 1-dimensional optimum in various ways (in general it is not clear how to compute things for arbitrary tensorizations, but due to nice properties of the 1-dimensional optimum, lots of things cancel out)). Note that this actually gives a slightly worse constant $C \approx 1.427$ , but Theorem 2 is slightly easier (we don’t have have that the subspace is smaller than $A$ , which affects things).

There is even a reasonable hope to prove that $C$ is the best value here, at least for random variables that behaves like “continuous downsets” (extending the analysis from Terry’s paper with Ben which he linked). We had a moral skeleton of what to do. I’ve been busy writing other papers lately, but hopefully in the next month I will find time to flesh out the skeleton.

23 June, 2024 at 5:43 pm

Anonymous

flesh out the skeleton. mwhahahaha

24 June, 2024 at 10:00 am

Terence Tao

Thanks for the update! The best known upper bound for Theorem 2 currently is $C=10$ (see Theorem 4 of https://arxiv.org/abs/2404.09639) so we now have the curious situation in which the ratio between the best bounds for the entropy version of the theorem, being about 7.04, is actually slightly worse than the best bounds for the combinatorial version, which is about 6.14. This is because the best upper bound of Theorem 1 is not actually derived from the best upper bound of Theorem 2, but by a more technical variant of Theorem 2 that also involves replacing some applications of the Shannon entropy in the quantity to be minimized by the Kullback-Leibler divergence instead to avoid some losses of the doubling constant.

In any case, it seems like a worthwhile project to understand how entropy and downsets interact, and may help clarify exactly what steps in the entropy argument are still lossy. One thing we noticed while optimizing the constant $C$ from our initial value (which was about 500 or so, and very broadly speaking followed lines similar to that outlined in this post) to our final value of 12 was that almost every invocation of the Ruzsa triangle inequality seemed to be inefficient, and a lot of our quantitative gains came from eliminating the use of this inequality as much as possible, for instance by using the Kaimonovich-Vershik-Madiman inequality as a partial substitute (although this one also seems to be a little lossy in the most critical scenarios, for instance it is not sharp in the gaussian case). I suspect that the entropic Balog-Szemeredi-Gowers lemma is the main remaining source of inefficiency, though I don’t know how to avoid using it.

25 June, 2024 at 7:36 am

Zach Hunter

Sorry, we should have $C=(1/4)/(h(2p(1-p))-h(p)) \approx 1.17$ . (I had forgotten how the actual numbers work)

The point is that $p$ is chosen so that $d[X;U_H]$ is the same for the two possible subspace $H$ . This means that $H[X]-(1/2)(H[X]+0)= 1-(1/2)(H[X]+1)$ . Solving that gives $H[X] = h(p)= 1/2$ , making the distance to both subspaces $1/4$ . From there, you compute $d[X;X] = h(2p(1-p))-h(p)$ , and get $C$ .

24 June, 2024 at 5:35 pm

Anonymous

“Mathematically his work and insight on pseudorandomness“.. can you elaborate on this?

25 June, 2024 at 9:59 am

Terence Tao

The work of Luca that most directly influenced my own thinking was his simplified proof (with Tulsiani, Reingold, and Vadhan) of the transference principle of Ben Green and myself (later formalized more explicitly in a paper of Tamar Ziegler and myself), using the min-max theorem from game theory, and using thresholding to manipulate the structured component of a function; these were tools that I was not previously familiar with (though Gowers independently obtained a similar simplification using language more familiar to me, for instance using the Hahn-Banach theorem in place of the min-max theorem). More broadly, through his blog, I learned about other ways of thinking that were standard in TCS (e.g., viewing pseudorandomness as the ability to “fool” various tests for randomness) or theorems in TCS (such as Impagliazzo’s “hard core” theorem) that I was also unaware of.

25 June, 2024 at 5:42 pm

Anonymous

Thank you.

28 June, 2024 at 2:05 pm

Anonymous

Is there are application of Marton’s conjecture to information theory? Asking this as she worked on information theory. Is there any motivation from information theory as well?

2 July, 2024 at 7:28 am

Terence Tao

From what I can tell from Lovett’s survey, Marton was motivated by the “two help one” problem, in which two people (say Alice and Bob) both possess two long bit strings, and one wants to compute the bitwise XOR of these strings (i.e., the positions in which they differ), to reasonable accuracy, with as little communication as possible. However, I do not know the details of this connection.

2 July, 2024 at 7:43 am

Anonymous

ho ho, oh so cohomologous

2 July, 2024 at 9:18 am

Anonymous

Thank you. The paper talks about correlated strings which is a nightmare scenario. For independent strings, I bet the answer is known trivially as joint entropy.

6 July, 2024 at 2:42 am

Anonymous

How essential would you say working in $2$-torsion (or $m$-torsion) is here? It’s hard to tell from this whether it is something technical that could be solved without changing the overall strategy or whether it is something more essential.

7 July, 2024 at 9:10 am

Terence Tao

Well, one reason is that the form of PFR considered here is simply false in the unbounded torsion setting, which typically has very few finite subgroups. For instance, in the integers ${\bf Z}$ , a long arithmetic progression such as $\{1,\dots,N\}$ has bounded doubling, but is far from any finite subgroup of ${\bf Z}$ (of which there is only the trivial group $\{0\}$ ). In the entropic version of the problem, the key counterexample comes from discrete gaussians spread out over a similarly large arithmetic progression, e.g., a random variable on the integers with probability distribution $n \mapsto \frac{1}{Z_N} e^{-n^2/N^2}$ for some large $N$ and suitable normalizing constant $Z_N$ (which is comparable to $N$ ). These gaussians appear to be “local minimizers” of entropic Ruzsa distance: any reasonable operation one does with a gaussian, such as taking the sum of two or more independent copies and conditioning against some other sum, produces another gaussian with essentially the same entropic doubling constant. So any analogue of the “endgame” in the integer setting has to somehow be able to “detect” these gaussians; something resembling an “inverse theorem” to the effect that local minimizers of entropic doubling constants must be “essentially gaussian” would be needed, I think, to allow this approach to PFR to succeed in the integer setting.

8 July, 2024 at 1:19 am

Mikko Pitkonen

Is there a way to compute U_H in polynomial time in Theorem 2?

8 July, 2024 at 9:18 am

Terence Tao

It may depend on what one means by “polynomial time”. But assuming that, given a random variable $X$ of a certain size and doubling constant, one can “compute” further random variables such as $X_1+X_2$ (or $X_1+X_2$ conditioned on a value of $X_1 + X_2 + X_3 + X_4$ ) in “polynomial time”, and in particular compute their entropies in “polynomial time” as well, then the proof provides a reasonable algorithm: given a random variable of a doubling constant $\log K$ , the argument provides a bounded list of (conditioned) random variables of a form such as the ones indicated above, at least one of which is guaranteed to have a doubling constant less than or equal to $(1-\eta) \log K$ for some absolute constant $\eta > 0$ , and which is relatively close in Ruzsa distance to the original random variable (in particular, the entropy is not that much larger). After computing the entropy of each of these candidates, one then isolates one candidate that does achieve the improvement in the doubling constant, and iterates this procedure about $O(\log\log K)$ times, until one arrives at a random variable with doubling constant less than $\varepsilon$ for some small absolute constant $\varepsilon$ . This is now very close to a uniform distribution on a subgroup $H$ and it should be possible to locate $H$ in “polynomial time” (but there may be subtleties depending on whether one wants $H$ to be described via a membership oracle, or by an explicit list of elements).

9 July, 2024 at 11:56 pm

Mikko Pitkonen

Thank you for your reply, this really clarified things for me. In my application $log K = poly(n)$ and I need some kind of lower bound for the circuit complexity. There are some uniform lower bounds for approximating the entropy of a random variable, but I don’t know if computing entropies is necessary for computing $U_H$ .

	Jianfeng Li on An introduction to measure…
	Jianfeng Li on An introduction to measure…
	Jianfeng Li on An introduction to measure…
	Jianfeng Li on An introduction to measure…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on Dense sets of natural numbers…
	Anonymous on Dense sets of natural numbers…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on Dense sets of natural numbers…
	Fabrice Imparato on Dense sets of natural numbers…
	Anonymous on A computation-outsourced discu…
	Anonymous on Dense sets of natural numbers…
	Anonymous on 245A, Notes 3: Integration on…
	Samuel German on Analysis I

An abridged proof of Marton’s conjecture

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

18 comments

Leave a comment Cancel reply

For commenters

An abridged proof of Marton’s conjecture

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

18 comments

Leave a comment Cancel reply

For commenters