Zulip Chat Archive

Stream: Polynomial Freiman-Ruzsa conjecture

Topic: New version of "outstanding tasks"


Terence Tao (Nov 22 2023 at 20:07):

A lot of progress in the last 48 hours! Most of the tasks that were outstanding in the previous list are now done, so here I will collect some new tasks for people to claim, with some brief comments on them. If you want to claim one or more of these lemmas, please do so by replying in this thread. The previous update thread can be found here.

Existing claims:

  1. @Jonas Bayer claims Bounded entropy implies concentration
  2. @Ben Eltschig has established Relating conditional Ruzsa distance to Ruzsa distance
  3. @Yaël Dillies and @Bhavik Mehta claim Entropic Balog-Szemeredi-Gowers
  4. @Kalle Kytölä has established tau has a minimizer
  5. @Rémy Degenne has completed conditioned Ruzsa distance and claims Alternate form of distance

Outstanding tasks:

  1. General fibring identity This will be an application of a large number of the entropic Ruzsa calculus lemmas, a bit similar to how the Kaimonovich-Vershik lemma is proven for instance, or the improved Ruzsa triangle inequality. Established by @Aaron Anderson
  2. Special case of Fibring identity Again, mostly a large number of applications of preceding entropy identities. Claimed by @Aaron Anderson
  3. Existence of independent copies We currently have a full proof of existence of two independent copies, and can deduce three independent copies from the general case, but don't have the general case yet (also we will later need a version with four independent copies to deduce from the general case). Established by @Floris van Doorn
  4. Entropy of uniform random variable Should be doable now that the definition of a uniform random variable is formalized. Needed to prove PFR Claimed by @Paul Lezeau
  5. Existence of uniform random variable Should be doable now that the definition of a uniform random variable is formalized. Needed to prove PFR Established by @Sebastien Gouezel
  6. Polynomial Freiman-Ruzsa conjecture (i.e. the final theorem!) This is a moderately challenging derivation from existing lemmas, including items 4 and 5 above, the entropy version of PFR, and the "bounded entropy implies concentration" lemma. Most of these are not yet proven, but it should be possible to locally prove this result now based on these black boxes. Claimed by @Sebastien Gouezel
  7. Bound on conditional mutual informations We are not close to proving the prerequisites for this lemma yet, but the derivation of this lemma from previous results is basically just combining together linear inequalities and should be straightforward. @Kyle Miller reduced matters to establishing a symmetry identity I2=I3I_2 = I_3, but this latter identity still needs to be proven and this task is currently available to claim.
  8. Existence of conditionally independent trials This is a variant of Existence of independent copies that is needed for the Balog-Szemeredi-Gowers lemma (so it may make sense for whoever is working on this to coordinate with Yael and Bhavik on this).
  9. Fibring identity for first estimate This should be an immediate application of Special case of Fibring identity
  10. Lower bound on distances This should be an immediate application of Distance lower bound Established by @Kyle Miller
  11. Lower bound on conditional distances This should be an immediate application of Conditional distance lower bound Established by @Kyle Miller
  12. Upper bound on distance differences This should be an immediate application of Comparison of Ruzsa distances, I Established by @Arend Mellendijk
  13. First estimate This should be a linear combination of inequalities 9-12.
  14. Entropy bound on quadruple sum This is a different linear combination of inequalities 9-12.
  15. Conditional distance lower bound Currently blocked due to the lack of definition of definition of conditional distance, but once this definition is in place, this inequality should in principle follow easily from Distance lower bound.
  16. Bound on distance increments This should be a straightforward linear combination of Entropy bound on quadruple sum and Comparison of Ruzsa distances, II
  17. Distance between sums - This should be a routine linear combination of Distance lower bound and Comparison of Ruzsa distances, I.
  18. Second estimate This is a linear combination of several other lemmas, including #2, #14, #15, #17, and Comparison of Ruzsa distances, I.
  19. tau decrement This combines a number of difficult estimates to produce a new pair of random variables with a smaller tau-functional than the initial pair. While this is a substantial result in the paper, the derivation from previous lemmas is relatively straightforward and this result may only be of moderate difficulty to formalize.
  20. Constructing good variables This is a technical lemma that is not yet ready to be formalized, because it is awaiting the formalization of Entropic Balog-Szemeredi-Gowers, but is included here for sake of completeness.
  21. Conditional independence This was previously incorrectly marked as done, but actually we don't have a valid definition yet, which is holding up some other tasks. Established by @Terence Tao
  22. Vanishing conditional mutual information This was previously incorrectly marked as done, but actually we didn't have a valid proof yet (because we lacked a definition of #21). But it should follow readily from Vanishing of mutual information. Established by @Terence Tao

We're getting into more challenging territory now (in particular, moving out of the appendices of the paper and into the main body of argument), so there is less "low hanging fruit" available unfortunately. Also some of the lemmas may require some additional hypotheses (e.g., measurability) that are not currently included in the statement: if you do modify the lemma to incorporate these hypotheses, you may have to check that any results that rely on the lemma are similarly updated.

Paul Lezeau (Nov 22 2023 at 20:12):

I'd like to claim the entropy of a uniform random variable :)

Kevin Buzzard (Nov 22 2023 at 20:13):

Just to remark that I think these well-documented and clear lists are very useful. We didn't use anything like this in LTE, people just talked amongst themselves about what to do next to a large extent. I'm watching carefully because I want to start on my FLT project at some point in the near future.

Aaron Anderson (Nov 22 2023 at 21:10):

I'm gonna take a stab at the general fibring identity.

Floris van Doorn (Nov 22 2023 at 21:35):

I'll try the existence of independent copies next.

Floris van Doorn (Nov 22 2023 at 22:53):

done

Kyle Miller (Nov 22 2023 at 23:06):

I'm trying 7 right now

Shreyas Srinivas (Nov 22 2023 at 23:10):

I would like to try the 5. existence of uniform random variables. But since I am probably the most novice here, I'll also probably give up soon. If that's okay, I'd love to take it for now, see what I can get and then pass the baton in an hour or so.

Terence Tao (Nov 22 2023 at 23:29):

Kevin Buzzard said:

Just to remark that I think these well-documented and clear lists are very useful. We didn't use anything like this in LTE, people just talked amongst themselves about what to do next to a large extent. I'm watching carefully because I want to start on my FLT project at some point in the near future.

For a project of this size it seems manageable to update these lists by hand, but for something as large as FLT I would imagine one would want a more automated tool (maybe a Zulip bot?) that integrates with Blueprint in which registered users can claim/relinquish/complete blueprint items directly, perhaps with some layer of moderator approval.

Shreyas Srinivas (Nov 23 2023 at 00:35):

It is well past my bedtime, so I should hand over the token on 5. If it is still open tomorrow morning, I will pick it up again.

Shreyas Srinivas (Nov 23 2023 at 00:36):

I do have a question: why should \Omega be from a universe different than all other universes used in variable before (\Omega : Type*)

Shreyas Srinivas (Nov 23 2023 at 00:41):

Also why not use the PMF API? For example see this linked code from mathlib

Terence Tao (Nov 23 2023 at 00:47):

Given the rapid rate at which tasks are being completed, I just expanded the list of tasks. I broke up the tasks to establish in the "First estimate" section into smaller pieces: four claims that are basically specializations of previous lemmas to the situation in first_estimate.lean, and then two claims that are basically linear combinations of the first four claims. Any one of these six claims might be particularly suitable for beginners (though there may be technical issues aligning the statements of these claims with their predecessors).

I'll try to do something similar with the "Second estimate" section later today. (It is a strange feeling when the formalization is proceeding so quickly that I have to race a little bit to keep the blueprint, task list, and lean stubs up to date, without doing any actual coding.)

Kyle Miller (Nov 23 2023 at 01:23):

I finished part of 7, but not its prerequisite I₃_eq, and I'll leave that for someone else to claim.

Kyle Miller (Nov 23 2023 at 01:34):

Here's 10 and 11

Terence Tao (Nov 23 2023 at 02:52):

Shreyas Srinivas said:

I do have a question: why should \Omega be from a universe different than all other universes used in variable before (\Omega : Type*)

I think it's difficult to create vectors such as ![ Ω, Ω', Ω''] unless all the spaces belong to the same universe, but I admit I don't really understand this subtlety. For our application I'm sure it can't hurt to have all spaces come from the same universe.

Terence Tao (Nov 23 2023 at 05:18):

I decided to go ahead and add tasks for all of the outstanding unclaimed bubbles on blueprint, with all but one are ready to be formalized at the current state of the blueprint). In the end I split up the second estimate into two medium sized pieces rather than six or so tiny pieces. Only 15 unclaimed tasks left in the entire project - the end is on the horizon!

Sebastien Gouezel (Nov 23 2023 at 07:26):

I'd like to claim 6 (going from the entropy version of the main statement to the covering version).

Sebastien Gouezel (Nov 23 2023 at 07:27):

I expect it won't be very smooth (in particular one needs to make a link with linear algebra, to find a subspace of half cardinality in a given subspace), so don't hold your breath :-)

Sebastien Gouezel (Nov 23 2023 at 12:30):

In fact I will avoid linear algebra alltogether, and argue that any group of cardinality p^n contains subgroups of cardinality p^k for any k < n (which is easy to prove by induction over n by quotienting by a nontrivial element in the center). And it turns out we already have this in mathlib, in docs#Sylow.exists_subgroup_card_pow_prime

Terence Tao (Nov 23 2023 at 15:54):

Sebastien Gouezel said:

I expect it won't be very smooth (in particular one needs to make a link with linear algebra, to find a subspace of half cardinality in a given subspace), so don't hold your breath :-)

I see you already think you have a workaround for this issue, but if not there is always the option of spinning off the one fact you don't see how to readily prove as a separate lemma with a sorry and reporting it so that I can put that fact back on the list of outstanding tasks (and also update the blueprint accordingly). We already did this with Item 7, in which the "routine combination of linear inequalities" also involved establishing an identity

I( X₁ + X₂ : X₁' + X₁ | X₁ + X₂ + X₁' + X₂' ] = I( X₁' + X₂ : X₁' + X₁ | X₁ + X₂ + X₁' + X₂' ]

coming from the symmetry of interchanging X₁ and X₁' , which we returned to the list of tasks.

Aaron Anderson (Nov 23 2023 at 19:37):

I've finished the general fibring lemma, and now I'm setting up the PR. I'm gonna go ahead and claim the specific fibring lemma.

Arend Mellendijk (Nov 24 2023 at 00:12):

I'm working on 12 at the moment.

Arend Mellendijk (Nov 24 2023 at 02:21):

And done

Ben Eltschig (Nov 24 2023 at 03:36):

Lemma 3.22 is also fully done now; see #73.

Terence Tao (Nov 24 2023 at 03:43):

There was a blueprint oversight with condIndepFun incorrectly marked as defined, when it fact we did not actually provide a definition. I think I have rectified this with what should be a usable definition of conditional independence, so hopefully the items that were previously blocked by not having such a definition (8 and 22) should now be available to claim:

def condIndepFun (X : Ω  S) (Y : Ω  T) (Z : Ω  U) (μ : Measure Ω) : Prop := ∀ᵐ z  (μ.map Z),  IndepFun X Y (μ[|Z ⁻¹' {z}])

It's possible though that we may want to generalize this definition to a less "pointwise" definition in case one wants to extend the notion of conditional independence to continuous random variables Z (not needed for this project, but perhaps if one ever wants to have a notion of conditional independence in Mathlib, which from what I can tell doesn't currently support such a notion).

EDIT: 22 is now done.

Rémy Degenne (Nov 24 2023 at 06:59):

There is no conditional independence in mathlib yet but I have an open mathlib pr with a definition: https://github.com/leanprover-community/mathlib4/pull/6098


Last updated: Dec 20 2023 at 11:08 UTC