Zulip Chat Archive

Stream: Polynomial Freiman-Ruzsa conjecture

Topic: Outstanding tasks for PFR III, version 3.0


Terence Tao (Nov 02 2024 at 20:21):

Time to restart the third phase of the PFR project, in which we formalize the m-torsion variant, as well as the improved constant from 11 to 9!

Here is the current state of play. (See also the blueprint dependency graph, where I manually count about 46 blue bubbles that need filling in.)

Ongoing tasks:

Outstanding tasks for the m-torsion project:

A.1. Nonnegativity Should follow quickly from pfr#max_entropy_le_entropy_add. Completed by @Arend Mellendijk
A.2. Multidistance and Ruzsa distance I A relatively straightforward application of existing Ruzsa distance inequalities together with the definition of multidistance.
A.3. Multidistance and Ruzsa distance II A corollary of A.1 and existing Ruzsa distance inequalities.
A.4. Multidistance and Ruzsa distance III A more complicated application of Ruzsa distance inequalities.
A.5. Multidistance and Ruzsa distance IV Another more complicated application of Ruzsa distance inequalities.
A.6. Vanishing Will follow from A.2 and existing Ruzsa distance inequalities.
A.7. Nonnegativity of conditional multidistance. This is almost completed thanks to A.1, but there is one sorry due to the need to handle events of probability zero. Completed by @Terence Tao

Outstanding tasks for the 2-torsion project, focusing on finishing the KL divergence material, and introducing the rho family of functionals:
B.1. Kullback–Leibler and sums Should be routine from already established facts about KL divergence. Established by @Sébastien Gouëzel
B.2. Kullback–Leibler and conditioning Should follow easily from the relationships between conditional and unconditional entropy and KL divergence. Established by @Sébastien Gouëzel
B.3. Conditional Gibbs inequality Should follow easily from the already proven Gibbs inequality and definition. Established by @Sébastien Gouëzel
B.4. Rho minus nonnegative Should follow from Gibbs inequality and definitions. Established by @Sébastien Gouëzel
B.5. Rho minus of subgroup Mostly just calculation. Established by @Sébastien Gouëzel
B.6. Rho plus of subgroup Corollary of B.5. Established by @Sébastien Gouëzel
B.7. Rho of subgroup Follows from B.5 and B.6. Established by @Sébastien Gouëzel
B.8. Rho invariant This will follow from translation-invariance of entropy and KL divergence. Established by @Sébastien Gouëzel
B.9. Rho of uniform This follows quickly from previous facts about rho.Established by @Sébastien Gouëzel
B.10. Rho and sums Should follow from B.1. Established by @Sébastien Gouëzel
B.11. Conditional rho and relabeling Should be a straightforward consequence of definitions and known facts about entropy. Established by @Sébastien Gouëzel
B.12. Continuity of rho Potentially tricky, but continuous_tau_restrict_probabilityMeasure could be a guide. Established by @Sébastien Gouëzel
B.13. φ-minimizers exist Should be somewhat similar to [tau_minimizer_exists] Established by @Sébastien Gouëzel (https://teorth.github.io/pfr/docs/PFR/TauFunctional.html#tau_minimizer_exists) and use B.12.
B.14. Bound on I_1 This is similar to the corresponding bound for the original PFR argument. Established by @Sébastien Gouëzel
B.15. Difference between I_1 and I_2 Follows from already established fibring identities. Established by @Sébastien Gouëzel

Sébastien Gouëzel (Nov 02 2024 at 20:48):

Can I claim B.1 to B.8?

Sébastien Gouëzel (Nov 04 2024 at 12:34):

PR at https://github.com/teorth/pfr/pull/228. It was fun, but now I should go back to real life for some time :-)

Arend Mellendijk (Nov 04 2024 at 15:23):

Here's task A.1: https://github.com/teorth/pfr/pull/229

Sébastien Gouëzel (Nov 08 2024 at 20:28):

I have a referee report to finish, so it gave me an excuse to formalize instead some more rho functional lemmas in https://github.com/teorth/pfr/pull/230. Structured procrastination works really well :-)

Sébastien Gouëzel (Nov 18 2024 at 09:08):

I've completed the proof that the constant 9 works in PFR instead of 11, in https://github.com/teorth/pfr/pull/231. It went smoothly, for the most part.

I've found the proof for continuity of rho in the blueprint a little bit too optimistic: instead of Clear from definition, it would be more fair to say this is an easy exercise, because the lack of continuity of (x, y) ↦ log (x / y) has to be dealt with somehow.

Also, once I was done and I did #printaxioms better_PFR_conjecture', I had the bad surprise to see that it depended on sorryAx. In fact, it's been the case of all mains theorems in the project, since the commit https://github.com/teorth/pfr/commit/ebd4d5babe62b9ae162c2667c05d774269afd4b1#diff-3ea916cc29c2e35fe3f117b8e69973be0c8c26d1da802976094fb9caf6a5e8d9R72 introduced a sorry in measure_compl_support. Anyway, it wasn't hard to fix once I had located the source of the sorry. By the way, is there a better command than #printaxioms that would have told me where the sorry was coming from? I had to do a bit of detective work to find it by hand...

Yaël Dillies (Nov 18 2024 at 09:10):

Yes, sorry about measure_compl_support. I was stuck in the middle of a bigger refactor and thought it would make an easy exercise for someone else :grinning:

Yaël Dillies (Nov 18 2024 at 09:11):

For axiom-tracking, see #general > Tracking axioms

Terence Tao (Nov 18 2024 at 20:03):

Wow, that's amazing! I thought it would take a much more painstaking and slow approach, but perhaps having a working model of formalized PFR for the previous constants helped.

Ah, yes, you're right, a bit of massaging is needed to make it self-evident that Kullback-Liebler divergence is in fact a continuous operation.

Sébastien Gouëzel (Nov 18 2024 at 20:51):

Clearly, the fact that we already had a formalization for exponent 11 made a big difference, as the approach is pretty similar. I could copy and paste lemmas, modify a little bit the statements and fix the proofs wherever Lean complained. Or, at least, take a strong inspiration from the statements, and look at which lemma was useful at which place. A huge speedup overall!

There's maybe a lesson to be learned here. In analysis, often it's not so much that one uses a big black box, but rather that one uses a standard technique with minor modifications here or there. It looks like formalization works quite nicely also in this kind of situation -- and it's also the kind of situation where I would expect AI to have a good potential.

Terence Tao (Nov 18 2024 at 20:57):

I certainly noticed something similar when we moved from C=12 to C=11... often one could copy and paste the code, change a 12 to an 11 at the top, and fix the few statements where Lean actually complained. It also makes me think that for doing formalized analysis it is more convenient to work with explicit constants everywhere rather than with the more standard asymptotic notation like O() and o() because maintaining the constants as the argument improves over time is actually relatively painless compared with other formalizatio tasks, and of course Lean guarantees that one is not propagating an arithmetic error or something all over the place.

Yaël Dillies (Nov 18 2024 at 21:23):

A style Bhavik and I are experimenting with is making all constants appearing in statement into separate defs. Something like

def constLemmaA :  := ...

lemma lemmaA : ...  constLemmaA := ...

def constLemmaB :  := 1- * constLemmaA

lemma lemmaB : ...  constLemmaB := ...

Yaël Dillies (Nov 18 2024 at 21:25):

This way, whenever a constant changes, the arithmetic error only propagates to the other constants/theorems that directly mention it, instead of long range fixes being needed (as in Sébastien's pfr#232)

Yaël Dillies (Nov 18 2024 at 21:26):

This comes at the cost of being less explicit with what the constant is, but that can be fixed in the final theorem by unfolding all constants present

Kevin Buzzard (Nov 18 2024 at 22:13):

Here's a true story and a problem with mathematics which are related to this.

The true story: Ken Ribet proved in the late 80s that FLT followed from (what was called then) the Taniyama-Shimura-Weil conjecture; this was the paper which made Wiles drop everything and start working on FLT. Ken's paper (after the FLT-related intro) begins with the line "let p be an odd prime", an assumption which runs through the entire 35-or-so-page paper, and he then goes on to prove a level-lowering result for mod p modular forms and mod p Galois representations.

A few years after the Wiles breakthrough, Richard Taylor drew up a strategy for proving new cases of an old conjecture of Artin on L-functions and one of the ingredients in the strategy was a strengthening of Ken's result so that it included the case p=2. I started working on this in about 2000, and it was a phenomenally painful task, because not only did I have to read Ken's paper very carefully, I also had to check every reference, because p=2 is a famously thorny prime in this area of number theory (not only does GL_2(Z) have elements of order 2, but GL_1(Z) has elements of order 2 and there are even some questions about GL_1 (for example the Iwasawa main conjecture on p-adic L-functions for totally real fields) which have not been satisfactorily resolved for p=2 yet). You really never knew what you were going to hit; Ribet would refer to SGA7 and so you'd have to read for these gigantic tomes of Grothendieck et al and really check hard to make sure that the results he was quoting didn't assume p>2, or find your way around them if they did. I knew some tricks (which I'd learnt from Ken and others) about how to prove various intermediate lemmas for p=2, but I also knew that the experts seemed convinced that proving the full level-lowering theorem was hard. I still remember clear as day where I was (the library in Oberwohlfach) when I realised that in fact all of the issues were surmountable and the paper worked fine for p=2 (as long as you inserted a few tweaks here and there). I emailed Ken and Richard to say "what is all the fuss about? level-lowering works fine for p=2?" and I got emails back from them simultaneously, both saying "I thought that the problem was here", for two completely different values of "here". So in fact the theorem had been known to the union of the experts, it was just that no single expert happened to know it. I wrote a paper "level-lowering for p=2" which contained none of my own ideas but was extremely influential because this was the last ingredient Taylor needed for his Artin strategy so I also got a multi-author Duke paper out of it proving new cases of an old conjecture. And all because removing the assumption p>2 was easy, but nobody had noticed it was easy. Terry has pointed out how formalization makes this kind of exercise much easier; I would have loved to have had a formalization of Ken's result back then (ironically, the strategy I'm following for FLT doesn't need Ken's result!)

The problem with mathematics is a version of this phenomenon which is a couple of orders of magnitude worse, and is explained in the second half of Michael Harris' blog post here. If you skip over his usual anti-formalization/AI rant in the first half, you'll run into a section "Dreams of an AI scribe (or, my secret plan to sell out)" accompanied by a picture of the front cover of a 727-page green book called "Stabilisation de la formule des traces tordue, Vol 2". The game here is that this book, and volume 1, prove stabilization of the twisted trace formula in the number field case, but Michael wants it in the function field case, to finish the proof of another theorem he's working on. There is an analogy between number fields and function fields, for example their integer rings are both Dedekind domains with finite residue fields and finite class groups etc, but there are also differences (for example the function fields are characteristic p, so one prime p becomes a "special" prime that you can't divide by, and this screws some stuff up). Michael points out that going from the number field theorem to the function field theorem will be hard, because either you write a paper saying "here is the diff between the green books and the function field theorem" (which will be kind of hard to publish, he conjectures, although I managed to publish my p=2 paper by writing just the diff, but that was only 8 pages), or you publish another 1000 pages (which will be kind of hard to publish, because 90% of it will be plagiarised). This means (according to Michael, and I suspect he will have evidence for this claim) that humans are reluctant to work on the problem, which leaves him stuck. If AI could do it however...


Last updated: May 02 2025 at 03:31 UTC