Zulip Chat Archive

Stream: mathlib4

Topic: Moderately big results formalized in other proof assistants


Jason Gross (Nov 12 2024 at 16:45):

I'm looking for a handful of moderately big results that are formalized in other proof assistants but not Lean. My current list is:

  • four-color theorem
  • Kepler conjecture / sphere packing
  • odd-order theorem

Are any of these currently formalized in Lean? Does anyone have a sense of how far off the current results in mathlib are from these results / the basic theory needed for these results?

Kevin Buzzard (Nov 12 2024 at 18:04):

Prime number theorem has been done in several other proof assistants and is actively being worked on in Lean (edit: see for example this message ). Finite-dimensionality of spaces of modular forms is done in Isabelle and is being worked on in Lean. But these are both standard graduate level mathematics so are perhaps not what you're looking for (PNT was moderately big 20 years ago when Jeremy did it, because the systems were far less mature).

None of the results you state above are formalized in Lean. 4CT and Kepler are both computer-assisted proofs, which we have far fewer of in Lean. But both of these theorems are "end results", i.e. you don't really use them to deduce other things, so perhaps there's not to much motivation to do them again. The paper would say "we formalised an old result which was already formalised" and perhaps this isn't such a big selling point.

Odd order is used in the classification of finite simple groups and this is the big result which I think some people (but certainly not all people) would like to see formalized. However humans have not finished writing it up properly yet -- the details are in the literature but currently you apparently have to be an expert to know how to extract them. One issue with finite group theory is that there was a huge exodus of experts after the classification was announced, so there's not a big pool of finite group theorists which we could tap into to do the dirty work of formalizing the 14 volumes. The total amount of work might even be more than FLT. However one huge difference between the classification theorem and FLT is that the classification theorem is a theorem about finite groups and the proof is thousands of results about finite groups, whereas the statement of FLT is about positive integers but the proof goes on a gigantic tour of algebra, analysis, topology, geometry etc etc; the two proofs are in some sense very different. My money would be on the classification (and in particular the odd order theorem) being formalized by an AI in Lean before it's formalized by humans in Lean.

Kevin Buzzard (Nov 12 2024 at 18:08):

For 4CT and odd-order theorem I think it would not be far from the truth to say that essentially all of the basic theory is already in mathlib, and what isn't there would not be hard to implement. I know far less about the mathematics involved in Kepler but again it would not surprise me if mathlib had many of the necessary ingredients, simply because it is the goal of mathlib to formalize mathematics which is being used by other projects. A great example of evidence that this is working is the formalization of the polynomial Freiman-Ruzsa conjecture, where basically essentially all of the prerequisites were there already and it was just a case of putting everything together, which is why it took 3 weeks. A non-example would be the Liquid Tensor Experiment, where 6 months in we were 5 lines from the end of the proof and it took another year to complete it, because the 5 lines assumed a huge amount of homological algebra which had to be built from scratch.

Josha Dekker (Nov 12 2024 at 18:10):

Also, are you familiar with https://www.cs.ru.nl/~freek/100/? This compares the various proof assistants along 100 problems; you may be able to select problems that match your parameters from there!

Kevin Buzzard (Nov 12 2024 at 18:13):

As for other big results proved in other systems but not Lean, one that springs to mind is the proof that the Lorenz attractor is chaotic, which was formalized by Immler in Isabelle/HOL. Again we don't have this in Lean. Again it's computer-assisted (by which I mean the original human proof used a computer in the more traditional way, like 4CT and Kepler). There has been very little enthusiasm here for formalizing computer-assisted proofs. To give one example of this; Helfgott announced a proof of the weak Goldbach conjecture (every odd number >= 7 is the sum of 3 primes) and the proof goes like this: check on a computer for up to 10^30 or so (a highly nontrivial task as I'm sure you'll be aware, Helfgott used some tricks to do this) and use theoretical methods to do numbers bigger than this. Several years ago I tried to encourage people to try doing this in Lean 4, or at least to attempt to do it so we could see how far off we were, but nobody took the bait. The reason I think this is a nice question is that the mathematics underlying it is very simple, so it would be a test to see if Lean could do the computer-assisted part of this computer-assisted proof.

Kevin Buzzard (Nov 12 2024 at 18:17):

It's finally perhaps worth noting here that my impression is that the communities associated to the other proof assistants tend to be more on the CS side of things, and indeed I think formally verifying computer-assisted proofs was something where they felt that they were helping the maths community, because who knows if our human code has bugs. However computer-assisted proofs have a rather funny status in the mathematics community, because what mathematicians are often looking for when reading a proof is to get an understanding of the argument, and a computer-assisted proof might well not give this. I'm not old enough to remember the hoo-hah which was caused by the announcement of 4CT but my understanding is that some people were very disappointed, and it's certainly the case that mathematicians still work on trying to find a "proper proof", i.e. one which humans can understand conceptually from end to end (indeed there was a prominent announcement a couple of years ago of a computer-free proof of 4CT which generated a bit of a stir, but in the end the proof fell through).

Floris van Doorn (Nov 12 2024 at 18:18):

Kevin Buzzard said:

There has been very little enthusiasm here for formalizing computer-assisted proofs.

I do agree that is not at all the focus of most people here, but there are some nice examples of computer-assisted proofs formalized in Lean, such as the empty hexagon number: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2024.35

Kevin Buzzard (Nov 12 2024 at 18:20):

Yes, I should definitely retract/reword that statement! although I can't now because you quoted it :-) So clearly some people here are interested, but the majority of what's going on here is more on the theoretical side of things rather than verifying program correctness.

Jason Gross (Nov 12 2024 at 19:04):

Thanks! I'm looking at these results precisely because I'm interested in getting AI to formalize them as a sort-of on-ramp to automating program verification. I expect that "translate odd order from Coq to Lean" to be marginally simpler than "translate 4CT from Coq to Lean", which I expect to be a bit simpler than "translate Kepler conjecture from Coq to Lean". It sounds like the weak Goldbach conjecture might be a good step after this, and probably simpler in various ways than "translate CompCert from Coq to Lean".

Jason Gross (Nov 12 2024 at 19:24):

Also, btw, regarding "proper proof" @Kevin Buzzard , I recently published a paper (upcoming in NeurIPS) making the case (empirically but somewhat informally) that length of a formal proof* can be used as a quantitative metric on the degree of understanding encoded in the proof.

*Length of proof in a non-computational theory of logic; in Lean & Coq it would be proof checking time, which includes, in this case, the running time of the computation for 4CT / Kepler / weak Goldbach

Kevin Buzzard (Nov 12 2024 at 20:10):

I have no feeling at all about the ratio of time taken to compile the theoretical part of the argument v. time taken to compile the computational part for 4CT/Kepler/Lorenz. My guess is that the computational verification of the <= 10^30 part of weak Goldbach would easily dominate compile time for the >10^30 part of a theoretical Lean formalisation of Helfgott (100 pages of analysis).

Jason Gross (Nov 13 2024 at 00:43):

For sure the time taken to compile the theoretical part for 4CT/Kepler/Lorenz is faster than the computational part, right? Note that I'm not talking about proof search time (tactics), only the kernel checking time (Qed in Coq). Roughly equivalently, how much LaTeX is required when you impose the constraint that the amount of detail left to the reader should be uniform across the document. Think of it this way: the theoretical part you could write down the math in a textbook or a paper. The part of the math that is specific to 4CT/Kepler/Lorenz, and not reusable, should not be more than, what, a couple dozen to a couple hundred pages of LaTeX?

By contrast, Google says that 4CT has over 600 cases, and Kepler has over 5000 cases and requires solving about 100,000 linear programming problems. Even if we assume that you have some compact representation of graphs that allows you to establish the validity of a 4-coloring of a graph in a single paragraph, that's still ~200 pages of case-work for 4CT. And even if every linear programming problem has a solution that can be validated in a single line, that's still ~1000 pages of solving linear programming problems for Kepler.

Jason Gross (Nov 13 2024 at 00:44):

(I guess it would still be interesting to actually get empirical measurements of these times, though.)

Mario Carneiro (Nov 13 2024 at 01:26):

Jason Gross said:

I expect that "translate odd order from Coq to Lean" to be marginally simpler than "translate 4CT from Coq to Lean",

I looked into this with Assia not long ago. Our estimate was that 4CT would be easier than odd order because it is an older proof and the style is more regular

Mario Carneiro (Nov 13 2024 at 01:28):

The computer assisted part is not the hard part, since here you just translate the program (manually if necessary)

Mario Carneiro (Nov 13 2024 at 01:28):

odd order has a lot more "text part" than 4CT

Abraham Solomon (Nov 13 2024 at 03:58):

Josha Dekker said:

Also, are you familiar with https://www.cs.ru.nl/~freek/100/? This compares the various proof assistants along 100 problems; you may be able to select problems that match your parameters from there!

Might not be an obvious answer to this, but why is the CLT apparently only formalised in Isabelle? Is there something specific about it thats hard to make with type theory?

Joseph Myers (Nov 13 2024 at 05:01):

Note that originally the Kepler conjecture was formalized in the union of HOL (most parts) and Isabelle (classification of certain graphs). They wanted to get it all into one system; not sure if that's been done yet.

Joseph Myers (Nov 13 2024 at 05:03):

I've wondered which will come first: FLT completely formalized in Lean, or all the other 99 theorems on the 100-theorems list (each of which has been done in at least one system) formalized in Lean (right now we have 79, maybe plus a few more that have been done but not added to the list while PRing to mathlib is ongoing).

Kim Morrison (Nov 13 2024 at 05:05):

The central limit theorem is in progress at https://remydegenne.github.io/CLT/index.html. I don't think there's any particular obstacle, except perhaps Mathlib's tendency to do things in great generality and ensuring reusability of components, and sometimes this takes time.

Abraham Solomon (Nov 13 2024 at 05:39):

Kim Morrison said:

The central limit theorem is in progress at https://remydegenne.github.io/CLT/index.html. I don't think there's any particular obstacle, except perhaps Mathlib's tendency to do things in great generality and ensuring reusability of components, and sometimes this takes time.

Cool, thanks so much! Is there some place to see which of the 21 remaining are in progress ( out of curiosity)? I briefly tried looking for it as a topic on Zulip but came up short.

Johan Commelin (Nov 13 2024 at 07:15):

@Abraham Solomon See #mathlib4 > Remaining 100 theorems , which I just started

Joachim Breitner (Nov 13 2024 at 09:31):

Jason Gross said:

only the kernel checking time (Qed in Coq). Roughly equivalently, how much LaTeX is required when you impose the constraint that the amount of detail left to the reader should be uniform across the document.

Interesting idea. However, I expect it works well only for large computational proofs. For most Lean proofs, kernel checking time is not the bottleneck, so little effort goes into making it uniformly fast – optimization efforts stops when its “feels fast enough”, but that is a property that depends on other facts, e.g. absolute number and relative to the elaboration time. If you wait 2s for the SAT solver to return, it doesn’t matter that much if the kernel checking takes 200ms or 20ms.

Martin Dvořák (Nov 13 2024 at 10:03):

I would call them "big results", not "moderately big results".

Jason Gross (Nov 13 2024 at 16:30):

@Joachim Breitner Yes, I mostly agree. For the proofs that I was looking at, where there's a parametrized family of theorems & proofs, I was only looking at asymptotic differences in proof length. And even then the amount of understanding required to shift between different polynomial exponents is relatively small in most cases, though sometimes we "hit a wall". I expect that small constant factors are mostly not meaningful. And it's not (yet) clear how to make comparisons across proofs of fundamentally different theorems.


Last updated: May 02 2025 at 03:31 UTC