Zulip Chat Archive

Stream: general

Topic: dreams of big projects


Kevin Buzzard (Feb 19 2025 at 14:47):

FLT has been the big project I've been dreaming of for years. What other big Lean projects do people dream about? I'm talking about big projects which would unarguably have high visibility and impact and which would perhaps attract new users?

Example: There is room for at least one more "Langlands program" type project I should think, for example something on geometric Langlands would be feasible given that the popular science blogs are already reporting on it when it's not even been refereed!

Non-example: I don't think that ABC would be a good example, despite there being a lot of noise about it, simply because of the high chance of failure.

Jeremy Tan (Feb 19 2025 at 14:56):

How about Keevash's existence theorem for Steiner systems?

Shreyas Srinivas (Feb 19 2025 at 14:56):

My easy answer : Algolib. A library of algorithms theory that uses a standardised notion of complexity computation of functions (using a generic query model), and encodes proofs of complexity and correctness of algorithms

Shreyas Srinivas (Feb 19 2025 at 14:57):

And to be able to bring it to a level where recent results can be formalised in a few weeks.

Shreyas Srinivas (Feb 19 2025 at 14:58):

Given the messy state of the algorithms literature, I expect this to be a tad bit more challenging than for math.

Michael Stoll (Feb 19 2025 at 15:04):

Closer to FLT: I've been toying with the idea of formalizing a proof of Mordell's Conjecture (see these slides from the Mordell Conference last year).

Patrick Massot (Feb 19 2025 at 15:07):

There are not so many theorems comparable to FLT in terms of public recognition. The obvious candidate in pure algebra is the classification of finite simple groups. I know that very few people here (probably nobody?) feel excited about finite group theory, but we cannot avoid discussing this topic from time to time. In geometry and topology, I think the only comparable result is Perelman’s proof of the Poincaré conjecture.

Ruben Van de Velde (Feb 19 2025 at 15:12):

Can we state the classification yet?

Riccardo Brasca (Feb 19 2025 at 15:17):

Ruben Van de Velde said:

Can we state the classification yet?

Stating the classification is probably a reasonable project. Even proving that all the groups are simple is not impossible. CC @Antoine Chambert-Loir

Yaël Dillies (Feb 19 2025 at 15:23):

A big project that's been on Bhavik and I's radar is Szemerédi's theorem. It's of course close to my heart because it's additive combinatorics, but it has also since inspired great advances in ergodic theory, model theory, and was the motivation behind the birth of higher order Fourier analysis. There are four-five categories of proofs, and every one of them would provide huge value to mathlib and the greater community if formalised.

Michael Rothgang (Feb 19 2025 at 15:28):

Riccardo Brasca said:

Ruben Van de Velde said:

Can we state the classification yet?

Stating the classification is probably a reasonable project. Even proving that all the groups are simple is not impossible. CC Antoine Chambert-Loir

Didn't the formal abstracts project have a statement of the classification (with lots of sorries, of course)? CC @Floris van Doorn who worked on this, I believe

Pim Otte (Feb 19 2025 at 15:33):

Along another axis: A web development framework, perhaps along the lines of IHP. I don't think the time is ripe for this. For one, it depends on development of the http library in core. The feasibility also dependes on how willing one is to use the FFI at some point.

The goal here would be to create a framework with an opinionated architecture (Event Sourcing/CQRS?) that is amenable to formalization, along with automation. Then users would just have to specify a data model and implement functions, and if those functions satisfy the constraints implied by the architecture, the theorems about integrated functionality could be easily derived.

Patrick Massot (Feb 19 2025 at 15:35):

That would be great indeed. But having the analogue of python’s requests library would already be a huge improvement to applicability in ordinary programming.

Riccardo Brasca (Feb 19 2025 at 15:38):

In my opinion if we want something that in some sense will make even more news than Fermat, the only possibility is to formally prove an open "big" conjecture. This already happened in certain field of course, but we are very far from being able to that in, say, algebraic geometry or analysis.

Julian Berman (Feb 19 2025 at 15:39):

Is #general > Robertson–Seymour theorem perhaps another example?

Patrick Massot (Feb 19 2025 at 15:43):

Riccardo, I’m sure Terry will let us know when he’ll prove existence of blow-up for Navier-Stokes so that we can help him formalize it. In the mean time I think formalizing Perelman’s proof of the Poincaré conjecture would still be nice.

Chris Birkbeck (Feb 19 2025 at 15:44):

Riccardo Brasca said:

In my opinion if we want something that in some sense will make even more news than Fermat, the only possibility is to formally prove an open "big" conjecture. This already happened in certain field of course, but we are very far from being able to that in, say, algebraic geometry or analysis.

Maybe one first formalises the (statements of the) millennium prize problems

Riccardo Brasca (Feb 19 2025 at 15:45):

Chris Birkbeck said:

Riccardo Brasca said:

In my opinion if we want something that in some sense will make even more news than Fermat, the only possibility is to formally prove an open "big" conjecture. This already happened in certain field of course, but we are very far from being able to that in, say, algebraic geometry or analysis.

Maybe one first formalises the (statements of the) millennium prize problems

Riemann should be very easy, right?

Chris Birkbeck (Feb 19 2025 at 15:45):

We have Riemann in mathlib IIRC

Riccardo Brasca (Feb 19 2025 at 15:45):

Ah yes, RiemannHypothesis

Floris van Doorn (Feb 19 2025 at 15:46):

The statement of the classification of finite simple groups was indeed done in Lean 3: https://github.com/formalabstracts/formalabstracts/blob/master/src/group_theory/classification.lean (with sorry's for proofs used in the definition of many groups)

Martin Dvořák (Feb 19 2025 at 15:49):

Complexity Zoo in a unified framework.

Chris Birkbeck (Feb 19 2025 at 15:49):

BSD would be fun, can also prove at least 66% of it :P

Floris van Doorn (Feb 19 2025 at 15:50):

Patrick Massot said:

In the mean time I think formalizing Perelman’s proof of the Poincaré conjecture would still be nice.

Roughly how many preliminaries go into that proof? Is it significantly easier than proving FLT?

Jeremy Tan (Feb 19 2025 at 15:50):

Julian Berman said:

Is #general > Robertson–Seymour theorem perhaps another example?

Check out my list of 100 theorems somewhere in the mathlib issues list. It has R–S in it

Jeremy Tan (Feb 19 2025 at 15:51):

Also in my 100: the Heawood conjecture (4CT for arbitrary closed surfaces)

Ruben Van de Velde (Feb 19 2025 at 15:53):

(Has anything happened to formal abstracts in the last years? It seemed like a cool idea)

Jeremy Tan (Feb 19 2025 at 15:54):

Of course, 4CT has already been formalised in Rocq...

Jeremy Tan (Feb 19 2025 at 15:55):

#6091

suhr (Feb 19 2025 at 16:01):

My dream project is an unusual one: to make mathematics accessible for people without formal math education. There's a gap between people formally trained in math and self-taught mortals (e.g. programmers). Lean could help to bridge this gap by allowing learning by doing.

As an extra, it would be nice if programmers were more familiar with formal methods. A lot of people perceive formal methods as something PhD level hard, which is rather unjustified.

Oliver Nash (Feb 19 2025 at 16:03):

If we're really talking big then I think Perelman's Theorem / proof of the Geometrisation Conjecture might be the highest profile. I note that the Poincaré Conjecture is still the only Millennium Prize Problem for which we have a solution. Oh Patrick already said this :man_facepalming:

Kevin Buzzard (Feb 19 2025 at 16:07):

For me personally it's currently the most convincing answer.

Patrick Massot (Feb 19 2025 at 16:18):

I don’t know enough about FLT to compare. But Perelman’s proof would be very far beyond anything we’ve done so far. It would mean fantastic progress for Lean and Mathlib.

Yakov Pechersky (Feb 19 2025 at 16:30):

Verified CUDA kernels and a verified autodiff package similar to jax

Patrick Massot (Feb 19 2025 at 16:36):

Since someone mentioned that Mathlib contains a statement of the Riemann hypothesis, do we want a statement of the three-dimensional Poincaré conjecture? It would be very easy to do.

Michael Rothgang (Feb 19 2025 at 16:42):

Don't we have that already? file#Mathlib/Geometry/Manifold/PoincareConjecture, proof_wanted SimplyConnectedSpace.nonempty_diffeomorph_sphere_three?

Riccardo Brasca (Feb 19 2025 at 16:44):

Ah, we should decide how to treat those things. As a def as in Riemann and FLT or as a proof_wanted?

Riccardo Brasca (Feb 19 2025 at 16:44):

In my opinion a def seems better, so one can prove foo implies "big theorem"

Jireh Loreaux (Feb 19 2025 at 17:00):

You can have both :wink:

def BigTheorem : Prop := ...

proof_wanted BigTheorem

Shreyas Srinivas (Feb 19 2025 at 17:26):

Here's a more concrete goal I am looking to accomplish this year (and collaborators to do this): To port this repo from Isabelle to Lean: https://github.com/mabdula/Isabelle-Graph-Library

This is still ambitious but would be of interest for CS theory folks

Arthur Paulino (Feb 19 2025 at 17:53):

A featureful, robust, reliable and efficient HTTP library

Oliver Nash (Feb 19 2025 at 18:28):

I think we even considered this but it fell through the cracks during review
Jireh Loreaux said:

You can have both :wink:

A. (Feb 19 2025 at 18:44):

A headline-grabbing project would be the one that forces an LLM into unerring accuracy.

Antoine Chambert-Loir (Feb 19 2025 at 18:46):

Regarding the classification of simple groups, here are things I would like to see done in the next few years.

  • I have the impression that defining the geometric groups will require to have more linear algebra to be able to define all classical groups, over an arbitrary fields, and in their nonsplit forms.

  • Defining the sporadic groups is a question of finite permutation groups. The books on the subject should be formalizable in full, and it should not be too hard to do so.

  • Those books will prove that the relevant sporadic groups are indeed simple. What I did for the alternating group (not yet in mathlib) is to use the Iwasawa criterion, docs#MulAction.IwasawaStructure.isSimpleGroup, and this is a major tool.

  • The geometric groups are more interesting: up to elementary modifications, the special linear and symplectic groups are simple over any field. Orthogonal groups are more subtle, the isotropic ones are also simple over any field (with some counterexamples), but not the anistropic ones (the orthogonal group over a real closed field is not simple when there are infinitesimals).

  • Something related with simplicity is the classification of the maximal subgroups of these simple groups. For symmetric and alternating groups, this is the O'Nan-Scott theorem, with an important addition by Liebeck, Praeger and Saxl; for geometric groups, it's a theorem of Aschbacher. Indeed, knowing than some subgroups is maximal is a key to their simplicity.

Andrew Yang (Feb 19 2025 at 19:02):

About sporadic groups: moonshine is also a cool thing to have.

Geoffrey Irving (Feb 19 2025 at 19:07):

A. said:

A headline-grabbing project would be the one that forces an LLM into unerring accuracy.

We already have one AI safety result in this space (proving accuracy of a simple scalable oversight method) formalised in Lean (https://github.com/girving/debate) and we have an improved result that I am doing the Lean preparation for now (either Blueprint or formalisation). If someone is interested I’m happy to chat.

Bhavik Mehta (Feb 19 2025 at 20:03):

Yaël Dillies said:

A big project that's been on Bhavik and I's radar is Szemerédi's theorem.

An extension of this is the Green-Tao theorem, which is probably more famous. It uses Szemerédi, and uses it as a black box.

Adam Topaz (Feb 19 2025 at 20:38):

My vote: Milnor Bloch-Kato (I.e the Voevodsky-Rost theorem). Alternatively the proof of the Weil conjectures.

Derek Rhodes (Feb 19 2025 at 22:00):

  • I would like to see a math department step up and offer a new undergraduate degree, where as now there are typically "Applied" and "Pure", there would be a "Formalized" track, where all undergraduate courses are done with a proof assistant.
  • Documentation for best practices about how to build undergraduate libraries that wrap Mathlib that expose only foundational lemmas so learners can't exact? their way to solutions, so that there would be standard consistent interfaces for Undergrad.Algebra, Undergrad.Analysis, etc. Ideally those wrappers would be a collaboration by many educators to help lighten the load when it comes time to upgrade.
  • A tool that can port lean4 to lean5, along with documentation, projects and books without much human intervention.
  • Along the lines of what @suhr mentioned. There are many people around the world without access to universities who would like to do proof based math. There are lots of math courses on MOOCs, but none of them are proof based, yet (as far as I know).
  • More public discussion about the future of Lean, perhaps collaboration in the style of khronos, which coordinates with industrial partners who pay for membership that grants them the right to participate in steering committees. Software engineering daily had an enlightening interview with techs from the vulkan 3d graphics group about how that sausage is made.

Scott Carnahan (Feb 19 2025 at 22:02):

People have mentioned Perelman's proof of the 3d Poincaré conjecture/geometrization, but the 4-dimensional case is also quite interesting, in part because the smooth and topological questions are not equivalent. Freedman's proof of the topological question was widely seen as completely incomprehensible for about 30 years, but the "Bonn seminar" seems to have helped clarify the argument. The smooth 4d case is still open as far as I know. Smale's proof (in TOP) for dimension at least 5 seems to be much more accessible.

Patrick Massot (Feb 19 2025 at 22:12):

Of course I agree all cases of the Poincaré conjecture are great pieces of mathematics that would be amazing formalizations. But only Perelman’s story would be stunning for non-mathematicians.

Bjørn Kjos-Hanssen (Feb 19 2025 at 23:32):

Ruben Van de Velde said:

(Has anything happened to formal abstracts in the last years? It seemed like a cool idea)

I think the problem is that until you prove the theorem, you're not really sure if you've stated everything correctly.

Kim Morrison (Feb 20 2025 at 03:22):

I'm not sure anyone seriously tried to launch it... I wouldn't discount the idea.

Junyan Xu (Feb 20 2025 at 03:50):

Floris van Doorn said:

The statement of the classification of finite simple groups was indeed done in Lean 3: https://github.com/formalabstracts/formalabstracts/blob/master/src/group_theory/classification.lean (with sorry's for proofs used in the definition of many groups)

See also #general > The state of finite simple groups @ 💬

Rida Hamadani (Feb 20 2025 at 09:15):

"After 3 decades of uncertainty, the proof of the snark conjecture is finally verified" would probably be an attention grabbing headline. Bhavik talks more about it here: #general > Help me sell Lean to my professor @ 💬

Julian Külshammer (Feb 20 2025 at 09:24):

To suggest some topic not to seek attention from the general public, but to attract people from different areas of mathematics: Hodge theory. It comes with many different aspects, of course geometric, but the ideas have been used to prove famous conjectures: Elias-Williamson's proof of the Kazhdan-Lusztig conjecture in representation theory, or Adiprasito-Huh-Katz's proof of Rota's log concavity conjecture for matroids.

Ilmārs Cīrulis (Feb 20 2025 at 09:36):

My personal answer --- formal definition of Javascript (or its subset, at least) with tools to do verification of scripts in Lean. I like to do mathy stuff with it, and it would be nice if I could remove as much bugs as possible by formal verification.

Not sure if it is big, but at least it is a dream.

Chris Wong (Feb 20 2025 at 13:06):

This might not be very big, but my dream project is formalizing Modern Tetris. There's a lot of research and theory around the game (example), but it's fragmented across many documents and chat threads, and might contain mistakes. A Lean/Verso library could address both problems.

Mario Carneiro (Feb 20 2025 at 13:16):

Why would you say that formalizing Javascript is not big? There is literally a project called "Everest" which is trying to tackle something smaller than JS. Web is hard

Mario Carneiro (Feb 20 2025 at 13:17):

unless you mean just the core language semantics and not the web APIs it supplies

Chris Wong (Feb 20 2025 at 13:22):

The core language is pretty complicated too. I think it would be a significant project for someone to take on by themselves.

Jz Pan (Feb 20 2025 at 16:38):

Antoine Chambert-Loir said:

What I did for the alternating group (not yet in mathlib) is to use the Iwasawa criterion, docs#MulAction.IwasawaStructure.isSimpleGroup, and this is a major tool.

So you have the proof that A_n is a simple group if and only if n = 3 or n >= 5 in Lean but is not PR in mathlib yet?

Antoine Chambert-Loir (Feb 20 2025 at 16:52):

Yes, in my Jordan project, but I need to merge mathlib and make it compile again.

Ruben Van de Velde (Feb 20 2025 at 16:53):

Are you interested in help with that?

Jz Pan (Feb 20 2025 at 16:59):

Ruben Van de Velde said:

Are you interested in help with that?

Maybe a few days later, I think that shouldn't be hard, as the typical proof in textbook is about half a page.

Jz Pan (Feb 20 2025 at 17:00):

Oh I mean extract the proof that A_n is simple...

Antoine Chambert-Loir (Feb 20 2025 at 17:09):

Actually, I have two proofs. One was in lean3 and is the textbook proof. The other one proves much more, hence is much more complicated, and is more or less new.

Antoine Chambert-Loir (Feb 20 2025 at 17:11):

https://arxiv.org/abs/2303.12404

Tomas Skrivan (Feb 20 2025 at 18:05):

Yakov Pechersky said:

Verified CUDA kernels and a verified autodiff package similar to jax

This is one of the goals of SciLean, well the verified part will come only after it is sufficiently fast and user friendly. If anyone wants to help towards this goal it would be nice to add cuBLAS bindings to LeanBLAS.

For writing kernels, I have a tactic that can take a lean expression like fun (x : ℝ) => x + x * exp x and produce e : CExpr

inductive CExpr where
  | bvar (n : Nat) | add (x y : CExpr) | mul (x y : CExpr) | neg (x : CExpr) | exp (x : CExpr) | ...

together with a proof that interpretation of e back to ℝ → ℝ produces the original expression fun (x : ℝ) => x + x * exp x. We can then write fast functions like map or reduce using CExpr, probably by utilizing and extending lean-alloy to generate corresponding shims from CExpr. Feel free to reach out if you are interested in working on this.

Luigi Massacci (Feb 21 2025 at 10:21):

While not quite in the same league of FLT or Poincaré, o-minimality of Ran,exp\mathbb{R}_{an, exp} would be cool to have. It's a fairly landmark result in applied model theory, and o-minimality has garnered even more interest recently.

Ruben Van de Velde (Feb 21 2025 at 16:39):

Antoine Chambert-Loir said:

Yes, in my Jordan project, but I need to merge mathlib and make it compile again.

I didn't manage to stop myself from updating your code to latest stable: https://github.com/AntoineChambert-Loir/Jordan4/pull/1 :)

Ilmārs Cīrulis (Feb 21 2025 at 20:17):

Mario Carneiro said:

Why would you say that formalizing Javascript is not big? There is literally a project called "Everest" which is trying to tackle something smaller than JS. Web is hard

I just don't know if it is big enough compared to another dreams mentioned here. And it's done in Coq, I believe, so maybe porting it is an option. But there are no tools, as far as I know, like Verified Software Toolchain for C language in Coq, so it is, maybe, big in the end.
It also becomes not so big, if only subset of it is formalized, too.

Basically, I just don't know, is it big or not. Sorry. :sweat_smile:

Ilmārs Cīrulis (Feb 21 2025 at 20:18):

Ah, yes, I meant the language itself, without many additional things that's connected to it in the modern Web.

Tyler Josephson ⚛️ (Feb 22 2025 at 02:43):

Markov Chain Monte Carlo, and the Metropolis-Hastings Algorithm, in particular. The basis of all simulations involving random numbers, from physics to chemistry to biology to Wall Street to Bayesian inference. Imagine if we could generate samples from a distribution, prove correctness of the algorithm generating the samples (conditions like detailed balance), and prove things about the distribution from which the samples are being drawn.

Andrés Goens (Feb 22 2025 at 06:55):

Just dreaming of a big project too, I would love to see Lean have the first bootstraped (self-hosted) verified compiler: usually it's a big milestone of a compiler when it is self-hosting, i.e. when it can compile itself. No compiler has ever been self-verifiying, i.e. it can check the proof of its own correctness.

This is different from, say Coq-in-Coq or Lean4Lean in that it's about the compiler, i.e. going from Lean to C or LLVM in the case of Lean, not the semantics of the core language. Of course you need the semantics of Lean to prove that they're preserved, so it builds up on this.

I think Lean has a nice advantage in that it's a small core language and most of it is defined in itself, so having a self-verified compiler is not impsosible. The main challenge would be of course that this kind of big project would have to either re-build/fork a lot of core in a separate project, or coordinate well with core/the FRO, but at this stage I'm just dreaming :)

Eric Taucher (Feb 22 2025 at 12:05):

@Kevin Buzzard

While this is not one one about creating a specific proof using Lean, it is something that Kevin has noted from time to time and would help many here.

The ability to convert a proof or algorithms in PDFs into Lean.

Siddhartha Gadgil (Feb 22 2025 at 12:36):

Patrick Massot said:

There are not so many theorems comparable to FLT in terms of public recognition. The obvious candidate in pure algebra is the classification of finite simple groups. I know that very few people here (probably nobody?) feel excited about finite group theory, but we cannot avoid discussing this topic from time to time. In geometry and topology, I think the only comparable result is Perelman’s proof of the Poincaré conjecture.

Does Lean have exotic spheres? How about exotic $R^4$s? The latter may be striking and not so hard given fully combinatorial versions of "Heegaard Floer theory" and friends. Closely related is the Milnor conjecture on unknotting number - may not be well known but not hard to explain.

Patrick Massot (Feb 22 2025 at 18:11):

Of course we have nothing like that.

Patrick Massot (Feb 22 2025 at 18:14):

And I don’t think you can get anywhere using only combinatorics. You can compute Heegaard-Floer using combinatorial methods, but then you need to relate the result to actual information.

Alex Meiburg (Feb 24 2025 at 18:55):

I think a proof of MIP*=RE would be awesome. Especially because some people still cast some doubt on it; putting it on firm footing would be great. It's a fairly deep proof. It doesn't have a lot of "consequences" for math in the particular sense that it mostly disproved some other nice conjectures, so it's not something that will be used in later proofs. Of course it had consequences for research and the state of math knowledge, but it's not something that will likely be needed for formalizing other theorems.

As a more tractable problem, I think a proof of Onsager's solution to the 2D Ising model would be a very nice mathematical result + derivation of a phase transition. This is something I vaguely hope to attempt sometime in the future...

Winston Yin (尹維晨) (Feb 25 2025 at 06:57):

I'd like to see foundational theories from a standard undergraduate physics curriculum formalised, to reproduce the success of the Xena Project. This includes Newtonian mechanics (free-body diagrams and laws of motion), classical electromagnetism (Maxwell's equations), thermodynamics and statistical mechanics, axiomatised quantum mechanics, and more.

To get some attention from working theoretical physicists, I dream of: the formalisation of Einstein's field equations, a few of its famous exact solutions (various black hole solutions in asymptotically Minkowski / dS / AdS space, FRW cosmology), and a few famous approximate solutions (gravitational waves). On the high energy physics side (@Joseph Tooby-Smith ), a geometric statement of the Yang-Mills Lagrangian and conformal field theory.

Geoffrey Irving (Feb 26 2025 at 08:06):

Winston Yin (尹維晨) said:

I'd like to see foundational theories from a standard undergraduate physics curriculum formalised, to reproduce the success of the Xena Project. This includes Newtonian mechanics (free-body diagrams and laws of motion), classical electromagnetism (Maxwell's equations), thermodynamics and statistical mechanics, axiomatised quantum mechanics, and more.

To get some attention from working theoretical physicists, I dream of: the formalisation of Einstein's field equations, a few of its famous exact solutions (various black hole solutions in asymptotically Minkowski / dS / AdS space, FRW cosmology), and a few famous approximate solutions (gravitational waves). On the high energy physics side (Joseph Tooby-Smith ), a geometric statement of the Yang-Mills Lagrangian and conformal field theory.

How about the Penrose-Hawking singularity theorems? :)

Notification Bot (Apr 15 2025 at 14:50):

Alex Meiburg has marked this topic as resolved.

Notification Bot (Apr 15 2025 at 14:54):

Alex Meiburg has marked this topic as unresolved.

Alex Meiburg (Apr 15 2025 at 14:54):

(oops)

Timothy Chow (Oct 10 2025 at 03:47):

Creating formal abstracts for the six remaining Millennium Prize problems could be good as far as publicity is concerned. If that project is carried out in partnership with the Clay Foundation, and the Clay Foundation makes an announcement when the project is finished, then I suspect that will grab some public attention for the simple reason that "money talks."

I believe that Tom Hales is, at present, not actively working on the formal abstracts project, so creating formal abstracts for the Millennium Prizes seems to be up for grabs. A couple of them are easy to do (and maybe have already been done), but others are more challenging, and making formal abstracts for them could drive improvements in mathlib that might not otherwise happen.

Such formal abstracts might also assist with filtering out cranks. If there is an official formal abstract, then that will block spurious claims by cranks that they have a Lean proof of one of the prize problems. As matters currently stand, the only thing the Clay Foundation can say in response to such a crank claim is, "You formalized the wrong theorem," which is not as satisfactory as, "You failed to prove the official Lean version of the problem."

Lawrence Wu (llllvvuu) (Oct 10 2025 at 05:18):

The Millenium problems are esteemed enough that statements would probably go directly into Mathlib. e.g. docs#RiemannHypothesis is in mathlib. My understanding is that several of them are very hard to state though.

Yan Yablonovskiy 🇺🇦 (Oct 10 2025 at 05:36):

Timothy Chow said:

Creating formal abstracts for the six remaining Millennium Prize problems could be good as far as publicity is concerned. If that project is carried out in partnership with the Clay Foundation, and the Clay Foundation makes an announcement when the project is finished, then I suspect that will grab some public attention for the simple reason that "money talks."

I believe that Tom Hales is, at present, not actively working on the formal abstracts project, so creating formal abstracts for the Millennium Prizes seems to be up for grabs. A couple of them are easy to do (and maybe have already been done), but others are more challenging, and making formal abstracts for them could drive improvements in mathlib that might not otherwise happen.

Such formal abstracts might also assist with filtering out cranks. If there is an official formal abstract, then that will block spurious claims by cranks that they have a Lean proof of one of the prize problems. As matters currently stand, the only thing the Clay Foundation can say in response to such a crank claim is, "You formalized the wrong theorem," which is not as satisfactory as, "You failed to prove the official Lean version of the problem."

You might be interested in https://github.com/lean-dojo/LeanMillenniumPrizeProblems

Kevin Buzzard (Oct 10 2025 at 06:41):

Speaking as a moderator of this forum I am highly skeptical that this or indeed anything would help filter out cranks. There are plenty of people posting code which doesn't compile which "proves" RH; the point is that if the LLM says it compiles then the crank doesn't even need to figure out what lean is or how to run a lean file, they just post the slop code anyway. If anything, formalisations of the six formalisable problems (it's not clear that Yang-Mills is mathematically well-defined as a formal conjecture) might just increase interest in slop proofs! Of course the benefits will outweigh the problems here but I'm just pointing out that I don't think it will help solve any crank issues.

Timothy Chow (Oct 10 2025 at 13:05):

Kevin Buzzard is probably right about cranks. I guess I was thinking about cases where the person isn't a total crank but where there are serious doubts about the correctness of the argument (think Randall Holmes, Louis de Branges, Wu-Yi Hsiang, Shinichi Mochizuki, ...).

I have in the past heard rumors that it's not completely clear what it would take to win the Yang-Mills prize, but it surprises me to hear that there may not be any formalizable statement extractable from the natural-language statement. I was under the impression that what was ambiguous was precisely which of several different possible formal conjectures was intended. Is there somewhere I can read more about this alleged non-formalizability of the Yang-Mills problem?

Kevin Buzzard (Oct 10 2025 at 13:18):

Here is an old discussion about formally stating Yang-Mills, initiated by Tom Hales. I absolutely agree with you about formalisation benefitting non-cranks and benefitting the area in general.

Timothy Chow (Oct 11 2025 at 00:06):

Hmmm...that's surprising and unfortunate. I doubt the Clay Foundation would ever make an announcement about formal abstracts for the Millennium Prize Problems if one of them doesn't have a formal abstract.

Sina Hazratpour (Oct 11 2025 at 17:31):

PDEs

Timothy Chow (Oct 26 2025 at 02:35):

I emailed Arnold Neumaier, who was collaborating with Tom Hales on formalizing the Yang-Mills problem back when Hales was actively working on that project. He said that he's not really interested in learning the technicalities of Lean, but he did say this: "It takes someone with a degree in theoretical physics to do the job. I would be ready to give advice to anyone trying who has the relevant background; my estimate for the time needed is 6 months."

Snir Broshi (Oct 26 2025 at 04:56):

  • Robertson-Seymour theorem (graph minors)
  • Weak Goldbach conjecture
  • Tao's result on Collatz
  • Rubik's cube "God's number" (diameter of the group) is 20
  • Jordan curve theorem
  • Four color theorem
  • Odd order theorem (FWIW mark me down as someone who's interested in finite groups)

Timothy Chow (Oct 27 2025 at 02:05):

The Rubik's cube example illustrates something I've wondered about since at least 2014, when I posted some musings to the Foundations of Mathematics mailing list. Namely, how should mathlib or other libraries of formalized mathematics deal with highly computational proofs?

Verifying that God's number is 20 reportedly took about 35 CPU years. There are other results in recreational mathematics that took a massive amount of computation (e.g., "checkers is a draw" and "Othello is a draw"), not to mention results in combinatorics with certificates in petabyte territory. In some of these cases, formal verification might be nice. AFAIK the proofs that checkers is a draw and Othello is a draw are not publicly available in any form (in the case of checkers, there used to be a website where you could interact with the perfect player, but last time I checked, the website was no longer working). A fairly recent effort to confirm the nonexistence of a projective plane of order 10 uncovered some mistakes in earlier work.

Two questions naturally arise. The first question is whether a formal proof of these results is even feasible. How much of a performance hit does formalization introduce? The second question is, even if a formal proof is feasible, one doesn't want to run a gigantic computation every time the library is loaded, so how does one manage the situation? Maybe there needs to be a version of "sorry" that indicates that a proof is available but won't be run unless you insist, because it will require vast computing resources? And for the person who does insist, should petabytes of storage be set aside for an exported formal proof, or will it be better to implement a low-memory version that takes longer to run?

Yoh Tanimoto (Oct 27 2025 at 11:54):

About Yang-Mills, it is not an easy task even to state the problem formally, for a combination of reasons.

  • The official description of the problem asks to "prove that ...a non-trivial quantum Yang-Mills theory exists..." either in the Wightman axioms or the Osterwalder-Schrader axioms. However, these axioms are actually not for pointlike fields in gauge theories such as the quantum Yang-Mills theories. Instead, one would have to consider extended objects callled Wilson loops, as in Seiler's book. As one can see, Seiler's axioms in Section 8 (an analogue of the OS axioms for loops) look rather ad hoc, and there isn't a definite version of "OS axioms for Wilson loops" as far as I know. See the comments in Section 3.7 of Summers' survey.
  • Even if we had a fixed axiomatic setting, to "prove that ... a non-trivial quantum Yang-Mills theory exists" is not yet a well-defined problem, as long as we don't have a characterization of the quantum YM theory. In practice, people use different specific techniques to obtain partial results. But the problem doesn't refer to such techniques, so it is not, at least formally, stated what constitutes a Yang-Mills theory.
  • Because different people use different techniques, how plausible the "technical conditions" in the axioms are differs from one to other.

Now, the techniques which are widely considered most plausible are the ones by Balaban, cited in the problem description. I think it is possible just to fix some technical conditions in Seiler's axioms and to state formally that the construction using Balaban's techniques gives objects satisfying Seiler's axioms, and that would be one formalizable statement. But that would be a very long way, and would exclude other approaches.

Yet, apart from defining the Millennium problem, I think it's very important just to state the OS axioms (also for Wilson loops) and Balaban's approach.

(I'm not sure about asking someone with a degree in theoretical physics. This is a problem in mathematical physics, and requires a firm mathematical background even to state the traditional OS axioms)

Snir Broshi (Oct 27 2025 at 13:23):

Timothy Chow said:

even if a formal proof is feasible, one doesn't want to run a gigantic computation every time the library is loaded, so how does one manage the situation?

btw these also require gigantic computation:

  • Weak Goldbach conjecture
  • Four color theorem
  • All Sudokus with less than 17 givens have no unique solution

Notably the four-color theorem in Rocq takes a while to build, but their trick is to parallelize (by splitting into multiple independent files which the main file imports, they have a 2 level hierarchy) and to be a separate library, so that they don't hurt the build times of the main math library.
idk how big are .oleans of huge proofs, but being able to download them to save 35 CPU years is cool.

Here's a similar computation-heavy proof in Lean.

For the Rubik's cube, my hope is that 35 CPU years in 2010 has dropped and will take a feasible time.
Also, they only needed enough formal arguments to make the computation feasible, so it might be possible to bring the time down further, especially given fancier group theory results. Emphasis on might :upside_down:

Snir Broshi (Oct 27 2025 at 13:34):

There's a Matt Parker video about a proof with a computation time of 3 hours (paper), which is both not that long and the computation results in 1.5 MiB of data so it probably doesn't even have to be rerun (although we do need to formally verify the data which might take computation time).

This might be a cool weekend project.

Roman Bacik (Oct 28 2025 at 01:58):

Kevin Buzzard said:

Speaking as a moderator of this forum I am highly skeptical that this or indeed anything would help filter out cranks. There are plenty of people posting code which doesn't compile which "proves" RH; the point is that if the LLM says it compiles then the crank doesn't even need to figure out what lean is or how to run a lean file, they just post the slop code anyway. If anything, formalisations of the six formalisable problems (it's not clear that Yang-Mills is mathematically well-defined as a formal conjecture) might just increase interest in slop proofs! Of course the benefits will outweigh the problems here but I'm just pointing out that I don't think it will help solve any crank issues.

One may be able to solve crank issues by running a service, which would authenticate valid proofs by signing the hash of the proof with priv key. Then unauthenticated posts with missing validation or not matching hash would get automatically rejected. By having authenticated proof, which can be verified using pub key, one could not only fix the forum but also use it when submitting to journals and arxiv so people know they can trust the results.

Timothy Chow (Oct 28 2025 at 03:44):

I emailed Arnold Neumaier and he basically agrees with Yoh Tanimoto. In my opinion, it would still be valuable to have one formal statement whose proof would be sufficient to win the prize, even if it's not necessary.

Neumaier also clarified that he meant that his collaborator, who would be doing the actual formalization, would need at least a degree in theoretical physics. Mathematical physics would probably be better, but would be harder to find, and Neumaier thinks he can probably explain the necessary mathematical physics to someone who is proficient both in Lean and in theoretical physics.

Timothy Chow (Oct 28 2025 at 03:47):

On the topic of Rubik's cube and computational proofs, I don't think that CPU clock speeds have gotten much faster since 2010. That doesn't mean the 35 CPU years can't be reduced, but I would expect that the performance improvements would come from algorithmic improvements and/or finding a way to exploit GPUs to do the heavy lifting.

Kevin Buzzard (Oct 28 2025 at 08:34):

Roman Bacik said:

One may be able to solve crank issues by [doing something which requires some understanding]

I'm afraid you're not seeing the junk posts which moderators are deleting which show absolutely no understanding or ability to run any code and just instead comprise of people posting a lean file and saying "the model says that this is a Lean proof of the Riemann Hypothesis so that's the end of it" and making no attempt to, or even knowing how to, run it themselves. There is no technological solution to this.

suhr (Oct 28 2025 at 09:02):

In a way, "it does not even compile" is already the best technological solution you can get. Everything else cannot be automated reliably.

One could hope some people will get over "it doesn't even compile" stage, but this basically does not happen. Cranks are really all the same.

Vasilii Nesterov (Oct 28 2025 at 10:00):

On Rubik's Cube, I formalized a recent result on the Hadwiger–Nelson problem, which essentially reduces graph coloring to SAT and verifies the SAT solver's certificate. I think it's impossible to verify such large computations using Lean's kernel directly. We can only verify an efficient checker and then run it (thereby adding Lean's compiler to the trusted code base).

Alok Singh (Oct 28 2025 at 18:30):

Snir Broshi said:

  • Robertson-Seymour theorem (graph minors)
  • Weak Goldbach conjecture
  • Tao's result on Collatz
  • Rubik's cube "God's number" (diameter of the group) is 20
  • Jordan curve theorem
  • Four color theorem
  • Odd order theorem (FWIW mark me down as someone who's interested in finite groups)

One of my dreams is Jordan curve by nonstandard analysis https://arxiv.org/abs/math/9608204

Yoh Tanimoto (Oct 28 2025 at 18:41):

I agree that it would be nice to have one formal statement. Actually I thought about stating the OS axioms (standard, for pointlike fields), but the definition of Schwartz distrubutions is missing in mathlib. So I think it's still a long way.

On the other hand, as for a characterization of the YM theory, the problem description seems to ask that the correlation functions should agree asymptotically with those in the perturbation theory. But that would be a huge statement, because one would first have to calculate the all-order perturbation theory, fully formalizing something like this... or does Neumaier have something else in mind?

Moritz Doll (Oct 29 2025 at 01:04):

Yoh Tanimoto said:

I agree that it would be nice to have one formal statement. Actually I thought about stating the OS axioms (standard, for pointlike fields), but the definition of Schwartz distrubutions is missing in mathlib. So I think it's still a long way.

On the other hand, as for a characterization of the YM theory, the problem description seems to ask that the correlation functions should agree asymptotically with those in the perturbation theory. But that would be a huge statement, because one would first have to calculate the all-order perturbation theory, fully formalizing something like this... or does Neumaier have something else in mind?

There is a wip PR for tempered distributions, and I am working on getting everything review-ready soon.

Moritz Doll (Oct 29 2025 at 01:04):

As for general distributions, we are not super far away either

Michael Rothgang (Oct 29 2025 at 07:13):

Moritz Doll said:

As for general distributions, we are not super far away either

The definition exists on a branch already, it's getting reviewed now. The first one is awaiting-author, I'll ask them.

Moritz Doll (Oct 29 2025 at 07:16):

Oh really? Do you have a link? I couldn't find it

Michael Rothgang (Oct 29 2025 at 08:34):

#30309 is the current supremum of mathlib PRs, but there is further work in a branch. Let me find it...

Michael Rothgang (Oct 29 2025 at 08:36):

There's also #30329

Michael Rothgang (Oct 29 2025 at 08:43):

This implicitly contains the space of distributions (look at the very end), but AFAICT no API for it yet.

Nikolas Kuhn (Dec 17 2025 at 20:46):

The proof of Theorem 6.8 of this paper would be very nice (if true):
https://arxiv.org/pdf/2508.05105
Quanta article
The statement shouldn't be too hard to formalize: A "very general" cubic hypersurface in C^5 is irrational. (I.e. there exists a countable union of hypersurfaces in the space of polynomials outside of which this is true).

This would definitely impress any algebraic geometers out there :)

Kevin Buzzard (Dec 17 2025 at 20:57):

See also #mathlib4 > New formalization target @ 💬 . I agree it would be an amazing goal!

Mirek Olšák (Dec 18 2025 at 13:34):

Alok Singh said:

One of my dreams is Jordan curve by nonstandard analysis https://arxiv.org/abs/math/9608204

That is related that I would like to see "logical quotation" in Lean, which is probably also a relatively big project (not as much as FLT though). Wrapping a Lean expression into lq(...) instead of q(...) which would capture all the metadata to make sense of the expression, and by that it could be possible to do logic on top of it. So for example,

theorem independent_CH : IndependentOnLean lq(
  Nonempty (  (Ordinal.omega.{0} 1).ToType)) := sorry

would be a valid provable statement. I know people have done CH in a ZFC model in Lean but I find it long-term impractical to be manually rebuilding mathematics again inside a model.

Mirek Olšák (Dec 18 2025 at 14:34):

Chris Wong said:

This might not be very big, but my dream project is formalizing Modern Tetris. There's a lot of research and theory around the game (example), but it's fragmented across many documents and chat threads, and might contain mistakes. A Lean/Verso library could address both problems.

Would be also nice to implement PuzzleScript, and formally reason about its games.

Violeta Hernández (Dec 18 2025 at 15:03):

Tangential remark, but the easiest way to state CH in Lean is surely:

import Mathlib.SetTheory.Ordinal.Aleph

open Cardinal

def CH : Prop := ℵ_ 1 = ℶ_ 1

Violeta Hernández (Dec 18 2025 at 16:04):

(Perhaps it'd be a good idea to make ContinuumHypothesis, ContinuumHypothesisFor, and GeneralizedContinuumHypothesis, to at least have a canonical way to spell this out)

Timothy Chow (Dec 19 2025 at 22:38):

I'm not familiar with the nuances of how set theory is formalized in Lean, but it might be worth pointing out that in ZF (without the axiom of choice), there are a couple of different candidates for "the continuum hypothesis," which are all provably equivalent in ZFC but not equivalent in ZF. So if "the continuum hypothesis" is going to be enshrined in the library, it may be worth thinking about these nuances first.

Aaron Liu (Dec 19 2025 at 22:49):

well we have choice so

James E Hanson (Dec 19 2025 at 23:20):

Timothy Chow said:

I'm not familiar with the nuances of how set theory is formalized in Lean, but it might be worth pointing out that in ZF (without the axiom of choice), there are a couple of different candidates for "the continuum hypothesis," which are all provably equivalent in ZFC but not equivalent in ZF. So if "the continuum hypothesis" is going to be enshrined in the library, it may be worth thinking about these nuances first.

I feel like that ship sailed around the time Mathlib used the axiom of choice to define π\pi:

protected noncomputable def pi :  :=
  2 * Classical.choose exists_cos_eq_zero

Snir Broshi (Dec 20 2025 at 01:55):

James E Hanson said:

I feel like that ship sailed around the time Mathlib used the axiom of choice to define π\pi

October 2018, in case anyone is curious like I was

Kevin Buzzard (Dec 20 2025 at 01:57):

Yes it was part of my 50th birthday PR

Ruben Van de Velde (Dec 20 2025 at 07:33):

Who doesn't like pi for their birthday?

Geoffrey Irving (Dec 20 2025 at 09:20):

^ Choice is a much better birthday present than pie. :)


Last updated: Dec 20 2025 at 21:32 UTC