Zulip Chat Archive

Stream: general

Topic: Numina 3mil XTX grant to make 1 million formal problems

Jason Rute (Dec 17 2024 at 14:45):

Project Numina (a nonprofit which won the recent AIMO progress prize) was awarded a 3 million euro grant.

Numina is embarking on a groundbreaking initiative to collect and publicly release a comprehensive database of up to one million formal mathematical problems and proofs

Jason Rute (Dec 17 2024 at 14:46):

From this tweet thread, I think they are targeting Lean. https://x.com/jiali52524397/status/1868990508749701462?s=46

Jason Rute (Dec 17 2024 at 14:47):

@Jia Li said:

We pray everyday for mathlib to be backward compatible for the next release and try to get things up to date as we can. But it is a huge challenge :)

Joseph Myers (Dec 18 2024 at 00:10):

Some questions about this:

Given the number of problems involved, is it intended that most of the formalization of both statements and proofs will be autoformalization (with human review of correct correspondence of formal statements to informal ones being only on the statistical level)?
Will the project involve contribution to mathlib of any new lemmas or definitions relevant to making better formal statements or proofs of the problems, or if it's mainly autoformalization will that be left for potential later data mining?
What conventions will be used in the formal statements, especially regarding (a) the use of separate auxiliary definitions referenced in a problem statement (I feel various of the AlphaProof statements ended up as rather unidiomatic Lean through forcing all such definitions from the informal statements into hypotheses rather than allowing separate definitions before the main theorem) and (b) how "determine" problems are stated?

Joseph Myers (Dec 18 2024 at 00:12):

If there is anything mathlib-appropriate, then, as for any Lean formalization project, I strongly encourage contributing new material as you go along, as soon as a single new lemma, or definition + API, is ready - not waiting until later and accumulating a backlog that may conflict with or duplicate other work or need major changes from review affecting all the uses you've made of it in proofs.

Jason Rute (Dec 18 2024 at 00:16):

My opinion (which I’ve stated elsewhere) is that even a large library of autoformalized mathematics which is quite messy compared to Mathlib would have tremendous value even if it is was filled with duplicate definitions, unidiomatic proofs, multiple versions of the same theorem (either intentionally or unintentionally). I don’t know if this library is just competition-level math or “real” math, but if it is more ambitious in its coverage even better. Of course it could get put into mathlib later after being cleaned, but I think just the process of trying large scale autoformalization of real math for the first time is value alone.

Kim Morrison (Dec 18 2024 at 00:18):

Jason Rute said:

Jia Li said:

We pray everyday for mathlib to be backward compatible for the next release and try to get things up to date as we can. But it is a huge challenge :)

@Jia Li , I'd be interested to hear your perspective on what could make this challenge easier. Obviously the firehose of Mathlib changes is not going to slow down. Hopefully the changes in the language and basic tactics will slow down over the next year or two...

Hopefully deprecations help, but what else would be useful? Unfortunately there doesn't appear to be much availability of developer time to work on migration tools, so it would need good to have suggestions about what to prioritise.

Joseph Myers (Dec 18 2024 at 00:19):

There is a certain tension here between building mathlib as a highly general and integrated library, and building very large amounts of AI training material. When focused on building mathlib, it's natural to lean into the extra work of building APIs in proper generality every time you discover an API gap in mathlib, always trying to add things to mathlib whenever there is a reasonable argument there is something appropriate to add there. When you want bulk training material (and AIs need much more training material at present than humans do), trying to get everything fully general and in mathlib may slow things down a lot.

Joseph Myers (Dec 18 2024 at 00:22):

I mentioned data mining as something that might be done with large amounts of autoformalized mathematics - develop tools to find places where some part of a proof can be extracted as a more generally useful lemma missing from mathlib (probably AI-assisted tools, since "more generally useful" is a rather fuzzy concept but one you might be able to train a machine learning model to recognize).

Jason Rute (Dec 18 2024 at 00:23):

Maybe think of it as Wikipedia (organized, methodical, slow growing) vs Google (a messy dump of all available information organized only by machine learning algorithms). They are compatible and both useful.

Jason Rute (Dec 18 2024 at 00:23):

Obviously Mathlib is Wikipedia in this analogy.

Joseph Myers (Dec 18 2024 at 00:23):

More traditional tools for cleaning up proofs should also have a lot of scope with such a dump of messy material (even if the data as released starts off more readable than AlphaProof output is).

Jason Rute (Dec 18 2024 at 00:24):

(By the way I have no idea what Numina is building or why, so this is just speculation, but I hope it is more than just training data.)

Joseph Myers (Dec 18 2024 at 00:34):

(I've recently started writing up my thoughts on what uniform formalization conventions for competition problem statements should look like, but that's very much oriented to the mathlib end of things not to the huge messy pile of autoformalized statements end of things. I also now have rough notes on what definitions I think should be added to mathlib - in appropriate generality and with corresponding API - for clean and idiomatic statements of all 392 past IMO problems, again oriented to the mathlib end of things on the basis of always erring on the side of adding a definition for any standard concept for which some API makes sense, rather than trying to get things stated quickly using just existing mathlib definitions.)

Yaël Dillies (Dec 18 2024 at 15:17):

Kim Morrison said:

Jia Li said:

We pray everyday for mathlib to be backward compatible for the next release and try to get things up to date as we can. But it is a huge challenge :)

Jia Li , I'd be interested to hear your perspective on what could make this challenge easier. Obviously the firehose of Mathlib changes is not going to slow down.

Actually, I have noticed that the basic algebra/order theory files I have refactored many times over do tend to stabilize now. It has taken a while, but we seem to have reached an optimal API in those parts of mathlib

Yaël Dillies (Dec 18 2024 at 15:20):

So my claim is that the firehose of mathlib changes is not going to slow down, but that it will move further down the import tree as time passes

Jia Li (Dec 20 2024 at 05:46):

Hi everyone, I am Jia. Thank you for your messages ! Would love to discuss with some of you to see how to solve these challenges :)
I am of course a bit joking with my complaint on twitter, i do realize how difficult it is to maintain such a huge library. To facilitate our work, we have fixed our version of lean to 4.11 and waiting for a good moment to update it. However, here are some recurrent requests from our contributors who are helping us formalizing some olympiads problems:

my code compiles on live lean but not on your platform (4.11)
can numina update the lean 4 version because i cannot find the newest theorem (yesterday it was pi is irrational), then we say we are not ready yet because update will break 10% of our existing formalised data
can numina put all the frequently used theorem on a fork of mathlib ? We are thinking about it but apparently we do not have the bandwidth and know how to do it.

@Yaël Dillies we are also trying to build a combinatorics lib to help formalising combinatorics in olympiads, would love to discuss with you.

Really thank you for your quick feedbacks. Sorry for not being very reactive. I would love to discuss with some of you

Jia Li (Dec 20 2024 at 05:51):

Kim Morrison said:

Jason Rute said:

Jia Li said:

We pray everyday for mathlib to be backward compatible for the next release and try to get things up to date as we can. But it is a huge challenge :)

Jia Li , I'd be interested to hear your perspective on what could make this challenge easier. Obviously the firehose of Mathlib changes is not going to slow down. Hopefully the changes in the language and basic tactics will slow down over the next year or two...

Hopefully deprecations help, but what else would be useful? Unfortunately there doesn't appear to be much availability of developer time to work on migration tools, so it would need good to have suggestions about what to prioritise.

currently our "approche" is to freeze the version then every x month, update mathlib togather with fixing the proofs

Jia Li (Dec 20 2024 at 05:58):

Jason Rute said:

My opinion (which I’ve stated elsewhere) is that even a large library of autoformalized mathematics which is quite messy compared to Mathlib would have tremendous value even if it is was filled with duplicate definitions, unidiomatic proofs, multiple versions of the same theorem (either intentionally or unintentionally). I don’t know if this library is just competition-level math or “real” math, but if it is more ambitious in its coverage even better. Of course it could get put into mathlib later after being cleaned, but I think just the process of trying large scale autoformalization of real math for the first time is value alone.

This is what we have in mind currently. There are definitively a lot of lemmas worth putting into the mathlib. Someone mentioned he had to copy paste Cauchy's functional equation because it is not in mathlib.

Unfortunately, we have limited ressources to manage data to train llm as well as trying to contribute properly to mathlib. I would love to discuss with you guys to see if we manage to find a simple solution

Kim Morrison (Dec 20 2024 at 06:21):

Jia Li said:

my code compiles on live lean but not on your platform (4.11)

For this bullet point, perhaps we could just support old versions on live.lean-lang.org. @Joachim Breitner does that sound plausible?

Kim Morrison (Dec 20 2024 at 06:22):

Jia Li said:

currently our "approche" is to freeze the version then every x month, update mathlib togather with fixing the proofs

Do you minimize imports, or do all your files effectively have import Mathlib? (This will affect whether theorems moving to different files causes problems for you.)

Kim Morrison (Dec 20 2024 at 06:24):

@Jia Li, I'm guess you haven't yet done a version bump (and in particular x is unknown?).

Even if x > 1 for the sake of maintaining context, I recomment that when you do bump you try to do it progressively, i.e. updating Mathlib to each intermediate tag of the form v4.14.0-rc1 or v4.14.0. This will separate out the changes in Mathlib vs the changes in Lean into batches for you. It would be great to hear any report you have of the relative difficulty of these.

Jia Li (Dec 20 2024 at 06:54):

Kim Morrison said:

Jia Li, I'm guess you haven't yet done a version bump (and in particular x is unknown?).

Even if x > 1 for the sake of maintaining context, I recomment that when you do bump you try to do it progressively, i.e. updating Mathlib to each intermediate tag of the form v4.14.0-rc1 or v4.14.0. This will separate out the changes in Mathlib vs the changes in Lean into batches for you. It would be great to hear any report you have of the relative difficulty of these.

Great idea ! We will try this idea during our first migration next week (we keep saying this since 1 month, but this time it is true). We need to train an llm to do this for us
For the import, for some reason, most of our contributors just do

import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat

Johan Commelin (Dec 20 2024 at 07:03):

That will definitely give you a working environment with all the tools (-;

Joachim Breitner (Dec 20 2024 at 09:15):

Kim Morrison said:

Jia Li said:

my code compiles on live lean but not on your platform (4.11)

For this bullet point, perhaps we could just support old versions on live.lean-lang.org. Joachim Breitner does that sound plausible?

Very plausible. I am not sure if it is worth having all stable versions, which will inflate the version selector a lot, but a handful of versions upon manual request is doable. Is 4.11 such a “version of interest”?

Eric Wieser (Dec 20 2024 at 10:05):

It would certainly be helpful to have at least the last 2 or so major versions in the web editor, as that can make it easy to create mwes when bumping from stable-n to stable

Joachim Breitner (Dec 20 2024 at 10:07):

Ok. This would either require manual work on live.lean-lang.org after releases, or a tag/branch on the mathlib side called, say, oldstable and oldoldstable that’s updated as part of the release process. But probably manual is better, because it means that the drop-down can have a good name.

Eric Wieser (Dec 20 2024 at 10:43):

I would imagine that live.lean-lang.org could do some git fu to find out the numeric tag attached to the stable branch

Eric Wieser (Dec 20 2024 at 10:43):

Or just query the last n such tags

Joseph Myers (Dec 20 2024 at 13:00):

Jia Li said:

Yaël Dillies we are also trying to build a combinatorics lib to help formalising combinatorics in olympiads, would love to discuss with you.

I think "a combinatorics lib" isn't really the right thing. Rather, each combinatorics problem is likely to need different things and show up API gaps in different areas of mathlib. (Olympiad combinatorics is to some extent a miscellany of things that don't fit into another topic. On the IMO 2019 PSC we spent some time discussing whether the problem that ended up as 2020 shortlist C7 - submitted but not shortlisted in 2019 - counted as combinatorics or algebra.)

So when I formalized IMO 2024 P5 that meant adding various lemmas about lists to mathlib (some resulting material since upstreamed further to core Lean), and IMO 2024 P3 (#19671) involved various lemmas about Nat.nth (a definition that doesn't appear in the statement, and isn't used much in mathlib - I don't know whether that would make it harder for AIs to realize its relevance to the problem). If you formalized ten other medium/hard IMO combinatorics problems, you might find yourself adding lemmas in ten other areas of mathlib. (It would be interesting to see if 1000+ lines is the natural length of such formalizations, given more examples. When I first formalized an olympiad combinatorics problem, from British MO 2020 round 2, the result was 3000 lines of Lean 3, but that was as a Lean beginner and I suspect I'd come up with something rather shorter now even though I think the high-level structure of that solution is still basically sound.)

@Yaël Dillies has given a talk about their approach to upstreaming lemmas to mathlib. Mine is:

At the end of each formalization session, as part of finishing for the day, along with committing work in progress to my own repository, I PR any mathlib-appropriate lemmas accumulated in the course of that session's work (that don't depend on anything awaiting review); I try not to leave any un-PRed lemmas overnight.
Whenever any one of those PRs is merged into mathlib, I update mathlib in my repository (removing the local copy of the lemma), preferably within a day or so, and then PR anything else that has become unblocked because all its dependencies have been merged. (I rarely use PR dependencies to PR lemmas to mathlib while they still have dependencies awaiting review; most of the time I don't feel the added complexity of doing so is justified. If review results in incompatible changes to a lemma's interface, I might occasionally backport those to my repository, but usually I wait for the PR to be merged before updating users in my repository for the new API.)
In the absence of any mathlib PRs originating from a given project being merged, I'll still update mathlib in my repository immediately before starting any work on it, if it's been more than a few days since I last updated mathlib there. (When I go longer periods without working on a particular repository, mathlib updates there may be less frequent.)
I try to keep a watch for relevant work by other people and engage in the discussions, both here and in mathlib PRs. So when I'm formalizing olympiad problems I may well also help review other people's IMO formalizations PRed to the mathlib archive, for example; likewise I'll watch our for Euclidean geometry PRs related to the theory I've added there (and may add more of in future) for use in formalizing olympiad problems. There's a (small) sub-community who contribute and review olympiad formalizations, here and in compfiles. When you're doing olympiad formalizations I urge you to join this community, help to review other people's formalizations, talk here on Zulip about what specific problems you're working on and what theory you're writing, post your work in public at an early stage especially when there's anything that could form a useful lemma for mathlib.

Last updated: May 02 2025 at 03:31 UTC