Zulip Chat Archive

Stream: general

Topic: a-dataset-of-modern-formalized-theorem-statements


Vasily Ilin (Dec 07 2025 at 22:15):

Hi, I just came across @Kevin Buzzard 's Ren Phil proposal. Is this list of statements published anywhere?

Kevin Buzzard (Dec 07 2025 at 22:36):

There's nothing public but right now we're working on statements from papers from the Annals of Mathematics published since 2020, although we might expand our horizons later. It seems a bit ridiculous to just publish 10 statements so we're waiting until we have a substantial amount which will maybe take us another year or so?

Vasily Ilin (Dec 08 2025 at 00:10):

Got it, thank you for the answer!
Is there just a list of already formalized research papers? Not the statements themselves but just the papers.

Kevin Buzzard (Dec 08 2025 at 07:50):

No.

Adrian Marti (Dec 08 2025 at 13:52):

How will dependencies between definitions be modeled, or papers that depend on each other? Different packages, one big package or statements in each paper individually?

One thing I'm really missing from current datasets, is that they take a "problem and proof" approach, each one self-contained. Of course, this is too rigid to model research-level mathematics, since the definitions need to be written first. Of course, you could simply take mathlib, the flt project or any other big lean project as a dataset. However, I'm not sure how well this works. I wonder if the tech companies are doing this and whether there are some important limitations.

Kevin Buzzard (Dec 08 2025 at 14:35):

When we are ready, which could well take a year, we will announce something, but it is to premature to talk about it now.


Last updated: Dec 20 2025 at 21:32 UTC