Zulip Chat Archive

Stream: mathlib4

Topic: 100 theorems: Ceva's theorem

Boris Alexeev (Dec 31 2025 at 21:46):

I recently looked at the list of Missing theorems from Freek Wiedijk's list of 100 theorems and saw that "61: Theorem of Ceva" was missing. As a result, I auto- formalized Ceva's theorem using Aristotle. I then attempted to add this proof to the 100 theorems YAML file.

In the process, I have learned that this was the Nth proof of Ceva's theorem, where "N is much bigger than 1". In particular, I learned that it was proven over 4 years ago way back in mathlib3 days. I don't know what the other N-2 proofs are. And now, there is proof N+1 that "Unlike various previous attempts", is "in appropriate generality for mathlib".

However, proof N+1 does not use the concepts of distance (signed or unsigned) or of the Euclidean plane. This was confusing to me, because I polled several professional academic mathematicians and they all thought Ceva's theorem was about distances in triangles in the plane. I also checked the first 20 or so statements of Ceva's theorem online (Wikipedia, Mathworld, etc) and I couldn't find a version that didn't mention distance. Accordingly, I was impressed that a version of "Ceva's theorem" of "appropriate generality for mathlib" was so general that it did not include Ceva's theorem itself as a special case.

As a discussion topic, I had the following related questions:

If one of the "100 theorems" (or 1000) like Ceva's theorem is proven, but not accepted into mathlib, should it be listed as proven in Lean or should it be missing?
What if it has been proven "N>>1" times? Can that be documented on the 100 theorems page?
If one of the "100 theorems" is proven, and the author attempts to contribute it to mathlib, should it be listed as proven in Lean while it's in process? What if it's been two years and it still hasn't been accepted yet? What if it's been 4 years, Lean has changed major version numbers, and the PR was abandoned? (These questions are not hypothetical.)
Suppose you have a version of "Ceva's theorem", whether in mathlib or not, that doesn't mention distances and in particular doesn't include Ceva's theorem as it appears stated almost anywhere, should that be listed as proving Ceva's theorem for the purposes of the 100 theorems?

Ruben Van de Velde (Dec 31 2025 at 21:52):

It's fine to add a reference to a theorem to the yaml even if it doesn't land in mathlib

Joseph Myers (Dec 31 2025 at 22:55):

Freek's list (deliberately, I think) doesn't come with any definition of which of many different but related statements that might informally be considered versions of the same theorem is supposed to be proved to count as having done something on the list (or, what the axiomatic basis is supposed to be for proving things - which affects e.g. whether Pythagoras is close to being true by definition or needs more complicated geometrical reasoning). This leads to such things as how I think for a long time what we listed for the triangle inequality was the definition of a metric space.

Joseph Myers (Dec 31 2025 at 22:58):

In the early days of formalization, when few of the theorems on the list had been formalized in any ITP, or indeed in the early days of Lean formalization, when few had been formalized in Lean, simply doing any version of a result on the list, regardless of library integration, helped to demonstrate the abilities of formalization in general, or Lean formalization in particular.

Joseph Myers (Dec 31 2025 at 23:04):

We're not in those early days any more (for any result on that list other than FLT). You can of course formalize something on the list for fun whether or not it's already shown on the list as being formalized in Lean. But just increasing the number of entries shown as being formalized in Lean (anywhere, whether or not in mathlib) seems of less value to me now. Most of the value from formalizing something new on the list isn't from increasing the number, it's from getting the formalization - and consequently, all the resulting smaller lemmas and definitions used in its proof - into mathlib, in mathlib-appropriate generality (with each lemma in the appropriate generality for that lemma). (And so getting a formalization into mathlib of something on the list previously formalized outside mathlib, without increasing the total number formalized, is as valuable as formalizing for mathlib something on the list not previously formalized at all - and more valuable than formalizing something new on the list but not getting it into mathlib.)

Joseph Myers (Dec 31 2025 at 23:07):

This means, for example, that if you're going for the isoperimetric theorem, as one example listed as not done in Lean (though I think someone had a project working on it at some point), doing it in n dimensions, aiming for mathlib, is much better than just doing a 2-dimensional version with a view to getting it done quicker (though the HOL formalization linked from Freek's list as the only one for that theorem is just a 2-dimensional version).

Joseph Myers (Dec 31 2025 at 23:11):

I'd prefer the headline number on the 100-theorems page to be the number that are done in mathlib or the mathlib archive, with a second figure for those formalized somewhere else and a prominent red flag on those entries indicating they haven't (yet) been integrated into mathlib.

Joseph Myers (Dec 31 2025 at 23:17):

As for Ceva, it's extremely common for a formal statement of a result to have gone through several stages of generalization so it doesn't look very much like the most familiar informal statements of that result. I think it's clear that Ceva is in essence an affine space result, and that other versions should be deduced from an affine space version, and whether you treat the main version for such a list as being an affine space statement, or as being something immediately derived from an affine space statement with docs#dist_left_lineMap and similar that say how a general affine statement can be converted to one with distances in the docs#NormedAddTorsor case (and then consider the affine space statement itself to be one of the intermediate lemmas that are often the main benefit of formalizing something for mathlib), isn't an interesting question.

Boris Alexeev (Jan 01 2026 at 03:21):

The question isn't which should be considered the main statement. The question is whether the "generalization of X" in mathlib should even include X at all.

Boris Alexeev (Jan 01 2026 at 03:36):

This was my first PR to mathlib. What I've learned is that in the past, people have apparently repeatedly come with proofs of theorems from the "100 theorems" list, attempting to get them into mathlib, and they have failed even after 2--4 years of trying. (I did not know this, and so I too "fail to engage with past Zulip discussions".)

Boris Alexeev (Jan 01 2026 at 03:36):

Mathematically, I've also learned the slightly surprising fact that multiplying some distances together for a point inside a triangle *isn't* Ceva's theorem but a version which doesn't even multiply distances for a point in a triangle IS Ceva's theorem. Someone should tell Wikipedia and Mathworld.

Boris Alexeev (Jan 01 2026 at 03:39):

Happy New Year!

Boris Alexeev (Jan 01 2026 at 03:40):

May the new year bring new contributors that prove a theorem from the "100 theorems" list and manage to get them into mathlib while it's still 2026.

Boris Alexeev (Jan 01 2026 at 04:07):

On a related note, I just found the comment will enter mathlib in 2024 in 1000.yaml. Maybe 2026 will be the year?

Violeta Hernández (Jan 01 2026 at 09:42):

I think we'd save on a lot of man-hours if we had some sort of unofficial 100 theorems list including all of the theorems that have done in Lean before but are not yet (or ever) in Mathlib

Ruben Van de Velde (Jan 01 2026 at 09:55):

My point is that this already exists and it's the 100.yml file

Ruben Van de Velde (Jan 01 2026 at 09:56):

PRs to add links to non-mathlib formalizations are welcome

Joseph Myers (Jan 01 2026 at 13:33):

I think the state of whether and how an entry on a list of theorems is "done" is at least two-dimensional, but the pages we have can't represent that.

On the "whether" axis, there are at least five options:

Done in mathlib.
Done in the mathlib archive.
Done outside mathlib / mathlib archive (possibly with an older version of Lean).
In progress but not yet sorry-free.
Not being worked on.

My disagreement with how things are presented on the current lists is that I think option 3 has more in common with options 4 and 5 than with options 1 and 2 - if something isn't integrated with the current version of mathlib and kept up to date as part of mathlib maintenance, it will probably bit rot and cease to be useful. (Or: I want "Lean has X and Y and Z" (implication that it's a single version that has them all), not "Lean has X" and "Lean has Y" and "Lean has Z" (possibly different versions for each of them).)

On the "how" axis, an informal name for a theorem may correspond to several different but related informal statements, each of which may correspond to several different but related formal statements, and not all of these may immediately look very close to being the same theorem. Sometimes one statement is clearly more general than the others, which can be deduced from it; sometimes different statements generalize in different directions. This is not actually unique to formalization; you often get cases in informal mathematics where people may refer to theorem X as being essentially the same as / a special case of theorem Y, even if the correspondence is not at all obvious without a more extensive knowledge of the area.

Snir Broshi (Jan 01 2026 at 13:54):

2.5. Done with an open Mathlib PR

Chris Henson (Jan 01 2026 at 14:49):

Joseph Myers said:

I think option 3 has more in common with options 4 and 5 than with options 1 and 2 - if something isn't integrated with the current version of mathlib and kept up to date as part of mathlib maintenance, it will probably bit rot and cease to be useful.

It seems a bit extreme to say a proof not being in Mathlib makes it in any sense closer to including sorry. Ideally a completed proof downstream of Mathlib with pinned revisions remains a valid, compiling proof over time. There are technical reasons this may not be true in practice, but this is a general failure of reproducibility in building software (not a problem unique to Lean). I fully agree that inclusion in Mathlib garuntees a level of maintaince to newer versions of Lean that is desirable, but I would find it inaccurate and limiting to say anything outside of Mathlib is inherently a less complete proof.

Joseph Myers (Jan 01 2026 at 15:22):

I'd say it's less complete as a project (the greater part of a formalization project is getting things into mathlib, not getting to sorry-free) rather than less complete as a proof.

Joseph Myers (Jan 01 2026 at 15:28):

Or: most of the value from mathlib comes from having all the lemmas available together and integrated, rather than thousands of separate small libraries each using a different version of Lean. If you have $N$ entries in the list that are in mathlib and $M$ that aren't, maybe that means that $N+M$ is a measure of value based on results in isolation, but $\binom{N}{2}$ (with $M$ not involved at all) is a measure of value based on network effects.

Boris Alexeev (Jan 01 2026 at 15:30):

ChatGPT tells me that 5 of the 6 highest-profile Lean formalizations aren't part of Mathlib. (The 6th was Mathlib itself, listed first.)

Boris Alexeev (Jan 01 2026 at 15:30):

I'm happy that there's a healthy world of formalization in Lean outside of Mathlib.

Joseph Myers (Jan 01 2026 at 15:34):

Since it's often the intermediate definitions and lemmas that are more valuable than the leaf results, a Lean formalization can get much of its value by contributing those to mathlib even if the leaf results don't end up there. (But "highest-profile" may contribute a different kind of value by attracting people to contribute rather than through the content of the formalization itself. It's less clear that increasing the number done on the 100-theorems list from 89 to 90, say, has that kind of value.)

Joseph Myers (Jan 01 2026 at 15:37):

Another significant form of value is the gain in understanding of the strengths and weaknesses of different approaches to formalizing an area of mathematics (different choices of possible definitions, intermediate lemmas that prove useful but might not appear in the informal literature, what kinds of little steps in the proofs were convenient or awkward to do, etc.) - the insights that may then appear in papers about the formalization. How to get that kind of insight when heavily using autoformalization is an open question - did you manage to extract such insights from your interactions with Aristotle to prove things on the 100-theorems list?

Joseph Myers (Jan 01 2026 at 15:42):

I'd like to see the entirety of FLT end up in mathlib - but that depends on all the known-in-1980s pieces that are outside the scope of the current project being formalized (and getting those more generally useful pieces into mathlib would also provide a large part of the value of getting FLT into mathlib). However, FLT is at a level of size and complexity where simply doing it at all, anywhere, should give a lot of insights into formalization, which isn't the case now for most of the rest of the 100 theorems list.

Joseph Myers (Jan 01 2026 at 15:47):

I don't object to including information about non-mathlib formalizations on the 100-theorems (or 1000-plus) list - it's treating them as done on the same level as mathlib formalizations that I think is misleading.

Chris Henson (Jan 01 2026 at 16:38):

I think we are more in agreement with your distinction between "project" and "proof", specifically that:

inclusion in Mathlib implies a high degree of review scrutiny, maintainability, and interoperability with future versions of Lean and Mathlib
projects outside Mathlib can have correct proofs, but probably more tied to a particular pinning of revisions

I think your idea of "project" still carries an implicit assumption of Mathlib as an end goal for all formalizations. To be clear, I fully agree with you on the immense value that upstreaming to Mathlib provides, but I also recognize that this not necessarily the goal of every formalization project, and that high quality work happens outside of Mathlib.

With that in mind, I don't find it particularly misleading to include non-Mathlib formalizations of sufficient quality on the 100-theorems list, though it is probably a bit of a historical quirk that these YAML files are hosted in the Mathlib repo. I find this especially true of the 1000-theorems list, as I think there are many theorems that would find difficultly finding a place in Mathlib, such as Church-Rosser that is now in CSLib.

Matthew Coke (Jan 01 2026 at 16:50):

@Chris Henson I think building a list of projects and proofs is a good place to start. I have two proofs that I built/looked at this morning. One that involves a discrepancy between quantity and distance (pigeonhole principle) and one that involves binary choice and the creation of a new object (proof that there are an infinite number of primes). I'm desirous of getting a better mapping of these techniques within proofs. In part because those different kinds of proof techniques take different forms in my "terrain" methodology and I want to use that to understand existing proofs and proof theory better

Ruben Van de Velde (Jan 01 2026 at 18:43):

Another note: as far as I'm aware, none of the other languages require a formalization to be part of an integrated library to be listed on Freek's page

Ruben Van de Velde (Jan 01 2026 at 18:55):

(Though landing in mathlib is obviously the best outcome for the ecosystem, assuming we have the contributors to maintain the proof)

Joseph Myers (Jan 01 2026 at 18:58):

1000-plus, however, does distinguish ("L" versus "X" status). And I'm more concerned with how we show what we have on the Lean community site (where we could make the difference between various forms of "done" a lot more obvious), than with the simplified representation on Freek's site.

Joseph Myers (Jan 01 2026 at 19:12):

Regarding the original "saw it was listed as missing, autoformalized it then discovered past formalizations and discussions" that started this discussion, maybe the lists of missing theorems from 100-theorems and 1000-plus should have a note at the top of the form "If you are interested in formalizing one of these theorems, and especially if you are interested in adding a formalization of it to mathlib, you should search Zulip for past discussions and previous work. It may be useful to start a discussion on Zulip if you don't find previous discussions of that theorem, since people on Zulip may know of other attempts at that theorem or have advice on the appropriate form of that theorem for inclusion in mathlib.". (The undergrad todo list already has such a note. Even after merging some form of the PR starting this discussion, the 1000-plus page should probably also have a note about the likelihood that various theorems there have in fact already been done in Lean without being listed there - and quite a few on 1000-plus are probably easy enough to do in mathlib-ready form that there may be less need to start a discussion first for those if you're confident in what the formalization should look like.)

Lawrence Wu (llllvvuu) (Jan 01 2026 at 21:27):

I would suggest that, while mathlib maintainers may be more interested in the list of theorems in mathlib, most readers who come across the webpage would be more interested in the list of formalization attempts overall. Given that the webpage purports to be generated from the .yaml files, it should be straightforward to have it generate qualifiers such as “in progress”, “outside mathlib”, “Lean 3”, etc, as desired.

Timothy Chow (Jan 01 2026 at 22:39):

Boris Alexeev said:

Mathematically, I've also learned the slightly surprising fact that multiplying some distances together for a point inside a triangle *isn't* Ceva's theorem but a version which doesn't even multiply distances for a point in a triangle IS Ceva's theorem. Someone should tell Wikipedia and Mathworld.

It's more common than you may realize to have some disagreement about what "X's theorem" is exactly. I recently learned that "Rolle's theorem," which nowadays we think of as a theorem of calculus, looks quite different from what Rolle originally proved, and that Rolle himself was skeptical of the logical validity of the infinitesimal calculus! Rolle's original theorem was more about the interlacing of the roots of a polynomial and the roots of its derivative, and what we would call derivatives he called "cascades," which he did not use calculus to define. Rolle would turn over in his grave to discover that a version which doesn't even mention polynomials explicitly, and which involves concepts that were anathema to Rolle, IS Rolle's theorem!
I think it's best not to be too dogmatic about nomenclature. For the purposes of mathlib, some decisions have to be made, but we don't have to go as far as to say that what goes by "X's theorem" in mathlib IS X's theorem.

Kevin Buzzard (Jan 02 2026 at 00:03):

(BTW

Joseph Myers said:

Freek's list (deliberately, I think) doesn't come with any definition of which of many different but related statements that might informally be considered versions of the same theorem is supposed to be proved to count as having done something on the list

Freek verbally confirmed to me that this was deliberate a few years ago)

Bryan Gin-ge Chen (Jan 02 2026 at 00:28):

Lawrence Wu (llllvvuu) said:

I would suggest that, while mathlib maintainers may be more interested in the list of theorems in mathlib, most readers who come across the webpage would be more interested in the list of formalization attempts overall. Given that the webpage purports to be generated from the .yaml files, it should be straightforward to have it generate qualifiers such as “in progress”, “outside mathlib”, “Lean 3”, etc, as desired.

I would be happy to review PRs to the website which would improve things on this front (or incorporate any of the other nice suggestions in this thread about how to communicate the status of formalizations).

Snir Broshi (Jan 02 2026 at 00:58):

Joseph Myers said:

maybe the lists of missing theorems from 100-theorems and 1000-plus should have a note at the top of the form "If you are interested in formalizing one of these theorems, and especially if you are interested in adding a formalization of it to mathlib, you should search Zulip for past discussions and previous work. It may be useful to start a discussion on Zulip if you don't find previous discussions of that theorem, since people on Zulip may know of other attempts at that theorem or have advice on the appropriate form of that theorem for inclusion in mathlib."

A nice coincidence #Is there code for X? > Pick's Theorem @ 💬

Boris Alexeev (Jan 02 2026 at 01:19):

I think the insistence on getting things into mathlib for a formalization "project" is over-stated. For example, I'm formalizing solutions to Erdős problems, and I'm not holding my breath for mathlib to even consider accepting those.

So what is the value, then, of these formalizations? For one, I have pedestrian concerns such as correctness of the mathematics.

Boris Alexeev (Jan 02 2026 at 01:20):

I would like to explicitly say one of the values of formalizing a theorem on the 100 theorems list outside of mathlib: it just lets people know that it's possible. As Kevin knows very well, not being able to define the terms in your theorems is a significant impediment to formalizing their proofs.

Joseph Myers (Jan 02 2026 at 01:23):

Formalization to ensure correctness is relevant for new results (in which case the quality of the formalization doesn't really matter - though if you write definitions and lemmas in a suitable form for mathlib, you can get that extra benefit out of it, and help the next person formalizing something in the same area of mathematics, even if the final leaf result doesn't go in mathlib - and the mathlib archive would often be a good place for such leaf results, once everything suitable for mathlib proper has gone there). It's relevant for results where the peer review process failed to reach a conclusion on the mathematics (say Con(NF) or the Kepler conjecture, as examples where formalization resolved such doubts). But for the results on the 100-theorems list, we don't have such doubts about correctness (though some people may have had such doubts about the four colour theorem before it was formalized, or at least before the improved proof by Robertson, Sanders, Seymour and Thomas).

Boris Alexeev (Jan 02 2026 at 01:25):

Consider, for example, "13. the polyhedron formula". Maybe in the version that eventually appears in mathlib, "in order to have greater generality", someone needs to prove something about the universal map from an abelian category to its Grothendieck group.

But for someone trying to just prove things about polyhedra, it's very useful to know that no one has proven V+F-E=2 for polyhedra in part because there's not even a definition of polyhedron.

Joseph Myers (Jan 02 2026 at 01:27):

Boris Alexeev said:

I would like to explicitly say one of the values of formalizing a theorem on the 100 theorems list outside of mathlib: it just lets people know that it's possible.

We already know it's possible (for everything on that list smaller than FLT).

As Kevin knows very well, not being able to define the terms in your theorems is a significant impediment to formalizing their proofs.

Which is why much of the value from one of these formalizations comes from doing it for mathlib - and thus getting lots of small intermediate definitions and API lemmas for those definitions into mathlib, even if the final leaf result only goes in the mathlib archive not mathlib proper.

Boris Alexeev (Jan 02 2026 at 01:28):

Okay, not "possible" but "a reasonable thing to do with the current state of Mathlib definitions"?

Joseph Myers (Jan 02 2026 at 01:31):

For the polyhedron formula, you'll find lots of past discussions on Zulip about planar graphs / combinatorial maps / the Jordan curve theorem. Sometimes, yes, you want V+F-E=2 as a leaf result (I'll want it eventually in AperiodicMonotilesLean, which means I want it in mathlib in a form that can be applied to maps drawn in the plane - albeit only I only need it for maps with polygonal regions that are all convex except for the infinite one). But much of the benefit would be from making it possible to talk about lots of other results involving planar graphs (or other things involving the Jordan curve theorem), rather than the specific result.

Boris Alexeev (Jan 02 2026 at 01:33):

I really should have stopped last year, or at least after this message, which I'll just say again:

Boris Alexeev (Jan 02 2026 at 01:33):

I'm happy that there's a healthy world of formalization in Lean outside of Mathlib.

Joseph Myers (Jan 02 2026 at 01:36):

As for Ceva, I'm happy to add more familiar versions in due course, but those would build on the affine versions (and there are lots more affine versions still to be done) - so first the material in that PR (as opposed to all the things that might be related but aren't in the PR) needs to be reviewed until it's ready to go in.

Joseph Myers (Jan 02 2026 at 01:44):

It would be interesting if we had AIs that could take non-mathlib formalizations and rework them for mathlib (including reworking definitions based on human feedback, making things more general, etc.), so that mathlib could benefit and grow more from the non-mathlib formalization world - but they aren't quite at that level yet (though if a human suggests the precise sequence of definitions and lemmas desired, they may be able to fill in many of the proofs).

Oliver Nash (Jan 02 2026 at 10:28):

@Boris Alexeev I'm sorry that you've had a bad experience here. I hope I can offer some slight consolation by suggesting that the discussion you have triggered here is very valuable and will be an input into future decisions regarding how we manage our documentation.

Regardless of the points discussed above related to mathematical generality and library maintenance, the human factor is also important and we want contributors to have a good experience. I hope that your next contribution will be more enjoyable.

Last updated: Feb 28 2026 at 14:05 UTC