Zulip Chat Archive
Stream: general
Topic: Discussion: Formal Conjectures
Moritz Firsching (May 28 2025 at 16:51):
Discussion thread for #announce > Formal Conjectures
Martin Dvořák (May 28 2025 at 16:54):
Great news! We were discussing on Discord the other day that this is exactly what the community needs!
Shreyas Srinivas (May 28 2025 at 20:24):
Is it open for some proposals from the TCS community?
Shreyas Srinivas (May 28 2025 at 20:25):
I am referring to mostly combinatorial conjectures
Shreyas Srinivas (May 28 2025 at 20:31):
For instance the union-closed set conjecture
Shreyas Srinivas (May 28 2025 at 20:40):
Or the existence of EFX allocations for any general monotone valuation functions in discrete fair division theory (I have a decent chunk of the pre-requisites already set up)
Moritz Firsching (May 29 2025 at 07:53):
Sure, I think those conjectures sound like they would be a good fit! If there pre-requisites are very involved it might be good to look into upstreaming parts of them if they are useful for mathlib.
Bhavik Mehta (May 29 2025 at 11:51):
Heads up to avoid duplicating work that I wrote union closed for (an early version of) this project back when I was employed at GDM, it should be in a PR today.
Bolton Bailey (May 29 2025 at 21:14):
Thanks for making this!
I see that besides conjectures, there is also a tag for research level problems. Is the repo interested in accepting those? Is there some kind of qualification for how "interesting" a problem needs to be for it to be something suitable to submit?
Kim Morrison (May 30 2025 at 00:14):
I'm sad that all these conjectures are stated directly using sorry, rather than:
def MyLovelyTheorem_statement : Prop := 1 + 2 = 5
proof_wanted MyLovelyTheorem : MyLovelyTheorem_statement
(the proof_wanted is optional, of course)
This would:
- enable reuse, e.g.
theorem myLovelyConsequence (h : MyLovelyTheorem_statement) : False := ... - enable a clean build without warning
Is there any chance there's interest in switching over?
Kim Morrison (May 30 2025 at 00:31):
I'd like to point out something very interesting I just noticed!
Consider the commit https://github.com/google-deepmind/formal-conjectures/commit/8ec658213ab9797e3f7fb4a1fe0e2bda37307897, made by AlphaProof, golfing a proof in the ForMathlib/ directory.
At first this looks completely innocuous, but actually it is a quite dangerous golf (specifically, it puts more work into the initial norm_num step, which is fragile because the simp set changing can result in changes of behaviour -- although the old proof is also fragile, with non-terminal simps).
Specifically, the new AlphaProof generated proofs break on v4.18.0, while the original proofs continue to work, so this golf actually blocks progress.
Eric Wieser (May 30 2025 at 00:49):
- enable reuse, e.g.
theorem myLovelyConsequence (h : MyLovelyTheorem_statement) : False := ...
In the meantime, you can use theorem myLovelyConsequence (h : type_of% my_lovely_theorem) : False, so this one isn't too big an issue
Kim Morrison (May 30 2025 at 00:50):
(I think you mean to omit _statement in that message.
Kim Morrison (May 30 2025 at 00:50):
Kim Morrison said:
Specifically, the new AlphaProof generated proofs break on v4.18.0, while the original proofs continue to work, so this golf actually blocks progress.
In fact, this is the only real obstacle to upgrading to v4.18.0:
https://github.com/google-deepmind/formal-conjectures/pull/111
Eric Wieser (May 30 2025 at 00:54):
(indeed, your naming convention using CamelCase for a proof_wanted confused me)
Eric Wieser (May 30 2025 at 01:06):
Kim Morrison said:
Is there any chance there's interest in switching over?
I think primarily the argument for the current form is inertia; most existing benchmarks are written in this form (miniF2F, PutnamBench, CombiBench, ...), and so releasing formal-conjectures in the same form should make it easier to integrate existing automated provers with.
Eric Wieser (May 30 2025 at 01:07):
If someone could write an automated tool to convert all these benchmarks into your proposed format, then perhaps we could agree as a community to shift all of them over to that style
Kim Morrison (May 30 2025 at 01:08):
Haha, doomed by miniF2F yet again. :-)
Eric Wieser (May 30 2025 at 01:13):
Kim Morrison said:
- enable a clean build without warning
Could we have a noSorryWarning lean option for benchmarks like these?
Eric Wieser (May 30 2025 at 01:15):
You could imagine it also being useful for running CI against student exercise templates, where they have to fill in the sorry
Kevin Buzzard (May 30 2025 at 07:17):
The myriad sorry warnings I get when building FLT drown out everything else and cause pages and pages of scrolling so +1 from me
Edward van de Meent (May 30 2025 at 07:44):
I suppose that instead, a single warning that this option is set would suffice?
Eric Wieser (May 30 2025 at 09:04):
I think no warning at all is best
Yaël Dillies (May 30 2025 at 09:10):
Previous discussion: #lean4 > Silent sorry
Junyan Xu (May 30 2025 at 10:17):
I wonder whether an "exercise" or "curiosity" category should be added to the repo for trivial little questions not expected to be hard and certainly not open or even stated in the literature such as #maths > Puzzles around finiteness and projectivity of modules @ 💬. I think they are good for AI systems to explore / come up with examples/constructions and are likely to offer more immediate reward signals.
Paul Lezeau (May 30 2025 at 12:14):
Junyan Xu said:
I wonder whether an "exercise" or "curiosity" category should be added to the repo for trivial little questions not expected to be hard and certainly not open or even stated in the literature
I think the high_school/undegraduate/graduate categories could make sense for this sort of question.
Paul Lezeau (May 30 2025 at 12:41):
In case anyone needs it, the full list of values that can be passed to the category attribute can be found here!
Junyan Xu (May 30 2025 at 12:47):
What directory should such problems go under? Other? Or shall we create HighSchool, Undergraduate and Graduate directories?
Paul Lezeau (May 30 2025 at 13:00):
Junyan Xu said:
What directory should such problems go under?
Other? Or shall we create HighSchool, Undergraduate and Graduate directories?
Our current convention would be to add such problems when they are related to a given conjecture (and then add them in the same file as the conjecture) - e.g. this problem
Notification Bot (May 30 2025 at 13:36):
5 messages were moved from this topic to #general > Benchmark formats by Eric Wieser.
Junyan Xu (May 30 2025 at 13:47):
I suspected that there isn't an established place for such questions so I asked two yesterday on MO and Math.SE respectively, and if there's no solution in say a month, I might PR the MO question to Formal Conjectures.
Eric Wieser (May 30 2025 at 13:55):
That seems like a nice approach; use MO as a litmus test for "is this an interesting question", and then add it for formal-conjectures in the MathOverflow directory
Eric Wieser (May 30 2025 at 13:55):
Which also means that there is a natural place to record human progress on the problem
Jason Rute (May 31 2025 at 01:10):
I'm a bit confused by the goal of this project. At first, I thought it was just open conjectures, so it served as a few things:
- really hard and valuable targets for AI theorem provers
- a benchmark where solving even one problem would be great and news worthy
- a repository of formal open problems for mathematicians (both as a list of open problems to look at and a formal target if a mathematician both solves (or partially solves) a problem and wants to formalize it)
Jason Rute (May 31 2025 at 01:10):
But what is the goal of having solved problems? Is it supposed to be like the formal abstract project? Is it also a benchmark of sorts (maybe an autoformalization benchmark like the idea I have talked about)? Is it a stepping stone on the way to formalizing open problems?
Moritz Firsching (May 31 2025 at 06:28):
Note that the repo's purpose is not to collect arbitrary solved problems, but rather only in two specific cases:
- when they are part of a collection (like the Erdős Problems) and
- when they are variants of open problems.
One reason for this is to have definitions stress tested, i.e. sanity checks for the formalization. Example here
Another is that we noticed it is relatively little effort to include solved variants, once the definitions are there for the unsolved problems.
This does however lead to exactly the points you mention:
- the collection of solved problem can be viewed as an autoformalization benchmark.
- an automated system attempting to provide formal proofs for unsolved problems should also be able to provide formal proofs some of the solved problem. In fact some of the open problems will already be solvable by automated systems, since some of the open variants can just be solved with a call to
decideor some other simple tactic. (This is what you call "stepping stone", I suppose)
Kevin Buzzard (May 31 2025 at 07:39):
Just to be clear -- many of these solved problems are very hard to solve! For some of them it would be a real achievement to have a machine-checked solution. For example FLT is a solved problem.
David Michael Roberts (May 31 2025 at 10:48):
https://meta.mathoverflow.net/questions/1820/mo-hard-questions instead of merely unanswered MO questions....
Junyan Xu (May 31 2025 at 14:47):
For solved problems whose statement is readily formalizable but whose proof is unlikely to be formalized soon, should we include them in mathlib as proof_wanted instead? If so I need to remove several items from my list (sorry I didn't read the README too carefully).
Jason Rute (May 31 2025 at 15:48):
As for MathOverflow, you can also use the open problems tag, sorted by score: https://mathoverflow.net/questions/tagged/open-problems?tab=Votes
Jason Rute (May 31 2025 at 15:50):
Either way, you are (possibly) missing the highest scored open problem on MO (which is also quite easy to formalize): https://mathoverflow.net/questions/21003/polynomial-bijection-from-mathbb-q-times-mathbb-q-to-mathbb-q
Bhavik Mehta (May 31 2025 at 16:05):
https://github.com/google-deepmind/formal-conjectures/pull/124
Yaël Dillies (May 31 2025 at 16:07):
FYI you can write google-deepmind/formal-conjectures#124 and it will render as google-deepmind/formal-conjectures#124
Bhavik Mehta (May 31 2025 at 16:13):
That's more key-presses than Ctrl-C, Ctrl-V ;)
Junyan Xu (May 31 2025 at 16:17):
maybe it'll move under leanprover-community some day?
Seewoo Lee (Jun 01 2025 at 19:21):
Not a serious question, but what's the definition of "significant" contributor to add name on AUTHORS?
Eric Wieser (Jun 01 2025 at 20:40):
Up to interpretation, but I think "added a new conjecture" is sufficient for inclusion, but "fixed a typo" usually is not
Jason Rute (Jun 01 2025 at 20:50):
When is answer(sorry) used? It seems to be used with some conjectures and not others?
Eric Wieser (Jun 01 2025 at 20:58):
I think there are some open PRs to standardize this. I believe the rule we settled on is "use it if the English version is in the imperative tense or otherwise ends with a question mark"
Eric Wieser (Jun 01 2025 at 20:59):
With the premise that many conjectures have an "expected" result, and this is often captured in the way they are phrased
Moritz Firsching (Jun 02 2025 at 05:14):
Eric Wieser said:
Up to interpretation, but I think "added a new conjecture" is sufficient for inclusion, but "fixed a typo" usually is not
Agreed. Fixing a misformalization would also be considered sufficient for inclusion, in my interpretation, even if this is done by fixing a typo.
Seewoo Lee (Jun 02 2025 at 16:17):
I may need NumberField.maximalRealSubfield which is merged to mathlib 2 weeks ago, and it does not exist in the current version (4.17.0 maybe?) of the repository yet. Should I wait until the next release and version up? I don't know what's the best practice for this in general.
Eric Wieser (Jun 02 2025 at 16:24):
For very small definitions (~10 lines), it's probably ok to back-port them to the ForMathlib directory, along with a link to the PR that added them.
Eric Wieser (Jun 02 2025 at 16:24):
We certainly plan to update the Lean version of the repository, but we can't commit to exactly when, and we certainly won't be supporting anything other than the monthly "major" lean versions.
Chris Birkbeck (Jun 02 2025 at 16:27):
Seewoo Lee said:
I may need
NumberField.maximalRealSubfieldwhich is merged to mathlib 2 weeks ago, and it does not exist in the current version (4.17.0 maybe?) of the repository yet. Should I wait until the next release and version up? I don't know what's the best practice for this in general.
Oh I was just thinking about this, in the context of adding Vandiver's conjecture to the formal conjecutures repo. Maybe you're already doing this?
Seewoo Lee (Jun 02 2025 at 16:45):
Chris Birkbeck 말함:
Seewoo Lee said:
I may need
NumberField.maximalRealSubfieldwhich is merged to mathlib 2 weeks ago, and it does not exist in the current version (4.17.0 maybe?) of the repository yet. Should I wait until the next release and version up? I don't know what's the best practice for this in general.Oh I was just thinking about this, in the context of adding Vandiver's conjecture to the formal conjecutures repo. Maybe you're already doing this?
Yes exactly, I already took it and have a code :sweat_smile:
Chris Birkbeck (Jun 02 2025 at 16:57):
Go for it! I also have a formalised statement of the modularity conjecture over Q, but I don't know if that's the sort of thing people want in the repo? For example, I couldn't see FLT in the repo (but maybe I missed it?).
Paul Lezeau (Jun 02 2025 at 18:25):
Chris Birkbeck said:
Go for it! I also have a formalised statement of the modularity conjecture over Q, but I don't know if that's the sort of thing people want in the repo? For example, I couldn't see FLT in the repo (but maybe I missed it?).
I think that would be a nice addition to the repo!
Alex Kontorovich (Jun 02 2025 at 22:29):
I was curious how hard formalization is for LLMs these days. See https://github.com/google-deepmind/formal-conjectures/pull/144. Claude 4.0 (nearly) one-shot the statement of Bateman-Horn. (I assume we don't already have that in Mathlib?)
Kevin Buzzard (Jun 03 2025 at 00:26):
I would never trust AI to state definitions and conjectures because there are no guardrails. Everything has to be extremely carefully human-checked here to the extent that it's not entirely clear if you are gaining anything.
Eric Wieser (Jun 03 2025 at 00:32):
A mathematician who "knows what the important questions are" but doesn't know lean and so delegates to AI to formalize is perhaps stillmore useful than that mathematician not getting involved at all.
Eric Wieser (Jun 03 2025 at 00:33):
Of course they could instead just open an issue flagging to skilled humans that it would be a good formalization target!
Junyan Xu (Jun 03 2025 at 00:45):
The familiarity with Lean 4 is remarkable. The renowned coding model has really seen a lot of code :)
Is this Sonnet or Opus? I can't tell from the shared convo.
How long did it think? (I assume a nontrivial period of time.)
Note that someone has taken Schinzel's Hypothesis so some coordination is in order.
Junyan Xu (Jun 03 2025 at 00:47):
Where does the habit of indenting hypotheses by only two spaces come from?
Eric Wieser (Jun 03 2025 at 00:59):
lean 3, perhaps?
Alex Kontorovich (Jun 03 2025 at 13:53):
Kevin Buzzard said:
I would never trust AI to state definitions and conjectures because there are no guardrails. Everything has to be extremely carefully human-checked here to the extent that it's not entirely clear if you are gaining anything.
Oh I agree completely that you can't ever trust what it produces (especially so for def's and conj's). But I disagree about the gain -- in my experience, it's way easier to edit than to compose. I let the AI compose, then edited it to my (and Lean's) liking. And now it's going through thorough PR review. So I agree that the gain may not be as large as advertised, but I think it's not nothing...
Alex Kontorovich (Jun 03 2025 at 14:06):
Junyan Xu said:
The familiarity with Lean 4 is remarkable. The renowned coding model has really seen a lot of code :)
Is this Sonnet or Opus? I can't tell from the shared convo.
How long did it think? (I assume a nontrivial period of time.)
Looks like I was running Sonnet 4.0. It responded basically instantaneously, I probably waited 15 secs for it to type out the formalization and its discussion.
Note that someone has taken Schinzel's Hypothesis so some coordination is in order.
Oops! Sorry to have gone out of turn... (It could now be fun to prove that Shinzel's H follows "immediately" from Bateman-Horn!...)
Paul Lezeau (Jun 03 2025 at 14:30):
Oops! Sorry to have gone out of turn... (It could now be fun to prove that Shinzel's H follows "immediately" from Bateman-Horn!...)
I think having both is fine! And we should certainly add the implication (formalising those sorts of proofs is often a good way to spot misformalisations)
Alex Kontorovich (Jun 03 2025 at 14:55):
I realized it was probably a faux pas to use Claude to formalize something ... for a Google repo. :sweat_smile: So I thought I'd rectify this with another experiment: let's state Deligne's theorem (just for Ramanujan) using Gemini 2.5 Pro. You can see the conversation here: https://gemini.google.com/share/e668efd40638
First it really insisted that was already defined in Mathlib, despite me repeatedly telling it that it was not. It even provided numerous links that went to 404 errors (I guess I would assume it would try to Google (!) the links before giving them?).
At some point it dawned on me to see if Deligne was already there, and indeed it was: https://github.com/google-deepmind/formal-conjectures/blob/main/FormalConjectures/Wikipedia/RamanujanTau.lean
But a couple of interesting things: Gemini had a good idea that if you just want to compute and not properly define the modular form, you only need to take a finite product. (This will surely cost you later, but as a pure definition of , it's not so bad: is the nth coefficient of the finite polynomial .) The human formalization in the library goes straight for the infinite product, def Δ := X * ∏' (n : ℕ+), (1 - X ^ (n : ℕ)) ^ 24, and I wonder how easy it is to compute from that?... (For example, is left as a sorry...)
Also, I notice that there's a human (I guess I'm presuming it was a human!) statement: lemma multipliable : Multipliable fun n : ℕ+ ↦ ((1 - X ^ (n : ℕ)) ^ 24 : PowerSeries ℤ), which I understand to be API (to compute ). But in terms of Ramanujan's original conjectures, I would've expected to see lemma tau_multiplicative : IsMultiplicative τ := sorry... Maybe worth a PR?
Kevin Buzzard (Jun 03 2025 at 15:06):
Yes when I give Ramanujan (and multiplicativity of ) as a problem to olympiad kids I define using exactly this finite product trick!
Junyan Xu (Jun 03 2025 at 15:14):
I proposed this definition but it wasn't adopted. We should probably add this and state the equality with the current definition as an API lemma.
Alex Kontorovich (Jun 03 2025 at 15:33):
Kevin Buzzard said:
Yes when I give Ramanujan (and multiplicativity of ) as a problem to olympiad kids I define using exactly this finite product trick!
I guess when I teach it (e.g.: Lecture 16 in https://sites.math.rutgers.edu/~alexk/2023S572) I do state as an infinite product, but then I compute a few values of to show how easy it is, because after , you're multiplying all 1's ("same thing" as having a finite product...)
Bhavik Mehta (Jun 03 2025 at 15:58):
Yup, we discussed the definition as you mention, but I think it's nice to have the modular form actually defined too, but having the extra API lemmas you mention is useful, and I think mathlib should eventually have API to make this kind of coefficient calculation easier
Junyan Xu (Jun 03 2025 at 16:12):
To close the Multipliable sorry multipliable_one_add_of_summable is probably useful :)
I'm not sure what the norm of X should be. 1 / Real.exp 1?
Chris Birkbeck (Jun 03 2025 at 16:16):
Just to say that I am working on some PRs that define as a modular form and obviously proves (a version of) this multipliable. But most of that is overkill for the formal conjectures
Bhavik Mehta (Jun 03 2025 at 16:16):
I'd be interested to hear @Antoine Chambert-Loir's thoughts on this. My guess is that we want to say that if coeff_m f_n -> 0 as n -> infty for all m > 0 (not necessarily uniformly in m) and coeff_0 f_n is eventually constantly 1, then f_n is multipliable
Bhavik Mehta (Jun 03 2025 at 16:17):
Chris Birkbeck said:
Just to say that I am working on some PRs that define as a modular form and obviously proves this multipliable. But most of that is overkill for the formal conjectures
I think your PRs show it's multipliable as a function H -> C, rather than as a power series?
Chris Birkbeck (Jun 03 2025 at 16:17):
Yes definitely, it's as a modular form
Seewoo Lee (Jun 03 2025 at 19:18):
Alex Kontorovich 말함:
I realized it was probably a faux pas to use Claude to formalize something ... for a Google repo. :sweat_smile: So I thought I'd rectify this with another experiment: let's state Deligne's theorem (just for Ramanujan) using Gemini 2.5 Pro. You can see the conversation here: https://gemini.google.com/share/e668efd40638
First it really insisted that was already defined in Mathlib, despite me repeatedly telling it that it was not. It even provided numerous links that went to 404 errors (I guess I would assume it would try to Google (!) the links before giving them?).
At some point it dawned on me to see if Deligne was already there, and indeed it was: https://github.com/google-deepmind/formal-conjectures/blob/main/FormalConjectures/Wikipedia/RamanujanTau.lean
But a couple of interesting things: Gemini had a good idea that if you just want to compute and not properly define the modular form, you only need to take a finite product. (This will surely cost you later, but as a pure definition of , it's not so bad: is thenth coefficient of the finite polynomial .) The human formalization in the library goes straight for the infinite product,def Δ := X * ∏' (n : ℕ+), (1 - X ^ (n : ℕ)) ^ 24, and I wonder how easy it is to compute from that?... (For example, is left as asorry...)
Also, I notice that there's a human (I guess I'm presuming it was a human!) statement:lemma multipliable : Multipliable fun n : ℕ+ ↦ ((1 - X ^ (n : ℕ)) ^ 24 : PowerSeries ℤ), which I understand to be API (to compute ). But in terms of Ramanujan's original conjectures, I would've expected to seelemma tau_multiplicative : IsMultiplicative τ := sorry... Maybe worth a PR?
I thought about this too, but my current thought is that the polynomial version might be more suitable for the actual mathlib, and for the formal conjecture repository I prefer to state as close as the original statement. Also as @Chris Birkbeck mentioned, the true (as a cusp form) will be in mathlib at some point, and I think it might be a good idea to have 1) polynomial version of it 2) prove that it converges in the pi-topology 3) use it to compute first few terms, where the last would be useful for the actual computation of -series of modular forms (and probably for more general -products, too).
Chris Birkbeck (Jun 03 2025 at 19:30):
I don't really know how much this would help compute the first few terms for modular forms. As a modular form we can just look at the q-expansion which we will have and that's not so bad (once we have the stuff in mathlib)
Chris Birkbeck (Jun 03 2025 at 19:35):
I agree it's the right definition for the formal conjectures. But I don't think I'd find much use for this in mathlib, since I only care for it as a modular form (but maybe that's just me).
Antoine Chambert-Loir (Jun 03 2025 at 19:52):
mkn
Bhavik Mehta said:
My guess is that we want to say that if coeff_m f_n -> 0 as n -> infty for all m > 0 (not necessarily uniformly in m) and coeff_0 f_n is eventually constantly 1, then f_n is multipliable
Of course, not. Take .
Antoine Chambert-Loir (Jun 03 2025 at 19:57):
Junyan Xu said:
I proposed this definition but it wasn't adopted. We should probably add this and state the equality with the current definition as an API lemma.
What cshould be proved is a congruence theorem: take an infinite product of power series such that ; then , where the latter product is finite.
Antoine Chambert-Loir (Jun 03 2025 at 20:00):
The infinite product of power series of power series with is a non archimedean business, it has little to do with the lemma that is used in complex analysis to ensure convergence of infinite products as holomorphic functions.
Antoine Chambert-Loir (Jun 03 2025 at 20:10):
With @María Inés de Frutos Fernández we defined infinite sums of power series, and I don't remember having pushed to mathlib the business of infinite products.
-
For powers series in where has the discrete topology, probably more generally a linear topology, infinite products can be defined in a very algebraic mood, as Bourbaki does in Algebra, Chapter 5 (Polynomials and rational functions), §4.
-
Results such as docs#multipliable_one_add_of_summable are interesting in other contexts, but mostly archimedean ones, for example power series with complex coefficients.
Antoine Chambert-Loir (Jun 03 2025 at 20:10):
@Junyan Xu can you explain your question mark?
Junyan Xu (Jun 03 2025 at 20:14):
I think docs#multipliable_one_add_of_summable can be used to prove lemma multipliable : Multipliable fun n : ℕ+ ↦ ((1 - X ^ (n : ℕ)) ^ 24 : PowerSeries ℤ), so I'm not sure about the "has little to do".
I think ℤ[[X]] doesn't currently have a norm on it, but there certainly exists (many) suitable ones.
Antoine Chambert-Loir (Jun 03 2025 at 20:16):
In that generality, I'm not so sure. But if you restrict to power series of radius of convergence , for , yes.
Antoine Chambert-Loir (Jun 03 2025 at 20:17):
On the other hand, with the existing topology on , the infinite product that you give converges.
Junyan Xu (Jun 03 2025 at 20:26):
You mean PowerSeries.WithPiTopology.instTopologicalSpace, which is only a scoped instance (and it's indeed used in the RamanujanTau file).
Antoine Chambert-Loir (Jun 03 2025 at 20:26):
Yes.
Antoine Chambert-Loir (Jun 03 2025 at 20:26):
It's a scoped instance because you don't necessarily want to have this topology on if you're doing complex analysis.
Alex Kontorovich (Jun 03 2025 at 22:06):
At lunch at IAS, I told some folks about the formal conjectures project and how good Claude was at (what I call) "quasi-autoformalization", that is, the LLM writes the code but is not expected to completely one-shot it (so a human is allowed to make small tweaks to fix the code, rather than iterating hoping to converge to perfect AI written code). And they started throwing out other conjectures to see what Claude could do (maybe it was only good at number theory or something). So... here's the next PR, Gromov's theorem that a finitely generated group has polynomial growth if and only if it's virtually nilpotent: https://github.com/google-deepmind/formal-conjectures/pull/153
Yaël Dillies (Jun 03 2025 at 22:08):
Ah, we have this one already in LeanCamCombi
Alex Kontorovich (Jun 03 2025 at 22:08):
Cool! Pardon my ignorance, can you please point me to LeanCamCombi?
Paul Lezeau (Jun 03 2025 at 22:23):
Alex Kontorovich (Jun 03 2025 at 22:30):
But where do I find Gromov's theorem? I poked around and didn't see it immediately... Thanks!
Junyan Xu (Jun 03 2025 at 22:31):
https://github.com/search?q=repo%3AYaelDillies%2FLeanCamCombi%20gromov&type=code
Alex Kontorovich (Jun 04 2025 at 03:58):
Thanks! I glanced at the statement (lemma theorem_1_2 : HasPolynomialGrowth G ↔ IsVirtuallyNilpotent G := showcased), but I don't see where it's assumed that G is finitely generated? (This is part of the trouble with just stating conjectures! The statements could be incomplete...) I see that it's baked into the definition of HasPolynomialGrowth, but that's not enough. E.g. the group is abelian (and hence IsVirtuallyNilpotent), but does not satisfy HasPolynomialGrowth..?
David Michael Roberts (Jun 04 2025 at 05:01):
Jason Rute said:
As for MathOverflow, you can also use the
open problemstag, sorted by score: https://mathoverflow.net/questions/tagged/open-problems?tab=Votes
This tag is a bit of a wonky one, there's been discussion about what it's actually for, and what it gets used for, and I couldn't say that you should take everything listed under that as raw material. It would need strong curation.
Yaël Dillies (Jun 04 2025 at 08:49):
Alex Kontorovich said:
I glanced at the statement (
lemma theorem_1_2 : HasPolynomialGrowth G ↔ IsVirtuallyNilpotent G := showcased), but I don't see where it's assumed thatGis finitely generated?
Aha, I was following those notes which claim the following. :sweat_smile:
Sébastien Gouëzel (Jun 04 2025 at 09:16):
A few lines above, the notes assume that G has a finite generating set, so everything is fine there.
Yaël Dillies (Jun 04 2025 at 09:23):
Ah! I did miss that. Anyway, updated in LeanCamCombi
Alex Kontorovich (Jun 04 2025 at 18:36):
And FYI Will Sawin kindly pointed out that Claude and I missed the product of degrees in the constant in Bateman-Horn. As I've said before, the responsibility for checking these things lies, as far as I'm concerned, solely with the human! So this is my fault, not Claude's. (I should expect it to make mistakes... It's up to me to correct them before putting them out.) Getting formalized conjectures right is hard!!... I've PRed a fix.
Alex Kontorovich (Jun 05 2025 at 03:12):
Further update: @Paul Lezeau pointed out that we didn't assume the polynomials weren't non-constant. (It's clear how to modify the conjecture, depending on whether the constant value is prime or not, but it ruins the asymptotics, and hence the conjecture as stated... AlphaProof can come along and say that it solved the sorry, in the negative! :grinning_face_with_smiling_eyes: ) So we had to add that, too.
And then we realized that at some point, the condition that the polynomials are distinct got dropped (which, to Claude's credit, it did have in the original formalization). Of course if two polynomials are the same, then on any input, they'll either both be prime or not. But again, it changes the asymptotic!! So that had to be added back in. [Edit: Nope, Finsets are automatically distinct, duh...]
I repeat: Getting formalized conjectures right is hard! (Hopefully now we're ok? Any more holes??)
Adam Marks (Jun 07 2025 at 14:36):
I will echo the assessment by @Alex Kontorovich that Claude Code is useful for "quasi-autoformalization". Of course of course of course, the human must carefully check the result for accuracy. It seems to me that the big improvement over older or even current chatbots is that Claude Code will proactively exec lean to check for errors, attempt to fix the errors, rinse and repeat. It will also synthesize and exec lean snippets to figure out how to use a particular mathlib definition, concoct commandline greps to find relevant code, etc. Outside of mathematics, for ordinary programming, I've had success with Claude Code, especially conducting it to perform small, incremental changes to a codebase, which is how I would conduct myself as a software engineer anyway (since time to debug a new error compounds with the number of features in the change, roughly). Keeping it on a short leash. Applying this same approach to Claude Code working on a lean codebase seems promising so far, although I'm only formalizing statements at the moment, not constructing proofs.
Adam Marks (Jun 07 2025 at 14:47):
It can also be useful to instruct Claude Code along the lines: "You are a pedantic mathematician. Find at least one critical flaw in the logical accuracy of the formalization of the statement." I've had this uncover flaws in statements that I formalized myself (newbie mistakes or typos highlighting my own failings; I'm still new to Lean). It seems to have a strong bias to "finish the job" and so if it can find something substantial it will produce flimsy complaints.
Junyan Xu (Jun 18 2025 at 23:02):
I'm adding some conjectures about ranks of elliptic curves over Q at #249 (bounded vs. unbounded + record holders, half rank 0 / half rank 1 + Bhargava--Shankar results). This 2013 paper seems not to be published in a journal, but I hope it's okay to mark the results therein as "solved".
Seewoo Lee (Jun 18 2025 at 23:05):
That reminds me simiar flavor but much easier ones to formalize, e.g. this paper on writing a number as sum of cubes
Tristan Stérin (Jun 20 2025 at 15:20):
Hello, when I run lake build on macos 14.2.1, after setting up a fresh fork in vscode, it stalls forever showing: [?/?] Computing build jobs. Would you have any ideas on how to fix this? I opened an issue
Tristan Stérin (Jun 20 2025 at 15:55):
It eventually finished, but failed, seemingly due to Mathlib
Tristan Stérin (Jun 20 2025 at 16:04):
It eventually worked by re-running lake build, I don't know why the first build failed.
Seewoo Lee (Jun 23 2025 at 17:33):
Is there anyone familiar with quantum gates? I formalized a conjecture 3.4 of this paper in this PR, and it seems that the variable in the conjecture is not used in the actual claim, but I want to make sure since I'm not an expert in this area.
Ivan S. (Jun 24 2025 at 07:20):
Seewoo Lee said:
Is there anyone familiar with quantum gates? I formalized a conjecture 3.4 of this paper in this PR, and it seems that the variable in the conjecture is not used in the actual claim, but I want to make sure since I'm not an expert in this area.
I skimmed the conjecture and the surrounding results, and it looks to me that you're right: $\epsilon$ does not appear in the statement or the proof of the conjecture. The only relation epsilon has to this that I can see is explained in the paragraph after the calculation: $k$ can always be chosen to be $t_{\epsilon}$. To be more certain would require getting the entire context from the 12 pages that come before this conjecture.
Have you tried contacting the authors of the preprint? I would imagine they would be happy to clarify this, especially because this is such a quick and specific question, and they already know all the context.
Seewoo Lee (Jun 24 2025 at 17:35):
I'll contact authors then, thank you for your confirmation!
Moritz Firsching (Aug 26 2025 at 06:56):
Can this channel be moved to "Projects" or can we get a dedicated thread in "Projects"? (This way there could be subtopics, which would make discussing individual conjectures easier and more organized)
Kim Morrison (Aug 26 2025 at 06:58):
It doesn't seem there is sufficient traffic here to warrant this --- it's been quiet for two months.
Moritz Firsching (Aug 26 2025 at 07:07):
Two points on that:
- I just had someone write to me writing questions on Formal Conjectures in a private zulip chat, when I suggested to use the main discussion thread they said it would be "a bit intimidating at least for me to post on
thethread. Maybe other beginners feel similarly" - There is a private non-zulip chat on the topic that is threaded an has traffic which could potentially move here.
So it might be a bit of a chicken-and-egg problem (and I didn't realize project channels is a scarce resource).
Yan Yablonovskiy 🇺🇦 (Aug 26 2025 at 07:39):
Kim Morrison said:
It doesn't seem there is sufficient traffic here to warrant this --- it's been quiet for two months.
Moritz Firsching said:
- I just had someone write to me writing questions on Formal Conjectures in a private zulip chat, when I suggested to use the main discussion thread they said it would be "a bit intimidating at least for me to post on
thethread. Maybe other beginners feel similarly"
Yea i will chime in to say that was me, though i appreciate the anonymity as it is a bit embarrassing. Naturally though the total amount of beginners may not be high..but i was feeling a bit strange chiming in asking a specific question both about the math, how to formulate my files for AI and how my examples should look (because its preferred to use theorems only and they have sorry, it makes things different when you state surrounding "results") etc. as these were at once too specific for that thread or any other thread/channel.
My other thought was that a thread has less modularity in terms of me (other beginners) asking my silly(beginner?) questions which will likely be answered immediately/uninterestingly/trivially, is it really worthwhile for every single person to need to see that while scrolling for other things.
Johan Commelin (Aug 26 2025 at 11:42):
If there's a lot of discussion happening in DMs that would move to a dedicated channel, let's create it. I'll do so now
Johan Commelin (Aug 26 2025 at 11:44):
Last updated: Dec 20 2025 at 21:32 UTC