Zulip Chat Archive

Stream: general

Topic: Exploring Autonomously Generated Theorems in Number Theory

Gerard Calvo Bartra (Feb 20 2024 at 17:05):

Hi! I'm working on a project inspired by AlphaGeometry, aiming to generate a synthetic dataset of valid lemmas and proofs in number theory. This approach differs from traditional theorem proving as it doesn't start with a predefined lemma to prove (the goal). Instead, it explores by autonomously constructing valid proofs from axioms --by random sampling of actions (e.g., applying "reversed" tactics, if that makes any sense)--, creating a broad spectrum of lemmas and theorems. I'm curious whether Lean can be adapted for such an explorative, "reverse" methodology, focusing on discovery rather than verification. Does anyone have insights on leveraging Lean for autonomous exploration and dataset generation?

David Renshaw (Feb 20 2024 at 17:20):

The propose tactic is a small-scale example of forward reasoning: https://github.com/leanprover-community/mathlib4/blob/b82dd45e7e6dbb2ea112d2419d1c8033c2f5e235/Mathlib/Tactic/Propose.lean#L13-L33

Kevin Buzzard (Feb 20 2024 at 17:20):

Not answering your question at all, but for alphageometry I had a clear idea of what the target was: an Olympiad style 2d geometry problem. What is your target here? Number theory is a far broader topic than 2d Euclid-style geometry (not least because it's very easy to write down easy-to-state but hard-to-prove problems such as Goldbach, twin prime, Fermat, Collatz etc).

Jason Rute (Feb 20 2024 at 17:50):

There are a few good discussions on this in the machine learning for theorem proving stream. There is also a project where someone did something like this already for Lean 3.

But you need to be careful. I think random sampling is going to get you nowhere. Also Lean (and similar theorem provers like Coq) use type checking during proof checking. So a proof like rfl could mean infinitely many theorems such as 0=0, 0=0+0, 0=12-12, and so on.

Jason Rute (Feb 20 2024 at 17:59):

I’m on my on my phone, but I can post more this afternoon.

Gerard Calvo Bartra (Feb 20 2024 at 18:11):

@David Renshaw the "propose" tactic seems to be a step in the direction I'm exploring. I'll delve deeper into how it generates forward reasoning steps. Thank you for sharing this!

Gerard Calvo Bartra (Feb 20 2024 at 18:26):

@Kevin Buzzard, thank you for your insight. The project's experimental nature aims at a long-term vision similar to AlphaGeometry—where a sufficiently diverse set of lemmas could train a language model to approach proofs across the entire domain of number theory. Yet, acknowledging the breadth of the field, the project might indeed benefit from narrowing the focus initially.

My guess is that I could constrain the algorithm's exploratory scope by beginning with fundamental number theory concepts—like natural numbers, prime numbers, and elementary operations such as modulo and division. By limiting the algorithm to established definitions and axioms without the capacity to introduce new definitions autonomously, I would be effectively setting up a controlled environment where any new lemma discovered remains strictly anchored to these core principles. This approach should ensure the explorations yield lemmas connected to natural and prime numbers, providing a structured yet rich dataset for further development.

Gerard Calvo Bartra (Feb 20 2024 at 18:33):

@Jason Rute Thanks for the insights! You are right; I'll think about how I can ensure meaningful contributions are created beyond trivial identities. Any further discussions or resources you could share on Lean 3's related projects would be highly valuable to my research.

Jason Rute (Feb 20 2024 at 21:54):

Here are a bunch of quick links and thoughts:

Using AI to generate theorems in Metamath as training data for AI: https://proceedings.neurips.cc/paper/2020/file/d2a27e83d429f0dcae6b937cf440aeb1-Paper.pdf
- Metamath has a few advantages here, but ultimately it mostly depends on how good your AI generator is.
PA.SE question about systems similar to the above paper.
- I give an answer there with similar systems.
@Joe Palermo and co-authors had a short workshop abstract, recorded talk, and repo about generating synthetic theorems in Lean.
- The main idea is that one can use language models to synthetically generate valid Lean term proofs.
- We had a lot of discussions on Zulip about whether this was feasible. See various threads by @Joe Palermo, for example, #new members > Inferring Theorem from a Proof Term? and #Machine Learning for Theorem Proving > Reinforcement learning for ATP.
- There are two things to keep in mind with a project like this. If you generate the proof randomly (or systematically), there is a high chance the proof will be incorrect. Let's say I have a theorem foo (n : Nat) (h : n > 0) : (n - 1) + 1 = n. To use this theorem, I have to supply two arguments. The first is just a Nat. So maybe I randomly (or systematically) pick n:=0. That is fine, but then now I have to supply for the second argument a proof of n > 0, which is false in this case (or maybe non-trivial in other cases). So AI probably helps to generate both interesting and valid proofs.
- The other thing to keep in mind is that a proof term does not unique specify the theorem. I already mentioned the example of @rfl Nat 0 which is a proof for 0 = 0 and 0 = 100-50-40-10.
There was a another recent PA.SE question about a similar topic.

Kevin Buzzard (Feb 20 2024 at 22:26):

One biig difference between alphageometry and number theory is that 2d Euclidean geometry is decidable whereas the theory of Diophantine equations (which only needs addition, multiplication, integers and equality) is undecidable, so it is extremely easy to run into research level or unsolved or even unsolvable problems very quickly. For example "if n is a natural then 2n+4 is the sum of two primes" is unsolved (but believed to be true). This is in stark contrast to 2d Euclidean geometry where if you compose two or three basic notions (the analogue of the question above) then you can only make trivial questions, or, at best, ones which were resolved centuries ago. So the landscape is somehow extremely different.

Joseph Myers (Feb 21 2024 at 00:05):

Olympiad-style functional equations are a fairly narrow topic like geometry (at least if you ignore the more unusual examples such as IMO 2019 shortlist G8, a combinatorial geometry functional equation), but probably not decidable, and they don't really have a limited set of valid moves you can expect to suffice to solve all such problems (integer functional equations might sometimes use arbitrary bits of number theory, for example).

Gerard Calvo Bartra (Feb 21 2024 at 07:49):

@Kevin Buzzard The thing is that I am limiting the scope of the proof, too, and the idea is to iteratively construct valid proofs to later discover the lemmas (not otherwise!), so the proofs come before the statements. This means that, in theory, the algorithm should only discover lemmas for relatively straightforward proofs if I limit the scope, and more importantly, I would have the associate proof for any lemma discovered. In other words, the "if n is a natural then 2n+4 is the sum of two primes" lemma will never appear in the dataset unless we also discover the proof for that statement. Therefore, I am not worried that easy-to-say statements in number theory might have complicated proofs. In fact, this is one of the motivations for exploring the number theory landscape! But please correct me if I have a wrong vision for this project.

Gerard Calvo Bartra (Feb 21 2024 at 08:17):

@Jason Rute Thank you for sharing! I'll take a deep look at these resources.
Regarding the article by @Joe Palermo, I'd like to keep the synthetic data generation away from language models trained on human theorems to avoid any possible bias toward generating already known (or similar) theorems and proofs. Nonetheless, I am unsure if this task can be accomplished without language models.
Also, the Reinforcement Learning for ATP seems intriguing. I'll give my thoughts on that once I better grasp all the content you shared.
Finally, regarding:

The other thing to keep in mind is that a proof term does not unique specify the theorem. I already mentioned the example of @rfl Nat 0 which is a proof for 0 = 0 and 0 = 100-50-40-10.

You are right, but it seems to me that all the theorems generated from the same proof could collapse into a unique and more general one, although I do not know this is always the case. In your example, a=a would englobe all the reflexions, and one could assume that your examples are just specific cases of the a=a, once evaluating all the numbers....

Jason Rute (Feb 21 2024 at 11:23):

Maybe I gave too simple of an example. Another example is example (n : Nat) : n + 0 = n := @rfl Nat (n+0) but this doesn't hold for 0 + n = n. Does this matter for your project? I don't know. But I just want to point it out.

Last updated: May 02 2025 at 03:31 UTC