Zulip Chat Archive
Stream: Machine Learning for Theorem Proving
Topic: ICLR 2025 Submissions
Jason Rute (Oct 05 2024 at 02:25):
There are so many formal theorem proving submissions to this year's ICLR. (Submissions to ICLR are public.) If you search for keywords like "theorem proving", "autoformalization", "lean", "isabelle" (and maybe others) you will see a lot. I think I count 12 papers for just the keyword "Lean" (that seem to be about the Lean theorem prover) and 1 for "Isabelle". (None for "Coq". :sad: ) Most are new to me. I'll try to make a list of all the theorem proving papers soon.
fzyzcjy (Oct 05 2024 at 02:31):
Wondering where can we find the PDFs? It seems that there is no buttons for that... Googling the titles for an arxiv also fails (for many papers)
Jason Rute (Oct 05 2024 at 10:35):
fzyzcjy said:
Wondering where can we find the PDFs? It seems that there is no buttons for that... Googling the titles for an arxiv also fails (for many papers)
Oh, I might have jumped the gun, but I would guess that the PDFs will be available on OpenReview in about a week or so.
fzyzcjy (Oct 05 2024 at 10:36):
Thanks!
Sid (Oct 07 2024 at 03:50):
A subset of the relevant papers that I compiled is below -
Synthetic Theorem Generation in Lean - https://openreview.net/forum?id=EeDSMy5Ruj - tl;dr - forward reasoning to generate theorems
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically - https://openreview.net/forum?id=D23JcXiUwf - tl;dr - RL-based training algorithm that encourages the model to decompose a theorem into lemmas, prove the lemmas, and then prove the theorem by using the lemmas
ImProver: Agent-Based Automated Proof Optimization - https://openreview.net/forum?id=dWsdJAXjQD - tld;r - proof rewriting to make proofs shorter, more modular, more readable
Alchemy: Amplifying Theorem-Proving Capability Through Symbolic Mutation - https://openreview.net/forum?id=7NL74jUiMg - tl;dr - sounds like forward reasoning on existing theorems to generate new ones? Not clear to me tho
CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search - https://openreview.net/forum?id=VQwI055flA
FormalAlign: Automated Alignment Evaluation for Autoformalization - https://openreview.net/forum?id=B5RrIFMqbe
3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes - https://openreview.net/forum?id=7gGVDrqVaz - tl;dr - DPP to subselect for a diverse set of tactics to reduce branching factor
Collaborative Theorem Proving with Large Language Models: Enhancing Formal Proofs with ProofRefiner - https://openreview.net/forum?id=y9xNQZjUJM
LeanAgent: Lifelong Learning for Formal Theorem Proving - https://openreview.net/forum?id=Uo4EHT4ZZ8 -
SubgoalXL: SUBGOAL-BASED EXPERT LEARNING FOR THEOREM PROVING - https://arxiv.org/pdf/2408.11172
Herald: A Natural Language Annotated Lean 4 Dataset - https://openreview.net/forum?id=Se6MgCtRhz
Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search - https://openreview.net/forum?id=I4YAIwrsXa - This is the DeepSeek Prover 1.5 paper I believe
Process-Driven Autoformalization in Lean 4 - https://openreview.net/forum?id=k8KsI84Ds7 - tl;dr Lean compilation feedback to enhance autoformalization
Lean-STaR: Learning to Interleave Thinking and Proving - https://arxiv.org/abs/2407.10040 - tl;dr interleave natural language and Lean
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization - https://openreview.net/forum?id=Qdp7hlenr6
Luigi Massacci (Oct 07 2024 at 12:10):
Hopefully without abusing of your kindness since you already went through all this trouble, do you know if any of the above are actually usable (without too much trouble) for someone actually working with lean?
Adam Topaz (Oct 07 2024 at 12:19):
I don't really know how openreview works. Is it possible to see the papers at this point?
Jason Rute (Oct 07 2024 at 12:23):
@Adam Topaz see below. My guess is they will be available in a few days.
Jason Rute said:
Oh, I might have jumped the gun, but I would guess that the PDFs will be available on OpenReview in about a week or so.
Adam Topaz (Oct 07 2024 at 12:23):
Thanks!
Jason Rute (Oct 07 2024 at 12:53):
Luigi Massacci said:
Hopefully without abusing of your kindness since you already went through all this trouble, do you know if any of the above are actually usable (without too much trouble) for someone actually working with lean?
It is difficult to tell since they are just abstracts at the moment (and when the PDFs are visible in about a week, they still will be anonymous without, say, GitHub links). Moreover, usability hasn't been a concern of most researchers. There are no incentives for it. Frankly, even the current "usable" models aren't very good, and few (if any) people use them. But, having said that, I have three caveats:
- Some trained LLM models like DeepSeek-Prover v1.5 are publically available on huggingface, so it is possible to easily code something up trying to use it locally from within Lean. (But it may not be useful since it was optimized for competition math problems.)
- Some papers use existing LLM APIs. Even if not public, they could be reimplemented with minimal work. (The ImProver paper may be an interesting one, but again I can't tell from the abstract.)
- The many autoformalization papers this year may lead to a lot of possibilities for useful applications. It may be that someone on one of those papers made something useful, or if not, that it would be worth it to reimplement their ideas.
Sid (Oct 07 2024 at 19:04):
Luigi Massacci said:
Hopefully without abusing of your kindness since you already went through all this trouble, do you know if any of the above are actually usable (without too much trouble) for someone actually working with lean?
To add to Jason Rute's excellent answer, InternLM2-Step-Prover is another model I've found fairly usable. But ya, until the papers' text is released it's hard to judge. I think some of them are already on arxiv and for those, the Lean-Star paper is something that I think the DeepSeek team took some ideas from so that might be a good one to check out. The DeepSeek 1 and 1.5 Prover papers are very good reads as well. Lots of good ideas.
Luigi Massacci (Oct 07 2024 at 19:48):
Jason Rute said:
It is difficult to tell since they are just abstracts at the moment (and when the PDFs are visible in about a week, they still will be anonymous without, say, GitHub links).
My apologies, I typed my message from the phone so I didn’t look carefully at the website, and assumed it was some kind of preprint setup from the name.
Moreover, usability hasn't been a concern of most researchers. There are no incentives for it.
I’m sadly aware of that (you explained really well why that is the case in the past somewhere around here) but hope is the last to die…
As for the rest, anything beginning by “reimplement” doesn’t qualify as “usable” : (
Incidentally, @Kim Morrison, I seem to remember the FRO was working on ML integration, or did I hallucinate that? Arguably providing some kind of more plug and play pipeline is the kind of important but not incentivised by the system™ development where the FRO excels
Zheng Yuan (Oct 15 2024 at 00:55):
Paper pdfs are visible now
RexWang (Oct 18 2024 at 02:18):
After searching for terms like "Lean,Isabelle,miniF2F,ProofNet" and "autoformal", I found 20 papers related to formal proofs. All of these papers use Lean, except for 4-StepProof, 6-SubgoalXL, and 9-ProD-RL, which use Isabelle. (None for Coq)
Here's the list:
And Some other works:
-
FOL Reasoning:
-
SAT:
-
Automatic Geometry Problem Solving (GPS):
Kaiyu Yang (Oct 19 2024 at 17:51):
@RexWang That's really useful. Thanks! FYI, ProofRefiner is plagiarizing LeanCopilot and has been desk rejected.
Kevin Buzzard (Oct 19 2024 at 18:28):
Indeed the abstracts of those two papers are surprisingly similar.
fzyzcjy (Oct 21 2024 at 06:03):
A longer list for papers related to AI for mathematics in ICLR 2025 submissions: https://github.com/fzyzcjy/ai_math_paper_list. In total there are ~150 papers, since many of them are not formal methods. I mainly do it because of my personal need, and put it here since it might be useful for someone.
Quinn (Oct 28 2024 at 23:27):
i link to this thread in this month's Guaranteed Safe AI Newsletter https://gsai.substack.com/p/october-2024-progress-in-guaranteed
Last updated: May 02 2025 at 03:31 UTC