Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Autoformalisation

Cris Salvi (Aug 11 2024 at 08:39):

Are there any recent surveys on autoformalisation that include a description of the techniques used and a link to the relevant code? Honest accounts of both successes and failures would be particularly appreciated.

Jason Rute (Aug 11 2024 at 21:03):

I don’t think there is such a survey. Autoformalization is pretty recent and I don’t think we have hit a stable point where a survey wouldn’t be very much out-of-date.

Jason Rute (Aug 11 2024 at 21:03):

Here are some key points that might help:

Qingxiang Wang has a few papers trying out auto-formalization using machine translation
Christian Szegedy wrote A Promising Path Towards Autoformalization and General Artificial Intelligence which is a manifesto of sort on the topic. The idea is that we can get into a virtuous loop where autoformalization and automated theorem proving strengthen each other in a reinforcement learning loop.
Christian Szegedy gave a talk at AITP 2020 showcasing how hard this topic is and the brick walls they were running into.

Jason Rute (Aug 11 2024 at 21:03):

Then everything changed in 2022 with the paper Autoformalization with Large Language Models. The main idea was that LLMs for Code (codex at the time) could do autoformalization at a sufficiently useful level (about 25% correct for theorem statements of competition problems IIRC). That was good enough for generating additional data for reinforcement learning of automated theorem provers.

Jason Rute (Aug 11 2024 at 21:04):

Shortly after that paper, a number of works came out (many talked about on this Zulip at length):

LeanChat (a pre-chat-gpt chat tool for autoformalization into Lean)
LeanAide: Autoformalization connected to Lean metaprogramming so can better find and fix translation mistakes
Draft-Sketch-Prove: Formalize theorem statements into proof sketchs and fill in details with Hammer. Many papers in automated theorem proving for Isabelle extend Draft-Sketch-Prove:
ProofNet: A benchmark for Lean made by autoformalizing textbook exercises
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization: I think the idea is to check informal LLM generated answers by using auto-formalization.
Multilingual Mathematical Autoformalization: Using autoformalization in reverse (formal to informal) to generate training data for LLMs.
DeepSeek-Prover and AlphaProof both use autoformalized competition problems at a large scale (millions of problems) to train automated theorem provers at scale.

(I’m sure I’m missing a lot of works.)

Jason Rute (Aug 11 2024 at 21:05):

Also, many people have been converting Lean theorems into informal statements for the purpose of either better documentation (automatic documentation), for semantic search (Moogle, LeanSearch), or for making training data for automated theorem proving.

Jason Rute (Aug 11 2024 at 21:05):

I’m less aware of work pursuing autoformalization for larger projects (e.g. formalizing a book, or a paper), but I think that would be the next big goal. Another big goal would be to do more systematic autoformalization, like formalizing a whole database, or a whole encyclopedic reference of some sort. Or trying to scrap and formalize all theorems of a particular form in some area of mathematics.

Jason Rute (Aug 11 2024 at 21:12):

Also just found this benchmark on autoformalization in Lean: An Evaluation Benchmark for Autoformalization in Lean4

Jason Rute (Aug 11 2024 at 21:14):

And this: Process-Driven Autoformalization in Lean 4

Jason Rute (Aug 11 2024 at 21:16):

And: Lean Workbook: A large-scale Lean problem set formalized from natural language math problems (discussed in #Machine Learning for Theorem Proving > Lean Workbook: A large-scale Lean formalized problem set)

Cris Salvi (Aug 12 2024 at 08:23):

This is great @Jason Rute thanks a lot for the response.

Last updated: May 02 2025 at 03:31 UTC