Stream: Lean Together 2025

Topic: Siddhartha Gadgil: Real world Autoformalization

Jason Rute (Jan 15 2025 at 13:32):

It would be interesting to combine something like LeanAide (autoformalization) and leansearch.net (library search). It would be cool if you search for something in leansearch.net and it doesn't just give you definitions/theorems in the library, but also gives you how you could enter this definition/theorem in yourself into Lean. (A web interface would make this a lot easier to use. Of course, cost is an issue.)

Jason Rute (Jan 15 2025 at 13:38):

@Siddhartha Gadgil How do you think LeanAide could help out in the Lean Blueprint process. (I guess others can say where the pain points are, and where autoformalization could help.)

Jason Rute (Jan 15 2025 at 13:38):

Also, what are your thoughts on the feasibility of scaling something like this to formalizing a whole paper, say a paper with few definitions and all the background is standard math in Lean? (Yes, this filters out most interesting papers.) Say you are willing to let something run in a loop for a week. Would there be any hope with current methods that this sort of thing could work? Or is this still science fiction?

Siddhartha Gadgil (Jan 15 2025 at 13:43):

I don't think this is science fiction, but work is needed to make it a reality. The hope will be that a language model will do things in various stages:

Add more details to the text if necessary (LLMs do this well).
Break into small enough sections.
Summarize the sections to act as context for other sections.
Rewrite the section in a custom JSON/XML format, for example.
Combine sentence level translation with rule-based code generation to generate Lean code, with final proofs left to some form of automation. The automation is also given information about what results the text says are used in the proof.

Siddhartha Gadgil (Jan 15 2025 at 13:44):

I believe the weak links are the sentence level translation, the identification of the needed results, and the final automation - the last is getting better with grind, aesop, duper etc.

Jason Rute (Jan 15 2025 at 13:46):

So you think the ability of the LLM to partition the proof into structured JSON is pretty good? That would be cool.

Siddhartha Gadgil (Jan 15 2025 at 13:48):

Generation of structured proofs seems very good, at least with GPT-4o. We have experimented a fair amount and it is generally correct.

Siddhartha Gadgil (Jan 15 2025 at 13:49):

@Anirudh Gupta is the one who has experimented the most. He can confirm how good.

Jason Rute (Jan 15 2025 at 13:50):

Ok, how expensive would it be to do this at scale, like 1000s of papers? Also, do you have a good example of what this structured output looks like? Does it consist of concrete mathematical steps which can be atomically formalized?

Siddhartha Gadgil (Jan 15 2025 at 13:52):

The structured output we use is based on very detailed instructions, with the current version at https://github.com/siddhartha-gadgil/LeanAide/blob/main/resources/JsonTemplate.md (some experimentation was needed). LLMs largely follow this.

Siddhartha Gadgil (Jan 15 2025 at 13:53):

The result is precise enough that my Lean code handles it in a rule-based way. There are some inaccuracies though.

Siddhartha Gadgil (Jan 15 2025 at 13:54):

Jason Rute said:

Ok, how expensive would it be to do this at scale, like 1000s of papers? Also, do you have a good example of what this structured output looks like? Does it consist of concrete mathematical steps which can be atomically formalized?

If it turns out (as seems plausible) that Gemini-2.0-flash can do this reasonably, I think the cost will be quite modest.

Siddhartha Gadgil (Jan 15 2025 at 13:56):

Also there are open models that are very good. But even for them there is the cost of compute.

Jason Rute (Jan 15 2025 at 13:59):

Back to my blueprint questions, have you ever thought how you could speed up the process of creating blueprints, or filling them in? It seems your tools are creating your own form of blueprint.

Siddhartha Gadgil (Jan 15 2025 at 14:02):

I have not experimented. Giving similar detailed instructions for blueprints may give good results (I would expect at least reasonable results).

For my purposes JSON/XML are easier to post-process (LaTeX is hard). Also the models have been trained heavily for JSON output, so LaTeX may not be as good.

Jason Rute (Jan 15 2025 at 14:07):

you can always do a pipeline. txt to json to other json to latex?

Siddhartha Gadgil (Jan 15 2025 at 14:09):

Yes, indeed JSON to LaTeX should be straightforward. And the initial JSON instructions can be optimized for the Blueprint format if needed.

Siddhartha Gadgil (Jan 15 2025 at 15:09):

My slides are at https://siddhartha-gadgil.github.io/presentations/LeanTogether25.html#/

Pietro Monticone (Jan 16 2025 at 00:27):

:video_camera: Video recording: https://youtu.be/bTueWi6OdSg

Siddhartha Gadgil (Jan 16 2025 at 00:27):

Thanks

Alok Singh (Jan 16 2025 at 02:02):

DeepSeek v3 has a good cost to performance ratio too

Rob Lewis (Jan 16 2025 at 03:42):

Finally getting around to watching most of today's talks! Thanks for the great presentation. Have you all experimented much with autoformalizing definitions instead of theorem statements? How did it go? It seems like a much much harder task.

Siddhartha Gadgil (Jan 16 2025 at 03:46):

Rob Lewis said:

Finally getting around to watching most of today's talks! Thanks for the great presentation. Have you all experimented much with autoformalizing definitions instead of theorem statements? How did it go? It seems like a much much harder task.

Thanks. Not tried on a large scale, but there is code in LeanAide that translates definitions (essentially the same prompt) and it works fine at least for simple definitions.

The real split I see is between translating to a single command versus a block. The latter is work in progress.

Rob Lewis (Jan 16 2025 at 03:49):

Do you have an example of a simple definition?

Rob Lewis (Jan 16 2025 at 03:51):

I'm thinking about @Oliver Nash 's talk yesterday about the difficulty of crafting the perfect definition, which rang very true, and wondering how an (eventual) full autoformalization pipeline would deal with these difficulties.

Siddhartha Gadgil (Jan 16 2025 at 04:04):

Here is an example I tried just now: "A natural number is said to be square-free if it is not divisible by the square of a prime number."

def squarefree (n : ℕ) : Prop :=
  ∀ p : ℕ, p.Prime → ¬(p ^ 2 ∣ n)

Siddhartha Gadgil (Jan 16 2025 at 04:06):

Because square-free is a standard term I tried a perturbation (not in the same chat, so the previous one was not a hint): "A natural number is said to be cube-free if it is not divisible by the cube of a prime number."

def CubeFree (n : ℕ) : Prop :=
  ∀ p : ℕ, p.Prime → ¬p ^ 3 ∣ n"

Siddhartha Gadgil (Jan 16 2025 at 04:10):

Another example of a different sort: "A group is said to be crazy if every normal subgroup is cyclic."

def IsCrazyGroup (G : Type u) [Group G] : Prop :=
  ∀ (H : Subgroup G), H.Normal → IsCyclic H

Siddhartha Gadgil (Jan 16 2025 at 04:13):

And a variant: "A group is said to be sane if every proper normal subgroup is cyclic."

def IsSaneGroup (G : Type u_1) [Group G] : Prop :=
  ∀ (N : Subgroup G), N.Normal → N ≠ ⊤ → IsCyclic ↥N

In this case, I am not sure if the use of top is correct.

Siddhartha Gadgil (Jan 16 2025 at 04:28):

@Rob Lewis Currently I feel the best hope for handling more subtle definitions is to get a language model to rewrite them in a more verbose form, perhaps even splitting them into many definitions. Then one uses the "simple definition" translation to handle these.

Siddhartha Gadgil (Feb 06 2025 at 12:10):

We have made some major changes since this talk, most notably a way to easily use LeanAide in your project (hopefully) without issues of matching toolchains, dependencies etc. There have also been many other improvements.

Below is the new README, including details of the setup.

LeanAide

LeanAide or LeanAIde (accidental pun) is work in progress to build AI based tools to help development with the Lean Theorem Prover. The core technique we use is Autoformalization which is the process of translating informal mathematical statements to formal statements in a proof assistant. This is used both directly to provide tools and indirectly to generate proofs by translation from natural language proofs generated by Large Language models.

We now (as of early February 2025) have a convenient way to use LeanAide in your own projects, using a server-client setup which we outline below.

Example usage

The most convenient way to use LeanAide is with syntax we provide that gives code-actions. We have syntax for translating theorems and definitions from natural language to Lean, and for adding documentation strings to theorems and definitions. For example, the following code in a Lean file (with correct dependencies) will give code actions:

import LeanAideTools

import Mathlib

#theorem "There are infinitely many odd numbers"

#def "A number is defined to be cube-free if it is not divisible by the cube of a prime number"

#doc

theorem InfiniteOddNumbers : {n  ∣  Odd n}.Infinite := by sorry

We also provide syntax for generating/completing proofs. For now this is slow and not of good quality. Experiments and feedback are welcome. The first of the examples below uses a command and the second uses a tactic.

#prove "The product of two successive natural numbers is even"

theorem eg₁ (n: Nat) : 2 ∣ n * (n + 1) := by

  #prove

Setting up LeanAide

To either experiment directly with LeanAide or use it via the server-client setup as a dependency in your own project, you need to first install LeanAide.

First clone the repository. Next, from the root of the repository, run the following commands to build and fetch pre-loaded embeddings:

lake exe cache get # download prebuilt mathlib binaries

lake build mathlib

lake build

lake exe fetch_embeddings

Our translation is based on GPT 4o from OpenAI, but you can use alternative models (including local ones). Note that we also use embeddings from OpenAI, so you will need an OpenAI API key unless you set up an example server as descried below. Here, we assume you have an OpenAI API key.

To get started please configure environment variables using the following bash commands or equivalent in your system at the base of the repository (the angle brackets are not to be included in the command), and then launch VS code.

export OPENAI_API_KEY=<your-open-ai-key>

code .

Alternatively, you can create a file with path private/OPENAI_API_KEY (relative to the root of LeanAIDE) containing your key.

After this open the folder in VS code (or equivalent) with Lean 4 and go to the file LeanCodePrompts/TranslateDemo.lean. Any statement written using syntax

similar to #theorem "There are infinitely many primes" will be translated into Lean code. You will see examples of this in the demo files. Once the translation is done, a Try this hyperlink and code-action will appear. Clicking on this will add the translated code to the file.

Alternatively, you can translate a statement using the following command in the Lean REPL:

lake exe translate "There are infinitely many primes"

Using as a dependency: server-client setup

Adding LeanAide as a dependency to your project is brittle and slow. In particular, toolchains need to match exactly. Further, there may be a collision of transitive dependencies.

We provide a server-client setup that is more robust and faster, based on the zero-dependency project LeanAideTools. This acts as a client to the server in this repository.

The simplest (and we expect the most common) way to set up LeanAide to use in your project is to run a server locally. To do this, first set up LeanAide as described above. Then run the following command from the root of the repository:

python3 leanaide_server.py

We assume you have Python 3 installed but we only use builtin packages. If you wish to use a different model supporting the OpenAI API, you can pass parameters as in the following example where we use the Mistral API. To use a local model, you can run with the OpenAI Chat API using, for example vLLM and give a local url.

 python3 leanaide_server.py --url https://api.mistral.ai/v1/chat/completions --auth_key <Mistral API key> --model  "mistral-small-latest"

Next, add LeanAideTools as a dependency to your project. This can be done by adding the following line to your lakefile.toml file:

[[require]]

name = "LeanAideTools"

git = "https://github.com/siddhartha-gadgil/LeanAideTools.git"

rev = "main"

If you use a lakefile.lean file, you can add the following line to the file:

require LeanAideTools from git "https://github.com/siddhartha-gadgil/LeanAideTools"@"main"

Now you can use the LeanAideTools package in your project. The following is an example of how to use it:

import LeanAideTools

#theorem "There are infinitely many primes"

Hosting a server

The server is by default launched at http://localhost:7654. This can be customized by setting the variables HOST and LEANAIDE_PORT. To use a shared server, simply set HOST to the IP of the hosting machine. In the client, i.e., the project with LeanAideTools as a dependency, add the configuration:

set_option leanaide.url <hosts IP address>:7654

to use the host. Further, unless you want all clients to use the hosts authentication for OpenAI or whatever is the model used, the host should not specify an authentication key (better still, start the server with --auth_key "A key that will not work"). The client should then supply the authentication key with

set_option leanaide.authkey? "<authentication-key>"

There are many other configurations at the server and client end.

Using different (non OpenAI) embeddings

If you wish to use embeddings different from the OpenAI embeddings, for example for a fully open-source solution, you can run an "examples server" and set the parameter examples_url to point to it. We provide a basic implementation of this in the example_server directory. To start this, run the following:

cd example_server

python3 prebuild_embeddings

flask run

The second command should be run only the first time you set up. Once you start the server, set examples_url = "localhost:5000/find_nearest" to use the embedding server.

The file example_server/README.md sketches the configuration options and also the protocol in case you wish to make a fully customized embedding database.

Contributions and details

The principal author of this repository is Siddhartha Gadgil.

The first phase of this work (roughly June 2022-October 2023) was done in collaboration with:

Anand Rao Tadipatri
Ayush Agrawal
Ashvni Narayanan
Navin Goyal

We had a lot of help from the Lean community and from collaborators at Microsoft Research. Our server is hosted with support from research credits from Google.

More recently (since about October 2024) the work has been done in collaboration with:

Anirudh Gupta
Vaishnavi Shirsath
Ajay Kumar Nair
Malhar Patel
Sushrut Jog

This is supported by ARCNet, ART-PARK, IISc.

Our work is described in a note at the 2nd Math AI workshop and in more detail (along with related work) in a preprint.

Last updated: May 02 2025 at 03:31 UTC