Zulip Chat Archive

Stream: general

Have people here already discussed the implications of the performance of Large Transformer models for Lean?
I believe applying this to Lean would immediately make formal proof verification the standard for mathematicians all over the world. It would absolutely abolish the rather steep learning curve Lean has.

Recently Openai Codex/GitHub Copilot has been released. It is afaik a GPT trained on large databases of code in various languages together with natural langlish descriptions/comments.
It shows remarkable capability of turning low to medium complexity natural language instructions into functioning code.
On their youtube channel they have some demonstrations.

Openai Codex has been said to excel at assisting with the small, fiddly details of programming.
For programmers it seems like so far this is not ground breaking. My impressions are that the reasons are the following:
1) Programmers DON'T REALLY MIND dealing with these small details. They are fundamentally used to writing syntactically correct and semantically precise code.
2) The Model might write code that contains bugs. The programmer can not immediately verify that the model has succeeded at its task.
3) The Model might write code that is not very readable and general doesn't follow the practices the programmer uses.

I believe that for mathematicians, all these three points are absolute non-issues!

1) Mathematicians are the exact opposite, they usually ONLY talk in natural language about mathematics. 99% of mathematicians never even learn how to use a formal language.
After I learned basic lean, the biggest hurdle towards actually using it in practice for me was the fact that doing very simple things is fiddly and difficult. Using a continuity lemma or just rewriting a fraction can be really tought if you don't have the practice. Most mathematicians won't want to put in this practice.
But in terms of understanding exactly these things are rather simple tasks for a Model like Openai Codex.

2) In formal mathematics it can be immediately verified if the Model has managed to produce code that achieves the subgoal. The programmer does have to carefully check if the theorem and definition statements are correct, but if any proof works, then it just works. There are no bugs in Lean in this sense.

3) I had the impression so far that code readability is not exactly seen as having a high priority in the lean community.

So, what is needed to make this work? Obviously, a large database of formalized code.
From what I understand, it is important that the theorems and proof sit in juxtaposition with comments that describe in natural language what is going on. This helps the model learn to translate between natural language mathematics, and formal lean mathematics.
I'm sure Openai Codex/Github copilot was trained on an absolutely stupendously large codebase, but I think that with a few years of more data in the mathlib and more data efficient transformers, a first version can be running.

Another extremely important point as to why Lean is in my opinion the perfect application for these large transformer models is that automatic data generation is possible.
We have a small d of database of lean code but an enormous database of LaTeX files.
Having a first version of such a transformer model working somewhat reliably, one can start to feed in textbooks and papers in LaTeX or some other typesetting language to turn them iinto formalized code. Correct formalization of theorems and definitions ahs to be manually verified, but proofs can be automatically checked. If the model manages to correctly formalize most of the proofs, a human only has to formalize the rest of them. In this sense the model can be used to generate more data on which it can be trained.

I may be a little optimistic here, but I genuinely think that in terms of complexity, given a reasonable amount of data juxtaposing natural language mathematics, latex mathematics, and formal lean mathematics, the amount of understanding Openai Codex/Github copilot has demonstrated it can extract is easily sufficient to allow mathematicians to write lean code just by giving instructions like: "Let epsilon be greater than zero and let x in X. Apply the previous lemma on x and use continuity of F to take the limit."

What do you think?

Last updated: May 02 2025 at 03:31 UTC

leanprover-community / mathlib

Zulip Chat Archive

Stream: general

Topic: OpenAI Codex and Lean

Golol (Aug 24 2021 at 08:52):