Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Translating between ITPs with machine learning

Jason Rute (Mar 14 2021 at 21:02):

It is well known that

There are many different ITPs.
Each have their own libraries, some covering certain areas of mathematics more than others.
It is very difficult (near impossible) to use a theorem from one ITP library in another.

Jason Rute (Mar 14 2021 at 21:02):

I know there have been projects to transpile the theorems from one ITP into another, for example Mario’s project to transpile MetaMath’s Dirichlet’s theorem into Lean. There have also been projects to put all ITP theorems into a common framework.

Jason Rute (Mar 14 2021 at 21:02):

There however are a few problems with such approaches:

It is only possible to directly transpile from one ITP to another if the logic of the source ITP is weaker than the logic of the target ITP (possibly through the addition of axioms).
The translations aren’t much use without alignment. For example, a transpilation of Metamath’s (1 + 1) = 2 into Lean would be something of the form mm.nat_add mm.one mm.one = (mm.two : mm.nat). For this to be useful, one would have to also prove in Lean that mm.nat = nat, mm.one = 1, etc. (Or worst, prove that mm.nat is just isomorphic to nat, so it would be even more work to transfer results.)
The translations and alignments would be both ugly and brittle. They wouldn’t be worth putting into say mathlib.

Jason Rute (Mar 14 2021 at 21:02):

Here is my (possibly rhetorical) question:

Jason Rute (Mar 14 2021 at 21:02):

Is machine learning able to do ITP translation in a more robust way?

Jason Rute (Mar 14 2021 at 21:03):

It seems to me that we have the technology currently to solve many parts of this problem, if not the whole problem. Can we put all the parts together? Do we have enough data?

Jason Rute (Mar 14 2021 at 21:03):

Machine Translation: Facebook AI’s recent TransCoder uses ML translation models to accurately translate between C++, Python, and Java. These models are nice in that they work on unaligned corpora. Programming languages (and ITPs) share a lot of keywords which help to build such a translator. (Here is a nice video on how it works, and an interview with the authors.)

Jason Rute (Mar 14 2021 at 21:03):

It is not unreasonable, assuming equivalent vocabulary exists in both ITPs, to think that such a system could accurately translate the statement of a theorem, e.g. Stirling’s Formula, from, say, Isabelle code

fact ∼ (λn. sqrt (2*pi*n) * (n / exp 1) ^ n :: real)

into Lean code

filter.tendsto (λ (n : ℕ), ↑(n.factorial) - real.sqrt (2 * π * ↑n) * (↑n / real.exp 1) ^ n) filter.at_top (𝓝 0)

For short proofs, it might even be able to translate not only the theorem statement (and statement of the lemmas leading up to the theorem) but also the proofs as well. However, that is more questionable.

Jason Rute (Mar 14 2021 at 21:03):

Similarly, we should be able to translate definitions (and maybe even notation?) from one ITP to another.

Jason Rute (Mar 14 2021 at 21:03):

It should be noted, however, that ITP libraries have 1/1000th of the data used to train TransCoder, so that is definitely a consideration.

Jason Rute (Mar 14 2021 at 21:03):

Neural theorem proving
As it is well known now, there are a number of neural and machine learning theorem provers for most of the ITPs.

Jason Rute (Mar 14 2021 at 21:03):

As we translate lemmas from one ITP to another, we generate candidate lemmas in the target ITP, but we can’t be sure they are correct. First, we must elaborate the new theorem statements to make sure they type check. If they do, then we can use a neural theorem prover to try to find proofs. Depending on the setup and the prover, we can even use other proposed lemmas as premises. (For example, if the Isabelle library uses LemmaA to prove LemmaB, then we could try to use the Lean translation of LemmaA as a premise in the proof of the Lean translation of LemmaB.) This way we can both generate an outline of a new area of math in Lean as well as start to fill in the proofs.

Jason Rute (Mar 14 2021 at 21:04):

Reinforcement Learning of Theorem Proving It has been repeatedly shown that one can improve (or even train from scratch) a neural theorem prover with only statements of theorems. Our translations (even if not all are correct), would be great candidates for such a reinforcement learning loop. We would be effectively doubling (or more!) the size of our theorem training data. Also, some libraries have a lot of simple theorems, e.g. arithmetic facts, identities, and real function inequalities. These would make good training data.

Jason Rute (Mar 14 2021 at 21:04):

Co-training It has been shown that training on multiple related tasks can improve a model more than training on just one task at a time. Possibly one could get benefits out of training on all the ITPs at once, both for translation and theorem proving. Also, theorem proving, and similar tasks, would help our model learn the semantics of the symbols, thereby making the translation better.

Jason Rute (Mar 14 2021 at 21:04):

Clustering and alignment The same techniques for machine translation would also help to align theorems and definitions across libraries, which is of independent value. It can also be used to cluster similar theorems both within and across libraries.

Jason Rute (Mar 14 2021 at 21:04):

So many questions

Jason Rute (Mar 14 2021 at 21:04):

There are still many questions, but unlike auto-formalization (which seems further from happening, and will require many major research breakthroughs) this seems more likely to being on the edge of possibility:

Would this be useful to ITP library users if it was done?
How much theorem data is needed? Are there ways to get around data scarcity, such as co-training on related tasks, or pre-training the models?
How would the model handle definitions which are not in the library? For example, take the Isabelle version of the Prime Number Theorem, (λx. pi x * ln (real x) / real x) ⇢ 1 . The prime-counting function pi: nat -> real doesn’t exist in Lean (as far as I know), but it wouldn’t be difficult to define it. More subtly, Lean doesn’t use the same sort of notation for convergence. Ideally, one would not want a direct translation, but one in the Lean style. (One idea is that one would translate not only a theorem, but all of the definitions used in that theorem, etc, but this requires a lot of thought to get right.)
Would the translation be into fully elaborated Lean code or into human-style Lean source code? Would it generate just theorem statements or whole Lean file sketches filled with sorrys?
Would it make sense to also have a natural English mathematics as one of the languages being translated? One could use arXiv, wikipedia, proof-wiki, etc. The idea here is that the vocabulary of ITPs is borrowed from natural language math. However, the danger is that this starts to get into auto-formalization which is much harder problem.
How can type checking be used to improve the translations? (It seems to me that one of TransCoder’s largest opportunities for improvement is by rewarding the agent for translations which both (1) compile, and (2) produce the same results. Something similar can be done in theorem proving.)
Since there should be many theorems which actually align perfectly between libraries, should we do this in multiple steps? (1) Train a model to propose matching theorems between libraries, (2) have a human sort through the proposals for exact matches, (3) add those aligned examples to the training data.
To make this project a reality, how much ITP engineering would have to happen and is there enough motivation to do it?
To be clear, to just naively generate proposed translations of all theorems for all pairs of ITPs would require almost no ITP engineering (assuming there is enough data to run something like TransCoder), since the theorem data already exists and we know how to extract it. However, to check that the translations type-check, or running a neural theorem prover on the translations would require significant ITP integration, only some of which currently exists for the various ITPs?

Jason Rute (Mar 14 2021 at 21:04):

This is really just me publicly brainstorming, but if say Facebook AI, OpenAI, Google AI, or DeepMind (or an academic lab) wanted to pursue this project, I’d personally be quite ecstatic :smile:

Kevin Buzzard (Mar 14 2021 at 22:41):

I guess it would be interesting to be able to, in the middle of a Lean proof, say "please port this goal to Isabelle/HOL and then try sledgehammer on it". I don't think the fact that the logic is different should affect things here -- for basic concepts like sets and functions everything is going to be essentially the same.

Joe Hendrix (Mar 15 2021 at 05:17):

This is not quite the same problem, but it would be definitely reduce my skepticism of this stuff if machine learning was able to convert between much simpler proof formats used in the SAT community. e.g. automatically going between LRAT and DRAT proofs.

I know virtually nothing about ML, but this problem seems challenging. Another, more industrially relevant problems would be using ML to optimize combinatorial logic, e.g., to reduce the size or depth of terms. My intuition is that these problems would be easier than reliable automation to do translations between higher level logics, and to translate not just definitions and proofs, but also translate tactics.

Last updated: May 02 2025 at 03:31 UTC