Zulip Chat Archive

Stream: Equational

Topic: How could machine learning/AI be deployed on this project?

Terence Tao (Oct 13 2024 at 01:48):

In my recent blog post on this project, I noted that thus far machine learning/modern AI tools have only made peripheral contributions to the project, e.g., in writing supporting code. After talking a bit with @Pietro Monticone about this, we decided to open a discussion on possible ways in which these methods might be useful for this project. Some preliminary suggestions we came up with were

Requesting access to APIs to advanced AI tools
Testing out ML-based Lean-compatible tools such as LeanCopilot, ImProver, or LeanAgent (see also the survey at https://arxiv.org/abs/2404.09939 )
Are there any graph-ML methods that might be relevant?
Creating a benchmark of implications specifically designed for testing out ML-based tools

Any other suggestions would be welcome.

p.s. Is there some way to crosspost this thread to the "Machine Learning for Theorem Proving" channel?

Joe McCann (Oct 13 2024 at 04:35):

This is mostly already covered in your point (2), but assistance in converting a standard human-written proof into valid Lean code. Lean has a learning curve as steep as a brick wall, so anything to help someone who might not be super comfortable (or knows nothing at all) contribute could help expand the vision of allowing people with less background help.

An additional (albeit non-sexy) use case is unit test writing which can help ensure that as more contributions are made they don't break previous work. This could be critical especially if the code used here is used as a template for future projects

Jose Brox (Oct 13 2024 at 10:49):

Terence Tao ha dicho:

In my recent blog post on this project, I noted that thus far machine learning/modern AI tools have only made peripheral contributions to the project, e.g., in writing supporting code.
[...]
Any other suggestions would be welcome.

I had some plans for this ( but I didn't realize how fast the project would develop!). I'm mostly interested in the application of ML tools to try to prove implications/anti-implications. What I'd like to do is:
1) In a first step, to run different kinds of ML methods (decision trees, random forests, neural networks...) in (a big portion of) the current database of pairs of identities, and test the efficiency of the classification predictions (in a test sample). Then get predictions about pairs yet to solve.
2) After that, apply methods of explainable AI (XAI) to the different models, to see what patterns are explaining the different outcomes, then try to translate them to theorems. E.g., I expect a deep enough neural network should be able to find our metatheorems about elements at the extreme of an identity, etc. but should also pick up subtler patterns (if they exist). Here it will be important to discard hallucinations.
3) If possible, try to make the AI produce Lean code from the patterns, and see if it succeeds.

I have some basic knowledge on how to do 1) and 2) (btw, we could use the Kaggle platform for this), but since I could easily mess up, my plan is to contact tomorrow (Monday) with an expert in XAI I know in my university. I think he will be interested; he is an expert in SHAP (the most successful XAI method at present) and has modified it to apply it in new contexts. I will update you on this (although I don't know if this is really needed; the collective of people in this project has a tremendous range of skills and wit!).

About 3), I don't know if my colleague will be able to help, or if he knows someone else.

Shreyas Srinivas (Oct 13 2024 at 11:07):

If you can embed access to the model as a lean tactic then you could be sure that the resultant proof is correct

Shreyas Srinivas (Oct 13 2024 at 11:07):

If lean says it is. I have added external checkers for good measure

Tim (Oct 14 2024 at 14:23):

If there would be something like alpha-proof tool chain available but in a more interactive fashion I think that would really help. One could formulate problems in natural language and use alpha-zero to solve this overnight. For that something similar like Leela Chess Zero would need to exist and maybe some specialized instances of LLMs (which is quite similar to the LeanCopilot problem.) The AlphaZero model would benefit from the distributed computation efforts (lean proofs made by the model).

Tim (Oct 14 2024 at 14:31):

(deleted)

Michael Bucko (Oct 14 2024 at 14:56):

(I am using Cursor with Lean and Claude. And O1 in the browser. I've been using LeanCopilot and read the LeanAgent paper.)

I believe that enabling those lifelong learners through deployment, tooling and UX can make a huge difference (it has already solved 162 unsolved cases and it'll most likely be millions very soon).
As for graphml, please have a look at: https://huggingface.co/blog/intro-graphml
I was looking into graph transformers; check out https://github.com/microsoft/Graphormer/

Michael Bucko (Oct 14 2024 at 15:02):

Also, how about turning mathlib into a knowledge base using something like node2vec, and then adding curriculum learning agents on top of it?

It'd work like this:

We'd need to turn the mathematical objects / relations into a graph,
We'd store the graph in a graph db (the edges could come from lean-enabled scanners, or from llms),
We could then use something like node2vec or graph transformers (see https://arxiv.org/pdf/2407.09777)

Anand Rao Tadipatri (Oct 14 2024 at 20:47):

I think it would be interesting if machine learning could be used in coming up with novel examples of magmas that settle open non-implications.

Terence Tao (Oct 14 2024 at 20:58):

One could imagine some adaptation of the "FunSearch" approach (where the magmas would be procedurally generated by LLM-generated Python code or something) possibly being effective.

Fan Zheng (Oct 15 2024 at 09:40):

@Michael Bucko What are the 162 cases solved by LeanAgent? That sounds interesting.

Michael Bucko (Oct 15 2024 at 09:55):

Fan Zheng schrieb:

Michael Bucko What are the 162 cases solved by LeanAgent? That sounds interesting.

From the paper, " LeanAgent proves 162 sorry theorems across 23 Lean repositories, including from challenging mathematics, highlighting its potential to assist in formalizing complex proofs across multiple domains. For example, LeanAgent successfully proves advanced sorry theorems from the PFR repository and proves challenging theorems in abstract algebra and algebraic topology. It outperforms the ReProver baseline by up to 11×, progressively learning from basic to highly complex mathematical concepts. Moreover, LeanAgent shows significant performance in forgetting measures and backward transfer, achieving a near-perfect composite score of 94%. This explains its continuous generalizability and continuous improvement."

Rémy Degenne (Oct 15 2024 at 09:58):

There is a discussion about LeanAgent here: #Machine Learning for Theorem Proving > LeanAgent

Tim (Oct 15 2024 at 11:30):

Anyone tried out this or its techniques used for search of proofs in the project: https://github.com/deepseek-ai/DeepSeek-Prover-V1.5

Pietro Monticone (Oct 15 2024 at 11:46):

No, I don’t think so.

Jose Brox (Oct 15 2024 at 11:56):

Jose Brox ha dicho:

My plan is to contact tomorrow (Monday) with an expert in XAI I know in my university. I think he will be interested; he is an expert in SHAP (the most successful XAI method at present) and has modified it to apply it in new contexts. I will update you on this (although I don't know if this is really needed; the collective of people in this project has a tremendous range of skills and wit!).

We have started this morning (I have briefed my colleague). We will start doing the fastest thing we can do, which is to apply already existing models for text recognition (to pairs of equations together with their implication status, not looking at the proofs) and see what happens. Next we will train our own models for this same problem (we expect this shouldn't take too long with this data). These we can do in a few days. If the success ratio is high, we will apply XAI to try to understand the patterns behind the predictions (this will take sensibly more time).

There are details that worry me, like how much "obvious knowledge" we can detract from the database: if we apply transitivity, duality, etc. then we can get biased patterns (e.g. the model doesn't know a priori about invariance under change of variables).

Jose Brox (Oct 15 2024 at 12:01):

We think between the two of us we don't have the skills and resources (including time needed) to apply LLMs for the high-level applications that are being discussed in the project, we could try to form a group in our university, which has an AI Center (and we can access a supercomputer if needed), but perhaps there already are people better suited to do this close to the project. We think that LLMs are needed in order to predict a proof for a new pair from the proof of the training pairs.

Shreyas Srinivas (Oct 15 2024 at 12:01):

what's XAI? Explainable AI?

Shreyas Srinivas (Oct 15 2024 at 12:02):

In the context of this project, what would explainability look like?

Shreyas Srinivas (Oct 15 2024 at 12:03):

What would the AI produce as explanations?

Jose Brox (Oct 15 2024 at 12:07):

Shreyas Srinivas ha dicho:

What would the AI produce as explanations?

For example, for the equations-only (no proofs) models, asking for the explanation of a neural network why a particular pair of equations (a=b,c=d) is an anti-implication, it could highlight the leftmost variables of a,b,c,d and "show" that they are equal in a,b but different in c,d. For decision trees it could be different. It is also important that XAI can be applied to different depths, so that for example an intermediate layer of the neural network may discriminate by the number of variables in each term, etc.

Pietro Monticone (Nov 10 2024 at 18:19):

See #Equational > Graph ML: Directed link prediction on the implication graph

Last updated: May 02 2025 at 03:31 UTC