Zulip Chat Archive

Stream: lean4

Topic: What else can AI help

Shaonan Wu (Feb 23 2024 at 07:16):

Hi everyone,
I am an undergraduate student majoring in Artificial Intelligence, with a particular interest in Natural Language Processing (NLP). Recently, I have come across numerous research efforts that aim to automate theorem proving using neural provers and proof assistants like Lean. This has sparked my curiosity about AI4Math.
Besides the NTP task, are there any other roles that AI can play in the development of formal mathematics, and what are the specific requirements that mathematicians have in this context?

Kevin Buzzard (Feb 23 2024 at 07:49):

An even more science-fiction application is "computer proves Riemann hypothesis" but right now this looks a long way away (unless you're Christian Szegedy)

James Gallicchio (Feb 23 2024 at 07:56):

i know very little about NLP/FM crossover, but my impression is there's still a pretty huge gap in translating natural language descriptions of mathematical problems/structures/theorem statements into accurate/correct formal statements. this is entirely orthogonal to actually automating proofs once you have a formal statement, and yet seems to be very very hard in its own right.

I think there is some interest in that direction from mathematicians, because having tools to translate natural language into formal language would make it easier to onboard more mathematicians to the formal world.

Shreyas Srinivas (Feb 23 2024 at 08:05):

Given a mathematical claim, find counterexamples that disprove it. Currently there are a few different methods of property testing. I don't know how many of those are integrated into lean

Shreyas Srinivas (Feb 23 2024 at 08:05):

Also see: https://youtu.be/vMLVH6IEwlM?si=dOfUFOlZMML3uuHU

Shaonan Wu (Feb 23 2024 at 08:49):

James Gallicchio said:

i know very little about NLP/FM crossover, but my impression is there's still a pretty huge gap in translating natural language descriptions of mathematical problems/structures/theorem statements into accurate/correct formal statements. this is entirely orthogonal to actually automating proofs once you have a formal statement, and yet seems to be very very hard in its own right.

I think there is some interest in that direction from mathematicians, because having tools to translate natural language into formal language would make it easier to onboard more mathematicians to the formal world.

I understand. Translating natural language into formal language is indeed similar to recent works in autoformalization, such as this paper(https://arxiv.org/abs/2210.12283). These works start with an informal proof and use it to generate a formal proof sketch through a language model, which is then refined using tools like Sledgehammer.

Autoformalization is undoubtedly an important translation task. And how about the backtranslation task？？For example, given a less readable Lean code, generating math comments or translating it back into readable informal proofs can be valuable. This helps bridge the gap between formal and informal mathematical representations, making the formalized proofs more accessible and understandable to a wider audience.

Shaonan Wu (Feb 23 2024 at 08:52):

Shreyas Srinivas said:

Given a mathematical claim, find counterexamples that disprove it. Currently there are a few different methods of property testing. I don't know how many of those are integrated into lean

Thank you for your reply. This is a truly new topic for me, and I will delve into it later.

Shaonan Wu (Feb 23 2024 at 08:54):

Kevin Buzzard said:

An even more science-fiction application is "computer proves Riemann hypothesis" but right now this looks a long way away (unless you're Christian Szegedy)

Yes, it is indeed challenging, and so far, it has mostly been depicted in fictional movies.

Siddhartha Gadgil (Feb 23 2024 at 09:25):

Backtranslation is a lot easier, and GPT-4 can do it zero-shot rather well.

Gerard Calvo Bartra (Feb 23 2024 at 11:57):

Regarding translation comments, I guess you could (reasonably quickly) develop some Encoder-Decoder Transformer architecture, provided enough examples of (informal, formal) pairs of theorems or lemmas are available. Otherwise, you might need to devise some intelligent system (maybe using existing LLMs in a few-shot manner) to curate a base dataset.

Gerard Calvo Bartra (Feb 23 2024 at 12:02):

Also, thinking outloud, the task of characterizing/ranking lemmas as "useful" or "insignificant" (and anything in between) is not trivial at all, and maybe one might use AI for this.

Siddhartha Gadgil (Feb 23 2024 at 12:18):

There are plenty of examples of (informal, formal): the documentation strings of theorems and definitions in Mathlib. Including definitions this makes over 40000 examples.

Gerard Calvo Bartra (Feb 23 2024 at 12:19):

Another task would be letting an AI create novel definitions of concepts based on TF-IDF (or similar) scores. For example, if it sees that some expression of the form $m | (a-b)$ "frequently" appears in many theorems, then maybe it is worth defining $a ≡ b$ (mod $m$ ) if doing this can simplify anything.

Gerard Calvo Bartra (Feb 23 2024 at 12:23):

Conjecture generation is also an interesting topic.

Shreyas Srinivas (Feb 23 2024 at 22:26):

Possibly related to counterexample generation. Suppose I have a theorem statement in lean. Let's say I am not sure I have all the hypothesis I need. A useful AI could choose promising proof paths and produce a shortlist of hypothesis that it thinks you must add to your theorem statement to make it true. In a sense this can already be done using classical proof search methods. The AI application would be to filter out what might be the most promising pathways and suggest the corresponding hypothesis. A sort of truncated search.

Shreyas Srinivas (Feb 23 2024 at 22:27):

Gerard Calvo Bartra said:

Also, thinking outloud, the task of characterizing/ranking lemmas as "useful" or "insignificant" (and anything in between) is not trivial at all, and maybe one might use AI for this.

For what definition of useful? API lemmas are very useful because life is infinitely simpler when they exist. At the same time these are what mathematicians would consider trivial.

Jason Rute (Feb 24 2024 at 03:14):

@Shaonan Wu Welcome. If you haven't already, you should check out the stream: #Machine Learning for Theorem Proving

Shaonan Wu (Feb 24 2024 at 04:04):

Jason Rute said:

Shaonan Wu Welcome. If you haven't already, you should check out the stream: #Machine Learning for Theorem Proving

Yeah, I have explored some topics in this stream as well, and I must say that some of the works are truly impressive especially alphageometry, which showcases the power of synthetic data.

Tyler Josephson ⚛️ (Feb 26 2024 at 04:15):

Shreyas Srinivas said:

Possibly related to counterexample generation. Suppose I have a theorem statement in lean. Let's say I am not sure I have all the hypothesis I need. A useful AI could choose promising proof paths and produce a shortlist of hypothesis that it thinks you must add to your theorem statement to make it true. In a sense this can already be done using classical proof search methods. The AI application would be to filter out what might be the most promising pathways and suggest the corresponding hypothesis. A sort of truncated search.

I’m glad you brought this up. I’m particularly interested in this, too; it would be useful in science/engineering derivations. These often don’t specify all the premises needed, such as to avoid dividing by zero or requiring continuity and differentiability before taking a derivative).

Here are the issues I anticipate such a “hypothesis generation” scheme will need to avoid:

Generate a hypothesis “FALSE” and prove the conjecture from it. (Maybe obviously avoidable)
Generate a hypothesis that contradicts an existing hypothesis. (More subtle than above, need good ways to detect and avoid contradictions among hypotheses)
Generate the conjecture as a hypothesis. Prove with exact. (Maybe obviously avoidable)
Generate something “too close for comfort” to the conjecture; it’s not the conjecture, but it’s close to it and renders the proof not useful.
Generate hypotheses that narrow the scope too much.

The typical use of AI to generate proofs is well-posed because the theorem statement is a static problem. But make the theorem statement fluid, and many varieties become possible; the space of these will include instances of high and low quality, and it will be likely much easier to generate low quality ones than high.

Shreyas Srinivas (Feb 26 2024 at 05:43):

These issues are exactly why I think this is an AI problem.

On the one hand it has some constraints which are somewhat well-defined but probably algorithmically hard to ensure. I.e. don't produce hypotheses that are identical to the conjecture, or worse, contradict other hypotheses. This means there has to be a provably correct algorithm that identifies whether a hypothesis does in fact contradict another.

On the other hand there are ill-defined quality metrics which determine what makes a hypothesis high quality.

Shreyas Srinivas (Feb 26 2024 at 05:47):

From an algorithmist's perspective, I find AI great when:

The problem objective is hard to define or ill-defined. This is the case of computer vision.
The problem is computationally hard to solve as it is, but advice from an oracle of some accuracy (say an ML model) can greatly improve the computational complexity on useful inputs.

namibj (Mar 08 2024 at 03:49):

@Shreyas Srinivas https://news.ycombinator.com/item?id=39604600 threw some classifiers at an elliptic curve property database (well, some L-function Dirichlet coefficients computed from them) to predict whether the curve had complex multiplication, and realized it performed better than expected.
They investigated further with some related tasks and ended up plotting some information to realize that there was indeed a pattern clear enough to see, explaining how the classifiers could score so high.

Finding somehow predictable, yet profoundly head-scratching patterns in discrete families of mathematical objects (AKA demonstrating statistically significant non-linear correlations) is something ML models are well-suited for.
Unexpected low perplexity is easy to spot and a clear sign of (unexpected) exploitable underlying structure.
Annotating yet-unsolved samples with classification probabilities could further help guide effort (it's basically conjecturing when it makes high-confidence predictions).

Last updated: May 02 2025 at 03:31 UTC