Zulip Chat Archive

Stream: general

Topic: ATP approaches before 2019-2020


Anh Nguyễn (May 22 2025 at 07:47):

(deleted)

Anh Nguyễn (May 22 2025 at 07:51):

As we all know, Machine Assisted Theorem Proving is now a rapid research field. There are a lot of approaches (LLM, RL, …)
I am more concerned about previous approaches (what has been done before), how well they performed, and why it failed.
Could you give me the overview of it. Thank you so much

Henrik Böving (May 22 2025 at 08:08):

I don't think the ATP approaches before ML failed at all. SAT solvers like cadical, SMT solvers like z3 or cvc4, superposition systems like vampire and even ITPs without AI have seen great success all throughout computer science and sometimes even in mathematics. I would even dare say that in the areas they were successful in (predominantly automated bug finding in software/hardware systems) AI has not been able to match yet.

Anh Nguyễn (May 22 2025 at 08:18):

How good is it when applied to mathematics
I saw some papers about using SAT Solver to solve mathematic problems

Anh Nguyễn (May 22 2025 at 08:18):

Are there some other techniques

Shreyas Srinivas (May 22 2025 at 08:52):

We don’t have exact data for this. But we applied these systems and conventional ATPs in the equational theories project. Automation-wise, this should be as simple as is gets. Old ATPs completely blew modern AI out of the water. Vampire, prover9, and mace were basically the workhorses of finding anti implications

Shreyas Srinivas (May 22 2025 at 08:53):

I don’t think modern AI was successfully applied for even one of the implications/anti-implications, that a contributor hadn’t solved by hand themselves.

Anh Nguyễn (May 22 2025 at 09:06):

About the Robbin Conjecture, are there any problems in the same spirit have been solved by theorem prover. As I know, the Robbin Conjecture is a rare notable achievement of theorem prover (it is not quite interesting conjecture)
Are there some interesting mathematics can be formulate and solved in that way

Henrik Böving (May 22 2025 at 09:07):

Anh Nguyễn said:

Are there some other techniques

Sure as long as your problem is reducible to one of the theories supported by these tools they apply, with SAT that is propositional logic, with SMT that's a mix of linear and potentially non linear arithmetic on integers/reals, uninterpreted functions and a couple of other ones and with ATP that is most generally first order logic, potentially with special support for theories from the SMT world by involving SMT solvers under the hood. Note that they may of course end up timing out, that depends on the problem etc.

Anh Nguyễn (May 22 2025 at 09:20):

Thanks for your answer, I will read more about SAT and SMT

Anh Nguyễn (May 22 2025 at 12:44):

Type theory back then, is it strong enough

Shreyas Srinivas (May 22 2025 at 12:54):

To add to what Henrik said, sometimes you can also work with undecidable theories on these tools. They are less likely to succeed and more likely to timeout. But it is not entirely impossible that they succeed.

Anh Nguyễn (May 22 2025 at 15:56):

Have much done/tried/experimented with coq (now Rcoq). I could see that the first wave of formalizing mathematics (Four Color Theorem, Kepler Conjecture, Odd Order-Feit Thompson,…) do not get much attention as nowadays (2022-2025).
One reason why people were not interested in formalizing mathematics was formalizing math takes so much time, and at that time there was not much motivation to do it. Then came Lean.
How was “proof discovery” worked in Coq (before AI). What it was capable of (maybe IMO problems were out of reach at that time).

Anh Nguyễn (May 22 2025 at 16:01):

Like the methods used in “proof discovery” in Coq

Shreyas Srinivas (May 22 2025 at 17:55):

What got mathematicians into lean was one or two mathematicians choosing to invest effort into a new system and build math that is of interest to math department mathematicians.

Shreyas Srinivas (May 22 2025 at 17:57):

There are several Zulip messages and threads where the history and cultural aspects are discussed. I don’t think modern AI had anything to do with it. They didn’t exist in public accessible forms until recently

Niels Voss (May 23 2025 at 00:40):

Also, I think having a centralized and aggressively refactored math library helped a lot as well

Anh Nguyễn (May 23 2025 at 02:27):

How was proof discovery back then

Anh Nguyễn (May 23 2025 at 02:28):

What directions they experimented

Mario Carneiro (May 23 2025 at 02:41):

You don't discover proofs, you write them down by typing them in

Mario Carneiro (May 23 2025 at 02:42):

you really don't need AI to write proofs

Anh Nguyễn (May 23 2025 at 03:02):

Mario Carneiro said:

You don't discover proofs, you write them down by typing them in

I got it but I want to know more about the process of using computer to automatically discover it (before AI) (just like nowadays we use AI to try to write proofs in Lean)

Mario Carneiro (May 23 2025 at 03:09):

I don't know what you mean, because for me AI has never really been part of my workflow for writing proofs

Mario Carneiro (May 23 2025 at 03:09):

there is no automatic discovery involved. I have an idea for how the proof will go, and I write tactics to work my way through the proof

Mario Carneiro (May 23 2025 at 03:12):

the only AI tool I'm likely to use in practice is copilot, but I don't because (1) I don't want to contribute to an ecological disaster in the making and (2) I don't want to become dependent on the good will of a company that may decide to change their pricing model in the future

Mario Carneiro (May 23 2025 at 03:13):

I was already competent at writing proofs back before the AI hype train started so I just keep doing what I'm doing

Anh Nguyễn (May 23 2025 at 03:40):

Like searching for the solution of a problem or the proof of theorems

Anh Nguyễn (May 23 2025 at 03:40):

Sorry if I write not quite clearly. There are a lot of researches which use AI to solve competition problems. I wonder what have been tried before the use of AI

Mario Carneiro (May 23 2025 at 03:40):

searching for lemmas to use in the library, or searching the space of possible proofs?

Mario Carneiro (May 23 2025 at 03:41):

those are very different activities

Anh Nguyễn (May 23 2025 at 03:41):

Searching the space of possible proofs to give a proof for a statement

Anh Nguyễn (May 23 2025 at 03:41):

Or give the solution to a problem

Mario Carneiro (May 23 2025 at 03:41):

The state of the art there is sledgehammer, which has worked since long before 2019

Mario Carneiro (May 23 2025 at 03:42):

but it's an isabelle tool, not lean

Jeremy Tan (May 23 2025 at 03:58):

@Mario Carneiro What about Canonical? #announce > Canonical is available now! @ 💬

Mario Carneiro (May 23 2025 at 03:58):

canonical is not SOTA if the comparison is vampire and Z3

Mario Carneiro (May 23 2025 at 03:58):

the thing is that they solve very different classes of problem

Mario Carneiro (May 23 2025 at 03:59):

(also it definitely wasn't around before 2019)

Anh Nguyễn (May 23 2025 at 04:08):

Jeremy Tan said:

Mario Carneiro What about Canonical? #announce > Canonical is available now! @ 💬

How good it is

Kevin Buzzard (May 23 2025 at 04:10):

Why don't you try it?

Anh Nguyễn (May 23 2025 at 04:19):

Kevin Buzzard said:

Why don't you try it?

Thanks, I will try it later

Anh Nguyễn (May 23 2025 at 05:13):

Mario Carneiro said:

The state of the art there is sledgehammer, which has worked since long before 2019

Is sledgehammer good enough to fill in proofs like infinitude of primes
What specific kinds of proof it was good at

Mario Carneiro (May 23 2025 at 05:15):

if you give it the right lemmas, probably. A common situation there would be that the only thing you need is "Observe: N!+1" and it figures out the rest

Anh Nguyễn (May 23 2025 at 05:24):

Theoretically, it could enumerate all statements from the lemmas we give, right?
In the case of infinitude of primes, if we do not give sledgehammer the important hint N!+1, maybe it never find out the right proof? Are there any techniques could be used to “prune” the search space, or at least handle it

Mario Carneiro (May 23 2025 at 05:26):

it will also do term enumeration, so it will eventually consider that expression even without a hint, it just takes longer. Also, if you give it the right lemmas it will deduce the term more easily

Anh Nguyễn (May 23 2025 at 05:59):

About the Robbin Conjecture, are there any problems in the same spirit have been achieved by theorem prover. As I know, the Robbin Conjecture is a rare notable achievement of theorem prover (it is not quite interesting conjecture)
Are there any mathematics that could be formulated and solved in that way? (generating lemmas after lemmas until the solution is reached)

Anh Nguyễn (May 23 2025 at 06:10):

Mario Carneiro said:

it will also do term enumeration, so it will eventually consider that expression even without a hint, it just takes longer. Also, if you give it the right lemmas it will deduce the term more easily

That means sledgehammer is capable of “exploring and discovering” proof space if we interact with it by feeding more axioms, lemmas to the system, right?

Anh Nguyễn (May 23 2025 at 06:53):

Henrik Böving said:

I don't think the ATP approaches before ML failed at all. SAT solvers like cadical, SMT solvers like z3 or cvc4, superposition systems like vampire and even ITPs without AI have seen great success all throughout computer science and sometimes even in mathematics. I would even dare say that in the areas they were successful in (predominantly automated bug finding in software/hardware systems) AI has not been able to match yet.

Which aspects ML approaches are better than pure ATP approaches and which aspect pure ATP are better.
It seems that ML made many things possible/better in theorem proving that are actively researched these days (premise selection, proof search, autoformalization,…). I am curious what the picture looked like before ML/RL/LLM
Specifically, IMO Problems back in 2019 is thought not possible for computer to tackle. Which methods have been tried with IMO problems (such as DDAR and Wu’s method in Geometry)

Anh Nguyễn (May 23 2025 at 06:56):

(deleted)

Anh Nguyễn (May 23 2025 at 07:02):

What kinds of different methods were added/invented (from 2019) that made it possible to solve IMO problems (as we see right now)

Shreyas Srinivas (May 23 2025 at 07:50):

If you are asking what researchers are trying to accomplish with ML in theorem proving, you are likely to find more on arxiv and ML conference proceedings. If you want to ask which ones are practically useful at the moment to “search proofs”, beyond finding the right lemma to apply or rewrite or simplify with at the next step, I would say practically none.

Shreyas Srinivas (May 23 2025 at 07:51):

There is a gap between what’s shown in benchmark evaluations and 1. What’s actually usable in practice. 2. How effectively those benchmark results translate into real usage.

Shreyas Srinivas (May 23 2025 at 07:53):

From a user’s perspective (not from an ML researcher’s perspective maybe), this is one of the areas where the ML hype train has so far moved at the pace of a ox cart.

Anh Nguyễn (May 23 2025 at 08:05):

Shreyas Srinivas said:

There is a gap between what’s shown in benchmark evaluations and 1. What’s actually usable in practice. 2. How effectively those benchmark results translate into real usage.

I think what achieved in benchmarks require much computational power so it is not easy for users to reproduce

Anh Nguyễn (May 24 2025 at 05:59):

What about symbolic methods


Last updated: Dec 20 2025 at 21:32 UTC