Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Paper: Top Down Automated Theorem Proving


Jason Rute (Aug 10 2023 at 16:59):

There is a new paper on arXiv called Top-down Automated Theorem Proving (Notes for Sir Timothy) by C. E. Larson, N. Van Cleemput.

Jason Rute (Aug 10 2023 at 16:59):

I haven't read it in detail but it seems very critical of Lean and current ATP methods. While I do agree that ATP is hitting a wall and we need to find methods which are more big picture, I also find myself disagreeing with a lot of the stuff in this paper.

Jason Rute (Aug 10 2023 at 17:00):

One quick example:

Furthermore “human-style” ATP must also include the design principle of working directly in the domains of the various mathematical sub-fields, with the objects (integers, matrices, graphs, etc) of those sub-fields, without translation to any more fundamental sub-field (for instance, set theory), just as human mathematicians do.

I can't tell if they are talking about traditional FOL ATP systems like Vampire, E, etc. If so, I think this has a lot of merit. But if one is talking about Lean, Coq or Isabelle, then while those are not yet used as a general purpose ATPs, I would say the users of Lean/Coq/Isabelle are focused on working directly with integers, matrices, and graphs even if one first defines these objects via some logical foundation. Indeed, ITPs provide a natural and principled way to represent mathematic objects on a computer so one can work with them.

Jason Rute (Aug 10 2023 at 17:00):

I'll try to read the paper in more depth and summarize it if I have time.

Min-Hsien Weng (Aug 11 2023 at 01:11):

Jason Rute said:

One quick example:

Furthermore “human-style” ATP must also include the design principle of working directly in the domains of the various mathematical sub-fields, with the objects (integers, matrices, graphs, etc) of those sub-fields, without translation to any more fundamental sub-field (for instance, set theory), just as human mathematicians do.

I can't tell if they are talking about traditional FOL ATP systems like Vampire, E, etc. If so, I think this has a lot of merit. But if one is talking about Lean, Coq or Isabelle, then while those are not yet used as a general purpose ATPs, I would say the users of Lean/Coq/Isabelle are focused on working directly with integers, matrices, and graphs even if one first defines these objects via some logical foundation. Indeed, ITPs provide a natural and principled way to represent mathematic objects on a computer so one can work with them.

I totally agree with you.
When I used Lean, I used it not only as a tool but a library to help me construct proofs and theorems. This is because it has many useful definitions, such as natural numbers, that can be used. If we use a top-down approach, then we have to define everything ourselves. That would be very tedious.

Jason Rute (Aug 12 2023 at 17:51):

Ok, I've read up to page 8 and I have a much better picture. Overall, I think they have some good points, but they continue to use ITPs as a strawman.

Jason Rute (Aug 12 2023 at 17:51):

To illustrate their main idea this give this example. Suppose G is a graph with n vertices. Assume every vertex of G has degree at least n/2. Now show that G has a Hamiltonian cycle. They propose that the way to solve it is not to reason from basic logical rules (as they claim automated theorem provers do), but to pass through high level steps (intermediate lemmas if you well), such as:

  • the vertices in every longest path in graph G induce a subgraph of G which is Hamiltonian
  • every longest path in graph G contains all of the vertices of the graph

From these it is then much easier to solve the goal.

Jason Rute (Aug 12 2023 at 17:51):

I :100: percent agree. Much of classical ATP is built around basically cut-free proof systems where it is impossible to create lemmas. More modern neural ATP systems also are typically based around finding small next proof steps. While those next steps can in theory be lemmas (through things like have tactics or Isabelle's Isar style proof), this doesn't seem to be used much or done well yet. Language models like ChatGPT and Minerva actually seem better at producing high level proofs that anything I've seen in the formal math word, and that is certainly one approach. A lot of other approaches I've seen, and papers like Draft-Sketch-Prove and Decoding the Enigma start to exploit this by first having a language model give a high level proof in natural language and then translating it to a proof sketch in Isabelle (again progressively breaking the main goal into intermediate lemmas). In a different direction, papers like Subgoal search for complex reasoning tasks shows that there is a big advantage to predicting future states instead of next actions. But still, I think we have a long way to go still.

Jason Rute (Aug 12 2023 at 17:51):

What I don't agree with is that the authors seem to think that in ITPs like Lean that one writes a proof in a fundamentally different way. I'm sure a typical proof of the above theorem in Lean (or even Metamath) would again proceed in a series of high-level lemmas just like they gave. And further, as papers like Draft-Sketch-Prove has shown, ITPs can actually be a great component inside an automated theorem prover when combined with other high level tools like language models or conjecturing bots.

Jason Rute (Aug 12 2023 at 17:51):

Now, what I finally figured out is that the authors have their only research program which has been going on for half a decade of developing a conjecturing tool that is supposed to find conjectures (or intermediate lemmas) just like the ones in the Hamiltonian graph example above. It is called Conjecturing. I don't know much about it yet. It seems to be related to Sage.

Jason Rute (Aug 12 2023 at 17:51):

If it actually works well, then I could see it being a good addition to current efforts in automated theorem proving. It seems clear to me that the most powerful tools in the end will combine neural AI (for interfacing with natural language mathematics and human mathematicians), automated conjecturing (for automated ideation and discovery), computer algebra systems (for computation), and theorem provers (for checking details and building a common library of mathematical facts). I don't see much need for gate keeping here. (It is for example sad that there is still so much incompatibility between today's ITPs and CAS systems today.)

Jason Rute (Aug 12 2023 at 17:51):

I still haven't read the rest. I think the rest of the paper is mostly a systematic response to arguments to their "top-down" automated theorem proving program.

Min-Hsien Weng (Aug 14 2023 at 00:12):

THank @Jason Rute For good summary.
I like the paper describing their approach as top-down programming. Top-down programming is more like how we actually program: start with a function and update it as we learn and find out more problems. This lets us learn the details of the function and make design decisions. Bottom-up programming may hide such information because we are just calling it from a library.

The author suggested that this top-down approach will help us reproduce the proofs written in the publication. I think the authors would make a good point here, as translating proofs into Lean requires a bit of manual work, and the translation may lose some information, due to abstraction. With the help of LLM, the precision of translation may be improved?!

I am trying to understand the Conjecturing part of the paper.

Patrick Nicodemus (Aug 24 2023 at 04:27):

@Jason Rute Are you familiar with "proof planning"? It's a research program for automated theorem proving based around the idea of defining high level goal blocks with input preconditions and output postconditions, without necessarily having a proof of the goal block. The idea is to find a high level proof plan and then recursively decompose those goals into smaller goals until they can be solved by ATP. The proof plan is found by coding the plan in STRIPS or PDDP language and feeding it to an automated planner.

Patrick Nicodemus (Aug 24 2023 at 04:28):

it seems related to this.

Jason Rute (Aug 24 2023 at 18:05):

@Patrick Nicodemus I'm not. Does it work? Are there any good benchmarks on this topic?

Patrick Nicodemus (Aug 24 2023 at 23:18):

Jason Rute said:

Patrick Nicodemus I'm not. Does it work? Are there any good benchmarks on this topic?

It was pretty heavily studied for a while, but Afaik it's not an active research area right now. It's not clear if there's a good reason it died out, maybe people just shifted their focus? It sounds sensible to me, AI planning has a solid history and it makes sense to draw on it here. You'd have to talk to somebody who was actually part of one of those groups to get a good appraisal. Which you can do if you want. many of them are still active.

These research groups built pieces of software called "proof planners" where you would write "methods" which are like declarative versions of tactics but "coarser", the details of how it is to be filled in and implemented are left up to the search engine which combines them. Methods also differ from tactics in that you have to write clear pre and post conditions that the proof planner can use to figure out when they can be chained together to achieve valid proof.

Here is a paper where three of them are compared and contrasted.
"On the comparison of proof planning systems Lambda Clam, Omega and IsaPlanner."
https://www.cl.cam.ac.uk/~mj201/publications/comp_pp_final.pdf

The technique was first applied in a theorem prover called Oyster (Alan Bundy and the DREAM group at Edinburgh) based on extensional Martin-Lof dependent type theory, to plan proofs based on a technique called "rippling" for proving things by induction, it is a rewriting technique to prove the induction step by migrating the variable from the inside out until you reduce it to the inductive hypothesis. This first proof planner was called Clam, there is also Lambda Clam for higher order logic (this is getting off topic, but the point of Lambda Clam is that it is written in a higher order logic programming language called Lambda Prolog which is supposed to be particularly adequate for theorem proving, programming language implementation and similar tasks because it gives you very elegant ways of dealing with free and bound variables. For this reason Lambda Prolog was added as a semi-official tactic language for Coq, the Embeddable Lambda Prolog Interpreter or ELPI for short).

Another planner called Omega (different research group, Research Center for Artificial Intelligence at Saarbrücken) was integrated with traditional ATP's, knowledge-bases, and CAS's that it could query. Their papers are interesting.
https://www.sciencedirect.com/science/article/pii/S0004370299000764
https://www.sciencedirect.com/science/article/pii/S0004370207001968

Lucas Dixon, who is now at Google, wrote a planner called IsaPlanner for Isabelle.
https://era.ed.ac.uk/handle/1842/1250
A sample paper if you want to skim the abstract
https://ieeexplore.ieee.org/document/8131278
This is a philosophical argument that we should pay more attention to high level planning as an integral part of mathematical problem solving
https://www.cambridge.org/core/journals/review-of-symbolic-logic/article/abs/plans-and-planning-in-mathematical-proofs/6F84A9D231721E7B1B60056F5493D795


Last updated: Dec 20 2023 at 11:08 UTC