Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: LEO

Julian Berman (Mar 30 2021 at 17:58):

@Jason Rute I also found https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Lean.20code.20into.20AST.2FXML.2FJSON.20for.20parsing.2Fgeneration.20for.20ML.3F/near/232463466 quite interesting

I was wondering if you had any specific opinions on LEO -- http://page.mi.fu-berlin.de/lex/leo3/ -- there isn't really any intelligence behind that question other than "I saw it mentioned on HackerNews and made a note to look at how it works given it seems to make some lofty claims on that page", but yeah curious if you'd looked at it in any way (I'd searched through Zulip to see if it was mentioned before but it's hard to filter out responses that mention De Moura :C)

Mario Carneiro (Mar 30 2021 at 20:33):

I don't think LEO has ever been mentioned on zulip before

Jason Rute (Mar 31 2021 at 00:25):

@Julian Berman I don't know much about ATPs except that they exist, they are pretty good at what they do, and you still have to tell them what premises to use in the proof, so there is a lot of applications for ML in premise selection.

Jason Rute (Mar 31 2021 at 00:25):

Others here, like @Jasmin Blanchette , may have more opinions on LEO.

Jason Rute (Mar 31 2021 at 00:25):

I do think there is an interesting comparison to make between ATPs like LEO and the newer ML learning-based tools when it comes to automatic theorem proving, especially in ITPs. The times I've seen then compared head-to-head, the ATP-based Hammers and the neural network based neural theorem provers perform comparably, with each doing better at different types of problems. Moreover, hammers still need you to provide them with relevant theorems from the libraries, and that seems to be something neural networks and machine learning agents are quite good at so there is room for symbiosis there. On the other hand, hammers are fast and really effective for certain types of problems. For neural tactic-based theorem provers, a hammer could just be another tactic to choose from when it is the right tool for the job, so again room for symbiosis.

Jason Rute (Mar 31 2021 at 00:26):

I think in the end ML agents will have a larger impact on ITP-like automated theorem proving, just because they are more general. The same ML tools basically work on all the ITPs (and frankly on problems completely separate from theorem proving). ML agents can do both premise selection as well as logical reasoning, and in the next few years they will also be able to do basic conjecturing, lemma and definition creation, ITP translation, and auto-formalization, whereas ATPs are really specialized for one thing. On the other hand, in industrial settings and certain types of combinatorial problems I can see ATPs being increasingly important because of their speed and power.

Jason Rute (Mar 31 2021 at 00:26):

But I will again say, I don't know a lot about ATPs.

Simon Cruanes (Mar 31 2021 at 19:38):

Leo-3 has been a strong contender in the THF (higher-order logic) category of CASC in recent years. It's been developed by really few people and still got pretty good results, by reusing existing first-order provers (E, CVC4, …) in parallel to its higher-order calculus. Of course if you look at Leo you may also want to look at Zipperposition as it got better than it used to and won last CASC in THF.

I don't know how machine learning based solutions compare, I think they have a bright future for lemma selection and possibly some heuristics in provers, but a purely ML-based prover (for the kind of problem ATPs like Leo are good at) would surprise me. I think it's complementary.

Jason Rute (Mar 31 2021 at 22:46):

Simon Cruanes said:

(for the kind of problem ATPs like Leo are good at)

Do you have an idea what sort of problems those are? For example, if used as a hammer for HOL interactive theorem proving, what sort of problems does Leo excel at? Or is that not the point? Is it more designed for industrial style problems?

Simon Cruanes (Apr 01 2021 at 14:43):

I'm not sure of the precise answer :smile: . For first-order provers like E, the sweet spot is problem with a lot of non-trivial equational reasoning and not too many nested quantifiers (which can mess up the CNF). HO theorem proving is tougher because the search space explodes faster; however I think that at least for Zipperposition, tough unification problems are now pretty doable, the difficult thing is instantiating predicate variables.

I'm getting lost in the details. I think the Hammer papers (sledgehammer, holyHammer, etc.) may give more insight on what kind of problems the ATPs are good at. The prover have also improved recently (handling HO directly instead of requiring an encoding to FOL) so the papers are more of a lower bound. The detailed results of CASC also hold a lot of raw data.

Last updated: May 02 2025 at 03:31 UTC