Zulip Chat Archive

Following a discussion on AITP, I was curious -- how far are Lean libraries from proving completeness of superposition (basically ordered resolution & rewriting & unification)? Some basic aspects such as a first order term, or a variable should be in ModelTheory (perhaps also some compactness results could come handy). Is there anywhere something regarding clauses (say in propositional setup)? Perhaps some term rewriting theory? And in case someone would look at it, it would rather belong to CSLib than Mathlib, what do you think?

By the way, the original motivation was that if we could convince Lean satisfiable by a superposition-based solver is correct (in experiments with Terry Tao's equational theories). Of course the problems there were a bit specific, only used unit clauses, so the theory should be easier. Although having superposition completeness would be a nice theoretical accomplishment... just perhaps too hard?

Mirek Olšák (Sep 17 2025 at 20:40):

Ok, I was pointed to an Isabelle formalization
https://drops.dagstuhl.de/storage/00lipics/lipics-vol309-itp2024/LIPIcs.ITP.2024.12/LIPIcs.ITP.2024.12.pdf

Henrik Böving (Sep 18 2025 at 07:49):

If vampire gives us the superposition proof trace there shouldn't really be a need for any proof theory on our side I think? Recovering the proof given the trace should be a pretty doable Lean meta programming task, even in the non unit case (of course as simplifications come in things might get a bit more difficult though)

Mirek Olšák (Sep 18 2025 at 08:35):

I meant the "satisfiable case", that is when Vampire concludes there are no more clauses that can be derived. This requires proving completeness which is much harder than soundness.

Henrik Böving (Sep 18 2025 at 08:37):

If vampire concludes there are no more clauses that can be derived and it hasn't reached a contradiction yet the active set / processed set (depending on which literature you read) contains a model.

Mirek Olšák (Sep 18 2025 at 08:45):

We discussed this on AITP, and "contains a model" is quite a bold statement. The model could be infinite. I need to check if Vampire at least decides all the unit equalities but even then you need some rewriting theory (termination, confluence) to be sure there equalities define a model.

Mirek Olšák (Sep 18 2025 at 09:01):

Indeed, it simplifies to the unit equalities, so for this particular task, term-rewriting theory (KBO) should be enough. Does Lean have it?

Henrik Böving (Sep 18 2025 at 09:03):

Mirek Olšák said:

Ok, I was pointed to an Isabelle formalization
https://drops.dagstuhl.de/storage/00lipics/lipics-vol309-itp2024/LIPIcs.ITP.2024.12/LIPIcs.ITP.2024.12.pdf

I can only tell you that the people that wrote this paper taught me it contains a model, you can check here to see how it is recovered in theory: https://www.tcs.ifi.lmu.de/lehre/ws-2024-25/atp/slides13-more-superposition.pdf

While there are indeed infinite models superposition is a semi-decision procedure, it is only guaranteed to terminate if a proof (in this case a contradiction) exists, otherwise it can end up running forever. If it terminates the saturated clause set can be used to recover a model. It might be that Vampire implements additional strategies on top of superposition (I've only read E papers and code before, don't know about Vampire) that are able to identify certain situations where an infinite model exists and give up but the basic superposition calculus has this property.

Henrik Böving (Sep 18 2025 at 09:09):

Mirek Olšák said:

Indeed, it simplifies to the unit equalities, so for this particular task, term-rewriting theory (KBO) should be enough. Does Lean have it?

While you might need some theory to prove that something is a counter example to your statement there is no need to actually do that proof when presenting a counter example to your user (one of the nice properties of counter example finding tools as opposed to proving ones). I guess one could do this for fun still but for an interactive tool there is no need to produce proofs.

Mirek Olšák (Sep 18 2025 at 09:23):

Let's take natural numbers with a successor and predecessor for example. If I am not mistaken, this only has an infinite model:

fof('succ_nz', axiom, ![X] : (s(X) != z)).
fof('pred_succ', axiom, ![X] : (p(s(X)) = X)).
fof('succ_pred', axiom, ![X] : ((X != z) => (s(p(X)) = X))).

Vampire answers

% # SZS output start Saturation.
cnf(u7,axiom,
    s(p(X0)) = X0 | z = X0).

cnf(u6,axiom,
    p(s(X0)) = X0).

cnf(u5,axiom,
    s(X0) != z).

% # SZS output end Saturation.

E answers

# SZS status Satisfiable
# Processed positive unit clauses:
cnf(i_0_2, plain, (p(s(X1))=X1)).

# Processed negative unit clauses:
cnf(i_0_1, plain, (s(X1)!=z)).

# Processed non-unit clauses:
cnf(i_0_3, plain, (s(p(X1))=X1|X1=z)).

If I tried, I could get the provers to also show the term ordering they are using.
How would you use any of this to convince Lean of satisfiability?

Henrik Böving (Sep 18 2025 at 09:53):

I'm again not sure why you are interested in convincing Lean of satisfiability here, if you have a use case for extracting SAT into Lean apart from counter example generation please do tell because I am not aware of one.

Anyway, this is likely one of the situations mentioned above where the practical superposition systems implement additional mechanisms to detect they are in a hopeless situation. If you just hand this to a completely plain superposition prover it will indeed end up looping (e.g. the one I wrote with the people from LMU does end up looping). Unfortunately I'm not deep enough into practical ATPs to say exactly how this works out but I can ask around.

Mirek Olšák (Sep 18 2025 at 10:23):

I am not sure if I agree with the "practical / theoretical" distinction. I admit I haven't implemented superposition on my own but just manually I don't see any meaningful rewrite I could do in the example problem (assuming z is the smallest in the term ordering). The only rewrites I can see is combining the two inverses, but that results in a trivially true clause. Am I missing something? How does your solver loop?

Also yes, it is more of a curiosity than a critical need. As I mentioned above, the question arose from the equational theories project (which already finished, but someone could want something similar in the future). In equational theories, people wanted to have both proofs and disproofs.

Henrik Böving (Sep 18 2025 at 11:06):

How does your solver loop?

It does end up in a situation where it repeatedly superposes clause 3 and descendants with itself, most of these are detected as useless but there are enough to keep the loop going. I would assume that E and Vampire have a smarter idea of what inferences might be redundant.

In equational theories, people wanted to have both proofs and disproofs.

If you are interested in a disproof a superposition prover can do that for you by just asking it to prove the negation of your theorem instead of the theorem it found a counter example for. Then you can just run the normal proof recovery procedure from the trace.

Henrik Böving (Sep 18 2025 at 11:10):

And regarding the practical theoretical distinction, here is e.g. the eprover manual: https://wwwlehre.dhbw-stuttgart.de/~sschulz/WORK/E_DOWNLOAD/V_3.1/eprover.pdf on page 8 and following they describe the specific inference system they are using, theoretical first order superposition has 4 rules, they have multiple pages of rules alone. A real world superposition implementation takes a lot of shortcuts through additional tricks that are not described in formalisations like the Isabelle paper above. As a simple example:

fof(goal, conjecture,
  ((? [X] : ! [Y] : (p(X) <=> p(Y))) <=>
   ((? [X] : q(X)) <=> ! [Y] : p(Y)))
  <=>
  ((? [X] : ! [Y] : (q(X) <=> q(Y))) <=>
   ((? [X] : p(X)) <=> ! [Y] : q(Y)))
).

solving this query in any kind of reasonable time is basically completely hopeless if you don't implement the CLC simplification rule.

Henrik Böving (Sep 18 2025 at 11:44):

The trick seems to be that the theoretical algorithm for extracting the model that's e.g. described in here:
https://www.tcs.ifi.lmu.de/lehre/ws-2024-25/atp/slides13-more-superposition.pdf

can also end up looping, so if you were to let it run on what vampire outputs it would just start printing the infinite description of the model that you are looking for.

Mirek Olšák (Sep 18 2025 at 21:45):

Henrik Böving said:

How does your solver loop?

It does end up in a situation where it repeatedly superposes clause 3 and descendants with itself, most of these are detected as useless but there are enough to keep the loop going. I would assume that E and Vampire have a smarter idea of what inferences might be redundant.

I would be curious for example what is the first clause it derives, and how it doesn't realize it is trivial (if you can easily display it).

In equational theories, people wanted to have both proofs and disproofs.

If you are interested in a disproof a superposition prover can do that for you by just asking it to prove the negation of your theorem instead of the theorem it found a counter example for. Then you can just run the normal proof recovery procedure from the trace.

Not really. The claim "every model of T satisfies X" cannot be negated within first order theory. You would need to provide to the solver external theory which is on its own capable of constructing the model, and I am afraid FOL solvers are not great when dealing with such problems.

Last updated: Feb 28 2026 at 14:05 UTC

leanprover-community / mathlib