Zulip Chat Archive

Stream: Equational

Topic: Running the `vampire` solver on all of the equations

Aaron Hill (Sep 27 2024 at 16:07):

I'm currently running the 'vampire' solver over all the pairs of equations, to see what implications it can find. My current setup looks like this:

I modified generate_eqs_list.py to generate TPTP-formatted equations. For example, EQ5 (x = y * x) looks like this:
fof(eq5, hypothesis, ! [X] : ! [Y] : X = mul(Y, X)).
I made a script to generate all pairs of implications, changing 'hypothesis' to 'conjecture'. For example:

fof(eq587, hypothesis, ! [X] :  ! [Y] :  ! [Z] :  ! [W] :  ! [U] :  X = mul(Y, mul(Z, mul(Z, mul(W, U))))).
fof(eq4620, conjecture, ! [X] :  ! [Y] :  ! [Z] :  mul(mul(X, X), Y) = mul(mul(Z, Y), Z)).

When Vampire output contains 'Refutation' for a particular file in the above format, it's found a proof that EQA => EQB (the terminology is somewhat confusing: 'Refutation' means that Vampire refuted a negation of the conjecture, i.e. proved the implication)

While this is running, I'm currently working on extracting useful information from the proof outputed by Vampire. This is probably going to be difficult in general, but I'm hoping that a large number of cases will use the same kinds of steps, which will be suitable for (partial) automated conversion to Lean. At a minimum, we should be able to get a list of 'known true' implications that should eventually be proved in lean.

Here's the full list of TPTP-format equations: https://gist.github.com/Aaron1011/d94c2d0e2d4d124758a5e2fa0edd73e9

Aaron Hill (Sep 27 2024 at 16:24):

Here's a sample of an implication proven by vampire:

fof(eq587, hypothesis, ! [X] :  ! [Y] :  ! [Z] :  ! [W] :  ! [U] :  X = mul(Y, mul(Z, mul(Z, mul(W, U))))).
fof(eq4620, conjecture, ! [X] :  ! [Y] :  ! [Z] :  mul(mul(X, X), Y) = mul(mul(Z, Y), Z)).

% Running in auto input_syntax mode. Trying TPTP
% Refutation found. Thanks to Tanya!
% SZS status Theorem for eq587_imp_eq4620
% SZS output start Proof for eq587_imp_eq4620
1. ! [X0] : ! [X1] : ! [X2] : ! [X3] : ! [X4] : mul(X1,mul(X2,mul(X2,mul(X3,X4)))) = X0 [input]
2. ! [X0] : ! [X1] : ! [X2] : mul(mul(X0,X0),X1) = mul(mul(X2,X1),X2) [input]
3. ~! [X0] : ! [X1] : ! [X2] : mul(mul(X0,X0),X1) = mul(mul(X2,X1),X2) [negated conjecture 2]
4. ! [X0,X1,X2,X3,X4] : mul(X1,mul(X2,mul(X2,mul(X3,X4)))) = X0 [flattening 1]
5. ~! [X0,X1,X2] : mul(mul(X0,X0),X1) = mul(mul(X2,X1),X2) [flattening 3]
6. ? [X0,X1,X2] : mul(mul(X0,X0),X1) != mul(mul(X2,X1),X2) [ennf transformation 5]
7. ? [X0,X1,X2] : mul(mul(X0,X0),X1) != mul(mul(X2,X1),X2) => mul(mul(sK0,sK0),sK1) != mul(mul(sK2,sK1),sK2) [choice axiom]
8. mul(mul(sK0,sK0),sK1) != mul(mul(sK2,sK1),sK2) [skolemisation 6,7]
9. mul(X1,mul(X2,mul(X2,mul(X3,X4)))) = X0 [cnf transformation 4]
10. mul(mul(sK0,sK0),sK1) != mul(mul(sK2,sK1),sK2) [cnf transformation 8]
15. mul(X4,X3) = X5 [superposition 9,9]
17. X4 = X5 [superposition 9,9]
19. mul(mul(sK0,sK0),X0) != mul(mul(sK2,X0),sK2) [superposition 10,17]
35. $false [subsumption resolution 19,15]
% SZS output end Proof for eq587_imp_eq4620
% ------------------------------
% Version: Vampire 4.9 (commit 5ad494e78 on 2024-06-14 14:05:27 +0100)
% Linked with Z3 4.12.3.0 79bbbf76d0c123481c8ca05cd3a98939270074d3 z3-4.8.4-7980-g79bbbf76d
% Termination reason: Refutation

% Memory used [KB]: 428
% Time elapsed: 0.003 s
% Instructions burned: 5 (million)
% ------------------------------
% ------------------------------

Last updated: May 02 2025 at 03:31 UTC