Zulip Chat Archive

Stream: Equational

Topic: Equation 477

Fan Zheng (Oct 06 2024 at 20:26):

Here is a proof sketch showing that Equation 477 only implies equations that can be trivially simplified by it.

It suffices to show that different sequences of simplifications lead to the same simplest result.

First, we can defer any simplification of the whole term (as opposed to its proper factor) to the last step.

Then, if there is such a simplification in the sequence, turning Y(X(Y(YY))) to Y, we show that X(Y(YY)) cannot be simplified.

Since X is the final result, it cannot be simplified.

If Y can be simplified, we simplify it before applying the last simplification.

Now assume that X and Y are simplest. If YY can be simplified, then Y=y=x(y(yy)), but the lengths don't match.

Similarly if Y(YY) can be simplified, then Y=y and YY=x(y(yy)), but the lengths still don't match.

If X(Y(YY)) can be simplified, then X=y and Y(YY) = x(y(yy)). Then x=y=Y, and the lengths still don't match.

So X(Y(YY)) cannot be simplified. By induction before the possible final step applied to the whole term,
each of the two factors of the outermost product must be identical between two sequences of simplifications.
Then it is not hard to show that the same simplification (if any) applies in the last step to give the same result.

Fan Zheng (Oct 06 2024 at 20:46):

https://www.overleaf.com/read/rgfnfdfvxnnh#44f904

Here is a write-up of what I get along the lines above. Clearly it is no match for beastlike machines, but it does contain some currently unknown results (hopefully they are correct).

Terence Tao (Oct 06 2024 at 21:10):

Perhaps you could make a file of conjectures and upload it to the repository? If you aren't familiar with the Lean syntax we are using, you could just give a text list of implications your method seems to establish and I can upload it myself.

Terence Tao (Oct 06 2024 at 21:13):

I think sometimes some simplifications pass through one or more temporary expansion stages, and do not necessarily arise from simplifying a single large expression in two ways. For instance the proof of 1689->2 takes quite a long path from x = (y*x) * ((x*z)*z) to x=y that goes through quite long equations as intermediate steps, so one has to be careful in making sure one is not ruling out some unintuitive ways to simplify one equation to another.

Fan Zheng (Oct 07 2024 at 11:51):

I think it's true that if every expression has unique simplification, then the truth of any equation can be checked by direct simplification. First expanding then simplifying is making use of the fact that the expanded expression has two ways of simplification. For example, in the first step of 1689->2, we are using the following two ways of simplification:

((xx)((xy)y))((((xy)y)z)z)=x((((xy)y)z)z)

and

((xx)((xy)y))((((xy)y)z)z)=(xy)y

If this does not happen, then we can build a model (on a given alphabet A) whose elements are simple expressions (those which cannot be simplified) build from letters in A, and whose operation is x*y=simplification of xy. Then if two sides of an equation is different after simplification, then it gives a counterexample to the equation in th

Kevin Buzzard (Oct 07 2024 at 12:01):

Can you formalise your ideas in lean?

Fan Zheng (Oct 07 2024 at 12:06):

That's the question. I'm not familiar enough with Lean to know, for example, how to formalize the property "an expression cannot be simplified" (meaning that no rule of simplication can be applied to any of its subexpressions), etc.

Amir Livne Bar-on (Oct 07 2024 at 12:18):

Do we even know that calculation chains are complete? That is, that every "implies" relation has a proof by elementary calculation?

Daniel Weber (Oct 07 2024 at 12:27):

Yes, that's the completeness theorem

Amir Livne Bar-on (Oct 07 2024 at 13:45):

Since all the simple substitutions are already proved, we can add conjectures for every unknown implication whose lhs is an equation like 477, with x on one side and one the other an expression that does not unify with any sub-expression

Amir Livne Bar-on (Oct 07 2024 at 13:46):

@Fan Zheng do you have a list of these equations?

Terence Tao (Oct 07 2024 at 14:53):

I see now @Fan Zheng , thanks for explaining. It sounds like "unique simplification" is going to be a useful property, and worth trying to formalize its basic properties in Lean.

Relating to this, I started a chapter in https://teorth.github.io/equational_theories/blueprint/abstract-nonsense-chapter.html (inspired by the metatheory and invariants chapter) on the concept of a free magma $M_{X,\Gamma}$ on an alphabet $X$ subject to a theory $\Gamma$ (a collection of laws. For instance, we can talk about the free associative magma generated by $X$ , the free commutative and associative magma generated by $X$ , and so forth. This magma is a quotient of the free magma $M_X$ with alphabet $X$ , so to verify that a law $w\simeq w'$ is implied by $\Gamma$ turns out by abstract nonsense to be equivalent to $w$ and $w'$ having the same representative in $M_{X,\Gamma}$ . So for those theories for which the free magma $M_{X,\Gamma}$ is easy to describe, determining implication $\Gamma \vdash w \simeq w'$ is easy.

The point here seems to be that if a theory $\Gamma$ admits unique simplification, then its free magma $M_{X,\Gamma}$ is given by the set of all simplified words in $X$ , with the $M_{X,\Gamma}$ magma operation given by simplifying the $M_X$ magma operation.

If I have time later today, I will try to organize this chapter a bit more and set up these statements to be formalized in Lean. I had already started to notice at the bottom of the chapter that the idempotent law $x = x \diamond x$ seemed to have the unique simplification property, but now I see that the situation is more general.

At some point it might be worth trying to see if we can in fact classify which of the 4694 laws we have enjoy unique simplification, but that is a future project.

Daniel Weber (Oct 07 2024 at 14:54):

This "unique simplification" is usually called confluence, if I'm not mistaken

Andrés Goens (Oct 07 2024 at 15:02):

I think this construction (the free magma subject to a theory) is essentially what I meant in #Equational > Counterexamples by Enumerating Words in Quotient Magma and if I understand correctly, the basis for what @Anand Rao Tadipatri and @Hernan Ibarra are using also to find some counterexamples (with a cleverer use of more structure in some of the concrete such free magmas subject to a theory)

Andrés Goens (Oct 07 2024 at 15:03):

I wonder if it's worth unifying the terminology (e.g. phrasing some of the stuff from Chapter 8 based on these definitions)?

Will Sawin (Oct 07 2024 at 15:07):

This kind of proof strategy seems similar to what ATPs do to prove non-implications as discussed for example here. Whenever I try to think of an argument of this type I feel that anything simple enough for me to find would be possible for Vampire or whatever to find as well so it's not worth trying to fill in the details. Is it possible to describe what property an argument involving confluence would need to have to be hard to find for an ATP?

Daniel Weber (Oct 07 2024 at 15:08):

If it has an infinite family of rewriting laws, perhaps?

Hernan Ibarra (Oct 07 2024 at 15:38):

Funny this thread opened yesterday night; I was thinking about these issues all weekend. Here are some of my thoughts:

We are looking for free structures (say, free magmas modulo some equations) in which the word problem is solvable. For example, the word problem for groups is unsolvable in general, but it is solvable for free groups. Usually, but not always, the word problem is solved by proving that all words can be put into a unique normal form (here confluence means that all reductions lead to this normal form).

For any such free structure F we find, we can get a new metatheorem that says that if A -> B and F satisfies A (when the variables are instantiated as the generators of F) then F must satisfy B. The fact that we can solve the word problem makes it computationally tractable to verify A and B in F.

Hernan Ibarra (Oct 07 2024 at 15:40):

Equivalently, consider the fragment of first-order logic with only equality, conjunction, and universal quantifier as logical symbols. Then we are looking for decidable theories in this logic.

Hernan Ibarra (Oct 07 2024 at 15:41):

Word problems and decidability of theories are well-studied but I don't know of any references that discuss this specific case. I was planning to ask in stackexchange later today.

Hernan Ibarra (Oct 07 2024 at 15:45):

Andrés Goens said:

I wonder if it's worth unifying the terminology (e.g. phrasing some of the stuff from Chapter 8 based on these definitions)?

I plan to polish Chapter 8 at some point in the future. When I do, I'll make sure to update the terminology to match whatever is in use in the blueprint.

Anand Rao Tadipatri (Oct 07 2024 at 15:51):

I'd also like to add that the Knuth-Bendix algorithm might be of use in finding theories with decidable equality (this idea is described in slightly more detail in Remark 3 of Chapter 8).

Daniel Weber (Oct 07 2024 at 15:52):

Anand Rao Tadipatri said:

I'd also like to add that the Knuth-Bendix algorithm might be of use in finding theories with decidable equality (this idea is described in slightly more detail in Remark 3 of Chapter 8).

Is that the same as deriving models from saturation, as in #Equational > Constructing infinite models for non-implications?

Anand Rao Tadipatri (Oct 07 2024 at 16:25):

It possibly is, although I don't understand the superposition calculus deeply enough to be able to judge for certain.

In more detail, the idea I was suggesting is that if we know that a certain theory can be completed using the Knuth-Bendix procedure, then we know that there is a decision procedure for terms in the free model of that theory, and we can use the model to settle non-implications.

Daniel Weber (Oct 07 2024 at 16:33):

Is there a way to prove that such a set of equations is confluent in a simpler way than "run the Knuth-Bendix procedure"?

Andrés Goens (Oct 07 2024 at 17:03):

Daniel Weber said:

Anand Rao Tadipatri said:

I'd also like to add that the Knuth-Bendix algorithm might be of use in finding theories with decidable equality (this idea is described in slightly more detail in Remark 3 of Chapter 8).

Is that the same as deriving models from saturation, as in #Equational > Constructing infinite models for non-implications?

not quite, the termination of term rewrite systems and equality saturation are subtly different (don't have the reference on my phone but can pass it on later)

Andrés Goens (Oct 07 2024 at 17:53):

https://effect.systems/doc/egraphs-2023-theory/paper.pdf

Amir Livne Bar-on (Oct 07 2024 at 18:08):

I wrote a program that follows Zheng's logic for each of the equations in the list, and got some more that are confluent (have the unique simplification property). They are listed in this file:
like_477.txt

Amir Livne Bar-on (Oct 07 2024 at 18:09):

These should resolve the following unknown implications

Equation467 => Equation2847
Equation477 => Equation1426
Equation477 => Equation1519
Equation477 => Equation2035
Equation477 => Equation2128
Equation477 => Equation3050
Equation477 => Equation3150
Equation504 => Equation817
Equation504 => Equation1629
Equation504 => Equation1832
Equation504 => Equation1925
Equation1515 => Equation4590

and their duals

Terence Tao (Oct 07 2024 at 18:26):

Can you add that as a PR as a list of conjectures for now? I can start working on the blueprint later to set up the general theory of confluent theories, and then the specific confluence of 477 later (probably just cut and pasting from @Fan Zheng 's text to begin with).

Amir Livne Bar-on (Oct 07 2024 at 18:29):

Will do

Michael Bucko (Oct 07 2024 at 19:02):

Fan Zheng schrieb:

That's the question. I'm not familiar enough with Lean to know, for example, how to formalize the property "an expression cannot be simplified" (meaning that no rule of simplication can be applied to any of its subexpressions), etc.

My Lean is not good enough (I am learning), but I am wondering if such rules could be specified along the following lines

some sort of rules

inductive non_unique_simplify : expr → expr → Prop
| refl {e} : simplify e e
| symm {e₁ e₂} : simplify e₁ e₂ → simplify e₂ e₁
| trans {e₁ e₂ e₃} : simplify e₁ e₂ → simplify e₂ e₃ → simplify e₁ e₃
| rule₁ {e₁ e₂} : (* condition *) → simplify e₁ e₂
| rule₂ {e₁ e₂} : (* condition *) → simplify e₁ e₂

like these rules:

| assoc {a b c} : simplify (expr.op (expr.op a b) c) (expr.op a (expr.op b c))
| comm {a b}    : simplify (expr.op a b) (expr.op b a)

how to define derivations

example (e : expr) : ∃ e', non_unique_simplify e e' :=
begin
... some examples
end

finally, a combination of finding a common expansion and using transitivity to show the equivalence?

theorem is_equal_non_unique_simplify (E e₁ e₂ : expr)
  (h₁ : non_unique_simplify E e₁) (h₂ : non_unique_simplify E e₂) : non_unique_simplify e₁ e₂ :=
non_unique_simplify.trans (non_unique_simplify.symm h₁) h₂

Terence Tao (Oct 07 2024 at 19:20):

OK, I've updated the blueprint chapter at https://teorth.github.io/equational_theories/blueprint/abstract-nonsense-chapter.html to have a (very sketchy at present) section on confluent theories, with the assertion that 477 in particular is confluent. I think basically one needs to formalize all the theorems in this chapter and we will then be able to formalize a lot of the anti-implications in Fan's notes. (It's building within CI right now, but the new section should be visible in a few minutes.)

I've structured everything around the concept of a free magma $M_{X,\Gamma}$ generated by an alphabet $X$ subject to a theory $\Gamma$ of equational laws; this generalizes the free magmas $M_X$ (or FreeMagma X) we have already been using, which correspond to the case when the theory $\Gamma$ is the empty set. Basically, the free magma $M_{X,\Gamma}$ always exists by abstract nonsense, but is hard to work with in general, except in special cases such as the confluent case where there is a tractable explicit description. The free magmas give criteria for implications: if a law $w \simeq w'$ is such that w, w' have the same representation in the free magma $M_{X,\Gamma}$ , then the same property must hold for any law implied by that law, so any law that does not have this property is anti-implied by any law that does.

I've started the first issue at equational#412 to define, and establish existence and uniqueness of, free magmas subject to a theory. Once this is established it should be possible to proceed sequentially through the chapter until we reach the confluent portion.

If anyone has any suggestions on how to arrange the blueprint better (or to make it easier to formalize in Lean), please let me know!

Amir Livne Bar-on (Oct 07 2024 at 19:20):

Terence Tao said:

Can you add that as a PR as a list of conjectures for now? I can start working on the blueprint later to set up the general theory of confluent theories, and then the specific confluence of 477 later (probably just cut and pasting from Fan Zheng 's text to begin with).

https://github.com/teorth/equational_theories/pull/413
Takes us down from 695 unknown (net 336) to 675 unknowns (net 328)

Joachim Breitner (Oct 07 2024 at 20:02):

Is this a correct rephrasing of the idea: if an equation forms a confluent, terminating rewrite system, then the free magma quotiened by the equation (or equivalently, the set of normal forms) is a useful magma for counter examples, because the word problem is decidable and we can thus disprove lots of other laws easily.

Oh, and because the simplification rule clearly terminates (it shortens the word), local confluence is sufficient for confluence, and in the first post we see how self-overlap of the rule is ruled out by arguing about the length of the assignment to the non-linear arguments of the rule.

Fan Zheng (Oct 07 2024 at 20:33):

It turns out that I'm reinventing some wheels…

Good to see further developments of this line of attack. The notes above can be a starting point as it shows (if I didn't make mistakes) that every equation with at most 2 compositions, after adding some of its consequences (that is called saturation?), is confluent. For example, to Equation 9 [x = x(xy)] one needs to add Equation 3 [x=x(x(xy)))=xx], for otherwise x(x(xy)) has 2 simplifications. In the note I also worked out a few equations with 3 compositions (by hand) so if a program can be written to automate this, that would be fantastic.

Also, with a bit of extension (adding a constant symbol), I believe that Equation 481 (or rather the "saturation" {x=y(x(y0)) (the original equation), x=0(x0), 0=xx}) is confluent (see Theorem 54). That would close a lot of the remaining open implications, right?

Terence Tao (Oct 07 2024 at 20:58):

Joachim Breitner said:

Is this a correct rephrasing of the idea: if an equation forms a confluent, terminating rewrite system, then the free magma quotiened by the equation (or equivalently, the set of normal forms) is a useful magma for counter examples, because the word problem is decidable and we can thus disprove lots of other laws easily.

Oh, and because the simplification rule clearly terminates (it shortens the word), local confluence is sufficient for confluence, and in the first post we see how self-overlap of the rule is ruled out by arguing about the length of the assignment to the non-linear arguments of the rule.

Yes, this is my understanding of the theory. There could be additional subtleties when dealing with laws with an equal number of operations on both sides, as now rewriting preserves the length rather than shortens it, potentially leading to cycles of rewriting; but it seems the most interest right now is from unbalanced laws where we only rewrite in the direction of strictly shortening the length of a word, in which case I think things are fine and there is a unique minimal reduction. Otherwise one has to place the word in some sort of "normal form", e.g., in lexicographically shortest order among all rewritings.

Terence Tao (Oct 07 2024 at 21:00):

Abstract nonsense tells us that regardless of confluence, the free magma modulo the equation completely classifies the implication problem (an equation $w \simeq w'$ is implied by the law iff w, w' have the same normal form in this magma), but in general this would be an unusable classification due to the intractibility of the general word problem. But apparently it is very good in certain special cases!

Hernan Ibarra (Oct 07 2024 at 21:05):

I asked this question in MathOverflow https://mathoverflow.net/questions/480260/what-is-known-about-the-word-problem-on-free-algebraic-models , in hopes of obtaining more 'invariants' (terminology of chapter 8), i.e., equations we can apply this general strategy on.

Joachim Breitner (Oct 07 2024 at 21:15):

I couldn’t resist and hacked this into lean in a totally unprincipled and hard-to-efficiently-generalize way, but the proof is complete, and @Fan Zheng ’s justification for why there is no self-overlap was spot on:
https://github.com/teorth/equational_theories/pull/417

theorem Equation477_Facts :  ∃ (G : Type) (_ : Magma G), Facts G [477] [1426, 1519, 2035, 2128, 3050, 3150]

Joachim Breitner (Oct 07 2024 at 21:28):

For how many equations do we expect to apply this method? If it’s only three or four, then maybe it’s fine to just have three or four files as ugly as this one and move on?

(I fear a more principled approach amounts to formalizing the basic theory of rewriting systems, with critical pairs etc. :-))

Terence Tao (Oct 07 2024 at 21:34):

Wow, that was far faster than I expected! I imagine the extra compilation time is minimal compared with the tens of thousands of generated Lean implications that we already have, so I don't see much harm in accepting this proof for now, but continuing to work longer-term on a more systematic approach (which could potentially be useful for future projects beyond this one).

David Renshaw (Oct 07 2024 at 21:34):

In my opinion, getting that landed on main should take priority over beautifying it. Then the dashboard and equation explorer can show the true state of our knowledge, and folks who aren't following this particular thread won't accidentally start duplicating effort.

Joachim Breitner (Oct 07 2024 at 21:35):

Ok, I un-drafted it

Terence Tao (Oct 07 2024 at 21:38):

Once the AI contributions start rolling in, I don't think any human-written proof will come close to competing for "ugliest proof submission".

Joachim Breitner (Oct 07 2024 at 21:39):

Challenge accepted.

Joachim Breitner (Oct 07 2024 at 21:50):

Hmm, probably I can separate the “apply a rule bottom up” from “the concrete rule to look at” to apply this a bit more easily to other equations. And I'll see if the subtype of normal forms or the quotient type is more convenient.

Daniel Weber (Oct 08 2024 at 01:35):

Joachim Breitner said:

For how many equations do we expect to apply this method? If it’s only three or four, then maybe it’s fine to just have three or four files as ugly as this one and move on?

(I fear a more principled approach amounts to formalizing the basic theory of rewriting systems, with critical pairs etc. :-))

I think most of the Vampire counterexample conjectures are essentially this

Terence Tao (Oct 08 2024 at 01:57):

But Vampire wasn't able to settle the 477 implications, right?

Daniel Weber (Oct 08 2024 at 02:08):

It is now able to - I think I might've not run it on all equations correctly

Terence Tao (Oct 08 2024 at 02:23):

Is it worth doing a new Vampire run on the ~900 remaining equations (or ~300 after reductions)?

Terence Tao (Oct 08 2024 at 02:24):

now with a more generous run time allocation per equation

Daniel Weber (Oct 08 2024 at 02:36):

Terence Tao said:

Is it worth doing a new Vampire run on the ~900 remaining equations (or ~300 after reductions)?

My script is only seeing 675 implications :thinking:

lyphyser (Oct 08 2024 at 02:37):

does it resolve 3588 => ...? it looks simple and I just run the "Knuth-Bendix Completion Visualizer" tool and it claims that it can be restated as the confluent rule system f(z, f(f(x, y), z)) -> f(x, y), f(f(f(z, x1), f(x, y)), f(z, x1)) -> f(x, y)
but the Wikipedia page of Vampire claims that "reduction ordering on terms is the standard Knuth–Bendix ordering" so I wonder if that means Vampire also finds a Knuth-Bendix completion and it was somehow missed or if it's fruitful to try kbcv and any other Knuth-Bendix implementation that can be found on everything

Terence Tao (Oct 08 2024 at 02:38):

I think it's 675 modulo equivalence but like 991 not modulo equivalence

Daniel Weber (Oct 08 2024 at 02:40):

lyphyser said:

does it resolve 3588 => ...? it looks simple and I just run the "Knuth-Bendix Completion Visualizer" tool and it claims that it can be restated as the confluent rule system f(z, f(f(x, y), z)) -> f(x, y), f(f(f(z, x1), f(x, y)), f(z, x1)) -> f(x, y)
but the Wikipedia page of Vampire claims that "reduction ordering on terms is the standard Knuth–Bendix ordering" so I wonder if that means Vampire also finds a Knuth-Bendix completion and it was somehow missed or if it's fruitful to try kbcv and any other Knuth-Bendix implementation that can be found on everything

Now it found it, yes. I think I found a bug in find_powerful_theorems which caused it to miss some implications previously. I'll add the conjectures

Terence Tao (Oct 08 2024 at 02:41):

I'm particularly curious as to how it fares with 65 => *. That has defeated all the human challengers so far.

Daniel Weber (Oct 08 2024 at 02:42):

It can't handle it, I've run it specifically on that for a few minutes. I suspect that it doesn't have a finite rule system

Terence Tao (Oct 08 2024 at 02:42):

OK I'm glad to know that we haven't been completely replaced by Vampire lol

Daniel Weber (Oct 08 2024 at 04:20):

Daniel Weber said:

Now it found it, yes. I think I found a bug in find_powerful_theorems which caused it to miss some implications previously. I'll add the conjectures

equational#422 has the new conjectures (and two size 21 counterexamples)

Joachim Breitner (Oct 08 2024 at 06:12):

Daniel Weber said:

I think most of the Vampire counterexample conjectures are essentially this

Is it still useful for me to polish that manually written proof and extend it to the other facts in https://github.com/amirlb/equational_theories/blob/main/equational_theories/Generated/Confluence/ManuallySampled.lean, or will we have an automatically generated lean proof soon anyways?

Daniel Weber (Oct 08 2024 at 06:13):

I still don't know how to formalize the Vampire conjectures, I think this will be good to build a framework for that

Joachim Breitner (Oct 08 2024 at 09:39):

I cleaned up the lean setup to the point where I am no longer embarrased about it, and prove all the conjectures added by @Amir Livne Bar-on in https://github.com/teorth/equational_theories/pull/413:

https://github.com/teorth/equational_theories/pull/425

Daniel Weber (Oct 08 2024 at 09:44):

Is it possible to translate an equality saturation to a confluence-based proof? Consider for example 1110 -> 8. Vampire says

% # SZS output start Saturation.
cnf(u9,negated_conjecture,
    sK0 != mul(sK0,mul(sK0,sK0))).

cnf(u8,axiom,
    mul(X1,mul(mul(X1,mul(X0,X0)),X1)) = X0).

% # SZS output end Saturation.

I'm guessing that's not enough information to directly construct a model, seeing as that's just a rewriting of the input - what do you need for that?

Amir Livne Bar-on (Oct 08 2024 at 10:17):

Equation 1110 looks confluent to me... I'll check why my program didn't find it

Joachim Breitner (Oct 08 2024 at 10:28):

Indeed it is, added Facts G [1110] [8, 411, 1629, 1832, 3253, 3319, 3862, 3915] to equational#525.

Will Sawin (Oct 08 2024 at 10:54):

Is it possible to prove (ideally automatically) that most of the remaining equations are not confluent?

Will Sawin (Oct 08 2024 at 10:54):

Maybe for some we already know they cannot be confluent because this would imply already-proven implications are false?

Joachim Breitner (Oct 08 2024 at 11:29):

That sounds feasible, if the known implications happen to be useful enough.

Joachim Breitner (Oct 08 2024 at 11:31):

Alhough, looking at a random like like https://teorth.github.io/equational_theories/implications/?1076 it seems that it doesn't imply anything yet, so this doesn’t help us.

Amir Livne Bar-on (Oct 08 2024 at 11:39):

My naive approach wasn't general enough to find all failing unifications. I think I got correct code now, based on e-graphs. This raises the number of confluent equations from 53 to 93, and these account for 21 unknown equations. I don't fully trust my code yet, I'll go over it tomorrow, maybe with an existing e-graph library.

lyphyser (Oct 08 2024 at 11:41):

@Daniel Weber if I add --show_everything on to vampire_proofs.py I see that it produces a partial "proof" for the non-implication conjectures, that after deleting dead code just becomes a sequence of "superpose" tactics producing equations.
So maybe it would work to formalize that superposition calculus is complete, express that sequence as a formal sequence of superpositions instead of tactics, and showing by decide that no further superposition calculus moves are possible in that state, and thus there is no superposition calculus proof and thus by completeness there is a counterexample?

Daniel Weber (Oct 08 2024 at 11:44):

That sounds possible, but it's hard to see that working without native_decide. I was hoping that once you have the set of equations there might be a simpler argument, maybe using confluence or something like that

Joachim Breitner (Oct 08 2024 at 11:54):

Amir Livne Bar-on said:

My naive approach wasn't general enough to find all failing unifications. I think I got correct code now, based on e-graphs. This raises the number of confluent equations from 53 to 93, and these account for 21 unknown equations. I don't fully trust my code yet, I'll go over it tomorrow, maybe with an existing e-graph library.

Or just directly extend the lean proof in equational#525 with the newly found ones?

Amir Livne Bar-on (Oct 08 2024 at 12:06):

I'll try that too, but I'm hoping for a bug that caused me to miss some more confluent expressions

David Renshaw (Oct 08 2024 at 12:20):

is 1447 -> 1431 provable by confluence? https://teorth.github.io/equational_theories/implications/?1447

Joachim Breitner (Oct 08 2024 at 13:20):

For those who haven’t memorized the rules and are too lazy to click on the link: 1447 is x = (x ◇ y) ◇ (x ◇ (z ◇ x))

I doubt it. For suitable values of z and x it can apply at z ◇ x, so you have a critical pair. Same for x ◇ y. The confluent rules so far all were very “non-linear” (mentioning the same variable twice, often next to each other, e.g. y ◇ y). And this non-linearity allowed us to rule out that the rule applies anyywhere in its lhs.

Daniel Weber (Oct 08 2024 at 13:25):

Vampire can't find a saturation, FWIW

Joachim Breitner (Oct 08 2024 at 14:36):

(deleted)

Amir Livne Bar-on (Oct 09 2024 at 17:23):

[EDIT: wrong proof ahead]

While working on the e-graph based program to detect confluent expressions, I found a different class of expressions that seem confluent to me. An archetypical example is equation 11:
x = x * (y * y)
This does not have the property of independent reductions the previous rules had. That rule was, the possible loci of reductions in an expression are spread far apart, in the sense that the sets of nodes corresponding to multiplications in the reduction rule are disjoint between the loci.
But for equation 11 we have the expression
(a * a) * ((a * a) ** (a * a))
that can be reduced at the root or at the position marked with **. One way reduces to a * a directly, and the other to (a * a) * (a * a) and them to a * a.
I think this rule is confluent, since every ambiguity in reduction would correspond to reducing a part of the expression to (E * E) * (E * E) or to E * E. If this acts later as the y*y part of the reduction rule then it doesn't matter which is chosen. And if sub-expressions of the (E * E) * (E * E) are reduced, we can do the same reduction in the E * E form.

Amir Livne Bar-on (Oct 09 2024 at 17:46):

Oops, this is not the general case. Equation 11 is not confluent. A=A(BB*BB)=AB

Edward van de Meent (Oct 09 2024 at 17:49):

i think you mean A=A(BB*BB)=A(BB)?

Edward van de Meent (Oct 09 2024 at 17:50):

(so it isn't an example of non-confluence?)

Amir Livne Bar-on (Oct 09 2024 at 17:52):

Mmmm. I managed to get confused, probably since I'm considering equation 13 at the same time. Yes, this is an example of unique reduction. And I think it's possible to prove that the general case works, but the proof will be a bit different from above.

Amir Livne Bar-on (Oct 10 2024 at 08:13):

Proof that Eq11 [x = x(yy)] is confluent:
We show by induction on expression size that there is a unique simplification.
The only possible ambiguity in applying reductions is if some sub-expression looks like A(BB*BB). We want to show that the two possible reductions here give the same final result simp(A). By induction, the sub-expression (BB)(BB) has a unique simplification. To see what it is, we first fully simplify BB to get C. Then CC can be further simplified only if C=DD for some D, and simp(CC)=simp(DD). We can assume that D is fully simplified, and by infinite descent the fully-simplified form is some EE, and then the whole expression can be reduced to A.

Amir Livne Bar-on (Oct 10 2024 at 08:17):

We can even have confluence for laws with (x * y) appearing in the bottom of the expression tree.

Proof that Eq26 [x=(xy)y] is confluent:
Again we prove unique simplification by induction on expressions. Ambiguity means a sub-expression of the form ((AB)B)B. The two possible simplifications both result in A*B, which is smaller and therefore has a unique simplification.

Amir Livne Bar-on (Oct 10 2024 at 08:26):

Equation 1447 above though [x = (xy)(x(zx))] isn't confluent, here is a minimal counter-example:

((v ◇ u) ◇ (v ◇ (s ◇ v))) ◇ ((v ◇ u) ◇ (r ◇ (v ◇ u)))  -->  v ◇ u
((v ◇ u) ◇ (v ◇ (s ◇ v))) ◇ ((v ◇ u) ◇ (r ◇ (v ◇ u)))  -->  v ◇ ((v ◇ u) ◇ (r ◇ (v ◇ u)))

lyphyser (Oct 10 2024 at 10:35):

Note that it is sometimes possible to turn a non-confluent equation into a confluent rewrite system by applying what is known as Knuth-Bendix completion or an equivalent algorithm. For example I manually proved 3588 !-> and 115 ->! like that, although instead of proving confluence directly I simple prove that the hypothesis equation holds by showing that inner rule rewrites do nothing. I found a paper that defines everything - "Critical pair criteria for completion" by Bachmair, Nachum Dershowitz [https://www.sciencedirect.com/science/article/pii/S074771718880018X - has full pdf]

Amir Livne Bar-on (Oct 10 2024 at 11:27):

Thanks! This will definitely be my next step.

To report some negative results:
My search found 58 equations of the form x = E that are provably confluent, and found counter-examples for all other equations of this form except for 32 equations. None of these 90 equations however have unknown implications.

So what makes sense to try next is to run Knuth-Bendix on the remaining equations that have unknown implications, with a variety of orderings, and see if we can get a confluent system.

Daniel Weber (Oct 10 2024 at 11:33):

You can also try to utilize the completions from Vampire

Amir Livne Bar-on (Oct 10 2024 at 12:10):

Mmmm, speaking of Vampire, there are 16 non-implications from 3 confluent equations I found yesterday/today, in the Vampire conjectures file. I'll look into proving confluence in Lean.

lyphyser (Oct 10 2024 at 12:13):

I think all the Vampire conjectures might be from equations that are turned into confluent rule systems by it

lyphyser (Oct 10 2024 at 12:15):

there are however some, like 481 -> ... that are conjectured but for me running Vampire with a 1 minute timeout on it using vampire_proofs.py (modified to print to stdout) gives a time limit exceeded instead of saturated, so I wonder why they are conjectured (perhaps it was run with a longer timeout or faster CPU, or with different options?)

Daniel Weber (Oct 10 2024 at 12:18):

It uses vampire_proofs_cyc.py. 481 -> 1496, for instance, outputs:

cnf(u21,negated_conjecture,
    sK0 != sF6).

cnf(u26,axiom,
    mul(sK1,sK0) = sF3).

cnf(u31,axiom,
    mul(sK1,sF4) = sF5).

cnf(u36,axiom,
    mul(sK2,sK2) = sF4).

cnf(u41,axiom,
    mul(sF3,sF5) = sF6).

cnf(u79,axiom,
    sK1 = mul(sF4,sF5)).

cnf(u87,axiom,
    sF3 = mul(sK1,sF6)).

cnf(u43,axiom,
    mul(X0,mul(X1,mul(X0,sF4))) = X1).

cnf(u17,axiom,
    mul(X0,mul(X1,X2)) = sF8(X0,X1,X2)).

cnf(u55,axiom,
    mul(X0,X0) = sF4).

cnf(u72,axiom,
    mul(sK1,mul(X0,sF5)) = X0).

cnf(u65,axiom,
    mul(sF4,mul(X2,sF4)) = X2).

Daniel Weber (Oct 10 2024 at 12:19):

This looks more like an explicit description of a model then a confluence argument, though

lyphyser (Oct 10 2024 at 12:42):

so it seems that the vampire output and behavior greatly depends on the command line options, including unexpected ones like --cores and --proof_extra

Daniel Weber (Oct 10 2024 at 12:43):

I don't think so, the relevant part there is --mode casc

lyphyser (Oct 10 2024 at 12:44):

for me, it produces your output if I run it with --mode casc; however, if I run it without --mode casc like vampire_proofs.py does it instead triggers a time limit in saturation phase

lyphyser (Oct 10 2024 at 12:47):

on 3588 -> 3862 with --mode casc --cores 0, it gives this, which is the confluent rule system that Knuth-Bendix completion:
cnf(u7,axiom,
mul(X0,X1) = mul(X2,mul(mul(X0,X1),X2))).

cnf(u9,axiom,
mul(X0,X1) = mul(mul(mul(X2,X3),mul(X0,X1)),mul(X2,X3))).

lyphyser (Oct 10 2024 at 12:48):

(also prints the negated conjecture, but nothing else)

lyphyser (Oct 10 2024 at 13:02):

as for 481 -> 1496, I think it might be presenting this rule system, which I think is probably confluent:
cnf(u55,axiom,
mul(X0,X0) = sF4).

cnf(u43,axiom,
mul(X0,mul(X1,mul(X0,sF4))) = X1).

cnf(u65,axiom,
mul(sF4,mul(X2,sF4)) = X2).

here x * x = y * y for any x, y, so it introduces sF4 = sK2 * sK2 to represent that value, since to have a normal form one has to either add an additional symbol or pick one value to represent x * x

lyphyser (Oct 10 2024 at 13:05):

basically it seems a rule system might be extractable from the Vampire output by defining a set of "used" symbols that initially starts with {X0, X1, X2, X3, X4, X5} picking all equations where the LHS only contains them, then adding the symbols in the rhs to the "used" symbol set and recursing. Not really sure whether this works in general though.

lyphyser (Oct 10 2024 at 13:07):

in this case this algorithm also adds cnf(u17,axiom, mul(X0,mul(X1,X2)) = sF8(X0,X1,X2)), but it seems superfluous given that sF8 is not referenced anywhere else

lyphyser (Oct 10 2024 at 13:12):

interestingly Vampire mentions "twee goal transformation" and it turns out that Twee is a separate theorem prover for equational reasoning, so it might be worth to try running Twee directly on the problems

Joachim Breitner (Oct 10 2024 at 13:14):

Amir Livne Bar-on said:

Proof that Eq11 [x = x(yy)] is confluent:

Nice! But no unknown implications here, right? So no point putting this into Confluence.lean?

BTW, the proof can be rephrased as follows:

The rewrite rule x(yy) → x has the critical pair (i.e. an expression where the rule applies in different ways):

x ← x((zz)(zz)) → x(zz)

which can be joined by taking one step x(zz) → x on the right.

That gives local confluence, and because the rule is terminating, that’s confluence.

Bhavik Mehta (Oct 10 2024 at 13:15):

I've been using kbcv to run knuth-bendix, and it works quickly for the cases that it terminates on: http://cl-informatik.uibk.ac.at/software/kbcv/. (As a historical note, Knuth invented Knuth-Bendix in order to resolve questions specifically regarding equation 168 and equation 26302).

lyphyser (Oct 10 2024 at 13:16):

so running Twee on 481 -> 1496 gives this output:

Here is the input problem:
Axiom 1 (flattening): mul3 = mul(y, x).
Axiom 2 (flattening): mul2 = mul(z, z).
Axiom 3 (flattening): mul4 = mul(y, mul(z, z)).
Axiom 4 (lhs): X = mul(Y, mul(X, mul(Y, mul(Z, Z)))).
Axiom 5 (flattening): mul5 = mul(mul(y, x), mul(y, mul(z, z))).
Goal 1 (rhs): x = mul(mul(y, x), mul(y, mul(z, z))).

mul(y, x) -> mul3
mul(z, z) -> mul2
mul(y, mul2) -> mul4
mul(X, mul(Y, mul(X, mul(Z, Z)))) -> Y
mul(mul3, mul4) -> mul5
mul(X, X) ~> mul(?, ?)
mul(?, ?) -> mul2
mul(X, mul(Y, mul(X, mul2))) -> Y
mul(y, mul(X, mul4)) -> X
mul(y, mul5) -> mul3
mul(mul2, mul(X, mul2)) -> X
mul(mul2, mul4) -> y

Ran out of critical pairs. This means the conjecture is not true.
Here is the final rewrite system:
mul(X, X) ~> mul(?, ?)
mul(?, ?) -> mul2
mul(y, mul5) -> mul3
mul(y, x) -> mul3
mul(y, mul2) -> mul4
mul(mul3, mul4) -> mul5
mul(mul2, mul4) -> y
mul(y, mul(X, mul4)) -> X
mul(mul2, mul(X, mul2)) -> X
mul(X, mul(Y, mul(X, mul2))) -> Y

RESULT: CounterSatisfiable (the conjecture is false).

lyphyser (Oct 10 2024 at 13:20):

it looks like the lowercase x/y are actually constants, not variables

running on the output the same algorithm I described before gives this:
mul(X, X) ~> mul(?, ?)
mul(?, ?) -> mul2
mul(mul2, mul(X, mul2)) -> X
mul(X, mul(Y, mul(X, mul2))) -> Y

which is the same as the rule system described above. I don't quite understand why it introduces the extra symbols since they seem to be impossible to obtain from an expression not containing them

lyphyser (Oct 10 2024 at 13:23):

seems like it would be worth trying Twee on all unknowns, will do so once I manage to setup a suitable environment and compile it

lyphyser (Oct 10 2024 at 13:25):

(I have been running the precompiled binary, but it's from 2022 while there seem to be recent commits)

lyphyser (Oct 10 2024 at 13:25):

kbcv can't handle this case since it says it can't orient f(X, X) = f(Y, Y)

Bhavik Mehta (Oct 10 2024 at 13:27):

lyphyser said:

kbcv can't handle this case since it says it can't orient f(X, X) = f(Y, Y)

Indeed, but adding 1496 as well produces proofs of x = y, so any nontrivial model of 481 should give a counterexample to 481 => 1496

lyphyser (Oct 10 2024 at 13:32):

so apparently Vampire doesn't use twee, but instead just implements a preprocessing step inspired by Twee that it calls "twee goal transformation" - https://github.com/vprover/vampire/blob/master/Shell/TweeGoalTransformation.cpp

Amir Livne Bar-on (Oct 10 2024 at 14:50):

[Quoting…]

No new unknowns, but they prove some conjectures (unless I got confused with the files, and old confluent rules prove these conjectures)

Joachim Breitner (Oct 10 2024 at 15:35):

Hmm, then I misread https://teorth.github.io/equational_theories/implications/?11, I don't get unknowns even with “Treat conjectures as unknown” ticked.

Amir Livne Bar-on (Oct 10 2024 at 17:48):

Equation 11 being confluent doesn't add anything, but eqs 1086, 2497, 2541 together prove 16 conjectures
https://github.com/teorth/equational_theories/pull/497

Amir Livne Bar-on (Oct 10 2024 at 17:54):

@lyphyser does Twee solve any unknown implications?

lyphyser (Oct 10 2024 at 18:25):

so far I haven't found any, and it seems that it solves the same set that is currently conjectured by vampire although I haven't finishing checking that

Joachim Breitner (Oct 10 2024 at 19:17):

glad to see that my code was cargocultable :-)

lyphyser (Oct 10 2024 at 23:52):

I confirm that Twee solves precisely the same set of implications that have Vampire conjectures

lyphyser (Oct 10 2024 at 23:52):

(with a 10 seconds timeout)

Fan Zheng (Oct 13 2024 at 11:37):

Seems there is some existing theory behind this confluence argument. Are we planning to put it in the paper?

Terence Tao (Oct 13 2024 at 14:31):

It would be good to have someone write a blueprint chapter dedicated to confluence and how one can automatically search for it. Much of the theory can presumably be outsourced to textbooks and papers, but having the references would be good. That can then form the base of a subsequent section of the final paper. (A bare-bones definition of confluence is already at https://teorth.github.io/equational_theories/blueprint/abstract-nonsense-chapter.html#a0000000037 , so one can build upon that, or start a fresh chapter.)

I opened equational#555 for this task.

Fan Zheng (Oct 13 2024 at 19:42):

Well, then I guess it takes a while to align my terminology with the existing literature. Also is the abstract theory going to be formalized in Lean, or are we going to formalize individual non-implications on a ad hoc basis?

Terence Tao (Oct 13 2024 at 19:50):

That's a good question. I think the other contributors to this thread will need to weigh in on how their formalizations are going, and whether it is necessary to lay out a systematic theory (in blueprint and/or in Lean) first.

Fan Zheng (Oct 13 2024 at 20:39):

Just brushed up the basics of confluence, so here's a (maybe stupid) question: is the rewriting system corresponding to Equation 46 (xy=zw) terminating? It seems not (because a product can be rewritten to itself), but if we add a constant symbol (say 0) and turns that equation to xy=0 then it is terminating. Is this trick handled by any automated provers?

Joachim Breitner (Oct 13 2024 at 20:42):

Is that even a rewriting system? A normal rewrite rule would not mention variables on the right hand side that don't appear on the left hand side.

Fan Zheng (Oct 13 2024 at 21:35):

But Equation 46 does allow terms to have a "normal form" in some sense: if a term has at least 2 variables (not necessarily distinct) then its normal form is 0; if it has only 1 variable, then it is already in normal form.

Jose Brox (Oct 13 2024 at 22:02):

Daniel Weber ha dicho:

I think most of the Vampire counterexample conjectures are essentially this

For the record, Prover9 also implements the Knuth-Bendix algorithm (one has to choose "kbo" at Prover9 options/Basic options/order). In fact, I run KB in Prover9 for days with high resources on the Astérix/Obélix equations (and also Dupond/Dupont) just in case it found some "monstruous" implication between them, and also on Equation 5105 (the conjectured Austin law) implying the singleton law (with no success).

Fan Zheng (Oct 13 2024 at 22:40):

Is it possible to have a glimpse oof the rewriting rule generated? Some equation may generated an infinite set of rewrite rules with a certain pattern that can be made out by the hunan eye.

Cody Roux (Oct 15 2024 at 17:32):

So I am an "expert" with strong scare quotes, maybe I'll give a try to write such a section. Generally confluence is usually most easily proven on strongly normalizing (or terminatin) systems rather than weakly normalizing systems, though there are techniques which can account for both.

Cody Roux (Oct 15 2024 at 17:33):

The moratorium on fresh variables on the rhs is common, but it's reasonable to forgo it, and confluence makes sense in such a setting.

Amir Livne Bar-on (Oct 15 2024 at 18:31):

Is there a completeness result for Knuth-Bendix completion? That is, does there always exist an order on terms that makes the algorithm converge?

Michael Bucko (Oct 15 2024 at 19:59):

Amir Livne Bar-on schrieb:

Is there a completeness result for Knuth-Bendix completion? That is, does there always exist an order on terms that makes the algorithm converge?

I don't think so. It's not always possible to find a term order that makes the algorithm converge for every possible equation set.

Cody Roux (Oct 15 2024 at 20:44):

No there does not

Cody Roux (Oct 15 2024 at 20:46):

the subject of "unfailing completion" is quite well-developed though, where one orients some of the equations, then keeps the others around and uses an order which is "total on ground terms" to compute further critical pairs

Cody Roux (Oct 15 2024 at 20:46):

Still, of course, the algorithm may fail to converge

Jose Brox (Oct 15 2024 at 23:32):

Amir Livne Bar-on ha dicho:

Is there a completeness result for Knuth-Bendix completion? That is, does there always exist an order on terms that makes the algorithm converge?

For a simple example, consider KB on xy=yx. Then xy reduces to yx and yx reduces to xy, giving a loop. Hence the orientations/orderings generalizations of KB.

Cody Roux (Oct 15 2024 at 23:41):

A more reasonable question is: if a completion exists (a _finite_ rewrite system that is terminating and confluent which contains the equational theory) is there an order which guarantees the algorithm will find it?

I think the answer is yes? Just taking the order such that t > u iff t -> u in the hypothetical system. I've just learned that the completed system is somewhat unique if certain additional conditions are required, which seems crazy.

But it is possible that no such finite system exists, though KB finds an infinite system "in the limit".

Cody Roux (Oct 15 2024 at 23:46):

oh and here's a simple example (I love rewriting, all the counter-examples are tiny)

f(g(f(x))) = g(f(x))

Last updated: May 02 2025 at 03:31 UTC