Zulip Chat Archive

Stream: Equational

Topic: 1076 !=> 3

Pace Nielsen (Oct 11 2024 at 04:09):

I've written up a proof that 1076 does not imply 3, using Terry's combinatorial idea. I've now triple-checked everything, so hopefully I didn't miss any silly mistakes.

[Edited to fix a few minor case analysis errors.]
NonImplication-updated.pdf
NonImplication-updated.tex

Pace Nielsen (Oct 11 2024 at 04:10):

I don't know how to submit a request for someone to "Leanify" the proof. I'd be grateful if someone submitted the request (or explained to me how to do it).

Pace Nielsen (Oct 11 2024 at 04:28):

By the way, the whole use of $\aleph_1$ here is completely unnecessary, since each stage only needed us to adjoin finitely many more pairs. But I wanted to bring it up, since it might be useful if one needed to adjoin countably many pairs at a given stage under a different functional equation.

Zoltan A. Kocsis (Z.A.K.) (Oct 11 2024 at 04:38):

@Pace Nielsen : Nice argument!

If you have a Github account, you can create an issue - this would be a feature request.

For inspiration you can read a previous issue @Terence Tao wrote requesting the formalization of his Asterix proof here (which has now been completed).

If you don't have a Github account and you'd rather not create one, somebody else can create the issue for you. Just ask for it to be done here.

Terence Tao (Oct 11 2024 at 06:09):

I made equational#506 for this. (There are some technical bookkeeping steps that also have to be made here, because this formalization involves a novel equation and probably also a novel Lean file, so I listed them explicitly.)

Terence Tao (Oct 11 2024 at 06:20):

I guess this technique is promising in any potential anti-implication where the hypothesis involves just two variables. Maybe equation 73 is another candidate?

Martin Dvořák (Oct 11 2024 at 09:03):

What does [ ] mean?

Pace Nielsen (Oct 11 2024 at 09:15):

Martin Dvořák said:

What does [ ] mean?

It is just another form of parentheses, to help keep track of them (when there are a lot of iterated parentheses).

Martin Dvořák (Oct 11 2024 at 09:16):

Ah ok. I generally struggle with reading math written in LaTeX, as it has a lot of parentheses in weird places for me.

Pace Nielsen (Oct 11 2024 at 12:53):

Terence Tao said:

I guess this technique is promising in any potential anti-implication where the hypothesis involves just two variables. Maybe equation 73 is another candidate?

You were quite right. These methods completely handled 73. I've attached the details (but only double-checked them rather than triple-checked this time). It was a bit easier here, since there were only three operations instead of four.
Equation_73-updated.pdf
Equation_73-updated.tex

Kevin Buzzard (Oct 11 2024 at 12:54):

This is exactly the sort of low-level argument where the maths community should be asking the AI community to see if it can autoformalise the entire thing to save humans the bother.

Terence Tao (Oct 11 2024 at 14:09):

What i have found reasonably effective is to formalize one claim, eg 73 does not imply 99, putting the latex proof above it in comments. Then put the latex proof of a similar claim, eg 73 does not imply 203, as a comment in the lean file, preferably with similar wording to the previous comment. Then Github copilot can often magically autocomplete the statement and proof with reasonable reliability.

Terence Tao (Oct 11 2024 at 14:26):

Added equational#516 to formalize the 73 cluster as well.

Pace Nielsen (Oct 11 2024 at 15:01):

Kevin Buzzard said:

This is exactly the sort of low-level argument where the maths community should be asking the AI community to see if it can autoformalise the entire thing to save humans the bother.

I agree. And yet, in some important ways it was the opposite of a bother, because working out these few examples was what helped me feel more comfortable with (and believe I finally understood) Terry's linearization technique (as well as the added difficulty going from 65 to 359, which needed even more ideas).

However, if I needed to do many more by hand, it could indeed become tedium.

Terence Tao (Oct 11 2024 at 15:11):

If all the recent progress on Asterix-type arguments are formalized, this may actually be enough training data for a cutting edge AI to apply it to a list of promising outstanding implications. I can mention this to my contacts (I know that at least some of the large private AI research groups are following this project to some degree).

Terence Tao (Oct 11 2024 at 15:30):

I guess what this means is that deploying humans to test this method on the other outstanding implications should be on a strictly volunteer basis for anyone who wants to practice the method. Otherwise we can wait to see if more automated methods come online (for instance, the confluence group is making progress towards an automated sweep of the remaining implications, which may help focus where the human-guided efforts should concentrate on).

EDITED TO ADD: one possible option is to have someone seek out promising other implications to refute this way and just mark them as conjectures for now in some file with sketch notes on how to proceed.

Pace Nielsen (Oct 11 2024 at 16:07):

Terence Tao said:

I guess what this means is that deploying humans to test this method on the other outstanding implications should be on a strictly volunteer basis for anyone who wants to practice the method. Otherwise we can wait to see if more automated methods come online (for instance, the confluence group is making progress towards an automated sweep of the remaining implications, which may help focus where the human-guided efforts should concentrate on).

EDITED TO ADD: one possible option is to have someone seek out promising other implications to refute this way and just mark them as conjectures for now in some file with sketch notes on how to proceed.

I have a couple students who have been following this project with me, and so if someone has any outstanding implications where they'd like humans to be "deployed", just post them here and I'll see if the students are interested. [I've found this project to be quite fun, and so I might give my hand at a couple more too!]

Interestingly, there was one aspect of my write-ups that may take some work to be automated. In the Equation_73.pdf file there are 4 conditions defining membership in script-E. The first two, and the last one can easily be automated. But the third one took some thought when trying to make Lemma 2.1 work. Something similar happened when defining script-F at the end of the file when trying to prove Lemma 7.1.

Terence Tao (Oct 12 2024 at 17:23):

Here I think is a way to construct solutions to 73, x = y ◇ (y ◇ (x ◇ y)), via greedy filling of the multiplication table (so not using the translation-invariant ansatz), as discussed over at https://leanprover.zulipchat.com/#narrow/stream/458659-Equational/topic/Equation.2065.20-.3E.20left.20cancellability .

Define a partial solution to 73 to be a partial magma operation on ℕ, defined on only finitely many pairs, that obeys the following two rules:
(1) If y ◇ (x ◇ y) is defined and equal to z, then y ◇ z is defined and equal to x.
(2) The operation is right-cancellative whenever defined; thus if x ◇ y and x' ◇ y are defined and equal to each other, then x = x'. (This is a necessary consequence of 73 holding globally.)

The claim is that if we have a partial solution to 73, and we have an undefined operation a ◇ b, we can extend the partial solution to one where a ◇ b is now defined, by the following procedure.

(i) Set a ◇ b equal to a natural number c which is novel: not equal to a, b, or anything already appearing in the partial solution.
(ii) If there is an x such that x ◇ a = b (such an x is unique by rule (2)), also set a ◇ c equal to x. (This is needed to maintain rule (1).)

One can then check (by a somewhat tedious, but in principle automatable, case analysis) that this creates a new partial solution (no violations of (1) or (2) are generated by the above procedure). Iterating this, we can obtain a global solution to 73.

The semi-automated procedure to discover this algorithm is to start just with (1) (which is the naive rule coming from 73 itself), try to perform the greedy algorithm, and observe that one gets stuck unless one also has rule (2). Then one adds rule (2) to the ruleset, and runs the greedy algorithm again, and thankfully now it closes with no new rules being required.

Terence Tao (Oct 12 2024 at 18:52):

For 1076 x = y ◇ ((x ◇ (x ◇ y)) ◇ y), I think the situation is as follows. First observe that 1076 implies the following consequences:

(1) For any y, the map x ↦ x ◇ (x ◇ y) is injective. (Immediate from 1076.)
(2) If x ◇ (x ◇ y) = y, then y ◇ (y ◇ y) = x (by 1076), and hence y ◇ (x ◇ y) = y (by 1076 again).
(3) If x ◇ (x ◇ y) = y and z ◇ (z ◇ y) = x, then y = z (follows from (1) and (2)).

So now one should complete the multiplication table keeping the following rules in mind:
(0') If (x ◇ (x ◇ y)) ◇ y is defined and equal to z, then y ◇ z is defined and equal to x.
(1') If x ◇ (x ◇ y) and x' ◇ (x' ◇ y) are defined and equal to each other, then x = x'.
(2') If x ◇ (x ◇ y) = y, then y ◇ (x ◇ y) is defined and equal to y.
(3') If x ◇ (x ◇ y) = y and z ◇ (z ◇ y) = x, then y = z.

One can then extend any finite partial multiplication table on the natural numbers to define a ◇ b by the following procedure:
(i) Set a ◇ b equal to a novel c.
(ii) If x ◇ (x ◇ b) = a for some x (which is unique by (1')), also set b ◇ c equal to x. (This is forced in order to maintain (0').)

It is a tedious case analysis to check that this extension preserves the properties (0')-(3').

[Workflow: I first naively tried to work greedily with (0') and got stuck unless I assumed (1'). Then I tried to work greedily with (0')+(1') and got stuck unless I assumed (2'). Then I tried to work greedily with (0')+(1')+(2') and got stuck unless I assumed (3'). Fortunately when I worked greedily with (0')+(1')+(2')+(3'), no further obstructions appeared.]

CAVEAT: I have not double-checked the "tedious case analysis". I guess this is best done within Lean...

Terence Tao (Oct 12 2024 at 19:26):

I guess one could use an existing ATP to verify the greedy completion claim, in advance of actually creating a Lean formalization. Here is one framework, in which one abstracts the partial magma operation x ◇ y = z into a ternary relation R(x,y,z). The language is then that of a single sort with the ternary relation R and three constants a,b,c, obeying the following general base axioms:

c != a and c != b. (This is part of c being novel.)
If R(x,y,z) and R(x,y,z'), then z=z'. (i.e., the relation is given by a partial magma operation.)
If R(x,y,z), then either x != a or y != b. (i.e., a ◇ b is undefined.)
If R(x,y,z), then x != c, y != c, and z != c. (This is the other part of c being novel.)

Then, for each law one wants to study, there will be additional axioms placed on top of this. For instance, in the case of the 1076 analysis, one adds the following axioms:

(0') If R(x,y,w), R(x,w,u), and R(u,y,z), then R(y,z,x).
(1') If R(x,y,z), R(x,z,w), R(x',y,z'), R(x',z',w), then x = x'.
(2') If R(x,y,w) and R(x,w,y), then R(y,w,y).
(3') If R(x,y,w), R(x,w,y), R(z,y,u), and R(z,u,x), then y=z.

Then one defines a new ternary predicate R' extending R by some rule. In the case of 1076, the rules are

(a) If R(x,y,z), then R'(x,y,z).
(b) One has R'(a,b,c).
(c) If R(x,b,y) and R(x,y,a), then R'(b,c,x).

We set R'(x,y,z) to be false if none of (a), (b), (c) apply.

One could then use an ATP to check that if R obeys 1-4 and (0')-(3'), then R' also obeys 2 and (0')-(3').

Daniel Weber (Oct 12 2024 at 19:29):

R' is well-defined just means that it satisfies 1-4, right? Also, should a and b necessarily be distinct?

Terence Tao (Oct 12 2024 at 19:31):

Good points. I've corrected my previous text.

Daniel Weber (Oct 12 2024 at 19:31):

I'm going to sleep now, I'll try this tomorrow

Joachim Breitner (Oct 12 2024 at 19:37):

Do you expect this to eventually lead to a finite magma? Or that this process can be proven to continue without contradiction, thus filling the multiplication table for all of Nat?

Joachim Breitner (Oct 12 2024 at 19:41):

And, looking at 73, for example, why is there a reason to believe that this process will not eventually equate all numbers, this constructing the trivial magma?

Terence Tao (Oct 12 2024 at 19:54):

This will always generate magma on Nat. To create nontrivial magmas one needs to “seed” the original partial magma with values that already contradict some target law, so that the completion of that partial magma as constructed here will also contradict that law.

Terence Tao (Oct 12 2024 at 19:55):

at no point do we identify two natural numbers together (if we are forced to, the technique has failed)

Joachim Breitner (Oct 12 2024 at 19:55):

Ah, I see the other thread has more detail, sorry

Pace Nielsen (Oct 12 2024 at 20:00):

@Terence Tao In the greedy algorithm described above for 73, I think there is a minor problem when we start with the single product 0◇1=1 (which satisfies the two rules), and then try to greedily adjoin a new product for 1◇1. If we follow the given algorithm we will set 1◇1=2 and then set 1◇2=0 and then stop. However, rule (1) is then broken, since 1◇(1◇1) is defined and equals 0, but 1◇0 is undefined (when it should equal to 1).

This is probably just a minor side-case (which appears only when a=b) that could be handled by adding an extra generating rule in that case. Those automated methods you mentioned would be most welcome for finding such edge cases.

Terence Tao (Oct 12 2024 at 20:28):

Good catch! Pen and paper methods are making a lot of mistakes, we need either ATP or Lean I think to be absolutely sure here.

To fix 73, one needs to add an additional running property:

(3) If x ◇ y is defined and equal to y, then y ◇ x is also defined and equal to y. (This is necessary because 73 forces y ◇ (y ◇ y) = x, and a second application of 73 then forces y ◇ x = y.)

However it seems that (3) propagates under the existing rules (i), (ii), and also allows rules (1), (2) to persist, so now we should be OK (but it would need to be formally verified).

Martin Dvořák (Oct 12 2024 at 20:44):

Terence Tao said:

For 1076 x = y ◇ ((x ◇ (x ◇ y)) ◇ y), I think the situation is as follows. First observe that 1076 implies the following consequences:

(1) For any y, the map x ↦ x ◇ (x ◇ y) is injective. (Immediate from 1076.)
(2) If x ◇ (x ◇ y) = y, then y ◇ (y ◇ y) = x (by 1076), and hence y ◇ (x ◇ y) = y (by 1076 again).
(3) If x ◇ (x ◇ y) = y and z ◇ (z ◇ y) = x, then y = z (follows from (1) and (2)).

I've been trying to follow the reasoning here and I got stuck with (3).

import Mathlib.Tactic

class Magma (α : Type _) where
  op : α → α → α

infix:65 " ◇ " => Magma.op

abbrev Equation1076 (G: Type _) [Magma G] := ∀ x y : G, x = y ◇ ((x ◇ (x ◇ y)) ◇ y)

lemma Equation1076_step1 (G: Type _) [Magma G] (h: Equation1076 G) (y : G) :
    Function.Injective (fun x => x ◇ (x ◇ y)) := by
  intro a b hab
  dsimp only at hab
  exact (hab ▸ h a y).trans (h b y).symm

lemma Equation1076_step2 (G: Type _) [Magma G] (h: Equation1076 G) {x y : G} (hxy : x ◇ (x ◇ y) = y) :
    y ◇ (x ◇ y) = y := by
  have hyyyx : y ◇ (y ◇ y) = x := by
    rw [h x y, hxy]
  rw [←hyyyx, ←h]

lemma Equation1076_step3 (G: Type _) [Magma G] (h: Equation1076 G) {x y z : G}
    (hxy : x ◇ (x ◇ y) = y) (hz : z ◇ (z ◇ y) = x) :
    y = z := by
  apply Equation1076_step1 G h
  simp
  rw [hz]
  have := Equation1076_step2 G h hxy
  sorry

Can somebody please untangle me?

Terence Tao (Oct 12 2024 at 21:01):

I guess one is really using the substep hyyyx of Equation1076_step2, rather than Equation1076_step2 directly.

lemma Equation1076_step3 (G: Type _) [Magma G] (h: Equation1076 G) {x y z : G}
    (hxy : x ◇ (x ◇ y) = y) (hz : z ◇ (z ◇ y) = x) :
    y = z := by
  apply Equation1076_step1 G h y
  simp
  rw [hz, h x y, hxy]

Daniel Weber (Oct 13 2024 at 02:29):

Terence Tao said:

I guess one could use an existing ATP to verify the greedy completion claim, in advance of actually creating a Lean formalization. Here is one framework, in which one abstracts the partial magma operation x ◇ y = z into a ternary relation R(x,y,z). The language is then that of a single sort with the ternary relation R and three constants a,b,c, obeying the following general base axioms:

c != a and c != b. (This is part of c being novel.)

If R(x,y,z) and R(x,y,z'), then z=z'. (i.e., the relation is given by a partial magma operation.)

If R(x,y,z), then either x != a or y != b. (i.e., a ◇ b is undefined.)

If R(x,y,z), then x != c, y != c, and z != c. (This is the other part of c being novel.)

Then, for each law one wants to study, there will be additional axioms placed on top of this. For instance, in the case of the 1076 analysis, one adds the following axioms:

(0') If R(x,y,w), R(x,w,u), and R(u,y,z), then R(y,z,x).
(1') If R(x,y,z), R(x,z,w), R(x',y,z'), R(x',z',w), then x = x'.
(2') If R(x,y,w) and R(x,w,y), then R(y,w,y).
(3') If R(x,y,w), R(x,w,y), R(z,y,u), and R(z,u,x), then y=z.

Then one defines a new ternary predicate R' extending R by some rule. In the case of 1076, the rules are

(a) If R(x,y,z), then R'(x,y,z).
(b) One has R'(a,b,c).
(c) If R(x,b,y) and R(x,y,a), then R'(b,c,x).

We set R'(x,y,z) to be false if none of (a), (b), (c) apply.

One could then use an ATP to check that if R obeys 1-4 and (0')-(3'), then R' also obeys 2 and (0')-(3').

It managed to prove this. I imagine the process using this will be to search for a contradiction each time, then study it and figure out what rule you need to add? Is there a good equation to try that on?

Terence Tao (Oct 13 2024 at 02:37):

Yes, that was my proposed workflow, though I don't know to what extent the "figure out what rule you need to add" step can be automated from the ATP output. Roughly speaking one wants to find new conditions on R (not involving a, b, c) that are forced by requiring R' to obey all the relevant axioms, but are not already in the list of axioms one is prescribing for R.

Oberlix (1491) seems like a good place to start, since we were halfway through doing that anyway (even though @Pace Nielsen has a slightly different approach that settles the anti-implication to Asterix), so may be a good warmup. (Or one could revisit Asterix (65) or 73 just to calibrate the method.)

Just going by sheer yield of implications, 511 could be a good target - it involves only two variables, so the ability to proliferate side conditions seems somewhat under control, and I don't think it has been addressed by the confluent team. Understanding it could resolve 12 anti-implications and their duals, and there is also another equation 2338 that implies it that also looks like a reasonable target. EDIT: 1692 and 1648 are further high-yield candidates.

Daniel Weber (Oct 13 2024 at 04:03):

It seemed to work for 511, but it doesn't require injectivity in the initial conditions, so I must have a bug somewhere

Terence Tao (Oct 13 2024 at 04:05):

That is indeed weird. 511 implies injectivity of right multiplication: x ◇ y = x' ◇ y implies x = x'. This would have to be carried in the set of rules one is greedily trying to maintain.

Daniel Weber (Oct 13 2024 at 04:12):

Perhaps I encoded it incorrectly? The axiom is $R(x, y, m_1) \wedge R(y, m_1, m_2) \wedge R(y, m_2, m_3) \to R(y, m_3, x)$

Daniel Weber (Oct 13 2024 at 04:20):

Apparently it's drawing information from the definition of $R'$ , but that shouldn't be possible :thinking:

Terence Tao (Oct 13 2024 at 04:21):

That looks right. Though... I guess it would take a few iterations of this procedure to reach injectivity, because of the three left-multiplications. One should first encounter the injectivity of x ↦ y ◇ (y ◇ (x ◇ y)), then the injectivity of x ↦ y ◇ (x ◇ y), then finally the injectivity of x ↦ x ◇ y. [Of course, since we can see the injectivity of x ↦ x ◇ y clearly, we can skip the intermediate steps and just add the latter injectivity axiom to the greedy algorithm scheme and see what happens.]

Daniel Weber (Oct 13 2024 at 04:23):

I found the problem, there were missing parentheses around the quantifiers in the definition of $R'$

Terence Tao (Oct 13 2024 at 04:26):

I guess even an ATP-centric workflow is not immune to human error in setting problem specifications :laughing:

Daniel Weber (Oct 13 2024 at 04:27):

If we have $a \diamond a = b, b \diamond a = a$ and we set $a \diamond b = c$ then $b = a \diamond (a \diamond (a \diamond (b \diamond a))) = a \diamond c$ , and then $a = a \diamond (a \diamond (a \diamond (a \diamond a))) = a \diamond (a \diamond c) = c$ , which is a contradiction.
I think $(a \diamond a) \diamond a = a \to a \diamond (a \diamond a) = a$ works to exclude this?

Terence Tao (Oct 13 2024 at 04:30):

That looks good, and also necessary, since your argument shows that this implication is a consequence of 511

Daniel Weber (Oct 13 2024 at 04:38):

Another contradiction: if $a = b, a \diamond x = a, x \diamond a = x$ then $x = a \diamond (a \diamond (a \diamond (x \diamond a))) = a \diamond c$ , but then $a = a \diamond (a \diamond (a \diamond (a \diamond a))) = c$ , contradiction.
Is there a more general rule for this than $a \diamond x = a \wedge x \diamond a = x \to x = a$ ?

Terence Tao (Oct 13 2024 at 04:43):

I think you may want $a \diamond x = a \wedge x \diamond a = x \implies a \diamond a = a$ instead. EDIT: No, wait, your version is stronger.

Terence Tao (Oct 13 2024 at 04:48):

These look like pretty safe rules to add, by the way: inserting a new entry $a \diamond b = c$ into the multplication table with c novel looks very unlikely to trigger any violation of these side conditions since c only shows up once in the new table

Daniel Weber (Oct 13 2024 at 05:01):

We also need injectivity of $y \diamond (x \diamond y)$

Terence Tao (Oct 13 2024 at 05:17):

Are you discovering these by hand, or through the ATP, or some combination of both?

Daniel Weber (Oct 13 2024 at 05:17):

I'm running the ATP, seeing the contradiction, and then figuring out by hand what rule is needed to prevent it

Terence Tao (Oct 13 2024 at 05:20):

OK. Hopefully the process converges. It doesn't look like there is an explosion of conditions so far, the ones you have look relatively mild.

Terence Tao (Oct 13 2024 at 05:31):

You should probably also add injectivity of $x \mapsto y \diamond (y \diamond (x \diamond y))$ too.

Daniel Weber (Oct 13 2024 at 05:32):

If that's defined then we already have y⋄(y⋄(y⋄(x⋄y))) = x, so I don't think it's needed

Terence Tao (Oct 13 2024 at 05:32):

Ah right

Terence Tao (Oct 13 2024 at 05:35):

By the way I just added a file Conjectures.lean to hold the implications @Pace Nielsen worked out earlier. If you resolve some implications from 511 I can add them to that file also.

Daniel Weber (Oct 13 2024 at 07:12):

I added 45 rules which rule out all 4x4 counterexample and now it gives 5x5 ones, so I'm not very optimistic about it converging

Daniel Weber (Oct 13 2024 at 07:12):

I'll try the same process on 65 as a sanity check

Daniel Weber (Oct 13 2024 at 07:23):

Yeah, that converged quite quickly

Daniel Weber (Oct 13 2024 at 08:35):

Only the rules are important, right? Once we have them I think it should be immediate how $R'$ should be defined, if we restrict it to only differ from $R$ in $R'(a, b, c)$ and $R'(c, \_, \_), R'(\_, c, \_)$ (and any values we have to set to satisfy old rules should be possible to simply convert to new rules)

Daniel Weber (Oct 13 2024 at 09:47):

Are all of the relevant rules either of the form $R(a_1, a_2, a_3) \wedge R(a_4, a_5, a_6) \wedge \cdots \to R(a_i, a_j, a_k)$ (where some of $a_1, a_2, \dots$ may be equal) or $R(a_1, a_2, a_3) \wedge R(a_4, a_5, a_6) \wedge \cdots \to a_i = a_j$ , or can there be other relevant rules?

Daniel Weber (Oct 13 2024 at 13:36):

I wrote a program which can do this automatically, but because the models Vampire sometimes use more set values then necessary it adds useless rules, I'll try to see how to prevent that

Terence Tao (Oct 13 2024 at 14:28):

I think that's right, at least for the notion of greedy algorithm I had in mind. It is conceivable that there is some much fancier greedy algorithm that does something more than just fill in one entry at a time (together with whatever other entries are needed to ensure local consistency), but I can't conceive of it now (we'd probably have see a concrete example.)

So it looks like this method is not a panacea for all the outstanding equations (which is to be expected - from general undecidabiity results we know that no technique could possibly handle everything) - but it seems to handle a different set of equations than e.g., the confluence based methods, and seems to be not previously known in the literature, so it's already a good scientific contribution.

Daniel Weber (Oct 13 2024 at 14:32):

Daniel Weber said:

I wrote a program which can do this automatically, but because the models Vampire sometimes use more set values then necessary it adds useless rules, I'll try to see how to prevent that

I didn't mange to figure out how to do this yet. Without this it finds 102 rules for equation 1648, and didn't terminate so far for 511 and 1692. However, many of the rules could be more general, so I expect a better rule-finding approach could do significantly better

Terence Tao (Oct 13 2024 at 14:35):

How does it fare for 73 (which should close rapidly) and 1491 (which looked like it was on its way to closing)?

Daniel Weber (Oct 13 2024 at 14:38):

It didn't manage either of them

Terence Tao (Oct 13 2024 at 14:39):

Hmm. You think this is because of the "adding useless rules" issue?

Daniel Weber (Oct 13 2024 at 14:40):

I think so, yes. I thought of something which might fix that, I'll try it

Daniel Weber (Oct 13 2024 at 14:59):

It was definitely the problem, it now found a set of 8 rules for 73, and 1648 which previously had 102 rules now has 5

Terence Tao (Oct 13 2024 at 15:05):

So the previous code already found a (very complicated, but working) way to perform a greedy algorithm for 1648, and now one has a much shorter one?

Daniel Weber (Oct 13 2024 at 15:06):

Yes

Terence Tao (Oct 13 2024 at 15:06):

That's promising progress.

Terence Tao (Oct 13 2024 at 16:19):

Pace Nielsen said:

I have a couple students who have been following this project with me, and so if someone has any outstanding implications where they'd like humans to be "deployed", just post them here and I'll see if the students are interested. [I've found this project to be quite fun, and so I might give my hand at a couple more too!]

One place where your students could help is to find ways to use your 1076 construction to disprove the other 39 open implications from 1076. This could be added to the existing issue at equational#506 to establish 1076 !=> 3. (Presumably, it would be efficient to just refute the maximal anti-implications in this diagram.)

Terence Tao (Oct 13 2024 at 23:58):

I wrote an initial blueprint chapter on greedy algorithms at https://teorth.github.io/equational_theories/blueprint/greedy-chapter.html

Joe McCann (Oct 14 2024 at 01:11):

Daniel Weber said:

It was definitely the problem, it now found a set of 8 rules for 73, and 1648 which previously had 102 rules now has 5

Ahh rip I was tinkering around with 1648 by hand, think its still a worthwhile endeavor haha?
Fun fact, the coefficients of the linearization ax+by of 1648 are the golden ratio!

Terence Tao (Oct 14 2024 at 01:23):

If you have found some interesting features on 1648, such as the golden ratio observation, you can create a commentary/Equation1648.md file to record them, and they will then show up on the Equation Explorer page for the equation. They may possibly be of later use, even if there is another technique that could in principle resolve the implications here.

Daniel Weber (Oct 14 2024 at 02:45):

The rules produced for 1648 are:

If R(x, y, z) and R(z, y, w) then R(z, w, x)
If R(x, y, y) and R(z, y, y) then x = z
If R(x, x, y) and R(z, x, y) then x = z
If R(x, y, z) and R(w, y, z) then x = w
In particular, the rule minimization is still not working very well (as the second the third rules are special cases of the fourth). What would be a good way to generalize a rule? The best thing I can think of is following the proof and seeing which instances of the same variable don't actually have to be equal, but that'll take some effort to program and there might be a simple proof assuming equality and a more complicated one without it

Daniel Weber (Oct 14 2024 at 02:48):

Another approach might be to try all partitions of instances of each variable in the rule, but I'm not sure if there are too many

Terence Tao (Oct 14 2024 at 02:57):

Well if your program runs automatically on an equation, even if inelegantly, that's good enough to perform an automated run on all outstanding equations and get a shortlist of equations for which we can focus on with human effort, improving upon whatever messy output the program gives.

Terence Tao (Oct 14 2024 at 03:01):

Another place where automation could be helpful here is in designing the partial solutions which obey one law (such as 1648) but not another, e.g., 1426, to pick one at random, though the requirements for a partial solution are so "loose" that one can probably just do them by hand.

Terence Tao (Oct 14 2024 at 03:02):

By the way, I just PR'ed an update to the commentary on some equations (including 1648, 2000, and 1659) to link to the discussion in various threads including this one.

Terence Tao (Oct 14 2024 at 03:05):

If you have n rules and want to see if one is redundant, I guess you could just delete one and see if the ATP can easily recover it from the other n-1 rules.

Daniel Weber (Oct 14 2024 at 03:06):

The problem is that the generation is slow, because it generates all of those specific laws before the general one

Terence Tao (Oct 14 2024 at 03:07):

To be fair, a human implementation of this algorithm would likely also proceed like this. Often one discovers the special cases of what one needs before the general one

Terence Tao (Oct 14 2024 at 03:13):

This, by the way, could be a useful feature of a next-generation AI: to take an overly specialized proof that proves too narrow a conclusion from a given set of hypotheses, and generalize it to its "natural" strength, proving the best conclusion of the same rough form as one's original proposal, using basically the same proof. But I don't think current generation AI are quite there yet.

Terence Tao (Oct 14 2024 at 03:22):

In any case, it seems that we now have a path to resolve 1648, by greedily extending a ternary relation R just using the two rules

If R(x, y, z) and R(z, y, w) then R(z, w, x)
If R(x, y, z) and R(w, y, z) then x = w

(i.e., maintain right cancellativity as well as 1648 as one extends). This should hopefully resolve all 15 implications from this equation, plus their duals. I've opened equational#566 for this if someone wants to claim at least the mathematical portion of this task, if not the Lean portion.

Daniel Weber (Oct 14 2024 at 03:23):

From my experience with 65 the Lean part is just a lot of annoying case-checking, I'm hoping I could automatically formalize Vampire's proofs

Terence Tao (Oct 14 2024 at 03:25):

OK then feel free to claim the task yourself if you're already on track to accomplish it.

Terence Tao (Oct 14 2024 at 03:31):

The proof that the above two rules extend when adjoining R(a,b,c) for a novel c (and also adjoining R(a,c,x) if R(x,b,a)) actually looks quite short to do by pen and paper. 1648 may in fact ultimately be a little easier to work with than 65.

Daniel Weber (Oct 14 2024 at 03:39):

Daniel Weber said:

Another approach might be to try all partitions of instances of each variable in the rule, but I'm not sure if there are too many

I did this, and it managed to solve 511

Terence Tao (Oct 14 2024 at 03:47):

Is it automated enough to make a run on all the outstanding equations? There's about 40 left (up to duality)

Daniel Weber (Oct 14 2024 at 03:56):

I think so, but I'm still making some optimizations

Terence Tao (Oct 14 2024 at 04:31):

For future reference, the outstanding equations, quotiented by duality and equivalence, in decreasing order of yield, are

(1076), (1659), 1485, (1648), (511), 1692, 854, 1112, 707, 713, 1117, 1722, 883, 917, 1289, 1518, 3342, 1526, 476, 503, 1447, 1516, 1729, 1113, 1437, 3308, 677, 63, 118, 1323, 906, 879, 124, 1728, 3352, 3475

where the equations in parentheses are already claimed by previous posts.

Daniel Weber (Oct 14 2024 at 04:54):

I implemented a better minimization, it now found 13 rules for 511. I'll try it on all remaining equations

Daniel Weber (Oct 14 2024 at 05:08):

Daniel Weber said:

Only the rules are important, right? Once we have them I think it should be immediate how $R'$ should be defined, if we restrict it to only differ from $R$ in $R'(a, b, c)$ and $R'(c, \_, \_), R'(\_, c, \_)$ (and any values we have to set to satisfy old rules should be possible to simply convert to new rules)

This is false - in 1692 there's the rule $a \diamond a = x, b \diamond a = a, a \diamond b = c \to a \diamond x = c$ , I need to think for a bit how to account for that

Daniel Weber (Oct 14 2024 at 06:17):

It succeeded on 118, 124, 476, 503, 677, 707, 713*, 883, 906, 1112, 1113, 1289*, 1447. The * are equations it seemingly succeeded on, but couldn't prove that all rules are preserved. I'll try to run it on the remaining equations with more time

Terence Tao (Oct 14 2024 at 14:11):

Thats a pretty good success rate!

Terence Tao (Oct 14 2024 at 14:54):

When you are done with your run, can you update Conjectures.lean with all the implications that this procedure will likely resolve (presumably in the negative), so that it is clear to the other participants what remains unresolved even in advance of the Lean formalization? I imagine that unless an unknown equation ends up as part of the ruleset (which is very unlikely, since otherwise one of the ATPs would have resolved the implication already), it will be very easy to construct a partial solution R that obeys that equation (just pick the bare minimum of values of R to make the equation hold as "generically" as possible, then complete by the bare minimum for the ruleset), and in fact by taking "disjoint unions" of such examples one could even create a single infinite magma that disproves all of the implications at once, if one wished.

Daniel Weber (Oct 14 2024 at 15:00):

I'm not at home right now, I'll do it once I return, and I'll also check the rules with more time to confirm everything

Daniel Weber (Oct 14 2024 at 16:21):

It claims to have produced a rule set for all equations except 879, 1117, 1518, 1729, 3475, but for many rules it couldn't prove that they work, it just couldn't produce a counterexample in the alloted time, so they'll have to be rechecked (I think I'll just try to run proof-finding with more time)

Terence Tao (Oct 14 2024 at 16:28):

That's tentatively very encouraging. In particular that would knock out all the really "high yield" implications and leave us with as few as 14 outstanding implications (modulo duality), if everything goes right... :fingers_crossed:

Daniel Weber (Oct 14 2024 at 16:34):

Well, the rule set for 1485 was indeed invalid :sad: I'll try to rerun just for it and see if it eventually resolves

Pace Nielsen (Oct 14 2024 at 21:38):

As requested, here is a proof for the remaining 39 non-implications from Eqn1076. They all essentially follow the same format as 1076 =/> 3, and can be handled almost exactly the same way in each case.
Equation1076-corrected.pdf
Equation1076-corrected.tex

Edited: Needed to fix a few minor things in the files.

Terence Tao (Oct 14 2024 at 21:59):

I'm PR'ing this list to Conjectures.lean. So we should see a big chunk taken out of the "no conjecture" dashboard statistic soon (currently it stands at 342, with duality I project a drop to 264).

Daniel Weber (Oct 15 2024 at 08:26):

Daniel Weber said:

Well, the rule set for 1485 was indeed invalid :sad: I'll try to rerun just for it and see if it eventually resolves

I fixed some bugs and did a run which produces proofs, and it definitely found greedy rules for: 118, 124, 476, 503, 677, 707, 713*, 883, 906, 1112, 1113, 1289*, 1447*

the three equations with * are somewhat problematic in that their rule sets are quite large, and the proofs that they're preserved are very large and use a SAT solver as a subprocedure, so I don't think they could be formalized in Lean without native_decide, which means adding an axiom (and it will also require quite a bit of effort).
The rest of the equations have simpler proofs, but they'll still be fairly large (118 has ~5,000 logical steps involved, for instance), and I'm not sure how hard it'll be to adapt to existing Vampire->Lean formalization to them

Zoltan A. Kocsis (Z.A.K.) (Oct 15 2024 at 08:37):

This is vey exciting news despite the difficulties ahead! I can't wait to see an actual model refuting 1112->8 after I found out that my construction for the dual 2460->23 was fatally flawed.

Terence Tao (Oct 15 2024 at 14:52):

Very interesting! So we now have examples of extremely lengthy proofs of anti-implications.

Would you be comfortable with marking the open anti-implications coming from the non-asterisked equations as conjectures? If they come with small rulesets, then it should be possible to obtain a partial counterexample obeying those rulesets and not obeying any other law not know to be implied by the original law, since otherwise this would presumably mean there is a short derivation of that law from the original law.

On the other hand, the situation with the asterisked equations is more uncertain. It could be that the laws that are unknown anti-implications of these laws do actually happen to be forced by these massive rulesets, due to a lengthy proof of the positive implication that previous ATPs missed. I guess this could be checked by a further SAT-solver calculation, since we actually have the rulesets in hand. Maybe I'll mark those in commentary rather than as formal conjectures.

Daniel Weber (Oct 15 2024 at 14:58):

I'll search for assignments for all of the anti-implications

Daniel Weber (Oct 15 2024 at 15:01):

I've been thinking about how to formalize the proofs in Lean, and I'm not sure what's the best way to represent clauses. I could do P1 ∨ .. ∨ Pn, but if I need (in meta code) to reorder that to P2 ∨ (P1 ∨ P3 ∨ .. ∨ Pn), for example, it seems like it would be really annoying. Is there a better way to do that?

Terence Tao (Oct 15 2024 at 15:20):

Just added equational#589 to add comments to the "barely greedily satisfiable" laws 713, 1289, 1447. It may be that some other approach is more effective on them than the greedy approach.

I assume your search for assignments for auto-implications will be at least partially automated? This sounds like it could be quite tedious otherwise.

The general question of how to convert a SAT-solver proof to a Lean proof sounds like it would be of broader interest beyond the Equational Theories project. Perhaps there is some discussion of this already elsewhere in the Zulip?

Daniel Weber (Oct 15 2024 at 15:21):

The search for assignments will be fully automated, yes. There is bv_decide which invokes a SAT solver, but it depends on native_decide, which means adding an axiom.

Daniel Weber (Oct 15 2024 at 15:48):

Daniel Weber said:

I've been thinking about how to formalize the proofs in Lean, and I'm not sure what's the best way to represent clauses. I could do P1 ∨ .. ∨ Pn, but if I need (in meta code) to reorder that to P2 ∨ (P1 ∨ P3 ∨ .. ∨ Pn), for example, it seems like it would be really annoying. Is there a better way to do that?

Even turning (a ∨ b ∨ c) ∨ (d ∨ e ∨ f) to a ∨ b ∨ c ∨ d ∨ e ∨ f seems quite complicated

Daniel Weber (Oct 15 2024 at 17:12):

Daniel Weber said:

I'll search for assignments for all of the anti-implications

It instantly found one for all of them, I'll add the conjectures. Should it go in Conjectures.lean or in Generated?

Terence Tao (Oct 15 2024 at 17:15):

Huh, interesting question. I think it could fit in either; if one uses Facts then it should not blow up the Conjectures file beyond human readability. On the other hand, it is technically a computer generated result, so it could also fit in Generated. How about this: if you intend to also upload the computer code generating the conjectures, then put it in Generated; otherwise, just put the results in Conjectures and add a link to this discussion.

Terence Tao (Oct 15 2024 at 17:16):

So even the giant rulesets for the asterisked equations did not prevent counterexamples for their respective anti-implications?

Daniel Weber (Oct 15 2024 at 17:17):

They aren't that giant, the largest is 27 rules

Terence Tao (Oct 15 2024 at 17:17):

Oh okay then. It's only the SAT solver verification of greediness which is lengthy?

Daniel Weber (Oct 15 2024 at 17:17):

Yes

Terence Tao (Oct 15 2024 at 17:20):

Hmm. Well at least the high-level construction is (barely) human-readable then. I imagine the verification of greediness is going to be mind-numbingly tedious if done by hand. I hope it can be ported to Lean in a way that does not blow up the Lean compiler. It would make some of the individual implications computer-assisted in the classical sense (four color theorem etc.), but that's fine with me.

Terence Tao (Oct 15 2024 at 17:21):

Though perhaps the 27 rules can be condensed after some human inspection to a more manageable number.

Terence Tao (Oct 15 2024 at 17:22):

(Maybe this is a good use case for ImProver? https://arxiv.org/abs/2410.04753 )

Daniel Weber (Oct 15 2024 at 17:23):

equational#594

Daniel Weber (Oct 15 2024 at 17:30):

Even 118 (which it managed without a SAT solver) has 11 rules, and they look mostly opaque to me:

$R(x, y, z) \wedge R(x, y, z') \to z = z'$ (this is the operation being well defined)
$R(x, y, z) \wedge R(z, y, w) \to R(y, w, x)$ (this is the actual equation)
$R(x, y, z) \wedge R(x', y, z) \to x = x'$ (this is injectivity)
$R(x, y, z) \wedge R(y,y,x) \to R(z, y, y)$
$R(x, y, y) \to R(y, x, y)$
$R(x, x, y) \wedge R(y, x, z) \to R(x, y, z)$
$R(x, x, y) \wedge R(x, y, z) \to R(x, z, x)$
$R(x, x, y) \wedge R(x, y, z) \to R(z, x, x)$
$R(x, y, z) \wedge R(y, y, x) \wedge R(w, x, y) \to R(x, z, w)$
$R(x, x, y) \wedge R(z, x, y) \wedge R(z, y, x) \to y = z$
$R(x, x, y) \wedge R(z, x, x) \to R(y, x, z)$
$R(x, x, y) \wedge R(x, y, z) \to R(y, x, z)$

Terence Tao (Oct 15 2024 at 17:38):

All of them should be logical consequences of the original law 118 (assuming the magma is globally defined) via some finite number of rewrites. For instance, $R(x,y,y) \to R(y,x,y)$ , in original notation, asserts that if $x \diamond y = y$ , then $y \diamond x = y$ . This should therefore be derivable from the original law x = y ◇ ((x ◇ y) ◇ y) somehow.

The rule $R(x,x,y) \wedge R(y,x,z) \to R(x,y,z)$ is bothering me though. It can be written as $(x \diamond x) \diamond y = x \diamond y$ (Equation 375) which EquationExplorer lists as NOT being a consequence of 118. So something is wrong with my intuition, I thought that the greedy algorithm would only generate rules that are forced by the original law. Hmm, I don't have a good explanation for this currently.

Daniel Weber (Oct 15 2024 at 17:41):

Isn't that rule $x \diamond (x \diamond x) = (x \diamond x) \diamond x$ ? That's equation 4380, and it is implied by 118

Terence Tao (Oct 15 2024 at 17:43):

I really am quite error prone at pen and paper calculations. Thanks. OK, that's one down... perhaps the other seven laws won't be so opaque after all.

Terence Tao (Oct 15 2024 at 17:44):

These rulesets may actually shed a fair amount of light on what the 'local' behavior of a magma obeying a given rule like 118 is like.

Terence Tao (Oct 16 2024 at 01:43):

For what it's worth, I can now verify that all the rules that were autogenerated from 118 are, in fact, necessary consequences of 118. I'll abbreviate $x^2 := x \diamond x$ in what follows. The list here expands upon @Daniel Weber 's previous commentary.

$R(x,y,z) \wedge R(x,y,z') \to z=z'$ : this is $x \diamond y$ being well-defined.
$R(x,y,z) \wedge R(z,y,w) \to R(y,w,x)$ : this is the original equation 118 ( $x = y \diamond ((x \diamond y) \diamond y)$ .
$R(x,y,z) \wedge R(x',y,z) \to x=x'$ : this is right-cancellativity (an easy consequence of 118).
$R(x,y,z) \wedge R(y,y,x) \to R(z,y,y)$ : this is equation 255, $(y^2 \diamond y) \diamond y = y$ , a known consequence of 118.
$R(x,y,y)→R(y,x,y)$ : the assertion that $x \diamond y = y$ implies $y \diamond x = y$ . Proof: Equation 255 and right-cancellativity implies that $x = y^2 \diamond y$ . The claim then follows from equation 99, $y \diamond (y^2 \diamond y) = y,$ a known consequence of 118.
$R(x,x,y) \wedge R(y,x,z) \to R(x,y,z)$ : this is equation 4380, $x^2 \diamond x = x \diamond x^2$ , a known consequence of 118.
$R(x,x,y) \wedge R(x,y,z) \to R(x,z,x)$ : this is equation 47, $x = x \diamond (x \diamond x^2)$ , a known consequence of 118.
$R(x,x,y) \wedge R(x,y,z) \to R(z,x,x)$ : this is equation 255 again, $(x^2 \diamond x) \diamond x = x$ . It is redundant in view of 4.
$R(x,y,z) \wedge R(y,y,x) \wedge R(w,x,y) \to R(x,z,w)$ : if $w \diamond y^2 = y$ then $y^2 \diamond (y^2 \diamond y) = w$ . This follows from 118 with $x,y$ replaced by $w,y^2$ respectively, followed by equation 4380.
$R(x,x,y) \wedge R(z,x,y) \wedge R(z,y,x) \to y=z$ : If $zx = x^2$ and $z \diamond x^2 = x$ then $z = x^2$ . The first equation and right-cancellativity gives $z=x$ ,thus $x \diamond x^2 = x$ , and the claim now follows from equation 47. I think this is redunant because of 7.
$R(x,x,y) \wedge R(z,x,x) \to R(y,x,z)$ : If $z \diamond x = x$ then $x^2 \diamond x = z$ . This follows from 255 and right cancellativity. Not sure if it is completely redundant in the partial magma setting.
$R(x,x,y) \wedge R(x,y,z) \to R(y,x,z)$ This is 4380 again, though I think it doesn't actually follow from 6, rather 6 and 12 should be thought of jointly as the local analogue of 4380.

So while there is a small amount of trimming that could be done here, the rule set is pretty close to optimal, I think.

Daniel Weber (Oct 16 2024 at 01:52):

Hmm, I wonder if the rule sets could be more efficient if we start with adding all equations which are known to follow from the equation

Terence Tao (Oct 16 2024 at 01:54):

In terms of partial magma operations, the greedy extension rule seems to be this: suppose one wants to adjoin a new entry $a \diamond b = c$ to the partial multiplication table, with $c$ novel. If $a = b^2$ , we also set $b \diamond a = c$ and $c \diamond b = b \diamond c = b$ . If $b = a^2$ , we set $b \diamond a = c$ , $c \diamond a = a \diamond c = a$ . (It is possible that we perform both of these extensions, though note in those cases we must have $a \neq b$ since $a \diamond b$ was undefined.) Otherwise, we perform no other additions.

Terence Tao (Oct 16 2024 at 01:54):

I think one can add all implied equations which are not specializations of other known equations. There's no point adding specializations, those are truly redundant.

Terence Tao (Oct 16 2024 at 01:56):

More generally, any equation that can be derived from existing laws without using any new entries of the multiplication table beyond what was used to state the original equation, could also be safely omitted.

Terence Tao (Oct 16 2024 at 03:46):

What happened to 511 (or its dual 3116)? At one point it seemed that the greedy method could resolve it, but it seems to have dropped off the list. Was there a bug in the previous resolution? It's the second highest-yield equation remaining (it would resolve 24 implications); only 1485 (dual 2162) is higher at 38.

Daniel Weber (Oct 16 2024 at 03:47):

Terence Tao said:

For future reference, the outstanding equations, quotiented by duality and equivalence, in decreasing order of yield, are

(1076), (1659), 1485, (1648), (511), 1692, 854, 1112, 707, 713, 1117, 1722, 883, 917, 1289, 1518, 3342, 1526, 476, 503, 1447, 1516, 1729, 1113, 1437, 3308, 677, 63, 118, 1323, 906, 879, 124, 1728, 3352, 3475

where the equations in parentheses are already claimed by previous posts.

It was in parentheses here, so I didn't run on it. I'll do a run on the four equations in parentheses there, that might be interesting

Terence Tao (Oct 16 2024 at 03:48):

Oh, it was in parentheses because you seemed to have already solved it in a previous run :sweat_smile:

Daniel Weber (Oct 16 2024 at 04:01):

It succeeded on 511 and 1648, and 1076 with an asterisk

Terence Tao (Oct 16 2024 at 04:03):

1076 currently lists 5 unresolved implications, but that is a reporting error; @Pace Nielsen 's latest document conjecturally resolves all of them (and I just PR'd an update to fix this). 1648 still has one open implication, and 511 has 12. So this could (conjecturally at least) knock off 26 more implications, bringing us down to 154 I think. (Assuming that partial solutions for all the anti-implications can be found, which based on previous experience looks extremely likely.)

Daniel Weber (Oct 16 2024 at 04:05):

I can confirm that partial solutions can be found

Terence Tao (Oct 16 2024 at 04:05):

I guess you can PR an update to Conjectures.lean then. We're down to 19 unresolved equations (up to equivalence and duality)! In decreasing order of yield (in parentheses): 1485 (38), 1692 (22), 854 (16), 1117 (10), 1722 (10), 917 (8), 1518 (8), 3342 (8), 1526 (6), 1437 (4), 1516 (4), 1729 (4), 3308 (4), 63 (2), 879 (2), 1323 (2), 1728 (2), 3352 (2), 3475 (2).

Daniel Weber (Oct 16 2024 at 07:35):

(deleted)

Terence Tao (Oct 17 2024 at 01:33):

Just to confirm, all open implications from 1648 and 511 are conjectured to be resolved? If so i can go ahead and update Conjectures.lean accordingly

Daniel Weber (Oct 17 2024 at 15:29):

I mostly finished the Leanification (for the non-asterisked equations), but equations 118 and 511 are really long (more than 4000 lines for 511, in a single theorem), and take a really long time to be checked. I'm not quite sure what to do about that, perhaps I'd split it to one file per rule

Terence Tao (Oct 17 2024 at 15:32):

In the previous discussion, it was noted that it was possible to trim the 118 ruleset by one or two equations. Does this noticeably affect the length of the proof? (It's not obvious to me that it does, it might even make the verification longer, but perhaps it's worth checking.)

Daniel Weber (Oct 17 2024 at 15:38):

It increased the size of the proof :thinking:

Terence Tao (Oct 17 2024 at 15:39):

Interesting. It could be like AlphaGeometry - adding a key new point or line to the problem, despite ostensibly making the problem more complicated, can actually simplify the proof substantially.

Terence Tao (Oct 17 2024 at 15:40):

There is always the option to get a human to try to prove that the rulesets for 118 or 511 close, though it seems like an inversion of the natural order of things to get the human to do the tedious work to spare the computer assistant.

Terence Tao (Oct 17 2024 at 15:42):

Do you think the length is due to the inefficiency of converting a SAT solver calculation to a Lean program? In that, if SAT solving could be integrated more into Lean than it is currently, the proof would be significantly shorter?

Zoltan A. Kocsis (Z.A.K.) (Oct 17 2024 at 15:43):

Instead of checking the ruleset, it may be possible to
a.) use the ruleset to generate the actual countermodels we want
b.) study and describe those particular models some other way,
c.) give only that description to Lean.

Terence Tao (Oct 17 2024 at 15:44):

It's possible that by adopting @Pace Nielsen 's approach of selecting entries in batches, rather than one at a time, the countermodels become easier to describe. The vanilla greedy approach is extremely "local" by design, it cannot "see" any global structure to the model it is building.

Daniel Weber (Oct 17 2024 at 15:45):

Terence Tao said:

Do you think the length is due to the inefficiency of converting a SAT solver calculation to a Lean program? In that, if SAT solving could be integrated more into Lean than it is currently, the proof would be significantly shorter?

The length is approximately the number of logical steps the ATP takes, so I don't think it could be shortened much. However, it's definitely possible that a DSL could allow elaborating much more efficiently than currently. But I think the easiest approach currently would be to try and split the proofs to multiple files, that might work well enough

Terence Tao (Oct 17 2024 at 15:45):

Of course, these approaches don't scale as well as the vanilla greedy method, but we're down to two equations, so perhaps scaling is no longer a concern.

Terence Tao (Oct 17 2024 at 15:49):

@Daniel Weber Is the ruleset for 511 too long to post here? In any event it might be good to have the rulesets added (perhaps as a Lean comment, or a separate text file) to each of the generated files you create.

In the future I can imagine one type of mathematical activity being that of starting with a horrifically long and complex machine-written proof and trying to tease out an elegant and conceptual human-readable proof out of it using various proof analysis tools combined with good old fashioned human pen and paper mathematics. Perhaps 118 and 511 could serve as test cases for such an activity. (Or one could just try feeding these proofs into e.g., ImProver and seeing what happens.)

Daniel Weber (Oct 17 2024 at 15:52):

these are the rules in Lean (note that these are in clausal form, as that's what Vampire works with)

  rule_0 : (∀ X0 X1 X2 X3, ¬ R X0 X1 X2 ∨ ¬ R X0 X1 X3 ∨ X2 = X3)
  rule_1 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X2 ∨ ¬ R X1 X2 X3 ∨ ¬ R X1 X3 X4 ∨ R X1 X4 X0)
  rule_2 : (∀ X0 X1, ¬ R X0 X1 X1 ∨ R X1 X0 X1)
  rule_3 : (∀ X0 X1, ¬ R X0 X1 X0 ∨ ¬ R X1 X0 X1 ∨ R X0 X0 X0)
  rule_4 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X2 ∨ ¬ R X3 X0 X4 ∨ ¬ R X3 X1 X2 ∨ R X0 X3 X4)
  rule_5 : (∀ X0 X1 X2 X3, ¬ R X0 X0 X1 ∨ ¬ R X0 X1 X2 ∨ ¬ R X3 X0 X2 ∨ X1 = X3)
  rule_6 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X2 ∨ ¬ R X1 X2 X3 ∨ ¬ R X2 X1 X3 ∨ ¬ R X4 X0 X1 ∨ R X0 X4 X1)
  rule_7 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X2 ∨ ¬ R X0 X3 X0 ∨ ¬ R X1 X3 X0 ∨ ¬ R X4 X1 X3 ∨ R X1 X2 X4)
  rule_8 : (∀ X0 X1 X2 X3, ¬ R X0 X1 X2 ∨ ¬ R X3 X1 X2 ∨ X3 = X0)
  rule_9 : (∀ X0 X1 X2, ¬ R X0 X1 X0 ∨ ¬ R X0 X2 X0 ∨ ¬ R X2 X0 X1 ∨ X1 = X2)
  rule_10 : (∀ X0 X1 X2 X3, ¬ R X0 X1 X0 ∨ ¬ R X2 X0 X3 ∨ ¬ R X3 X0 X1 ∨ R X0 X0 X2)
  rule_11 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X2 ∨ ¬ R X3 X0 X1 ∨ ¬ R X4 X0 X2 ∨ R X0 X3 X4)
  rule_12 : (∀ X0 X1 X2 X3 X4 X5, ¬ R X0 X1 X2 ∨ ¬ R X0 X3 X2 ∨ ¬ R X4 X0 X1 ∨ ¬ R X5 X0 X3 ∨ X4 = X5)
  rule_13 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X2 ∨ ¬ R X0 X3 X1 ∨ ¬ R X0 X4 X0 ∨ ¬ R X3 X0 X4 ∨ X2 = X0)
  rule_14 : (∀ X0 X1 X2 X3, ¬ R X0 X1 X2 ∨ ¬ R X0 X3 X0 ∨ ¬ R X1 X0 X3 ∨ R X0 X2 X0)
  rule_15 : (∀ X0 X1 X2 X3 X4, ¬ R X0 X1 X0 ∨ ¬ R X0 X2 X3 ∨ ¬ R X3 X0 X1 ∨ ¬ R X4 X0 X2 ∨ X0 = X4)

Daniel Weber (Oct 17 2024 at 15:52):

The proofs use custom tactics extensively, so I'm not sure how well ImProver would work on them

Terence Tao (Oct 17 2024 at 15:56):

It's interesting that there are a relatively large number of rules for 511, since 511 doesn't actually imply any interesting equations in our list (other than trivial specializations). There must be a certain amount of latent structure to magmas obeying this law that can't quite be picked up just by EquationX -> EquationY implications. I might try to decipher the interpretation of this ruleset later, it looks relatively doable, though perhaps a bit tedious.

Terence Tao (Oct 17 2024 at 17:36):

OK, I can derive all 16 rules here from 511 (assuming it holds globally for simplicity), but the statements are definitely "math from Mars" here. Only a few I would have recognized a priori as being possible conclusions of 511. It seems that the equation 511 is surprisingly rich in structure, and yet still only has a finite (but large) number of local constraints. In particular I don't see an obvious way to shorten the verification of the greedy algorithm to a human-readable form, without some major insight into the structure of 511 magmas.

Some notation: $L_y x = yx$ and $R_y x = xy$ are the left and right multiplication operations (I omit the magma law for brevity). 511 asserts that $x = L_y^3 R_y x$ . In particular $x = L_y^4 x$ , which we will often use; also $R_y$ is injective. While 511 does not imply the commutative law, many of the rules below seem to indicate that 511 magmas "want" to be commutative somehow. Also there are hints that 511 magmas "want" to be of "exponent 4" in some sense, even if they need not be a group. (Note that any abelian group of exponent 4 obeys 511.)

rule_0: this asserts that the magma operation is well-defined.
rule_1: This is the original law 511.
rule_2: If $xy=x$ then $yx=x$ . Proof: $yx = L_y (L_y^3 R_y x) = L_y^4 y = y$ .
rule_3: If $xy=x$ and $yx=y$ then $x^2=x$ . Proof: $y = L_x^3 R_x y = L_x^2 x$ , hence $x = L_x^4 x = L_x^2 y = L_x x = x^2$ .
rule_4: If $xy=wy=z$ and $wx=u$ then $xw=u$ . Proof: from injectivity of $R_y$ , $x=w$ , hence the claim.
rule_5: if $x^2 = y$ , $xy = z$ , $wx=z$ , then $w=y$ . Proof: $w = L_x^3 R_x w = L_x^4 R_x x = L_x x = y$ .
rule_6: If $xy=z$ , $yz=w$ , $zy=w$ , $ux=y$ , then $xu=y$ . Proof: $yx = L_y(L_y^3 R_y x) = L_y^3 w = L_y^3 R_y z = z = xy$ , hence $xu = L_x(L_x^3 R_x u) = L_x^4 y = L_x^3 R_x y = y$ .
rule_7: If $xy=z$ , $xw=x$ , $yw=x$ , $uy=w$ , then $yz=u$ . Proof: by injectivity of $R_w$ , $x=y$ . Now $u = L_y^3 R_y u = L_y^2 y = y z$ .
rule_8: This is the injectivity of $R_y$ .
rule_9: If $xy = xz = x$ and $zx = y$ then $x=y$ . Proof: $z = L_x^3 R_x z = L_x^2 x$ , hence $x = L_x^4 x = L_x^2 z = x^2$ , then $z = L_x^2 x = x$ , hence $y = x^2 = x$ .
rule_10: If $xy=x$ , $zx=w$ , $wx=y$ , then $x^2=z$ . Proof: $w = L_x^3 R_x w = L_x^3 y = L_x^2 x$ , hence $z = L_x^3 R_x x = L_x^3 w = L_x^5 x = L_x x = x^2$ .
rule_11: If $xy = z$ , $wx=y$ , $ux=z$ , then $xw = u$ . Proof: $u = L_x^3 R_x u = L_x^4 y = L_x^4 R_x w = L_x w$ .
rule_12: If $xy = xw = z$ , $ux = y$ , $vx=w$ , then $u=v$ . Proof: $u = L_x^3 R_x u = L_x^2 z = L_x^3 R_x v = v$ .
rule_13: If $xy=z$ , $xw = y$ , $xu=x$ , $wx=u$ , then $z=x$ . Proof: $z = L_x y = L_x^2 w = L_x^2(L_x^3 R_x w) = L_x^4 x = x$ .
rule_14: If $xy=z$ , $xw = x$ , $yx=w$ , then $xz=x$ . Proof: $x = L_x^4 x = L_x^4 (L_x R_x y) = L_x^2 y = xz$ .
rule_15: If $xy=x$ , $xz=w$ , $wx=y$ , and $ux=z$ , then $x=u$ . Proof: $u = L_x^3 R_x u = L_x^2 w = L^2_x L_x^3 L_x w = L_x^4 x = x$ .

Daniel Weber (Oct 18 2024 at 03:25):

Daniel Weber said:

It increased the size of the proof :thinking:

Tweaking the proofs so they can use earlier (already proven) rules, it now decreased the size of the proof

Daniel Weber (Oct 18 2024 at 05:18):

511 and 118 now work, although they take a few minutes

Michael Bucko (Oct 18 2024 at 13:57):

Terence Tao schrieb:

Only a few I would have recognized a priori as being possible conclusions of 511. It seems that the equation 511 is surprisingly rich in structure, and yet still only has a finite (but large) number of local constraints. In particular I don't see an obvious way to shorten the verification of the greedy algorithm to a human-readable form, without some major insight into the structure of 511 magmas.

I was trying to find a major insight into 511 and (so far) did not find anything. I constructed clusters of implication patterns, and turned them into a network. From this perspective, I can only see it that 511-related patterns (marked orange) belong to a cluster close to the center of the network.
newimage.png

Daniel Weber (Oct 20 2024 at 05:53):

I finally finished, equational#678

Terence Tao (Oct 20 2024 at 14:04):

I noticed that the conjectures for 511 have been removed, but the proofs for 511 are not incorporated into the database. Are you withdrawing the claims for 511? I do see a ruleset for 511 in your PR.

Daniel Weber (Oct 20 2024 at 14:05):

They should be, at https://github.com/teorth/equational_theories/blob/main/equational_theories/Generated/Greedy/Eq511.lean

Terence Tao (Oct 20 2024 at 14:06):

Hmm. Equation Explorer is listing the 12 implications from 511 as open for some reason (and the dashboard is count is similarly too high). Not sure where the discrepancy is coming from yet.

Daniel Weber (Oct 20 2024 at 14:06):

Perhaps I forgot to import it in Greedy.lean?

Daniel Weber (Oct 20 2024 at 14:07):

Yep, that's the problem, I'll make a PR fixing it

Daniel Weber (Oct 20 2024 at 14:14):

I made equational#683

Last updated: May 02 2025 at 03:31 UTC