Zulip Chat Archive

Stream: Equational

Topic: Resolving >60% of cases with non-implications automatically


Nicholas Carlini (Sep 27 2024 at 06:41):

TL;DR: I believe I can resolve >60% of the 4694^2 cases with "A does not imply B" via counterexample. But I'm not a mathematician so maybe I'm terribly misguided. I also don't know how turn my counterexamples into lean, so if someone does that, they'd solve half of the problems in one go. Maybe this is obvious? Not sure, I figured I'd write it down in case it's helpful.

Let me first state my understanding of the problem that's being studied. If I'm wrong here you can stop reading here and everything else is incorrect. You have a list of 4694 possible equations that might hold true involving an operator op over some set G, regardless of what op and G you choose. You want to know which equations imply one another.

(Equation A) implies (Equation B) if whenever (Equation A) is true, (Equation B) is also true. Proving that this is true is hard, because you have to show this for all operators and sets. But proving the negation is easier: all you have to do is find a single operator and set where (Equation A) is true, and (Equation B) is false.

For example, if we take the operator to be op(a, b) = a*b, and the set GF(2), then the Equation 3, x = op(x, x) is true for all x,y \in [0,1]. But Equation 4, x = op(x, y) is false, because 1 != 1*0. Therefore (Equation 3) does not imply (Equation 4).

Is my understanding here correct? (I hope so...)

If you've got this far it means my math undergrad from ten years ago has served me well (yay!).

I wrote a program to try and automatically brute-force potential counterexamples. To do this, I randomly generate second degree polynomials of the form op(x,y) = ax^2 + by^2 + cxy + dx + ey + f. Then, I choose G = Z/kZ for some small k.

Now I can just run through each of the 4694 equations, and check which equations are true, and which equations are false. For example, if I take the equation

op(x, y) = 4 * x2 + 4 * y2 + 4 * y + 2 * x * y over GF(5)

then the following three equations are true:

def Equation1 (G: Type*) [Magma G] := ∀ x : G, x = x

def Equation151 (G: Type*) [Magma G] := ∀ x : G, x = (x ∘ x) ∘ (x ∘ x)

def Equation159 (G: Type*) [Magma G] := ∀ x y : G, x = (x ∘ y) ∘ (y ∘ x)

and all others are false. (This can be checked by brute force.) This lets me fill in 3*(4694-3) possible implications.

Ideally we'll find some polynomials that are true often to get as many equations as possible. Indeed, the best operator for ruling out the most possible implications is just

op(x, y) = 0 [1556 true]

op(x, y) = x mod 2 [1214 true]

op(x, y) = x+y mod 2 [663 true]

As confirmation that I think I have this all correct, I can correctly refute every one of the non-implications already proven, and can't refute any of the implications already proven.

Now here's where I'm stuck after a few hours. I've never used Lean before. And I have half a dozen papers I'm supposed to be writing, so I won't be able to learn it in the next week.

What I have are a bunch of lines of the form

Equations ['Equation1242', 'Equation211'] do not imply ['Equation3050', 'Equation99', 'Equation3862'] with op(x,y) = (8 * x + 3 * y + 9) % 11

Equations ['Equation1285'] do not imply ['Equation513', 'Equation411'] with op(x,y) = (5 * x + 2 * y + 2) % 7

Equations ['Equation1083'] do not imply ['Equation2441'] with op(x,y) = (1 * x + 2 * y + 4) % 9

This lets me show that all of these implications can't be true (ordered by the problem index)
fig1.png
or if I re-order in easiesit-to-hardest
fig2.png

All to say: if anyone would like to take this dump and convert it all to lean, it'll give over half of the problems all at once.

Joachim Breitner (Sep 27 2024 at 06:51):

Sounds fun! Guess what you are doing is a variant of what I suggested in https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Proposing.20a.20universal.20algebra.20exploration.20using.20Lean/near/472926804 but with a more specific operation, and larger carrier sets.

I can have a look at how to move that to lean in an hour or so. Can you share your code?

Amir Livne Bar-on (Sep 27 2024 at 06:52):

How many operations do you have altogether? I wonder how much smaller the membership data (magma, equation) is than the implications data

Nicholas Carlini (Sep 27 2024 at 06:53):

It's a total of 482 equations that cover all the cases.

Nicholas Carlini (Sep 27 2024 at 07:04):

https://github.com/carlini/equational_theories/tree/main/brute_force

Has my code for doing this.

https://github.com/carlini/equational_theories/blob/main/brute_force/output.txt is a list of all of the invariances

Joachim Breitner (Sep 27 2024 at 07:05):

Also, could you include in your output data also the values for the equation that refutes the one equation - presumably you have it and it shortens the proof time in lean.

Nicholas Carlini (Sep 27 2024 at 07:06):

If I generate this for all of the problems it'll be a few hundred megabyte file. The python code generates it here https://github.com/carlini/equational_theories/blob/cc6a670252f1ce2ae5adfbca1630b8dbb646984d/brute_force/show_out.py#L59

Joachim Breitner (Sep 27 2024 at 07:07):

Hmm, ok , I'll try first with that we have and see if it would make a significant difference

Nicholas Carlini (Sep 27 2024 at 07:10):

Of maybe independent interest, here are the "hardest" equations that imply "almost nothing". (All of these imply fewer than 5 other equations with my current set of invariances. The true set may be even smaller.)

[   1 1832 4591 2238 4608   99   47 4598 4599 1020 4658 3253 4380 4605
 4276 4606  817 2441 1629 2847 4284 4343 4290 4321 4320 4293  203 4314
 3456 4065 3659 3862 2035 2644 4636 3050 1426  411  255 1223  151  614
 4275 3484 1238 1239 3458 4631  159 1248 1252 4635 3272 3079 4282 4283
 4270 3459 3457 2746 2736 3461 1075 3462 1036 1035 1028 4622 3464 1023
  167 3271 3472 3474 1109 4268 4269 4273 1038 4362 4316 3278  633 4585
 4584  632  622  620 4470 4472 4473 4480 4587  264 3112  513 3281 3316
 3315 3306  440  429  417  359 3068 3319 4307 4588 3343 4332 4341 4358
 4165  846  833  825  221 4369 2863 2866  639 2872 4398 4399 2903 4433
 4435 4442 4445 2946  669 3353 4590 4396 1315 1278 1322 2240 3887 2244
 2246 3915 2254 3925 1731 3675 3255 3674 3962 3880 3668 3256 3518 3667
 3665 3664 3662 4673   75 3661 3660 1657 1637 4677 3878 3877 3677 2050
 2063 3761 3759 2124 3752 3749 3748 3725 3724 3722 3721 3714 3712 3864
 3867 3868 3870 3687 1861 1860 3685 1857 3684 4684 1847 3678 2340 3258
  307 4666 1451 1429 1428 3261 4070 4073 4074 4083 4090 1454 4091 4118
 3522 4127 4647 3262 4128 4130 4131 3519 4093 4068 4071 3259 2457 2530
 2466 2444 2467 3152 2443 2506 2459 1631 4656 1488 2496  105 4067 4479
 1228 1026 3865 1721 1251 3482 3509 3511  427  414 3512 2088 4155 1313
 3318 4084 3481 2646 2293 3928 2266 3924 3918 3955 4405 4408  832 3556
  845 2910 2243 3888 1634 3881 1083  870 4080 1837 3475 4081 1848 1851
 2709 2503 1684 2043 3279 3139  107   50 3254  211  280 3269 1515 2543
 3115 2100 3058 1479 4583 1452 1049 2534  179 1444 1647 3346  108 3075
 3149 4272  378  160  127 2504 2128 2137  466 3308 1325 1934 3066 1519
 2699 4120 1122 1113 1850 1455 1658 2949 2939   56 4395 4389 4385 1434
  420 4383 4382  872  212  419 2294  842  117 3105 2044 1427 1229 2267
 2470  231 1279 1285 4436  323 2087 4432 2300  436  437 4158  906  209
 4409 4351 4406 4597 3726 1241 3728 3309  473 3723 3729 3680 4023 3943
 3935  879 3927 3414 3921 3670 3669 2256 2912 3666 3663 4291 1046 3873
 4629 3334 3467 3326  562 3323 2672 4512]

Obviously (1) being the top makes sense. Several of the next others, too. But some after that start looking a tiny bit more interesting.

Joachim Breitner (Sep 27 2024 at 07:31):

Ok, on it. I guess refutations.txt is the raw data I could convert to a lean file, and it lists those equations that the given magma does not satisfy (and implicitly, that it satisfies all others?)

Joachim Breitner (Sep 27 2024 at 07:50):

I manually wrote the code that I think I’d be creating here. How does this look?

-- While we explore things and the repo setup is in flux, let's copy some things here
-- Eventually these defintions should be provided by imports

universe u

class Magma (α : Type u) where
  /-- `a ∘ b` computes a binary operation of `a` and `b`. -/
  op : α  α  α

@[inherit_doc] infixl:65 " ∘ "   => Magma.op

def Equation1 (G: Type _) [Magma G] :=  x : G, x = x
def Equation2 (G: Type _) [Magma G] :=  x y : G, x = y
def Equation1629 (G: Type _) [Magma G] :=  x : G, x = (x  x)  ((x  x)  x)


-- The following line produced by the tool
-- '(4 * x**2 + 4 * y**2 + 1 * x + 0 * y + 0 * x * y) % 6' (0, 1628, 3252, 3260, 3305, 3318, 3333, 4064, 4117)
-- would be translated to the following output

def «FinitePoly 4x² + 4y² + x % 6» : Magma (Fin 6) where
  op x y := 4*x*x + 4*y*y + 6

theorem «FinitePoly 4x² + 4y² + x % 6 satisfies Equation1» :
  @Equation1 _ «FinitePoly 4x² + 4y² + x % 6» := by unfold Equation1; decide

-- and more

theorem «FinitePoly 4x² + 4y² + x % 6 refutes Equation1629» :
  ¬ @Equation1629 _ «FinitePoly 4x² + 4y² + x % 6» := by unfold Equation1629; decide

-- The following non-implications are now trivial combinations of the above,
-- we'll see if they are actually needed in the final repo setup

theorem Equation1_not_implies_Equation1629 :
     (G: Type) (_: Magma G), Equation1 G  ¬ Equation1629 G :=
  ⟨_, _, «FinitePoly 4x² + 4y² + x % 6 satisfies Equation1», «FinitePoly 4x² + 4y² + x % 6 refutes Equation1629» 

Daniel Weber (Sep 27 2024 at 07:51):

Perhaps this should use ZMod and not Fin?

Joachim Breitner (Sep 27 2024 at 07:52):

I intentionally used Fin here because it means the files don’t have to import Mathlib. When we have hundrets of these files the additional processing speed may well be worth it.

Joachim Breitner (Sep 27 2024 at 07:52):

And since all we care about are the final implications, it shouldn't matter how we construct the magma, should it?

Joachim Breitner (Sep 27 2024 at 07:54):

(This assumes that eventually the definition of Magma for this project, and the list of equations, are also in a import-free file. But I see no reason why they can't)

Vlad Tsyrklevich (Sep 27 2024 at 08:05):

I was working on something similar last night and just thought I'd note it here: instead of having X_not_implies_Y proofs, a general proof for a whole formula (e.g. Eq1 ∧ Eq2 ∧ ¬Eq3 ∧¬Eq4) takes O(M+N) time where there are M/N positive/negative terms, but proves M*N cases, so it scales very well if certain formulas are particularly general. (Perhaps this is obvious, but it was not how I initially started.)

Daniel Weber (Sep 27 2024 at 08:08):

Nicholas Carlini said:

Of maybe independent interest, here are the "hardest" equations that imply "almost nothing". (All of these imply fewer than 5 other equations with my current set of invariances. The true set may be even smaller.)

[  ... ]

Obviously (1) being the top makes sense. Several of the next others, too. But some after that start looking a tiny bit more interesting.

Do you happen to have data of how many magmas you found satisfying each of these?

Joachim Breitner (Sep 27 2024 at 08:14):

Ok, this doesn’t work :-D

~/build/lean/equational_theories $ du -sh  equational_theories/FinitePoly
5,8G    equational_theories/FinitePoly

Vlad Tsyrklevich (Sep 27 2024 at 08:19):

What is the size from? Individual not_implies theorems, or?

Joachim Breitner (Sep 27 2024 at 08:47):

Ok, the file “works” now, but it seems we are stressing lean a bit too much, I get

equational_theories/FinitePoly/Refutation0.lean:52813:8: error: (kernel) deep recursion detected

(note the line number)

Daniel Weber (Sep 27 2024 at 08:48):

How many magmas are there? Perhaps you can split it to files based on the magma?

Joachim Breitner (Sep 27 2024 at 08:48):

It’s already one file per magma

Joachim Breitner (Sep 27 2024 at 08:49):

It seems stupid to generate all the non-equations in every of these files. I think there needs to be a tool that identifies “interesting” non-implications (e.g. if a→b, then no point disproving c→a if we can disprove c→b), and also somehow tells the code generation which implications should be generated from which magma

Joachim Breitner (Sep 27 2024 at 08:51):

If I remove the implications from the generated file, then one takes ~1min to process with lean, and is 800k in size, with 218 files generated, so still 90MB. I could compress the files a lot with a bit of metaprogramming, but whether that is useful depends a bit on how they will be integrated into the wider picture here.

Daniel Weber (Sep 27 2024 at 08:54):

Joachim Breitner said:

It seems stupid to generate all the non-equations in every of these files. I think there needs to be a tool that identifies “interesting” non-implications (e.g. if a→b, then no point disproving c→a if we can disprove c→b), and also somehow tells the code generation which implications should be generated from which magma

What are the rules of inference here? Did I miss something?

  • If a implies b and b implies c then a implies c
  • If a implies b and a doesn't imply c then b doesn't imply c
  • If a doesn't imply b and c implies b then a doesn't imply c

Vlad Tsyrklevich (Sep 27 2024 at 08:54):

Joachim Breitner said:

If I remove the implications from the generated file, then one takes ~1min to process with lean, and is 800k in size, with 218 files generated, so still 90MB. I could compress the files a lot with a bit of metaprogramming, but whether that is useful depends a bit on how they will be integrated into the wider picture here.

Is it possible to give an example of such a file? I'm having difficulty understanding what it looks like exactly

Joachim Breitner (Sep 27 2024 at 08:59):

See the example above, just with 4000 implies and refutes lemmas, and then non-implication-lemmas for all combinations :-)

Joachim Breitner (Sep 27 2024 at 09:00):

Ok, as I expeced I can cut the processing time by half by using decide!, a tactic I quickly hacked up that works like decide but doesn’t check the predicate, so it’s only checked once by the kernel. It’s something I meant to write anyways, and will probably propose to lean.

Daniel Weber (Sep 27 2024 at 09:00):

Daniel Weber said:

Joachim Breitner said:

It seems stupid to generate all the non-equations in every of these files. I think there needs to be a tool that identifies “interesting” non-implications (e.g. if a→b, then no point disproving c→a if we can disprove c→b), and also somehow tells the code generation which implications should be generated from which magma

What are the rules of inference here? Did I miss something?

  • If a implies b and b implies c then a implies c
  • If a implies b and a doesn't imply c then b doesn't imply c
  • If a doesn't imply b and c implies b then a doesn't imply c

I think this means that if we look at the transitive reduction condensation graph of implications then "interesting" non-implications are non-implications from nodes to nodes with out-degree 0 which a direct ancestor of them can reach, but once we also add unknown relations that gets more complicated

Vlad Tsyrklevich (Sep 27 2024 at 09:01):

Joachim Breitner said:

See the example above, just with 4000 implies and refutes lemmas, and then non-implication-lemmas for all combinations :-)

What I'm trying to see, is if you perform refutations like I mentioned above that turns M*N theorems into M+N theorems.

Joachim Breitner (Sep 27 2024 at 09:16):

I put my current work in progress in https://github.com/teorth/equational_theories/pull/19

Shreyas Srinivas (Sep 27 2024 at 09:36):

I enabled CI on this PR to make sure you also have all the error info you might need from that

Shreyas Srinivas (Sep 27 2024 at 09:39):

Also I think from this point you will get CI automatically

Joachim Breitner (Sep 27 2024 at 10:43):

I still see “This workflow requires approval from a maintainer.”

Shreyas Srinivas (Sep 27 2024 at 10:50):

Running

Shreyas Srinivas (Sep 27 2024 at 10:53):

We need to set up a GitHub action for this and allow users to add labels like WIP and awaiting-review

Shreyas Srinivas (Sep 27 2024 at 10:53):

Currently I am doing this manually

Joachim Breitner (Sep 27 2024 at 10:56):

I put these extra file in to a non-default lake library, so that lake build doesn't take forever for unsuspecting users, and added it to the CI. can you allow it again?

Joachim Breitner (Sep 27 2024 at 11:06):

(Although it might blow up the github runners, memory consumption is rather high for these files.)

Shreyas Srinivas (Sep 27 2024 at 11:34):

Okay

Joachim Breitner (Sep 27 2024 at 11:54):

Hmm, according to profiling, the majority of the time is spent is synthesizing decidable instances, and not in actually running them. Guess there are some perf improvements to be gained, if they become the bottleneck. But maybe first see if this is actually a direction that's welcome.

Joachim Breitner (Sep 27 2024 at 12:44):

Couldn't resist. Replacing type class inference with simple MetaM code constructing the instance speeds up checking these files by about 2.5×. Now the kernel type checking is the bottleneck.

Shreyas Srinivas (Sep 27 2024 at 12:57):

CI run started

Vlad Tsyrklevich (Sep 27 2024 at 13:51):

@Nicholas Carlini Just to confirm that I'm not mis-parsing this data, I found that there are 4,794,933 unique refutation pairs , which is ~21.7% of 4694^2. I just wanted to double check this with you to make sure I'm not making an error here. For the >60% number I wonder if there could have been double counting to the inclusion of refutation pairs in multiple equations? Prior to running sort -u the number of cases was approximately 70% of 4694^2

Nicholas Carlini (Sep 27 2024 at 13:59):

@Vlad Tsyrklevich Sorry, I accidentally truncated a much shorter file than I intended to commit. Let me correct this.

Nicholas Carlini (Sep 27 2024 at 14:05):

Thanks for noticing; I've corrected this. Here's the bigger set that should work if you could please confirm. https://github.com/carlini/equational_theories/blob/main/brute_force/refutations_62percent.txt

Vlad Tsyrklevich (Sep 27 2024 at 14:25):

Confirmed that it's now ~60% of all pairs and that it's a strict superset of the previous data. There are 13,633,461 total pairs.

Nicholas Carlini (Sep 27 2024 at 14:26):

Daniel Weber said:

Do you happen to have data of how many magmas you found satisfying each of these?

Here are counts for how many unique magmas refute each of these. Each of these are "almost never true". For example, the last few are only refuted by a very small number of magmas.

Equation | num_uniqe_satisfying
4065 1483
3253 1470
3659 1162
3456 985
3862 961
4118 825
3319 819
[...]
2504 3
1279 3
2903 3
1313 2
2294 2

Here are the refutations for the bottom few:

1083 (0 * x2 + 0 * y2 + 1 * x + 2 * y + 0 * x * y) % 3
1083 (0 * x2 + 0 * y2 + 6 * x + 2 * y + 0 * x * y) % 7
1083 (0 * x2 + 3 * y2 + 4 * x + 2 * y + 6 * x * y) % 9

1279 (0 * x2 + 0 * y2 + 2 * x + 4 * y + 0 * x * y) % 5
1279 (0 * x2 + 0 * y2 + 3 * x + 1 * y + 0 * x * y) % 7
1279 (0 * x2 + 0 * y2 + 4 * x + 4 * y + 0 * x * y) % 7

1313 (0 * x2 + 0 * y2 + 1 * x + 2 * y + 0 * x * y) % 7
1313 (0 * x2 + 0 * y2 + 2 * x + 4 * y + 0 * x * y) % 5

2294 (0 * x2 + 0 * y2 + 2 * x + 5 * y + 0 * x * y) % 7
2294 (0 * x2 + 0 * y2 + 4 * x + 2 * y + 0 * x * y) % 5

2504 (0 * x2 + 0 * y2 + 2 * x + 1 * y + 0 * x * y) % 3
2504 (0 * x2 + 0 * y2 + 6 * x + 2 * y + 0 * x * y) % 7
2504 (6 * x2 + 0 * y2 + 2 * x + 4 * y + 3 * x * y) % 9

2903 (0 * x2 + 0 * y2 + 2 * x + 1 * y + 0 * x * y) % 11
2903 (0 * x2 + 0 * y2 + 2 * x + 4 * y + 0 * x * y) % 5
2903 (0 * x2 + 0 * y2 + 4 * x + 4 * y + 0 * x * y) % 7

2910 (0 * x2 + 0 * y2 + 1 * x + 4 * y + 0 * x * y) % 7
2910 (0 * x2 + 0 * y2 + 3 * x + 4 * y + 0 * x * y) % 7
2910 (0 * x2 + 0 * y2 + 4 * x + 2 * y + 0 * x * y) % 5

Nicholas Carlini (Sep 27 2024 at 14:27):

Is this interesting in and of itself? If so I can spin this out somewhere else. (I don't know how you all handle discussions on this thing; I've never used it before.)

Nicholas Carlini (Sep 27 2024 at 14:39):

For example, if you could show any of the following statement, we would have completely characterized the implications for these equations:

Equation2746[x = ((y ∘ y) ∘ (y ∘ y)) ∘ x] => Equation2644[x = ((x ∘ x) ∘ (x ∘ x)) ∘ x]
Equation4283[x ∘ (x ∘ y) = x ∘ (y ∘ x)] => Equation4358[x ∘ (y ∘ z) = x ∘ (z ∘ y)]
Equation4282[x ∘ (x ∘ y) = x ∘ (x ∘ z)] => Equation4268[x ∘ (x ∘ x) = x ∘ (x ∘ y)]
Equation3271[x ∘ x = y ∘ (x ∘ (y ∘ x))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation3272[x ∘ x = y ∘ (x ∘ (y ∘ y))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation3278[x ∘ x = y ∘ (y ∘ (x ∘ x))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation3281[x ∘ x = y ∘ (y ∘ (y ∘ x))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation1109[x = y ∘ ((y ∘ (x ∘ x)) ∘ x)] => Equation1020[x = x ∘ ((x ∘ (x ∘ x)) ∘ x)]
Equation1238[x = x ∘ (((y ∘ x) ∘ x) ∘ x)] => Equation1223[x = x ∘ (((x ∘ x) ∘ x) ∘ x)]
Equation1239[x = x ∘ (((y ∘ x) ∘ x) ∘ y)] => Equation1223[x = x ∘ (((x ∘ x) ∘ x) ∘ x)]
Equation4316[x ∘ (y ∘ x) = x ∘ (z ∘ x)] => Equation4269[x ∘ (x ∘ x) = x ∘ (y ∘ x)]
Equation1248[x = x ∘ (((y ∘ y) ∘ x) ∘ x)] => Equation1223[x = x ∘ (((x ∘ x) ∘ x) ∘ x)]
Equation4472[x ∘ (y ∘ y) = (x ∘ y) ∘ x] => Equation4380[x ∘ (x ∘ x) = (x ∘ x) ∘ x]
Equation1252[x = x ∘ (((y ∘ y) ∘ y) ∘ y)] => Equation1223[x = x ∘ (((x ∘ x) ∘ x) ∘ x)]
Equation4473[x ∘ (y ∘ y) = (x ∘ y) ∘ y] => Equation4380[x ∘ (x ∘ x) = (x ∘ x) ∘ x]
Equation4584[(x ∘ x) ∘ x = (x ∘ y) ∘ x] => Equation4631[(x ∘ y) ∘ x = (x ∘ z) ∘ x]
Equation4585[(x ∘ x) ∘ x = (x ∘ y) ∘ y] => Equation4656[(x ∘ y) ∘ y = (x ∘ z) ∘ z]
Equation3306[x ∘ y = x ∘ (x ∘ (x ∘ y))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation4587[(x ∘ x) ∘ x = (y ∘ x) ∘ x] => Equation4666[(x ∘ y) ∘ y = (z ∘ y) ∘ y]
Equation620[x = x ∘ (x ∘ ((y ∘ x) ∘ y))] => Equation614[x = x ∘ (x ∘ ((x ∘ x) ∘ x))]
Equation622[x = x ∘ (x ∘ ((y ∘ y) ∘ x))] => Equation614[x = x ∘ (x ∘ ((x ∘ x) ∘ x))]
Equation3315[x ∘ y = x ∘ (y ∘ (x ∘ x))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation3316[x ∘ y = x ∘ (y ∘ (x ∘ y))] => Equation3253[x ∘ x = x ∘ (x ∘ (x ∘ x))]
Equation4470[x ∘ (y ∘ y) = (x ∘ x) ∘ y] => Equation4380[x ∘ (x ∘ x) = (x ∘ x) ∘ x]
Equation632[x = x ∘ (y ∘ ((x ∘ y) ∘ x))] => Equation614[x = x ∘ (x ∘ ((x ∘ x) ∘ x))]
Equation633[x = x ∘ (y ∘ ((x ∘ y) ∘ y))] => Equation614[x = x ∘ (x ∘ ((x ∘ x) ∘ x))]
Equation1023[x = x ∘ ((x ∘ (x ∘ y)) ∘ y)] => Equation1020[x = x ∘ ((x ∘ (x ∘ x)) ∘ x)]
Equation4480[x ∘ (y ∘ y) = (y ∘ x) ∘ y] => Equation4380[x ∘ (x ∘ x) = (x ∘ x) ∘ x]

Nicholas Carlini (Sep 27 2024 at 14:41):

(There are 195 equations left with only one implication to resolve, and 55 with 2 to resolve, 76 with 3 to resolve.)

Daniel Weber (Sep 27 2024 at 14:41):

Isn't the first one completely trivial by setting y = x?

Nicholas Carlini (Sep 27 2024 at 14:42):

I haven't checked any of these by hand. It's just whether or not it's automatically found a refutation.

Daniel Weber (Sep 27 2024 at 14:42):

(deleted)

Nicholas Carlini (Sep 27 2024 at 14:45):

Should I make this its own thing for "statements that are almost finished resolving?" This approach already completely resolves 42 of the equations finding a counterexample for every one of them (except for x=>x and itself). Are these the "prime" equations?

Joachim Breitner (Sep 27 2024 at 15:40):

I’ve thrown in the larger files into my lean PR, but processing all these files does take a long time, now that we have Z/8Z, so an equation with four variables has 4096 instantiations.

I expect we need some selection of the data, i.e. which magmas tell us an interesting antiimplication (one not already implied by the known ones), and for a given magma only prove (or refute) those equations that help us to show these antiimplications. No need to prove that a certain equation holds or doesn’t hold if we are not using that result.

Terence Tao (Sep 27 2024 at 15:49):

I have work duties that will not allow me to get back to this project for several horus, but just wanted to say that this is fantastic progress, will try to catch up on it later!

Notification Bot (Sep 27 2024 at 17:52):

4 messages were moved from this topic to #Equational > Refutations using Z3 by Joachim Breitner.

Nicholas Carlini (Sep 28 2024 at 06:05):

I've tried various hacks, and can slightly increase the number of refutations, by maybe 1% or so (which is I guess a large number in absolute terms) but straightforward tricks aren't successfully brining me beyond 63%. So I think the initial run basically saturates what you can do without something more clever.

Joachim Breitner (Sep 28 2024 at 14:13):

Before merging this I’d like to reduce the amount of work done. In particular, for a magma that satisfies equations [1,2,3] and refutes [4,5,6], if we know the implication 2 → 3 and 4→5, I’d like to only include [1,2] and [5,6] in the set of equations to test in the proof.

For that I need a file with a list of implications that we know already. Is that stored or accessible already somewhere, or is that still coming?

(Probably this is much more useful after equational#23 is merged, which adds a huge number of implications.)

Terence Tao (Sep 28 2024 at 14:15):

I guess equational#29 has the latest on this

Joachim Breitner (Sep 28 2024 at 14:19):

Thanks! I’ll wait for that to resolve.

Edward van de Meent (Sep 28 2024 at 14:24):

relatedly, i think there are some implications in the Subgraph.lean file that are implied by other implications in the file. for example, the implications 2->3, 3->8 and 2->8 all exist... should we try and make sure these triangles don't occur in the Subgraph subproject?

Shreyas Srinivas (Sep 28 2024 at 14:26):

There is no PR on that issue yet

Daniel Weber (Sep 28 2024 at 14:26):

There could be a linter for that, as well as for making sure that conjectures don't create contradictions

Edward van de Meent (Sep 28 2024 at 14:29):

introducing an additional node will usually do this, should we aim to cleanup in the same PR that adds the implications?

Daniel Weber (Sep 28 2024 at 14:31):

How do we check for redundant non-implications?

Daniel Weber (Sep 28 2024 at 14:33):

we have implicit_refutations = implicit_implicationsᵗ explicit_refutations implicit_implicationsᵗ (this is matrix multiplication over the boolean semiring), if I'm not mistaken, but I'm not sure how to minimize explicit_refutations there

Edward van de Meent (Sep 28 2024 at 14:37):

if a non-implication is of the form not n -> m, then for the "smallest" k with k -> n and k -> m, k -> n must be minimal. (otherwise an implication must be missing)

Edward van de Meent (Sep 28 2024 at 14:38):

i'm not sure this is a sufficient condition to the not-implication being non-redundant, however...

Daniel Weber (Sep 28 2024 at 14:40):

We can create two copies of the equations, and build a graph by:

  • For an implication a -> b, add copy1(b) -> copy1(a) and copy2(b) -> copy2(a)
  • For an non-implication not (a -> b) add copy1(a) -> copy2(b)

And I'm pretty sure the implicit implications/non-implications are exactly the transitive closure of this (so for non-redundancy we want it to be transitively reduced)

Dániel Varga (Sep 28 2024 at 20:46):

Hi Team! I'm doing something very similar, namely, exactly what Joachim suggested. Here's a short writeup: https://github.com/teorth/equational_theories/issues/50

Have you settled on the best way for interfacing between the two worlds? My counterexamples take the form of small (currently 2x2 or 3x3) multiplication tables. How hard would it be to turn those into Lean proofs?

Nicholas Carlini (Sep 28 2024 at 20:50):

I also don't know Lean at all, and so what I did was just generate my counterexamples and say "Here they are!" and the wonderful people who understand this Lean stuff figured out a way to turn them into formalizable objects. One thing that will probably be useful to start is to split out refutations on a per-table basis, and keep only the tables that are needed for that.

Also: What do you mean by "multiplication tables"? You mean just the operator is a lookup into the table?

Dániel Varga (Sep 28 2024 at 21:04):

Nicholas Carlini said:

Also: What do you mean by "multiplication tables"? You mean just the operator is a lookup into the table?

Yes, if I understand your terminology correctly. For example this 3x3 matrix defines the operation x+y := x if y=0 else 0.

x = x + (y + x) is implied by x = x + (x + y) for 2-magmas, but not for 3-magmas.
counterexample:
[[0 0 0]
 [1 0 0]
 [2 0 0]]

Nicholas Carlini (Sep 28 2024 at 21:09):

Oh got it okay. I've brute forced up to all 4x4 tables and found they were almost completely a subset of the initial "generating random polynomials". Your ocde would be great way to double check that we've got the same values for n=3

Dániel Varga (Sep 28 2024 at 21:23):

My run has only included equations with up to 4 variables. I intend to re-run it with the full set of equations, and then we can directly compare, but in the meanwhile, here are some stats:

S matrix about which 3-magma satisfies which equation: (4347, 19683) matrix with 508228 true elements.
Number of true implications for 3-magmas: 7082298 out of 4347*4347=18896409, a ratio of 0.6252 falsified.

S matrix about which 2-magma satisfies which equation: (4347, 16) matrix with 9446 true elements.
Number of true implications for 2-magmas: 7785726 out of 4347*4347=18896409, a ratio of 0.5880 falsified.

Terence Tao (Sep 28 2024 at 21:25):

By the way are you only considering one magma from each isomorphism class? This may slightly speed up the runtime, and there is no point considering multiple magmas from a given class as they won't eliminate any new implications.

Nicholas Carlini (Sep 28 2024 at 21:25):

I just brute forced over all 4^(4*4) tables.... computers are fast, coding is slow(er)

Nicholas Carlini (Sep 28 2024 at 21:26):

(This will not work at n=5)

Terence Tao (Sep 28 2024 at 21:28):

I think the links in https://oeis.org/A001329 might lead to a list of magmas up to isomorphism up to a given size, but I haven't checked

Terence Tao (Sep 28 2024 at 21:33):

Though, given that even after isomorphism there are still 2483527537094825 different magmas of order 5, brute force here is never going to be feasible

Dániel Varga (Sep 28 2024 at 21:33):

Yes, I've brute forced n=3, and postponed n=4 because my computer is probably slower than Nicholas'. :) The 120-fold speedup at n=5 does not buy us anything.

Terence Tao (Sep 28 2024 at 21:35):

It could maybe be possible if one fixes the equation X first and then searches over magmas that obey X, then see what other equations Y they fail to obey. The constraint X may allow for quite a pruning of the tree of possible magmas.

Terence Tao (Sep 28 2024 at 21:36):

(This is hard to code though.)

Terence Tao (Sep 28 2024 at 21:38):

In any case, randomly generating say 10000 magmas may already get 95% of the anti-implications that are attainable with this approach. (It's the last 5% which are probably the most interesting, where there is a counterexample but it is highly structured and not visible from a random search.)

Terence Tao (Sep 28 2024 at 21:38):

At some point I could imagine reinforcement learning being deployed on the most stubborn anti-implications, but it will be a while before we have boiled off enough of the easy ones that we can locate those ones

Nicholas Carlini (Sep 28 2024 at 21:40):

I can try to find which of the relations are most interesting, in that they're only solved by a small number of magmas.

Dániel Varga (Sep 28 2024 at 21:54):

If Nicholas' operation is defined by a 2-variable polynomial, then an equation is equivalent to a system of polynomial equations over the coefficients, right? Could we solve those using Grobner bases?

Vlad Tsyrklevich (Sep 28 2024 at 22:54):

@Nicholas Carlini You mentioned earlier you generated another ~3% of refutations, is that correct? Could you share those new equations?

Nicholas Carlini (Sep 28 2024 at 22:55):

I think it's ~1%, but, these are all in very different weird formats though. Some have conditionals, some have some binary operators, some are from tables... I'm not sure if you want them all because they're not just polynomials like the current set

Vlad Tsyrklevich (Sep 28 2024 at 22:55):

For 1% and the additional work it's probably not worth it. Mostly I was going to just update my pairs to avoid iterating over while bruteforcing, and maybe trying to visualize some things

Nicholas Carlini (Sep 28 2024 at 22:56):

I'm trying to do a cleaner version of these (e.g., going through my brute forced 4x4 tables) and will try to collect these well at some point in the future

Nicholas Carlini (Sep 28 2024 at 23:32):

Terence Tao said:

(It's the last 5% which are probably the most interesting, where there is a counterexample but it is highly structured and not visible from a random search.)

If we define a relationship as "interesting" whenever there's only one magma that shows a=/>b, then there are a lot of "boring" "interesting" relationships. This is true because there are several equations where only one magma satisfies them. For example, Equation2613 x = (y ∘ ((z ∘ z) ∘ w)) ∘ x is only satisfied by f(a,b)=b. And so it has many interesting outgoing edges even though they're all just because this is a boring equation.

Maybe there's a better definition of interesting? If there are two equations A,B where A is true on many magmas, and B is false on many, but A=/>B is only true for a small number? Something else that makes it interesting?

Terence Tao (Sep 28 2024 at 23:45):

Yeah, as a first approximation I guess one wants P( A true and B false ) to be positive, but much smaller than P(A true ) * P(B false), to qualify as "interesting".

Nicholas Carlini (Sep 29 2024 at 00:44):

The C code to brute force the 4696 * 2^32 tables is considerably faster than the python code to just load the processed outputs... Preliminarily, here's a fun one. Equation3066 =/> Equation4585

Equation3066[x = (((x ∘ y) ∘ x) ∘ x) ∘ y]
Equation4585[(x ∘ x) ∘ x = (x ∘ y) ∘ y]

differ in exactly two tables, even though the Equation3066 is true in 10,053 cases

3 0 2 1
1 2 0 3
0 3 1 2
2 1 3 0

and

3 1 0 2
0 2 3 1
2 0 1 3
1 3 2 0

Vlad Tsyrklevich (Oct 14 2024 at 17:56):

Nicholas Carlini said:

I think it's ~1%, but, these are all in very different weird formats though. Some have conditionals, some have some binary operators, some are from tables... I'm not sure if you want them all because they're not just polynomials like the current set

The mention of conditionals here made me wonder if these old runs may have some current unknowns. The formula to solve https://leanprover.zulipchat.com/#narrow/stream/458659-Equational/topic/1648.20!.3D.3E.20206 was so trivial, that I was really quite surprised that it hadn't been discovered already. I remembered that you had mentioned conditionals, so it made me wonder if you may have hit this refutation yourself before. If the data is available and it's not too much trouble, you're welcome to upload it here or somewhere else and I'd be happy to comb through it if you're busy. No worries if it's lost to time, I may look at doing a simple tool myself to look for cases like this.

Terence Tao (Oct 14 2024 at 18:12):

The new example is an infinite model, though. Not sure if there is a finite version, it seems to be important that the underlying group is totally ordered.

Vlad Tsyrklevich (Oct 14 2024 at 18:13):

Indeed, I had forgotten that the original work was operating over finite magmas. That makes sense, otherwise it would be more involved to test for symbolic equality.


Last updated: May 02 2025 at 03:31 UTC