Zulip Chat Archive

Stream: Equational

Topic: visualization of graph closure


David Renshaw (Oct 01 2024 at 13:27):

outcomes0.png

David Renshaw (Oct 01 2024 at 13:27):

^ A rendering of the current output of lake exe extract_implications outcomes equational_theories

David Renshaw (Oct 01 2024 at 13:28):

explicitly proved implications are bright green. implicitly proved implications are dim green.
explicitly disproved implications are bridge red. implicitly disproved implications are dim red.
conjectures are purple. Not many of those.

Notification Bot (Oct 01 2024 at 13:30):

3 messages were moved here from #Equational > Some more implication visualizations by David Renshaw.

David Renshaw (Oct 01 2024 at 13:31):

The ordering of the rows and columns are not yet canonical. Fixing that is my next task.

Joachim Breitner (Oct 01 2024 at 13:35):

Nice to see so much area covered already -)

Amir Livne Bar-on (Oct 01 2024 at 13:38):

What do the (nearly) black columns mean? Identities that are probably very weak but we have no proof of that?

David Renshaw (Oct 01 2024 at 13:38):

put up a draft pull request: https://github.com/teorth/equational_theories/pull/167

David Renshaw (Oct 01 2024 at 13:39):

what I really want with this is some way that I can navigate the data!

David Renshaw (Oct 01 2024 at 13:39):

like, it would be cool if Amir could hover a mouse pointer over a black column and get information about it

David Renshaw (Oct 01 2024 at 13:50):

outcomes1.png

David Renshaw (Oct 01 2024 at 13:50):

^ ordering canonicalized

Johan Commelin (Oct 01 2024 at 14:01):

I've seen that picture before!

David Renshaw (Oct 01 2024 at 14:13):

It looks like the whole diagonal is black, meaning unknown

David Renshaw (Oct 01 2024 at 14:13):

rather than green, as it should be

David Renshaw (Oct 01 2024 at 14:14):

I think that means there's an easy set of results to add to equational_theories/Generated

Edward van de Meent (Oct 01 2024 at 14:14):

the difference between transitive closure and reflexive-transitive closure

Joachim Breitner (Oct 01 2024 at 14:14):

I don’t think we have explicit proofs of the diagonal, and I don’t think it’s useful to add these proofs. Just include them as implied implications in the picture.

Edward van de Meent (Oct 01 2024 at 14:15):

this could be a metatheorem :upside_down:

David Renshaw (Oct 01 2024 at 14:15):

why isn't it useful? I thought we wanted to prove everything.

David Renshaw (Oct 01 2024 at 14:16):

It's not like there's a huge number of them, compared to other sorts of things

David Renshaw (Oct 01 2024 at 14:17):

one could write a macro and have this take up very little source code space

Edward van de Meent (Oct 01 2024 at 14:17):

it's trivially true, that's why. and i imagine they clog up calculating the transitive closure.

Joachim Breitner (Oct 01 2024 at 14:19):

David Renshaw said:

why isn't it useful? I thought we wanted to prove everything.

I don’t think we want to prove everything. We want to prove enough so that we know the full graph, but just like we don’t prove transitive implications, we shouldn’t prove symmetrical implications.

Joachim Breitner (Oct 01 2024 at 14:20):

I’d also say we don’t prove implications or anti-implications where we already have the dual one, and instead move that information into the result-processing stage.

David Renshaw (Oct 01 2024 at 14:23):

makes sense

David Renshaw (Oct 01 2024 at 14:25):

outcomes2.png

David Renshaw (Oct 01 2024 at 14:26):

^ drew the diagonal as "implicit_proof_true"

David Renshaw (Oct 01 2024 at 14:28):

I think it would be very cool to render an up-to-date version of this on a webpage and to make it explorable somehow. (hover to get info about a pixel, maybe choose between different orders on indices, zoom in, etc)

David Renshaw (Oct 01 2024 at 14:31):

Hm... I wonder why red stripes can be both vertical and horizontal, but it seems that green stripes can only be vertical. What's the math explanation for that?

Daniel Weber (Oct 01 2024 at 14:33):

A horizontal green stripe would mean that something is implied by many equations, right? That seems like a very rare property, compared to a vertical green stripe (implies a lot of things), a vertical red stripe (doesn't imply a lot of things), and a horizontal red stripe (a lot of things don't imply it)

Terence Tao (Oct 01 2024 at 14:49):

Thanks! I just created a list of visualizations in README.md (similar to the "recent progress", please add to it when submitting your PR to link these files. (I have lists both for single-shot visualizations generated at one point in time, and continuously updated visualizations.)

David Renshaw (Oct 01 2024 at 15:07):

out.png

David Renshaw (Oct 01 2024 at 15:08):

now using blue instead of green, after being informed that might work better for colorblind folks

David Renshaw (Oct 01 2024 at 16:12):

ready for review equational#167

David Renshaw (Oct 01 2024 at 16:15):

I added a --scale parameter, so that a smaller visualization of something like Subgraph.lean looks good. (Zooming in on a tiny png typically looks bad due to antialiasing).

David Renshaw (Oct 01 2024 at 16:15):

subgraph.png

David Renshaw (Oct 02 2024 at 00:38):

output.gif

David Renshaw (Oct 02 2024 at 00:39):

^ zoomed in 300x300 pixel area showing some of the changes from @Vlad Tsyrklevich's equational#178

David Renshaw (Oct 02 2024 at 00:40):

It looks to me like all the changes (even when I zoom out) are filling in existing dashed vertical blue lines and making them solid.

David Renshaw (Oct 02 2024 at 00:40):

I wonder what the means about the math!

Vlad Tsyrklevich (Oct 02 2024 at 06:04):

I realize now that the original bruteforce scripts did not make use of symm, so there were certain identities it would never solve that were fairly trivial

Vlad Tsyrklevich (Oct 02 2024 at 06:13):

In particular, I increased the size of the Equation 2 equivalence class by another ~140 equations, and perhaps that's what's visible there?

David Renshaw (Oct 04 2024 at 02:48):

I fixed a bug in index order canonicalization (see equational#268), and now the status image looks much more regular:
outcomes.png

David Renshaw (Oct 04 2024 at 03:00):

Screenshot from 2024-10-03 23-00-31.png

David Renshaw (Oct 04 2024 at 03:01):

I like these secondary diagonals, but I don't know what they mean.

David Renshaw (Oct 04 2024 at 11:58):

oops, fixed another indexing bug (equational#274), and now the images look even more regular:
outcomes2.png

Johan Commelin (Oct 04 2024 at 11:59):

I guess we need formally verified visualisations :rofl:

David Renshaw (Oct 04 2024 at 12:02):

more practically, perhaps Closure.lean should at least sort the equation numbers in its indexing. (some related discussion over here)

Zoltan A. Kocsis (Z.A.K.) (Oct 04 2024 at 12:20):

@Johan Commelin You're joking, but the first release of the formally verified Pitts quantifier calculator -- an algorithm so difficult that nobody could implement it correctly without formal verification despite repeated attempts in the last 30 years -- gave wrong outputs for related reasons.

The output it gave was sometimes, but not always, wrong. I only noticed because I happened to know the Pitts interpolant of a specific formula. I double-checked everything, but "Dear Professor, This sounds crazy but I'm pretty sure your formally verified program is giving wrong outputs..." was still one of the harder emails of my career :) The issue turned out to be a bug in the unverified expression printing which put some parentheses in wrong places.

Johan Commelin (Oct 04 2024 at 13:12):

Great story :grinning:

Terence Tao (Oct 04 2024 at 14:26):

This is actually an important insight. Our project has a formally verified Lean backend but we are going to interface with it through a variety of front ends written in a variety of languages. We already had one major (but thankfully very temporary) issue with an off-by-one error temporarily scrambling the non-verified (conjectural) portion of the ground truth data set but there is also the possibility of even the verified data set being incorrectly rendered when converted to human-accessible form. I guess “unit tests” are a common validation here? We could try to see if various formally verified (or verifiable) phenomena, such as duality symmetry, actually show up in our data visualizations.

David Renshaw (Oct 04 2024 at 14:29):

In this particular case, I tried zooming in on some black ("unknown") pixels, grabbed their pixel coordinates, translated those to equation numbers (just add 1 to each) and then checked their status in https://nicholas.carlini.com/tmp/view.html.

David Renshaw (Oct 04 2024 at 14:30):

... and they seemed to line up.

Terence Tao (Oct 04 2024 at 14:32):

Here is a concrete suggestion. At https://github.com/teorth/equational_theories/blob/main/data/dual_equations.md is a list of equations and their dual, generated by https://github.com/teorth/equational_theories/blob/main/scripts/find_dual.py . This can be used to order the 4694 equations so that duality becomes reflection symmetry (e.g., the self-dual equations would be in the middle, and then dual pairs added around that self-dual core). Then the above visualizations should also exhibit a reflection symmetry (this will hopefully be more manifest if one figures out a nice ordering of the self-dual equations and dual pairs that gives the graph a bit more structure, e.g., putting simpler equations on one end. The visual symmetry might also give more pleasing images as well as being a nontrivial unit test. (Note: our graph is not currently guaranteed to exhibit duality symmetry, as we have not implemented equational#174, but it should be pretty close to such, and this visualization is in fact a very nice way of seeing how close we are to being closed under duality, and whether we can get a quick win from implementing equational#174 .)

Terence Tao (Oct 04 2024 at 14:39):

Here is another concrete suggestion, starting with a heuristic. To specify a finite magma of size n one needs to specify n^2 quantities, each of which have n choices, leading to n^(n^2) such magmas. Intuitively there are n^2 "degrees of freedom" here, one for each entry of the multiplication table. If one has a law involving k variables, this is basically specifying about n^k "constraints" on these n^2 variables. So the total "degrees of freedom" a law provides should be roughly n^2 - n^k, ignoring the extremely complex issue of what dependencies between the constraints there are. Still, this suggests the following predictions:

  • k=1 laws are underdetermined - they should have a ton of finite magmas obeying them.
  • k=2 laws are determined - some of them may have very few finite instances, some may have a moderately large number, but nowhere near what the k=1 type equations would have
  • k>2 laws are overdetermined - most of them should be trivial, particularly for very large k, but some may have some sporadic solutions, and a few wil have exceptional algebraic structure and have a lot of solutions.

So if we order the laws by k (as opposed to the current system of ordering lexicographically by length of LHS and RHS) we may get a more structurally interesting visualization.

It might also be worth visualizing the finite magma data somehow using this classification by the number k of variables to see to what extent this heuristic is valid.

More generally, we have enough data now that experimental mathematics is in play: we can try to start testing heuristic hypotheses empirically. It is perhaps ironic (but potentially quite wonderful) that formal proof assistants can in fact enable one of the least formal (but still very useful) modes of mathematics that we have...

Terence Tao (Oct 04 2024 at 14:40):

Let me go ahead and make formal project tasks out of these two assertions. One sec. UPDATE: equational#277 and equational#278

David Renshaw (Oct 04 2024 at 15:14):

a lot less bright red after the transitive closure pruning that just landed:
outcomes3.png

Amir Livne Bar-on (Oct 04 2024 at 15:20):

Is there still interest in sorting the equations in the program output?
https://github.com/teorth/equational_theories/pull/280

Zoltan A. Kocsis (Z.A.K.) (Oct 04 2024 at 15:54):

Terence Tao said:

t might also be worth visualizing the finite magma data somehow using this classification by the number k of variables to see to what extent this heuristic is valid.

Note that there might not be all that much exhaustive finite magma data for this project yet. We have a full enumeration of 4x4 magmas, but it's not checked in, and we only use 858 finite magmas as counterexamples. Still, let me get this started with the following histograms, based on these 858 finite magmas.

858-finite-magmas-histograms.png

The histograms show how many equations there are which are satisfied in a given number of magmas. The x axis does not actually go below 0, that's just how a bin that includes 0 is visualized.

Note that this particular dataset is not just small, but might be quite skewed. E.g. a lot of magmas satisfying Equation3 (x=xx) won't feature.

Nicholas Carlini (Oct 04 2024 at 16:02):

Zoltan A. Kocsis (Z.A.K.) said:

We have a full enumeration of 4x4 magmas, but it's not checked in

I thought it was checked in, right? Or at least it looks like some of them are, are others not
https://github.com/teorth/equational_theories/blob/main/equational_theories/Generated/All4x4Tables/Refutation176.lean

Zoltan A. Kocsis (Z.A.K.) (Oct 04 2024 at 16:05):

@Nicholas Carlini : Sorry, I misspoke. Your refutations4x4.txt is of course checked in and used, but that's not the list of all 4x4 magmas (which is massive), it's a close-to-minimal subset which proves the same set of anti-implications as the full ist.

Terence Tao (Oct 04 2024 at 16:05):

That preliminary data already indicates that the number of variables plays a role. One may wish to first reduce by equivalence to get sharper correlations; there are some equations with many variables that are actually just low-variable equations in disguise. And x=x and x=y are somewhat degenerate and should probably be removed from the analysis. Still it does indicate that we should get some meaningful empirical observations by exploring the relation between finite magma solvability and number of "degrees of freedom" (where one may need to update the naive metric of "number of variables" to something more subtle).

Nicholas Carlini (Oct 04 2024 at 17:11):

Zoltan A. Kocsis (Z.A.K.) said:

Nicholas Carlini : Sorry, I misspoke. Your refutations4x4.txt is of course checked in and used, but that's not the list of all 4x4 magmas (which is massive), it's a close-to-minimal subset which proves the same set of anti-implications as the full ist.

Oh okay yeah that's right. We could potentially check in one magma per equivalence class (meaning: proves the exact same set of equations). That's only ~500k I think of them. But that still is quite a lot of text to be storing in the repo.

David Renshaw (Oct 05 2024 at 02:37):

outcomes5.png
things are looking very filled in now that equational#281 has landed

Terence Tao (Oct 05 2024 at 03:05):

Nice! You might consider uploading a gallery of some images (marked with dates) to the repo for visitors to browse through.

Nicholas Carlini (Oct 05 2024 at 03:07):

Do I have it right that this has brought down the number of unknown equations to ~30k?

Nicholas Carlini (Oct 05 2024 at 03:54):

Here's (currently) the longest chain of implications where A=>B=>C=>D where at no point does B=>A, or C=>B etc

Equation 2: x = y
Equation 4: x = x ∘ y
Equation 1037: x = x ∘ ((y ∘ (x ∘ x)) ∘ z)
Equation 1237: x = x ∘ (((x ∘ y) ∘ z) ∘ w)
Equation 103: x = x ∘ ((x ∘ y) ∘ z)
Equation 824: x = x ∘ ((x ∘ y) ∘ (x ∘ z))
Equation 9: x = x ∘ (x ∘ y)
Equation 418: x = x ∘ (x ∘ (y ∘ (x ∘ z)))
Equation 48: x = x ∘ (x ∘ (x ∘ y))
Equation 412: x = x ∘ (x ∘ (x ∘ (x ∘ y)))
Equation 3: x = x ∘ x
Equation 3715: x ∘ y = (x ∘ x) ∘ (y ∘ y)
Equation 375: x ∘ y = (x ∘ x) ∘ y
Equation 4118: x ∘ y = ((x ∘ x) ∘ x) ∘ y
Equation 4065: x ∘ x = ((x ∘ x) ∘ x) ∘ x
Equation 1: x = x

Note that this chain could either get longer (if a new intermediate equation is found) or shorter (if we discover one of these equations implies one of the earlier ones).

Daniel Weber (Oct 05 2024 at 03:57):

What's the latest equation each equation in this chain is known to not imply?

Nicholas Carlini (Oct 05 2024 at 03:59):

Oh I didn't even check

Nicholas Carlini (Oct 05 2024 at 04:01):

The only possible backwards edges are 418 =?> 9 and 412=?>48

Nicholas Carlini (Oct 05 2024 at 04:01):

The rest are all filled in (proven | conjectured)

Johan Commelin (Oct 05 2024 at 04:39):

412 can't imply 48, by a similar string-concat-with-a-twist example as I gave in another thread (sorry, on mobile)

Pace Nielsen (Oct 05 2024 at 05:32):

418 can't imply 9, by the following 5x5 counterexample: [[0,2,3,0,0],[4,1,4,4,1],[2,2,2,2,2],[3,3,3,3,3],[4,4,4,4,4]]. (In case this notation isn't clear, the underlying set of the magma is {0,1,2,3,4}, and the product 0*1=2 shows up in the first line of the array.) Equation 9 doesn't hold taking x=0 and y=1.

Amir Livne Bar-on (Oct 05 2024 at 08:27):

If we only take implications which are proven to be strict, this is a maximally long chain:
2 => 5 => 2499 => 2415 => 238 => 2716 => 28 => 2973 => 270 => 3 => 3715 => 375 => 359 => 4065 => 1
(one item shorter than the one above)

Amir Livne Bar-on (Oct 05 2024 at 08:35):

I looked into the composition of the 29651 unknown implications.
These statistics include the order-5 and order-6 laws, which we have almost no data about. Without them there are 9520 unknowns, or 4819 up to equivalence.

Daniel Weber (Oct 05 2024 at 08:37):

I've thought a bit about that, because they only hold nontrivially in infinite models, one direction of the implication should be fairly simple to resolve, by just adding them to some Fact statements. The other direction seems more complex

Will Sawin (Oct 05 2024 at 11:31):

The sequence 4, 9, 48, 412 continues to an infinite sequence of longer equations which all imply the next one but all are distinct. Probably Johan's example works for this. I did it with n-bit strings where x ∘ y is obtained by taking bits from x until after the first bit that differs in x and y and then taking bits from y. I wonder if there is any equation that is between every member of that sequence and 3, my guess is not.

Johan Commelin (Oct 05 2024 at 11:41):

@Will Sawin Would you mind explaining your example in a bit more detail?

Pace Nielsen (Oct 05 2024 at 15:49):

Will Sawin said:

The sequence 4, 9, 48, 412 continues to an infinite sequence of longer equations which all imply the next one but all are distinct. Probably Johan's example works for this. I did it with n-bit strings where x ∘ y is obtained by taking bits from x until after the first bit that differs in x and y and then taking bits from y. I wonder if there is any equation that is between every member of that sequence and 3, my guess is not.

Here is another way to construct an example. Define a magma with underlying set {0,1,2,...,n}, with the binary operation given by the rule that when a\neq 0 we simply take ab=a , but when a=0 we take ab=b+1 for 0<b<n, and finally we take 00=0n=0. (So, essentially, only the top row is nontrivial, and that row would be a simple +1 shifting except that 0*0 is not 1.)

(Another similar example that also works is to set ab=b+1 (mod n) when a\neq b, and to set aa=a.)

Terence Tao (Oct 05 2024 at 16:05):

Is there a fixed n for which this example already rules out implications? We could add it to the list of finite magmas.

Will Sawin (Oct 05 2024 at 18:05):

Pace's examples are probably simpler, but the basic idea is for x circ y to have every bit agree with y except one, which is shifted to the corresponding bit of x. Specifically, we take the first bit where x and y differ and flip that bit in y. Iterating this process enough n times, with n the total number of bits, you get x, but iterating it n-1 times doesn't always work if x and y differ by n bits.

Will Sawin (Oct 05 2024 at 18:11):

Terence Tao said:

Is there a fixed n for which this example already rules out implications? We could add it to the list of finite magmas.

Either of Pace's examples on n+1 elements takes n operations of x to converge. So the n=1 example with 2 elements shows Equation 4 does not imply Equation 2 , the n=2 example with 3 elements shows Equation 9 does not imply Equation 4, the n=3 example with 4 elements shows Equation 48 does not imply Equation 9, and the n=4 example with 5 elements shows equation 412 does not imply equation 48. The next example has five multiplications so isn't part of the project, but the magmas with 2 through 5 elements could be added to the list of finite examples if they aren't already.

Isak Colboubrani (Oct 06 2024 at 17:31):

Pace Nielsen said:

418 can't imply 9, by the following 5x5 counterexample: [[0,2,3,0,0],[4,1,4,4,1],[2,2,2,2,2],[3,3,3,3,3],[4,4,4,4,4]].

Verified:

% scripts/explore_magma.py '[[0,2,3,0,0],[4,1,4,4,1],[2,2,2,2,2],[3,3,3,3,3],[4,4,4,4,4]]' --ids 9,418 --verbose

0 1 2 3 4
0 0 2 3 0 0
1 4 1 4 4 1
2 2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4 4
    9. x = x  (x  y) 
       0  0  (0  1) # counterexample with x=0, y=1
       0  0  2
       0  3

  418. x = x  (x  (y  (x  z))) 
       3 = 3  (3  (2  (3  0))) # example with x=3, y=2, z=0
       3 = 3  (3  (2  3))
       3 = 3  (3  2)
       3 = 3  3
       3 = 3

Last updated: May 02 2025 at 03:31 UTC