Zulip Chat Archive

Stream: Equational

Topic: Pruning counter examples based on implications

Joachim Breitner (Sep 30 2024 at 10:44):

Things are falling into place. With the equations_result attribute now existing, and added to the autogenerated files, including those that prove trivial implications, I can now prune the auto-generated counter-examples using that data, and it’s very effective:

Pruning by implication removed 476896 facts to check, down from 511646, leaving 34750.

That’s down by 93%!

(Assuming I didn’t make a mistake here, or got the implications wrong).

So strongly suggest that everyone working on counter-examples prunes the generated lean files this way, to keep project processing time down.

Joachim Breitner (Sep 30 2024 at 10:48):

See https://github.com/nomeata/equational_theories/blob/finite_poly_refutation/equational_theories/Generated/FinitePoly/src/finite_poly_generate_lean.py for how I read the implications.json, calculate the transitive closure using adjacency sets (not lists!) relatively efficiently and then prune the data using this logic:

def prune_row(data):
    satisfied = set()
    for i in data["satisfied"]:
        # already implied
        if implying[i].intersection(satisfied):
          continue
        # remove all implied by this
        satisfied = satisfied - impliedBy[i]
        satisfied.add(i)
    refuted = set()
    for i in data["refuted"]:
        # already ruled out
        if impliedBy[i].intersection(refuted):
          continue
        # remove all that this is ruling out
        refuted = refuted - impliedBy[i]
        refuted.add(i)

Joachim Breitner (Sep 30 2024 at 11:10):

Also, with the generated implications the data is now so large that the code I found in the scipts directory to calculate the transitive closure was too slow. This worked well and results in a maybe more convenient representation as adjacency sets in both directions:

# we have 4694 equations
full = range(1,4694+1)

with open(f"{dir}/src/implications.json") as f:
  implications = json.load(f)["implications"]
implications = [ (int(i["lhs"].removeprefix("Equation")), int(i["rhs"].removeprefix("Equation"))) for i in implications ]

print("Number of implications:", len(implications))

impliedBy = { i : set() for i in full }
implying = { j : set() for j in full }

for i, j in implications:
  impliedBy[i].add(j)
  impliedBy[i] |= impliedBy[j]
  implying[j].add(i)
  implying[j] |= implying[i]

Daniel Weber (Sep 30 2024 at 11:11):

There's also code in equational#126 for transitive closure

Vlad Tsyrklevich (Sep 30 2024 at 11:19):

Joachim Breitner said:

# we have 4694 equations
full = range(1,4694+1)

with open(f"{dir}/src/implications.json") as f:
  implications = json.load(f)["implications"]
implications = [ (int(i["lhs"].removeprefix("Equation")), int(i["rhs"].removeprefix("Equation"))) for i in implications ]

print("Number of implications:", len(implications))

impliedBy = { i : set() for i in full }
implying = { j : set() for j in full }

for i, j in implications:
  impliedBy[i].add(j)
  impliedBy[i] |= impliedBy[j]
  implying[j].add(i)
  implying[j] |= implying[i]

If you're talking about the ruby code, it's fast enough if you first run the transitive reduction (I should have made that necessary, but I just left it as a comment in case others wanted to not run reduction twice)

Joachim Breitner (Sep 30 2024 at 11:44):

Not the ruby code, I found this elegant snippet to be too slow:

def transitive_closure(pairs):
    new_pairs = closure = set(pairs)
    while new_pairs:
        new_pairs = {
            (a, d)
            for a, b in new_pairs
            for c, d in pairs
            if b == c
        } - closure
        closure |= new_pairs
    return closure

Daniel Weber (Sep 30 2024 at 11:48):

How fast do you need it to be? I'm currently trying to optimize the lean code, and I'm not sure how much to optimize it---the current bottleneck is collecting the edges, not computing the closure itself

Vlad Tsyrklevich (Sep 30 2024 at 11:51):

Gotcha. One more questions, I'm not actually clear what you're doing to reduce the number of counterexamples to search. Is it just that you're subtracting the closure of implications, or something else? Just based on the number you gave above, I would have thought the search space would be greater than that, e.g. down from ~7m -> ~3.5m. Or perhaps I am just not up to date with how many refutations are now proven.

Joachim Breitner (Sep 30 2024 at 12:00):

I now extended the script to prune some more, and do not include equations in the Facts to check that would not tell us anything new, given the previous examples into account. This is my code. I guess the first laf is “subtracting the closure of implications”?

def prune_row(data):
    stats["total"] += len(data["satisfied"]) + len(data["refuted"])

    # prune by implications
    satisfied = set()
    for i in data["satisfied"]:
        # already implied
        if implying[i].intersection(satisfied):
          continue
        # remove all implied by this
        satisfied = satisfied - impliedBy[i]
        satisfied.add(i)
    refuted = set()
    for i in data["refuted"]:
        # already ruled out
        if impliedBy[i].intersection(refuted):
          continue
        # remove all that this is ruling out
        refuted = refuted - impliedBy[i]
        refuted.add(i)
    stats["removed_by_implication"] += len(data["satisfied"]) + len(data["refuted"]) - len(satisfied) - len(refuted)

    # prune by earlier examples
    satisfied_ = {i for i in satisfied if refuted - notImpliedBy[i] }
    refuted_   = {j for j in refuted   if satisfied - notImplying[j] }

    stats["removed_by_covering"] += len(satisfied) + len(refuted) - len(satisfied_) - len(refuted_)
    satisfied, refuted = satisfied_, refuted_

    for i in satisfied:
      for j in refuted:
        notImpliedBy[i].add(j)
        notImplying[j].add(i)

    data["satisfied"] = sorted(satisfied)
    data["refuted"] = sorted(refuted)
    return data

Joachim Breitner (Sep 30 2024 at 12:00):

The second check was less effective than I was hoping for:

Out of 619608 facts to check, pruning by implication removed 577899 facts (93.27%) to check, pruning by covering removed 3219 facts (0.52%), leaving 38490 facts to check.

Joachim Breitner (Sep 30 2024 at 12:02):

I think we can merge https://github.com/teorth/equational_theories/pull/19 now. If we then also merge https://github.com/teorth/equational_theories/pull/130 then these antiimplications will show up in the script that collects our results, and that will answer your question of how fast it needs to be – fast enough to handle that data :-)

Amir Livne Bar-on (Sep 30 2024 at 12:03):

Joachim Breitner said:

Not the ruby code, I found this elegant snippet to be too slow:

def transitive_closure(pairs):
    new_pairs = closure = set(pairs)
    while new_pairs:
        new_pairs = {
            (a, d)
            for a, b in new_pairs
            for c, d in pairs
            if b == c
        } - closure
        closure |= new_pairs
    return closure

For what it's worth, the reason it's slow is the iteration "for c, d in pairs if b == c". If we had defined

pairs_idx = defaultdict(list)
for a, b in pairs:
    pairs_idx[a].append(b)

this could be replaced with "for d in pairs_idx[b]".

I made this change locally when the number of implications grew, but thought it's too minor to push. If anyone is interested, this is called the semi-naive algorithm, at least in the DataLog literature.

Joachim Breitner (Sep 30 2024 at 12:05):

Right, a dict-of-dict (or dict-of-list) data sturcture works bettter, probably also why my code is fast enough. Feel free to push it, before someone else uses or copied that code :-)

Michael Bucko (Sep 30 2024 at 12:11):

Joachim Breitner schrieb:

Not the ruby code, I found this elegant snippet to be too slow:

def transitive_closure(pairs):
    new_pairs = closure = set(pairs)
    while new_pairs:
        new_pairs = {
            (a, d)
            for a, b in new_pairs
            for c, d in pairs
            if b == c
        } - closure
        closure |= new_pairs
    return closure

Did you try something closer to DFS?

from collections import defaultdict

def transitive_closure(pairs):
    graph = defaultdict(set)
    for a, b in pairs:
        graph[a].add(b)
    closure = set()
    for a in graph:
        visited = set()
        stack = [a]
        while stack:
            node = stack.pop()
            if node not in visited:
                visited.add(node)
                closure.add((a, node))
                stack.extend(graph[node] - visited)
    return closure

Daniel Weber (Sep 30 2024 at 12:13):

Joachim Breitner said:

I think we can merge https://github.com/teorth/equational_theories/pull/19 now. If we then also merge https://github.com/teorth/equational_theories/pull/130 then these antiimplications will show up in the script that collects our results, and that will answer your question of how fast it needs to be – fast enough to handle that data :-)

Currently it takes about 6 seconds, let's see what happens after equational#19

Joachim Breitner (Sep 30 2024 at 12:13):

I’m using this which works well enough for now and is quite elegant (I think), no need to discuss variants here :-)

impliedBy = { i : set() for i in full }
implying = { j : set() for j in full }

for i, j in implications:
  impliedBy[i].add(j)
  impliedBy[i] |= impliedBy[j]
  implying[j].add(i)
  implying[j] |= implying[i]

Update: This is not only elegant and fast, but also wrong. Ignore me.

Joachim Breitner (Sep 30 2024 at 12:19):

Daniel Weber said:

Currently it takes about 6 seconds, let's see what happens after equational#19

There is certainly a lot of data there now:

~/build/lean/equational_theories $ .lake/build/bin/extract_implications |wc -l
88066

Daniel Weber (Sep 30 2024 at 12:20):

That's only the direct implications, isn't it?

Daniel Weber (Sep 30 2024 at 12:21):

The transitive closure had about 8M implications/non-implications

Joachim Breitner (Sep 30 2024 at 12:22):

Yes, the extract_implication scripts just collects the “raw data” of the proven theorems; further downstream scrips should calculate the transitive closure or reduction as needed.

Joachim Breitner (Sep 30 2024 at 12:33):

Just merged equational#19. Enjoy :-)

Leo Shine (Sep 30 2024 at 12:48):

I think it's equational#19 you mean

Joachim Breitner (Sep 30 2024 at 12:48):

Keep getting that wrong :-)

Daniel Weber (Sep 30 2024 at 12:50):

Joachim Breitner said:

Just merged equational#19. Enjoy :-)

Now it finds 18M implications/refutations in 14 seconds

Oliver Nash (Sep 30 2024 at 13:37):

Joachim Breitner said:

Yes, the extract_implication scripts just collects the “raw data” of the proven theorems; further downstream scrips should calculate the transitive closure or reduction as needed.

Does tooling to consume the JSON which this emits already exist?

Oliver Nash (Sep 30 2024 at 13:38):

(E.g., if I wanted to witness the 18M figure quoted above.)

Daniel Weber (Sep 30 2024 at 13:39):

You can use equational#126 to see it, although it doesn't output as JSON because Lean's toJson stack-overflowed

Kim Morrison (Sep 30 2024 at 13:42):

If someone could provide a standalone example showing the stack overflow, that would be great (unless someone is already on fixing it!)

Daniel Weber (Sep 30 2024 at 13:51):

@Kim Morrison said:

If someone could provide a standalone example showing the stack overflow, that would be great (unless someone is already on fixing it!)

import Lean

open Lean

def main : IO Unit := IO.println (toJson (Array.mkArray 70000 "string1"))

outputs

Stack overflow detected. Aborting.

I assume that might be system-dependent, I'm on Ubuntu 22.04.5

Amir Livne Bar-on (Sep 30 2024 at 14:10):

I analysed the implications graph. We have 3557 equivalence classes. Of the 12M implications between them, 2.5% are know to hold and 72% are known not to hold.
I reduced the remaining conjectures and got 15501 irreducible implications.

Joachim Breitner (Sep 30 2024 at 14:11):

So we are ¾ done? Great, so one more day to go.

Daniel Weber (Sep 30 2024 at 14:13):

Amir Livne Bar-on said:

I analysed the implications graph. We have 3557 equivalence classes. Of the 12M implications between them, 2.5% are know to hold and 72% are known not to hold.
I reduced the remaining conjectures and got 15501 irreducible implications.

Do you happen to have the graph in a form Nauty is able to process? I wonder if there are interesting symmetries

Amir Livne Bar-on (Sep 30 2024 at 14:18):

Do you mean after reducing to equivalence classes? The original graph can be parsed directly from lake exe extract_implications

Daniel Weber (Sep 30 2024 at 14:19):

Yes

Amir Livne Bar-on (Sep 30 2024 at 14:20):

Do you have a link that explains what the format is? Anyway should be easy to generate.
If anyone wants to have a look, my code is here:
https://github.com/teorth/equational_theories/pull/144

Daniel Weber (Sep 30 2024 at 14:22):

There's https://pypi.org/project/pynauty/ , although I'm not sure what's the output format

Amir Livne Bar-on (Sep 30 2024 at 14:43):

It's a bit strange to think about this as a graph. The implications form a poset, we only use graphs as light-weight representation

Daniel Weber (Sep 30 2024 at 14:57):

Posets are graphs :smile:

Amir Livne Bar-on (Sep 30 2024 at 18:44):

I ran some more analyses on the data. I'm less certain of these, but:

3557 is the current number of equivalence classes, but when more implications are proved this may go down. The lower limit we have established is 845
The current longest provable path of non-equivalent implications is of length 10. An example of such a path is 2 => 41 => 374 => 4117 => 38 => 311 => 309 => 3467 => 3458 => 3456 => 1
If there are no other true implications beside what's proven in the repo now, the longest path of non-equivalent implications has length 19
I don't know what the maximum length would be if arbitrary subsets of the unknown implications hold, it's plausible to get longer chains than 19

[EDIT: changed the numbers after bugfixes]

Edward van de Meent (Sep 30 2024 at 18:46):

so in theory we might be able to make do with 18 counterexamples? :upside_down:

Amir Livne Bar-on (Sep 30 2024 at 18:48):

Actually I think my conclusions are inconsistent... There must be fewer equivalence classes if the longest path gets shorter.

EDIT: On third thought, it's possible to imagine situations where adding implications would lengthen the path without joining equivalence classes. If the Hesse diagram of A, B, C, D is a diamond with B&C incomparable, adding an edge between B and C would increase the longest path.

Amir Livne Bar-on (Sep 30 2024 at 20:46):

Amir Livne Bar-on said:

I analysed the implications graph. We have 3557 equivalence classes. Of the 12M implications between them, 2.5% are know to hold and 72% are known not to hold.
I reduced the remaining conjectures and got 15501 irreducible implications.

The script had a bug, it chose the wrong subset of implications. If anyone relied on it, please re-run the script from the branch. You should get 24283 irreducible implications.

Kim Morrison (Oct 01 2024 at 01:20):

Daniel Weber said:

Kim Morrison said:

If someone could provide a standalone example showing the stack overflow, that would be great (unless someone is already on fixing it!)
import Lean

open Lean

def main : IO Unit := IO.println (toJson (Array.mkArray 70000 "string1"))
outputs
Stack overflow detected. Aborting.
I assume that might be system-dependent, I'm on Ubuntu 22.04.5

Tracking at lean#5548, but I don't immediately understand what is wrong.

Daniel Weber (Oct 01 2024 at 01:20):

I found out that when using .compress it doesn't happen, so that's what I'll do for now

Vlad Tsyrklevich (Oct 01 2024 at 06:23):

@Joachim Breitner My processing of Nicholas' original data set had ~13.6m refutations, and Daniel posted in another thread that his tool calculated implicit_proof_false: 12764328. I don't have a way to calculate this myself right now, and was curious if we proved less than what Nicholas originally calculated, or if my or Daniel's numbers are wrong

Daniel Weber (Oct 01 2024 at 06:26):

I think you just have to add explicit_proof_false from my numbers, as they're counted separately

Vlad Tsyrklevich (Oct 01 2024 at 06:28):

@Daniel Weber Thanks! So this get me up to 13,503,535 where as my list of the refutations Nicholas generated has 13,633,461

Joachim Breitner (Oct 02 2024 at 10:12):

Joachim Breitner said:

I’m using this which works well enough for now and is quite elegant (I think), no need to discuss variants here :-)

impliedBy = { i : set() for i in full }
implying = { j : set() for j in full }

for i, j in implications:
  impliedBy[i].add(j)
  impliedBy[i] |= impliedBy[j]
  implying[j].add(i)
  implying[j] |= implying[i]

The reason it’s elegant and fast is because it’s wrong. A bit embarrassing.

Vlad Tsyrklevich (Oct 02 2024 at 10:19):

Is this the source of the discrepancy in refutation numbers?

Joachim Breitner (Oct 02 2024 at 10:20):

I don’t think so, this only caused too large Facts statements in the generated lean code, but didn’t affect the raw numbers reported by various people.

Joachim Breitner (Oct 02 2024 at 10:25):

I am now using this code, based on @Amir Livne Bar-on ’s code, slightly optimized a bit futher:

def transitive_closure(pairs):
    pairs_idx = defaultdict(list)
    for a, b in pairs:
        pairs_idx[a].append(b)
    closure = set()
    new_pairs = list(pairs)
    while new_pairs:
        (a,b) = new_pairs.pop()
        closure.add((a, b))
        for c in pairs_idx[b]:
          if (a, c) not in closure:
            new_pairs.append((a, c))
    return closure

https://github.com/teorth/equational_theories/pull/196

Vlad Tsyrklevich (Oct 02 2024 at 10:53):

Vlad Tsyrklevich said:

Daniel Weber Thanks! So this get me up to 13,503,535 where as my list of the refutations Nicholas generated has 13,633,461

So do you know what this discrepancy is from? Is this expected, or not?

Terence Tao (Oct 02 2024 at 13:31):

It would be nice if the statistics portion of README.md was updated with some of these stats. Presumably we are close to having this done automatically but having some early data points recorded will be nice for the time series visualizations.

Nicholas Carlini (Oct 02 2024 at 14:46):

Vlad Tsyrklevich said:

So do you know what this discrepancy is from? Is this expected, or not?

Not an answer to your question, but I'm finding that it's difficult to keep track of my own sets of "A =/> B" where as I process them I'm losing some refutations. This sadly isn't something Lean can help with (it'll only show the set we give it is correct, not that others could have been given), and I wonder if there's some process that would help make sure this happens less often. e.g., right now I'm going to try and find some problem where it looks like I've lost some 4x4 magmas along the way and I'm not sure where they went.

Last updated: May 02 2025 at 03:31 UTC