Zulip Chat Archive

Stream: Equational

Topic: Lean script for extracting current set of implications

David Renshaw (Sep 27 2024 at 16:12):

I opened a PR adding a new Lean script extract_implications.lean: https://github.com/teorth/equational_theories/pull/22

David Renshaw (Sep 27 2024 at 16:13):

The idea is to move the logic that's currently in process_implications.py into Lean, so that we can be much more precise about the analysis.

Joachim Breitner (Sep 27 2024 at 16:13):

Ah, so basically what’s discussed in https://leanprover.zulipchat.com/#narrow/stream/458659-Equational/topic/Refactoring.20the.20Lean.20file.20structure/near/473148427. Nice.

David Renshaw (Sep 27 2024 at 16:14):

yeah, it's a minimal first step in that direction

Joachim Breitner (Sep 28 2024 at 14:07):

Somehow running the script seems to require a working local rust installation, because of egg. Is it possible to set it up so that the script works without that?

Daniel Weber (Sep 28 2024 at 14:10):

See the discussion in #Equational > Adding the egg repository as a dependency

Daniel Weber (Sep 28 2024 at 14:13):

Do you know how we can extract proof_wanted statements? They're run in withoutModifyingEnv, so I'm not sure how to see them

Joachim Breitner (Sep 28 2024 at 14:13):

With a python script scraping the lean files :smiling_devil:

Joachim Breitner (Sep 28 2024 at 14:14):

Or by writing out own implication_wanted command that stores the data in an environment extension.

Bryan Gin-ge Chen (Sep 28 2024 at 14:15):

(I think we might actually eventually want something like that in proof_wanted itself so that we could e.g. have a page with all proof_wanted statements in the mathlib docs.)

Terence Tao (Sep 28 2024 at 14:18):

I didn't expect that our first contribution to Mathlib from this project was an update to proof_wanted :joy:

Daniel Weber (Sep 28 2024 at 14:18):

I think it's in batteries, actually

Edward van de Meent (Sep 28 2024 at 14:19):

indeed, docs#proof_wanted

David Renshaw (Sep 28 2024 at 14:45):

Joachim Breitner said:

Somehow running the script seems to require a working local rust installation, because of egg. Is it possible to set it up so that the script works without that?

Is that only if you use lake exe? What if you run the executable directly?

./.lake/build/bin/extract_implications

Joachim Breitner (Sep 28 2024 at 15:19):

Ah, good idea! Away from the keyboard until tomorrow, but I'll try then.

Leo Shine (Sep 28 2024 at 16:41):

Is there an easy way to add rust to the gitpod instance so that this just works?

Daniel Weber (Sep 28 2024 at 17:04):

Joachim Breitner said:

Or by writing out own implication_wanted command that stores the data in an environment extension.

I'm not sure what data should be saved there, and how to get it from the macro. Is someone able to do this?

David Renshaw (Sep 28 2024 at 17:10):

Stashing a fully-qualified name should be straightforward.

David Renshaw (Sep 28 2024 at 17:11):

I can do this, right now I feel stuck on naming, because implication_wanted seems not quite right.

Daniel Weber (Sep 28 2024 at 17:11):

The type is also needed to figure out what it implies/refutes

David Renshaw (Sep 28 2024 at 17:12):

I was thinking we would make a new command_elab (i.e. not just a macro that expands to proof_wanted).

David Renshaw (Sep 28 2024 at 17:12):

Then we should get full access to the environment and be able to grab type information.

Terence Tao (Sep 28 2024 at 17:13):

conjecture perhaps? I believe that's not already reserved in Lean. Or claim

David Renshaw (Sep 28 2024 at 17:14):

or maybe just proof_wanted', on the theory that eventually the functionality will get upstreamed.

David Renshaw (Sep 28 2024 at 17:14):

OK, I'll choose one of these and prepare a PR.

Vlad Tsyrklevich (Sep 28 2024 at 18:42):

Hi David, this tool currently only picks up implications in Subgraph.lean, but there are now also many auto-generated theorems that will be under .Generated once https://github.com/teorth/equational_theories/pull/73 lands, it would be nice to add those. It would also be nice to print the transitive closure of all implications, but that can probably be done externally.

David Renshaw (Sep 28 2024 at 20:07):

New conjecture command added here: equational#85

David Renshaw (Sep 28 2024 at 20:08):

Extracting the type of the declaration was trickier than I anticipated! It took me a while to figure out that I needed to expandModifiers and then expandDeclId to get a name that I could pass to getConstInfo.

David Renshaw (Sep 28 2024 at 20:16):

@Vlad Tsyrklevich yeah, adding those should be straightforward. I can see two options for how to specify which implications should be included:

iterate through the whole environment, as proposed here: https://github.com/teorth/equational_theories/pull/22#issuecomment-2379778521
iterate through a given set of files

David Renshaw (Sep 28 2024 at 20:46):

add another command like conjecture with a similar environment extension, to hold proven facts about implications. This feels like the nicest option to me.

David Renshaw (Sep 29 2024 at 00:52):

Here's a proposal for option 3: https://github.com/teorth/equational_theories/pull/93

David Renshaw (Sep 29 2024 at 00:52):

This feels to me like the right way to do it.

David Renshaw (Sep 29 2024 at 00:53):

I'm not sure if result is the best name for the command; I'm interested in any other suggestions.

Terence Tao (Sep 29 2024 at 02:42):

David Renshaw said:

I'm not sure if result is the best name for the command; I'm interested in any other suggestions.

proposition, perhaps?

David Renshaw (Sep 29 2024 at 12:43):

Maybe edge_theorem? There are theorems about edges in our graph of interest.

David Renshaw (Sep 29 2024 at 12:48):

or graph_theorem

Joachim Breitner (Sep 29 2024 at 12:50):

Why is this an command and not an attribute? It seems that would fit the bill nicely, more compositional, and without having users learn new commands?

As for the name, maybe @[equational_theories], i.e. simply the name of the project?

David Renshaw (Sep 29 2024 at 12:51):

Attributes are more limited than full commands... but maybe we can make it work.

David Renshaw (Sep 29 2024 at 12:51):

In an attribute handler, we get access to the theorem name and the attribute syntax

David Renshaw (Sep 29 2024 at 12:52):

so we could accumulate the theorem names, and then do all of the analysis later, I think

Joachim Breitner (Sep 29 2024 at 12:52):

What do you have to do that you cannot do in an attribute? I'd expect you check the theorem statement whether it's the given shape and then register it with a env extension to easily find it later?

David Renshaw (Sep 29 2024 at 12:52):

that approach did not work for conjecture, because nothing was saved in the environment

David Renshaw (Sep 29 2024 at 12:52):

but I think it can work here

David Renshaw (Sep 29 2024 at 12:54):

As far as I know, it's not possible to check the type of a theorem via an attribute before it's added to an env extension

David Renshaw (Sep 29 2024 at 12:54):

all logic would need to happen upon retrieval. Which is maybe fine.

Joachim Breitner (Sep 29 2024 at 12:54):

I thought the attribute handler is the code that's adding something to the env extension. So you can do whatever you want there.

Joachim Breitner (Sep 29 2024 at 12:56):

I can have a closer look later today when no longer at the phone. I hope I'm not misleading here :-)

David Renshaw (Sep 29 2024 at 12:56):

https://github.com/leanprover/lean4/blob/3a46fd0fde0f6ae8093e3303061d312e502cee82/src/Lean/Attributes.lean#L50-L54

David Renshaw (Sep 29 2024 at 12:57):

ah, but there's context information in AttrM...

Joachim Breitner (Sep 29 2024 at 12:57):

Right - use the name, look up the type in the environment, do your checks, add it to the env extension

Joachim Breitner (Sep 29 2024 at 12:58):

Existing attributes like ext could serve as guidelines

David Renshaw (Sep 29 2024 at 12:58):

Unclear to me that AttributeApplicationTime.afterCompilation definitely means "the const is in the environment now", but I can check experimentally.

David Renshaw (Sep 29 2024 at 13:07):

FunProp is a nice example for me to copy: https://github.com/leanprover-community/mathlib4/blob/62eccbcb33b8a9fc8e7cc61cd546298b52f0c29b/Mathlib/Tactic/FunProp/Attr.lean#L23-L40

David Renshaw (Sep 29 2024 at 14:02):

I remember how I got the wrong idea about attributes -- a while back I tried to write an attribute for Compfiles that would need access to syntax of the declaration body. That turned out to be impossible (as far as I could tell), and from that experience I internalized the lesson "attributes are underpowered".

David Renshaw (Sep 29 2024 at 15:14):

I marked the PR as "ready to review": https://github.com/teorth/equational_theories/pull/93

David Renshaw (Sep 29 2024 at 15:14):

I named the attribute @[equational_result], which seems reasonable to me. Open to other suggestions, but it's probably best not to bikeshed too much.

David Renshaw (Sep 29 2024 at 15:15):

I also added some justification in the PR summary for "why not just scan the environment for theorems that have the right shape?"

Joachim Breitner (Sep 29 2024 at 15:36):

David Renshaw said:

I also added some justification in the PR summary for "why not just scan the environment for theorems that have the right shape?"

Maybe include it also in the module docstring of equational_theories/EquationalResult.lean, for posterity.

David Renshaw (Sep 29 2024 at 16:28):

added: https://github.com/teorth/equational_theories/pull/93/commits/5a705fdca75fbf2246d3af7e3c4f9d43b52cb92c

Joachim Breitner (Sep 30 2024 at 09:09):

In https://github.com/teorth/equational_theories/pull/130 I am adjusting the script to handle the (expanded form of the) Facts syntax. @David Renshaw do you want to review?

Joachim Breitner (Sep 30 2024 at 12:03):

(deleted)

Vlad Tsyrklevich (Sep 30 2024 at 18:11):

lake exe extract_implications is still blocked for others by egg failing to build, right? Should it be removed for now?

David Renshaw (Sep 30 2024 at 18:19):

It works for me.

David Renshaw (Sep 30 2024 at 18:20):

does it work if you don't use lake exe?
Like this:

$ lake build extract_implications
$ ./.lake/build/bin/extract_implications

David Renshaw (Sep 30 2024 at 18:21):

ah, I understood you to be suggesting removal of extract_implications, but maybe you're suggesting removing egg?

Vlad Tsyrklevich (Sep 30 2024 at 18:26):

Nono, removing egg if it's breaking the build

Vlad Tsyrklevich (Sep 30 2024 at 18:41):

Is it expected that the output has duplicates, e.g.
{"rhs": "Equation856", "lhs": "Equation1004"},
{"rhs": "Equation856", "lhs": "Equation1004"},

Vlad Tsyrklevich (Sep 30 2024 at 18:42):

(deleted)

Vlad Tsyrklevich (Sep 30 2024 at 18:50):

Vlad Tsyrklevich said:

Is it expected that the output has duplicates, e.g.
{"rhs": "Equation856", "lhs": "Equation1004"},
{"rhs": "Equation856", "lhs": "Equation1004"},

This is obviously because two proofs exist in the code base, I hadn't thought of it.

Joachim Breitner (Sep 30 2024 at 19:55):

Vlad Tsyrklevich said:

Nono, removing egg if it's breaking the build

As long as it isn't used, I'd be in favor of removing it. (I tend to remove it locally, build, and try to remember to revert before committing something)

Shreyas Srinivas (Sep 30 2024 at 20:14):

I added egg in the beginning because at that time it wasn't clear how the project would evolve and it still seemed like a lot of handproving would be involved. My hope was to get play around with egg. Looking at the list of forks, it seems the Rosser group's GitHub account has also forked the project. I wonder if they had the same idea.

Nevertheless, since then we have moved on to massive sets of automatically generated networks of implications. I'd ask @Andrés Goens and if he says so, removing it might be the right thing to do.

Shreyas Srinivas (Sep 30 2024 at 20:16):

That being said, it could be one of the explorations carried out in parallel, if we can fix the build. I believe @Marcus Rossel maintains the egg repository we use.

Vlad Tsyrklevich (Sep 30 2024 at 20:32):

Joachim Breitner said:

Vlad Tsyrklevich said:

Nono, removing egg if it's breaking the build

As long as it isn't used, I'd be in favor of removing it. (I tend to remove it locally, build, and try to remember to revert before committing something)

Yeah, I am already maintaining a lot of local changes so the additional git stash / git stash pop to move around adds friction

Andrés Goens (Sep 30 2024 at 21:33):

yeah, I think we should remove the dependency if we're not using it. Might be interesting to explore in the future, but then we can have a special target for that or something similar, so it doesn't break people's builds if they didn't have a rust toolchain

Andrés Goens (Sep 30 2024 at 21:34):

it does seem like not ideal behavior for lake to pull the dependency when it's not used, but that's a separate point I guess

Marcus Rossel (Sep 30 2024 at 21:34):

If it's causing problems for people I'd just remove it. I think it's sufficient for people to add it locally if they want to play around with it.

On that note, I've been working on improving the tactic for this project and as a small experiment managed to prove 80% of the theorems in Subgraph.lean using by set_option egg.explosion true in egg [*] (more improvements to come). I haven't been able to keep up with how the project is being handled at a larger scale, so I don't know how to put the egg tactic to better use - but if anyone has ideas let me know :)

Michael Bucko (Oct 02 2024 at 04:15):

David Renshaw schrieb:

does it work if you don't use lake exe?
Like this:

$ lake build extract_implications
$ ./.lake/build/bin/extract_implications
````

I am getting
```v@Fenek equational_theories % lake exe extract_implications
✖ [13/30] Running egg/egg_for_lean.static
trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic
info: stderr:
could not execute external process 'cargo'
error: external command 'cargo' exited with code 255
Some required builds logged failures:
- egg/egg_for_lean.static
error: build failed
v@Fenek equational_theories % lake build extract_implications
✖ [15/30] Running egg/egg_for_lean.static
trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic
info: stderr:
could not execute external process 'cargo'
error: external command 'cargo' exited with code 255
Some required builds logged failures:
- egg/egg_for_lean.static
error: build failed

Joachim Breitner (Oct 02 2024 at 07:29):

Oh, we still haven’t removed that unused dependency? Well, here we go: https://github.com/teorth/equational_theories/pull/190

Joachim Breitner (Oct 02 2024 at 07:46):

Hmm, how do I have to call the script now if I want the raw data (no transitive closure, facts still as compact facts and no expanded into antiimplications)? The 33MB file I get from

.lake/build/bin/extract_implications --json equational_theories

is hard to use.

(Adding a --only-imlications flag as a easy and possibly useful work-around, as I only care about those at the moment.)

Joachim Breitner (Oct 02 2024 at 07:56):

https://github.com/teorth/equational_theories/pull/192

Amir Livne Bar-on (Oct 02 2024 at 09:05):

If we still want the output in the old format, this PR adds a subcommand for it:
https://github.com/teorth/equational_theories/pull/194

Joachim Breitner (Oct 02 2024 at 09:09):

Thanks! While you are at it, can you make equational_theories the default filename if none is given?

Amir Livne Bar-on (Oct 02 2024 at 09:44):

Added to the same PR

Michael Bucko (Oct 02 2024 at 14:27):

That already works for me. Had to reinstall Rust.

Michael Bucko schrieb:

David Renshaw schrieb:

does it work if you don't use lake exe?
Like this:

$ lake build extract_implications
$ ./.lake/build/bin/extract_implications
````

I am getting
```v@Fenek equational_theories % lake exe extract_implications
✖ [13/30] Running egg/egg_for_lean.static
trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic
info: stderr:
could not execute external process 'cargo'
error: external command 'cargo' exited with code 255
Some required builds logged failures:
- egg/egg_for_lean.static
error: build failed
v@Fenek equational_theories % lake build extract_implications
✖ [15/30] Running egg/egg_for_lean.static
trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic
info: stderr:
could not execute external process 'cargo'
error: external command 'cargo' exited with code 255
Some required builds logged failures:
- egg/egg_for_lean.static
error: build failed

Nicholas Carlini (Oct 03 2024 at 00:18):

I'm having a somewhat hard time understanding the output of this script.

If I run "extract_implications outcomes --hist" it tells me there are 2826 unknown.
If I run "extract_implications outcomes" I get a list of lists, with 2826 entries tagged as "unknown". So far so good.
If I run "extract_implications unknowns" then I get a list of length just 1696.

Why is extracting unknowns smaller than outcomes and counting? Further, why are 1683 of these 'unknowns' involving Equation2315?

Finally, if I look at the outcomes list of lists, why is outcomes[0][0] 'unknown' (as is [37,37], [61, 61] and a bunch other). Shouldn't the diagonal be always true?

David Renshaw (Oct 03 2024 at 00:22):

Yes, arguably Closure.lean should mark the diagonal as implictly true. Currently it does not do that.

David Renshaw (Oct 03 2024 at 00:23):

I'll open an issue for that...

Nicholas Carlini (Oct 03 2024 at 00:26):

Okay great. But these two lists are still different sizes even if I account for that

David Renshaw (Oct 03 2024 at 00:26):

equational#215

David Renshaw (Oct 03 2024 at 00:28):

@Nicholas Carlini do you have a full command that I can run to see what you're seeing?

David Renshaw (Oct 03 2024 at 00:28):

or is this on your own local branch?

Nicholas Carlini (Oct 03 2024 at 00:28):

No if I'm just running this on upstream main. Let me write a quick reproducer

David Renshaw (Oct 03 2024 at 00:29):

what if you add the --proven flag to unknowns?

David Renshaw (Oct 03 2024 at 00:29):

it looks like outcomes and unknowns have different default behavior with respect to conjectures

Nicholas Carlini (Oct 03 2024 at 00:30):

No if I add --proven then it goes to 1904808

Nicholas Carlini (Oct 03 2024 at 00:31):

lake exe extract_implications unknowns | grep -o "Equation2315" | wc gives 1683, for example. This seems wrong

David Renshaw (Oct 03 2024 at 00:33):

the help message of unknowns says something about "modulo equivalence"

Nicholas Carlini (Oct 03 2024 at 00:34):

Yeah, but if I look at the indexs of the unkonwns in outcomes, then I see exactly 1800 instances of Equation2990.

Nicholas Carlini (Oct 03 2024 at 00:34):

So the counts aren't lining up

David Renshaw (Oct 03 2024 at 00:38):

It's very believable that there's a bug here, but I'm not following your argument. What are you expecting to line up to what, and why?

Nicholas Carlini (Oct 03 2024 at 00:38):

(And 'outcomes' in theory should be listing the entire N^2 set of equations. So modulo equivalence shouldn't matter here.)

David Renshaw (Oct 03 2024 at 00:41):

if outcomes lists N^2 and unknowns reduces modulo equivalence, then it makes sense to me that unknowns reports a smaller number

Nicholas Carlini (Oct 03 2024 at 00:41):

Hm sorry I'm typing as I get more confused. Maybe here's the clearest question that's confusing to me right now.

If I look at extract_implications outcomes, then I see the entire N^2 table. If I look at the major diagonal, I would expect either unknown or true. But I see a bunch of implicit_conjecture_false. This should not happen right?

David Renshaw (Oct 03 2024 at 00:43):

i think that would mean that we have some bad conjectures

Nicholas Carlini (Oct 03 2024 at 00:43):

$ lake exe extract_implications outcomes > out.json
$ python3
>>> json.load(open("out.json"))['outcomes'][5][5]
'unknown'
>>> json.load(open("out.json"))['outcomes'][6][6]
'implicit_conjecture_false'

Nicholas Carlini (Oct 03 2024 at 00:44):

And then of the 1815 'unknown' entries in the matrix, all but 1801 involve Equation 2990. Which I guess is possible? But that seems suspicious. (If I ignore entries on the the diagonal.)

David Renshaw (Oct 03 2024 at 00:44):

yep, I'm seeing the same thing here

Nicholas Carlini (Oct 03 2024 at 00:45):

I was going to try and get started on (https://github.com/teorth/equational_theories/issues/209) but before I claimed I wanted to make sure I knew how to read the dataset of what implies what.... and currently I do not

David Renshaw (Oct 03 2024 at 00:48):

yeah, equation 2990 doesn't look all that special: https://github.com/teorth/equational_theories/blob/3b11388f8a933b00d92c6bb883ef2c0b458d564e/equational_theories/AllEquations.lean#L3011

David Renshaw (Oct 03 2024 at 00:49):

equation 2990 := x = ((y ∘ (z ∘ y)) ∘ x) ∘ x

David Renshaw (Oct 03 2024 at 00:49):

my guess is that there may be some bad conjectures, and that there could possibly be a bug in the closure computation

Daniel Weber (Oct 03 2024 at 00:52):

David Renshaw said:

my guess is that there may be some bad conjectures, and that there could possibly be a bug in the closure computation

There's a bug in equational#206, apologies. It should be reverted

Daniel Weber (Oct 03 2024 at 00:52):

I'll make extract_implications warn if there are inconsistent conjectures

Nicholas Carlini (Oct 03 2024 at 00:53):

Can we make the build break if that happens?

Nicholas Carlini (Oct 03 2024 at 00:54):

Preventing commits from introducing things that cause errors seems safer

Daniel Weber (Oct 03 2024 at 00:54):

Nicholas Carlini said:

Can we make the build break if that happens?

That should be possible, but I don't know how to do CI

Nicholas Carlini (Oct 03 2024 at 00:55):

OK for now warn. I'll add an issue for making it break. I don't either :)

Nicholas Carlini (Oct 03 2024 at 16:11):

I think I'm still misunderstanding something about this tool.

$ lake exe extract_implications outcomes > out.json
$ python3
>>> import json
>>> outcomes = json.load(open("out.json"))['outcomes']
>>> (outcomes[461][612], outcomes[612][461])
('implicit_proof_false', 'implicit_proof_false')

Now I'm not sure which way the table is meant to be read (a=>b or b=>a) but this one is symmetric. So this seems to tell me that Equation462 =/> Equation613 and Equation613 =/> Equation 462.

But we know that 613 => 462. So something's not right in my understanding of this tool.

theorem Equation613_implies_Equation462 (G : Type*) [Magma G] (h : Equation613 G) : Equation462 G := λ x y z w u => h x x y z w u

Could someone help me out?

Daniel Weber (Oct 03 2024 at 16:17):

outcomes[461][612] is the outcome of json.load(open("out.json"))['equations'][461] and json.load(open("out.json"))['equations'][612], not necessarily Equation462 and Equation613

Nicholas Carlini (Oct 03 2024 at 16:18):

Oooohhhhhh

Cody Roux (Oct 04 2024 at 01:22):

That's a little unfortunate tbh, is this to be fixed?

David Renshaw (Oct 04 2024 at 01:25):

the idea is to allow for not all equations to be present, e.g. for Subraph.lean

David Renshaw (Oct 04 2024 at 01:26):

so there needs to be a mapping between array indices and equation indices

Nicholas Carlini (Oct 04 2024 at 01:28):

Perhaps there could be a "user-friendly" output default that has {"Equation5": {"Equation6": "A_IMPLIES_B", "Equation 7": "A_NOT_IMPLIES_B"...}, ...} or something like that which is very hard to incorrectly understand. It will be a bit bigger (because you're duplicating keys) but easier to parse

David Renshaw (Oct 04 2024 at 02:47):

Relatedly: here is a fix to index canonicalization in outcomes_to_image.py: equational#268

Last updated: May 02 2025 at 03:31 UTC