Zulip Chat Archive
Stream: Equational
Topic: Lean script for extracting current set of implications
David Renshaw (Sep 27 2024 at 16:12):
I opened a PR adding a new Lean script extract_implications.lean
: https://github.com/teorth/equational_theories/pull/22
David Renshaw (Sep 27 2024 at 16:13):
The idea is to move the logic that's currently in process_implications.py
into Lean, so that we can be much more precise about the analysis.
Joachim Breitner (Sep 27 2024 at 16:13):
Ah, so basically what’s discussed in https://leanprover.zulipchat.com/#narrow/stream/458659-Equational/topic/Refactoring.20the.20Lean.20file.20structure/near/473148427. Nice.
David Renshaw (Sep 27 2024 at 16:14):
yeah, it's a minimal first step in that direction
Joachim Breitner (Sep 28 2024 at 14:07):
Somehow running the script seems to require a working local rust installation, because of egg
. Is it possible to set it up so that the script works without that?
Daniel Weber (Sep 28 2024 at 14:10):
See the discussion in #Equational > Adding the egg repository as a dependency
Daniel Weber (Sep 28 2024 at 14:13):
Do you know how we can extract proof_wanted
statements? They're run in withoutModifyingEnv
, so I'm not sure how to see them
Joachim Breitner (Sep 28 2024 at 14:13):
With a python script scraping the lean files :smiling_devil:
Joachim Breitner (Sep 28 2024 at 14:14):
Or by writing out own implication_wanted
command that stores the data in an environment extension.
Bryan Gin-ge Chen (Sep 28 2024 at 14:15):
(I think we might actually eventually want something like that in proof_wanted
itself so that we could e.g. have a page with all proof_wanted
statements in the mathlib docs.)
Terence Tao (Sep 28 2024 at 14:18):
I didn't expect that our first contribution to Mathlib from this project was an update to proof_wanted
:joy:
Daniel Weber (Sep 28 2024 at 14:18):
I think it's in batteries, actually
Edward van de Meent (Sep 28 2024 at 14:19):
indeed, docs#proof_wanted
David Renshaw (Sep 28 2024 at 14:45):
Joachim Breitner said:
Somehow running the script seems to require a working local rust installation, because of
egg
. Is it possible to set it up so that the script works without that?
Is that only if you use lake exe
? What if you run the executable directly?
./.lake/build/bin/extract_implications
Joachim Breitner (Sep 28 2024 at 15:19):
Ah, good idea! Away from the keyboard until tomorrow, but I'll try then.
Leo Shine (Sep 28 2024 at 16:41):
Is there an easy way to add rust to the gitpod instance so that this just works?
Daniel Weber (Sep 28 2024 at 17:04):
Joachim Breitner said:
Or by writing out own
implication_wanted
command that stores the data in an environment extension.
I'm not sure what data should be saved there, and how to get it from the macro. Is someone able to do this?
David Renshaw (Sep 28 2024 at 17:10):
Stashing a fully-qualified name should be straightforward.
David Renshaw (Sep 28 2024 at 17:11):
I can do this, right now I feel stuck on naming, because implication_wanted
seems not quite right.
Daniel Weber (Sep 28 2024 at 17:11):
The type is also needed to figure out what it implies/refutes
David Renshaw (Sep 28 2024 at 17:12):
I was thinking we would make a new command_elab
(i.e. not just a macro that expands to proof_wanted
).
David Renshaw (Sep 28 2024 at 17:12):
Then we should get full access to the environment and be able to grab type information.
Terence Tao (Sep 28 2024 at 17:13):
conjecture
perhaps? I believe that's not already reserved in Lean. Or claim
David Renshaw (Sep 28 2024 at 17:14):
or maybe just proof_wanted'
, on the theory that eventually the functionality will get upstreamed.
David Renshaw (Sep 28 2024 at 17:14):
OK, I'll choose one of these and prepare a PR.
Vlad Tsyrklevich (Sep 28 2024 at 18:42):
Hi David, this tool currently only picks up implications in Subgraph.lean, but there are now also many auto-generated theorems that will be under .Generated once https://github.com/teorth/equational_theories/pull/73 lands, it would be nice to add those. It would also be nice to print the transitive closure of all implications, but that can probably be done externally.
David Renshaw (Sep 28 2024 at 20:07):
New conjecture
command added here: equational#85
David Renshaw (Sep 28 2024 at 20:08):
Extracting the type of the declaration was trickier than I anticipated! It took me a while to figure out that I needed to expandModifiers
and then expandDeclId
to get a name that I could pass to getConstInfo
.
David Renshaw (Sep 28 2024 at 20:16):
@Vlad Tsyrklevich yeah, adding those should be straightforward. I can see two options for how to specify which implications should be included:
- iterate through the whole environment, as proposed here: https://github.com/teorth/equational_theories/pull/22#issuecomment-2379778521
- iterate through a given set of files
David Renshaw (Sep 28 2024 at 20:46):
- add another command like
conjecture
with a similar environment extension, to hold proven facts about implications. This feels like the nicest option to me.
David Renshaw (Sep 29 2024 at 00:52):
Here's a proposal for option 3: https://github.com/teorth/equational_theories/pull/93
David Renshaw (Sep 29 2024 at 00:52):
This feels to me like the right way to do it.
David Renshaw (Sep 29 2024 at 00:53):
I'm not sure if result
is the best name for the command; I'm interested in any other suggestions.
Terence Tao (Sep 29 2024 at 02:42):
David Renshaw said:
I'm not sure if
result
is the best name for the command; I'm interested in any other suggestions.
proposition
, perhaps?
David Renshaw (Sep 29 2024 at 12:43):
Maybe edge_theorem
? There are theorems about edges in our graph of interest.
David Renshaw (Sep 29 2024 at 12:48):
or graph_theorem
Joachim Breitner (Sep 29 2024 at 12:50):
Why is this an command and not an attribute? It seems that would fit the bill nicely, more compositional, and without having users learn new commands?
As for the name, maybe @[equational_theories]
, i.e. simply the name of the project?
David Renshaw (Sep 29 2024 at 12:51):
Attributes are more limited than full commands... but maybe we can make it work.
David Renshaw (Sep 29 2024 at 12:51):
In an attribute handler, we get access to the theorem name and the attribute syntax
David Renshaw (Sep 29 2024 at 12:52):
so we could accumulate the theorem names, and then do all of the analysis later, I think
Joachim Breitner (Sep 29 2024 at 12:52):
What do you have to do that you cannot do in an attribute? I'd expect you check the theorem statement whether it's the given shape and then register it with a env extension to easily find it later?
David Renshaw (Sep 29 2024 at 12:52):
that approach did not work for conjecture
, because nothing was saved in the environment
David Renshaw (Sep 29 2024 at 12:52):
but I think it can work here
David Renshaw (Sep 29 2024 at 12:54):
As far as I know, it's not possible to check the type of a theorem via an attribute before it's added to an env extension
David Renshaw (Sep 29 2024 at 12:54):
all logic would need to happen upon retrieval. Which is maybe fine.
Joachim Breitner (Sep 29 2024 at 12:54):
I thought the attribute handler is the code that's adding something to the env extension. So you can do whatever you want there.
Joachim Breitner (Sep 29 2024 at 12:56):
I can have a closer look later today when no longer at the phone. I hope I'm not misleading here :-)
David Renshaw (Sep 29 2024 at 12:56):
David Renshaw (Sep 29 2024 at 12:57):
ah, but there's context information in AttrM
...
Joachim Breitner (Sep 29 2024 at 12:57):
Right - use the name, look up the type in the environment, do your checks, add it to the env extension
Joachim Breitner (Sep 29 2024 at 12:58):
Existing attributes like ext
could serve as guidelines
David Renshaw (Sep 29 2024 at 12:58):
Unclear to me that AttributeApplicationTime.afterCompilation
definitely means "the const is in the environment now", but I can check experimentally.
David Renshaw (Sep 29 2024 at 13:07):
FunProp
is a nice example for me to copy: https://github.com/leanprover-community/mathlib4/blob/62eccbcb33b8a9fc8e7cc61cd546298b52f0c29b/Mathlib/Tactic/FunProp/Attr.lean#L23-L40
David Renshaw (Sep 29 2024 at 14:02):
I remember how I got the wrong idea about attributes -- a while back I tried to write an attribute for Compfiles that would need access to syntax of the declaration body. That turned out to be impossible (as far as I could tell), and from that experience I internalized the lesson "attributes are underpowered".
David Renshaw (Sep 29 2024 at 15:14):
I marked the PR as "ready to review": https://github.com/teorth/equational_theories/pull/93
David Renshaw (Sep 29 2024 at 15:14):
I named the attribute @[equational_result]
, which seems reasonable to me. Open to other suggestions, but it's probably best not to bikeshed too much.
David Renshaw (Sep 29 2024 at 15:15):
I also added some justification in the PR summary for "why not just scan the environment for theorems that have the right shape?"
Joachim Breitner (Sep 29 2024 at 15:36):
David Renshaw said:
I also added some justification in the PR summary for "why not just scan the environment for theorems that have the right shape?"
Maybe include it also in the module docstring of equational_theories/EquationalResult.lean
, for posterity.
David Renshaw (Sep 29 2024 at 16:28):
Joachim Breitner (Sep 30 2024 at 09:09):
In https://github.com/teorth/equational_theories/pull/130 I am adjusting the script to handle the (expanded form of the) Facts
syntax. @David Renshaw do you want to review?
Joachim Breitner (Sep 30 2024 at 12:03):
(deleted)
Vlad Tsyrklevich (Sep 30 2024 at 18:11):
lake exe extract_implications
is still blocked for others by egg failing to build, right? Should it be removed for now?
David Renshaw (Sep 30 2024 at 18:19):
It works for me.
David Renshaw (Sep 30 2024 at 18:20):
does it work if you don't use lake exe
?
Like this:
$ lake build extract_implications
$ ./.lake/build/bin/extract_implications
David Renshaw (Sep 30 2024 at 18:21):
ah, I understood you to be suggesting removal of extract_implications
, but maybe you're suggesting removing egg
?
Vlad Tsyrklevich (Sep 30 2024 at 18:26):
Nono, removing egg if it's breaking the build
Vlad Tsyrklevich (Sep 30 2024 at 18:41):
Is it expected that the output has duplicates, e.g.
{"rhs": "Equation856", "lhs": "Equation1004"},
{"rhs": "Equation856", "lhs": "Equation1004"},
Vlad Tsyrklevich (Sep 30 2024 at 18:42):
(deleted)
Vlad Tsyrklevich (Sep 30 2024 at 18:50):
Vlad Tsyrklevich said:
Is it expected that the output has duplicates, e.g.
{"rhs": "Equation856", "lhs": "Equation1004"},
{"rhs": "Equation856", "lhs": "Equation1004"},
This is obviously because two proofs exist in the code base, I hadn't thought of it.
Joachim Breitner (Sep 30 2024 at 19:55):
Vlad Tsyrklevich said:
Nono, removing egg if it's breaking the build
As long as it isn't used, I'd be in favor of removing it. (I tend to remove it locally, build, and try to remember to revert before committing something)
Shreyas Srinivas (Sep 30 2024 at 20:14):
I added egg in the beginning because at that time it wasn't clear how the project would evolve and it still seemed like a lot of handproving would be involved. My hope was to get play around with egg. Looking at the list of forks, it seems the Rosser group's GitHub account has also forked the project. I wonder if they had the same idea.
Nevertheless, since then we have moved on to massive sets of automatically generated networks of implications. I'd ask @Andrés Goens and if he says so, removing it might be the right thing to do.
Shreyas Srinivas (Sep 30 2024 at 20:16):
That being said, it could be one of the explorations carried out in parallel, if we can fix the build. I believe @Marcus Rossel maintains the egg repository we use.
Vlad Tsyrklevich (Sep 30 2024 at 20:32):
Joachim Breitner said:
Vlad Tsyrklevich said:
Nono, removing egg if it's breaking the build
As long as it isn't used, I'd be in favor of removing it. (I tend to remove it locally, build, and try to remember to revert before committing something)
Yeah, I am already maintaining a lot of local changes so the additional git stash
/ git stash pop
to move around adds friction
Andrés Goens (Sep 30 2024 at 21:33):
yeah, I think we should remove the dependency if we're not using it. Might be interesting to explore in the future, but then we can have a special target for that or something similar, so it doesn't break people's builds if they didn't have a rust toolchain
Andrés Goens (Sep 30 2024 at 21:34):
it does seem like not ideal behavior for lake to pull the dependency when it's not used, but that's a separate point I guess
Marcus Rossel (Sep 30 2024 at 21:34):
If it's causing problems for people I'd just remove it. I think it's sufficient for people to add it locally if they want to play around with it.
On that note, I've been working on improving the tactic for this project and as a small experiment managed to prove 80% of the theorems in Subgraph.lean
using by set_option egg.explosion true in egg [*]
(more improvements to come). I haven't been able to keep up with how the project is being handled at a larger scale, so I don't know how to put the egg
tactic to better use - but if anyone has ideas let me know :)
Michael Bucko (Oct 02 2024 at 04:15):
David Renshaw schrieb:
does it work if you don't use
lake exe
?
Like this:$ lake build extract_implications $ ./.lake/build/bin/extract_implications ```` I am getting ```v@Fenek equational_theories % lake exe extract_implications ✖ [13/30] Running egg/egg_for_lean.static trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic info: stderr: could not execute external process 'cargo' error: external command 'cargo' exited with code 255 Some required builds logged failures: - egg/egg_for_lean.static error: build failed v@Fenek equational_theories % lake build extract_implications ✖ [15/30] Running egg/egg_for_lean.static trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic info: stderr: could not execute external process 'cargo' error: external command 'cargo' exited with code 255 Some required builds logged failures: - egg/egg_for_lean.static error: build failed
Joachim Breitner (Oct 02 2024 at 07:29):
Oh, we still haven’t removed that unused dependency? Well, here we go: https://github.com/teorth/equational_theories/pull/190
Joachim Breitner (Oct 02 2024 at 07:46):
Hmm, how do I have to call the script now if I want the raw data (no transitive closure, facts still as compact facts and no expanded into antiimplications)? The 33MB file I get from
.lake/build/bin/extract_implications --json equational_theories
is hard to use.
(Adding a --only-imlications
flag as a easy and possibly useful work-around, as I only care about those at the moment.)
Joachim Breitner (Oct 02 2024 at 07:56):
https://github.com/teorth/equational_theories/pull/192
Amir Livne Bar-on (Oct 02 2024 at 09:05):
If we still want the output in the old format, this PR adds a subcommand for it:
https://github.com/teorth/equational_theories/pull/194
Joachim Breitner (Oct 02 2024 at 09:09):
Thanks! While you are at it, can you make equational_theories
the default filename if none is given?
Amir Livne Bar-on (Oct 02 2024 at 09:44):
Added to the same PR
Michael Bucko (Oct 02 2024 at 14:27):
That already works for me. Had to reinstall Rust.
Michael Bucko schrieb:
David Renshaw schrieb:
does it work if you don't use
lake exe
?
Like this:$ lake build extract_implications $ ./.lake/build/bin/extract_implications ```` I am getting ```v@Fenek equational_theories % lake exe extract_implications ✖ [13/30] Running egg/egg_for_lean.static trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic info: stderr: could not execute external process 'cargo' error: external command 'cargo' exited with code 255 Some required builds logged failures: - egg/egg_for_lean.static error: build failed v@Fenek equational_theories % lake build extract_implications ✖ [15/30] Running egg/egg_for_lean.static trace: ././.lake/packages/egg/Rust> cargo rustc --release -- -C relocation-model=pic info: stderr: could not execute external process 'cargo' error: external command 'cargo' exited with code 255 Some required builds logged failures: - egg/egg_for_lean.static error: build failed
Nicholas Carlini (Oct 03 2024 at 00:18):
I'm having a somewhat hard time understanding the output of this script.
If I run "extract_implications outcomes --hist" it tells me there are 2826 unknown.
If I run "extract_implications outcomes" I get a list of lists, with 2826 entries tagged as "unknown". So far so good.
If I run "extract_implications unknowns" then I get a list of length just 1696.
Why is extracting unknowns smaller than outcomes and counting? Further, why are 1683 of these 'unknowns' involving Equation2315?
Finally, if I look at the outcomes list of lists, why is outcomes[0][0] 'unknown' (as is [37,37], [61, 61] and a bunch other). Shouldn't the diagonal be always true?
David Renshaw (Oct 03 2024 at 00:22):
Yes, arguably Closure.lean
should mark the diagonal as implictly true. Currently it does not do that.
David Renshaw (Oct 03 2024 at 00:23):
I'll open an issue for that...
Nicholas Carlini (Oct 03 2024 at 00:26):
Okay great. But these two lists are still different sizes even if I account for that
David Renshaw (Oct 03 2024 at 00:26):
David Renshaw (Oct 03 2024 at 00:28):
@Nicholas Carlini do you have a full command that I can run to see what you're seeing?
David Renshaw (Oct 03 2024 at 00:28):
or is this on your own local branch?
Nicholas Carlini (Oct 03 2024 at 00:28):
No if I'm just running this on upstream main. Let me write a quick reproducer
David Renshaw (Oct 03 2024 at 00:29):
what if you add the --proven
flag to unknowns
?
David Renshaw (Oct 03 2024 at 00:29):
it looks like outcomes
and unknowns
have different default behavior with respect to conjectures
Nicholas Carlini (Oct 03 2024 at 00:30):
No if I add --proven then it goes to 1904808
Nicholas Carlini (Oct 03 2024 at 00:31):
lake exe extract_implications unknowns | grep -o "Equation2315" | wc
gives 1683, for example. This seems wrong
David Renshaw (Oct 03 2024 at 00:33):
the help message of unknowns
says something about "modulo equivalence"
Nicholas Carlini (Oct 03 2024 at 00:34):
Yeah, but if I look at the indexs of the unkonwns in outcomes, then I see exactly 1800 instances of Equation2990.
Nicholas Carlini (Oct 03 2024 at 00:34):
So the counts aren't lining up
David Renshaw (Oct 03 2024 at 00:38):
It's very believable that there's a bug here, but I'm not following your argument. What are you expecting to line up to what, and why?
Nicholas Carlini (Oct 03 2024 at 00:38):
(And 'outcomes' in theory should be listing the entire N^2 set of equations. So modulo equivalence shouldn't matter here.)
David Renshaw (Oct 03 2024 at 00:41):
if outcomes
lists N^2 and unknowns
reduces modulo equivalence, then it makes sense to me that unknowns
reports a smaller number
Nicholas Carlini (Oct 03 2024 at 00:41):
Hm sorry I'm typing as I get more confused. Maybe here's the clearest question that's confusing to me right now.
If I look at extract_implications outcomes
, then I see the entire N^2 table. If I look at the major diagonal, I would expect either unknown or true. But I see a bunch of implicit_conjecture_false
. This should not happen right?
David Renshaw (Oct 03 2024 at 00:43):
i think that would mean that we have some bad conjectures
Nicholas Carlini (Oct 03 2024 at 00:43):
$ lake exe extract_implications outcomes > out.json
$ python3
>>> json.load(open("out.json"))['outcomes'][5][5]
'unknown'
>>> json.load(open("out.json"))['outcomes'][6][6]
'implicit_conjecture_false'
Nicholas Carlini (Oct 03 2024 at 00:44):
And then of the 1815 'unknown' entries in the matrix, all but 1801 involve Equation 2990. Which I guess is possible? But that seems suspicious. (If I ignore entries on the the diagonal.)
David Renshaw (Oct 03 2024 at 00:44):
yep, I'm seeing the same thing here
Nicholas Carlini (Oct 03 2024 at 00:45):
I was going to try and get started on (https://github.com/teorth/equational_theories/issues/209) but before I claimed I wanted to make sure I knew how to read the dataset of what implies what.... and currently I do not
David Renshaw (Oct 03 2024 at 00:48):
yeah, equation 2990 doesn't look all that special: https://github.com/teorth/equational_theories/blob/3b11388f8a933b00d92c6bb883ef2c0b458d564e/equational_theories/AllEquations.lean#L3011
David Renshaw (Oct 03 2024 at 00:49):
equation 2990 := x = ((y ∘ (z ∘ y)) ∘ x) ∘ x
David Renshaw (Oct 03 2024 at 00:49):
my guess is that there may be some bad conjectures, and that there could possibly be a bug in the closure computation
Daniel Weber (Oct 03 2024 at 00:52):
David Renshaw said:
my guess is that there may be some bad conjectures, and that there could possibly be a bug in the closure computation
There's a bug in equational#206, apologies. It should be reverted
Daniel Weber (Oct 03 2024 at 00:52):
I'll make extract_implications
warn if there are inconsistent conjectures
Nicholas Carlini (Oct 03 2024 at 00:53):
Can we make the build break if that happens?
Nicholas Carlini (Oct 03 2024 at 00:54):
Preventing commits from introducing things that cause errors seems safer
Daniel Weber (Oct 03 2024 at 00:54):
Nicholas Carlini said:
Can we make the build break if that happens?
That should be possible, but I don't know how to do CI
Nicholas Carlini (Oct 03 2024 at 00:55):
OK for now warn. I'll add an issue for making it break. I don't either :)
Nicholas Carlini (Oct 03 2024 at 16:11):
I think I'm still misunderstanding something about this tool.
$ lake exe extract_implications outcomes > out.json
$ python3
>>> import json
>>> outcomes = json.load(open("out.json"))['outcomes']
>>> (outcomes[461][612], outcomes[612][461])
('implicit_proof_false', 'implicit_proof_false')
Now I'm not sure which way the table is meant to be read (a=>b or b=>a) but this one is symmetric. So this seems to tell me that Equation462 =/> Equation613 and Equation613 =/> Equation 462.
But we know that 613 => 462. So something's not right in my understanding of this tool.
theorem Equation613_implies_Equation462 (G : Type*) [Magma G] (h : Equation613 G) : Equation462 G := λ x y z w u => h x x y z w u
Could someone help me out?
Daniel Weber (Oct 03 2024 at 16:17):
outcomes[461][612]
is the outcome of json.load(open("out.json"))['equations'][461]
and json.load(open("out.json"))['equations'][612]
, not necessarily Equation462
and Equation613
Nicholas Carlini (Oct 03 2024 at 16:18):
Oooohhhhhh
Cody Roux (Oct 04 2024 at 01:22):
That's a little unfortunate tbh, is this to be fixed?
David Renshaw (Oct 04 2024 at 01:25):
the idea is to allow for not all equations to be present, e.g. for Subraph.lean
David Renshaw (Oct 04 2024 at 01:26):
so there needs to be a mapping between array indices and equation indices
Nicholas Carlini (Oct 04 2024 at 01:28):
Perhaps there could be a "user-friendly" output default that has {"Equation5": {"Equation6": "A_IMPLIES_B", "Equation 7": "A_NOT_IMPLIES_B"...}, ...} or something like that which is very hard to incorrectly understand. It will be a bit bigger (because you're duplicating keys) but easier to parse
David Renshaw (Oct 04 2024 at 02:47):
Relatedly: here is a fix to index canonicalization in outcomes_to_image.py: equational#268
Last updated: May 02 2025 at 03:31 UTC