Zulip Chat Archive
Stream: general
Topic: Make `simp` more powerful than `dsimp`?
Scott Morrison (Sep 05 2020 at 04:08):
Would it be possible to change the behaviour of simp
(i.e. the C++ implementation), so that whenever it is trying to use a _refl_lemma
, it behaves as dsimp
?
Scott Morrison (Sep 05 2020 at 04:09):
It would be great to avoid the need for dsimp, simp
which we see so often in the category theory library, and explained in this library note:
https://github.com/leanprover-community/mathlib/blob/master/src/category_theory/category/default.lean#L249
Scott Morrison (Sep 05 2020 at 04:09):
(I've just found myself writing simp, dsimp, simp
.)
Mario Carneiro (Sep 05 2020 at 04:20):
Ideally, simp
should traverse the term, recursing into all subterms, but when the subterm is not one provided by the congr lemma it should switch to "definitional mode" where it only does things dsimp would do. That algorithm yields simp
and dsimp
as special cases
Mario Carneiro (Sep 05 2020 at 04:21):
Alternatively, we can just make simp'
a wrapper around repeat {simp <|> dsimp}
which is a lot easier to do. :)
Scott Morrison (Sep 05 2020 at 04:22):
I'd certainly be happy with that. Obviously it's slower.
Scott Morrison (Sep 05 2020 at 04:23):
Presumably it would need to support simp' [X, Y, Z]
and simp' only [X, Y, Z]
and so on.
Simon Hudon (Sep 05 2020 at 04:24):
One of the overhead there is building the simp_lemmas
lists multiple times which we can skip. Where is the rest of the inefficiency coming from?
Scott Morrison (Sep 05 2020 at 04:24):
Multiple traversals of the expression.
Simon Hudon (Sep 05 2020 at 04:27):
But that's already part of what simp
does. Do we cut down the number if we merge the two tasks? I'm not sure because simp
and dsimp
have distinct tasks. simp
has to be careful not to break the type correctness of the expression while dsimp
can be more aggressive.
Mario Carneiro (Sep 05 2020 at 04:28):
the approach I described doesn't break type correctness
Mario Carneiro (Sep 05 2020 at 04:28):
two traversals is more expensive than one
Mario Carneiro (Sep 05 2020 at 04:29):
the repeat actually has more failed traversals than that
Simon Hudon (Sep 05 2020 at 04:29):
Your estimate is that you'd roughly cut down the number of traversals by half?
Mario Carneiro (Sep 05 2020 at 04:30):
at the very least the repeat must do 3 traversals for a successful run
Scott Morrison (Sep 05 2020 at 04:30):
Don't simp
and dsimp
by themselves only ever do a single traverse? Sometimes they act multiple times on the same subexpression, but they never "go back".
Simon Hudon (Sep 05 2020 at 04:31):
No, on their own they also repeat. They traverse bottom up and then they try again. They stop when they can't make any more changes
Simon Hudon (Sep 05 2020 at 04:31):
(unless you turn on the single_pass
option)
Mario Carneiro (Sep 05 2020 at 04:32):
consider a case where we want to do simp, dsimp, simp
:
simp -> succeed
simp -> fail
dsimp -> succeed
simp -> succeed
simp -> fail
dsimp -> fail
that's 6 traversals where 3 were done before and 1 could do the job in principle
Scott Morrison (Sep 05 2020 at 04:32):
I thought single_pass
just prevented it from rewriting in an already rewritten subexpression
Scott Morrison (Sep 05 2020 at 04:33):
I've been trying to read https://github.com/leanprover-community/lean/blob/master/src/library/tactic/simplify.cpp looking for the spot where simp
stops looking inwards because it can't find an appropriate congr
. And failing. :-)
Simon Hudon (Sep 05 2020 at 04:34):
Look for the function visit_app
Simon Hudon (Sep 05 2020 at 04:35):
@Mario Carneiro I believe you. I don't have a firm grip on the number of passes that simp
performs on average. If you have the time to implement your idea, I'd love to see the difference
Mario Carneiro (Sep 05 2020 at 04:36):
It is possible to get fewer than 6 calls with better state management even if you only treat simp
and dsimp
as black boxes
Mario Carneiro (Sep 05 2020 at 04:37):
if you assume that simp
never succeeds twice in a row (which is false, but close enough) then you can get it down to 4 calls in the example
Simon Hudon (Sep 05 2020 at 04:38):
That would be worth experimenting with. This is something that can be done without going into the C++ code I think
Scott Morrison (Sep 05 2020 at 04:38):
So is Mario's suggestion to revert to dsimp
mode if congr lemmas aren't available a suggestion to add a line before
https://github.com/leanprover-community/lean/blob/master/src/library/tactic/simplify.cpp#L769
Scott Morrison (Sep 05 2020 at 04:39):
that does definitional simplification on arg
?
Mario Carneiro (Sep 05 2020 at 04:40):
I think we want to enhance the defeq_canonize
step at the beginning of that function
Simon Hudon (Sep 05 2020 at 04:42):
For comparison, you can look at what dsimplify.cpp
does in visit_app
Mario Carneiro (Sep 05 2020 at 04:43):
https://github.com/leanprover-community/lean/blob/master/src/library/tactic/dsimplify.cpp#L188
Simon Hudon (Sep 05 2020 at 04:43):
I think you're right Mario, defeq canonize is a good place to look
Mario Carneiro (Sep 05 2020 at 04:44):
what's not clear to me is whether to basically call dsimp
on the expression, or just do a head simplification; the first sounds like it might have exponential blowup but the second is probably incomplete
Mario Carneiro (Sep 05 2020 at 04:46):
In order to incorporate it in the traversal, we need to analyze the congr lemma to find all arguments that don't change, and dsimp
in them before traversing the hypotheses of the congr lemma
Scott Morrison (Sep 05 2020 at 04:51):
Is it really necessary to do this analysis? I feel like we'd get most of the benefit just by doing the dsimp
on the arg
of a dependent function, when simp
has fallen back just using congr_fun
.
Mario Carneiro (Sep 05 2020 at 04:52):
It comes up pretty frequently in automatically generated congr lemmas, for example a = b -> c = d -> @add A inst a c = @add A inst b d
Mario Carneiro (Sep 05 2020 at 04:52):
here if we loop over just the hypotheses we will miss A
and inst
Mario Carneiro (Sep 05 2020 at 04:55):
We can probably modify try_auto_eq_congr
to do something else in the Fixed
and FixedNoParam
cases
Scott Morrison (Sep 05 2020 at 05:15):
Going back to the repeat {simp <|> dsimp}
suggestion, I think
import tactic.core
namespace tactic
meta def alternate : tactic unit → tactic unit → tactic unit
| t₁ t₂ := t₁ >> try (alternate t₂ t₁)
meta def alternate' (t₁ t₂ : tactic unit) : tactic unit :=
(t₁ >> try (alternate t₂ t₁)) <|> (t₂ >> try (alternate t₁ t₂))
meta def simp' := alternate' `[dsimp] `[simp]
end tactic
would work.
Scott Morrison (Sep 05 2020 at 05:15):
Should I bother dressing this up (e.g. argument handling an interactive simp'
)?
Scott Morrison (Sep 05 2020 at 05:17):
(Of course I'd be perfectly happy if making a PR of this provoked someone into working out how to improve simp
in core, thereby making it obsolete!)
Scott Morrison (Sep 05 2020 at 05:17):
But on the other hand if people wouldn't want to allow simp'
as a valid terminal step in proofs I'm not keen to do the work.
Simon Hudon (Sep 05 2020 at 05:26):
That looks like a reasonable workaround to me. Given how prevalent the pattern is in category theory code, that looks like a nice improvement. And we could support it with squeeze_*
too
Gabriel Ebner (Sep 05 2020 at 10:11):
The hsimp
tactic did this in '18: https://github.com/gebner/hott3/blob/7ead7a8a2503049eacd45cbff6587802bae2add2/src/hott/init/meta/simp.lean
Last updated: Dec 20 2023 at 11:08 UTC