Zulip Chat Archive

Stream: general

Topic: Make `simp` more powerful than `dsimp`?

Scott Morrison (Sep 05 2020 at 04:08):

Would it be possible to change the behaviour of simp (i.e. the C++ implementation), so that whenever it is trying to use a _refl_lemma, it behaves as dsimp?

Scott Morrison (Sep 05 2020 at 04:09):

It would be great to avoid the need for dsimp, simp which we see so often in the category theory library, and explained in this library note:
https://github.com/leanprover-community/mathlib/blob/master/src/category_theory/category/default.lean#L249

Scott Morrison (Sep 05 2020 at 04:09):

(I've just found myself writing simp, dsimp, simp.)

Mario Carneiro (Sep 05 2020 at 04:20):

Ideally, simp should traverse the term, recursing into all subterms, but when the subterm is not one provided by the congr lemma it should switch to "definitional mode" where it only does things dsimp would do. That algorithm yields simp and dsimp as special cases

Mario Carneiro (Sep 05 2020 at 04:21):

Alternatively, we can just make simp' a wrapper around repeat {simp <|> dsimp} which is a lot easier to do. :)

Scott Morrison (Sep 05 2020 at 04:22):

I'd certainly be happy with that. Obviously it's slower.

Scott Morrison (Sep 05 2020 at 04:23):

Presumably it would need to support simp' [X, Y, Z] and simp' only [X, Y, Z] and so on.

Simon Hudon (Sep 05 2020 at 04:24):

One of the overhead there is building the simp_lemmas lists multiple times which we can skip. Where is the rest of the inefficiency coming from?

Scott Morrison (Sep 05 2020 at 04:24):

Multiple traversals of the expression.

Simon Hudon (Sep 05 2020 at 04:27):

But that's already part of what simp does. Do we cut down the number if we merge the two tasks? I'm not sure because simp and dsimp have distinct tasks. simp has to be careful not to break the type correctness of the expression while dsimp can be more aggressive.

Mario Carneiro (Sep 05 2020 at 04:28):

the approach I described doesn't break type correctness

Mario Carneiro (Sep 05 2020 at 04:28):

two traversals is more expensive than one

Mario Carneiro (Sep 05 2020 at 04:29):

the repeat actually has more failed traversals than that

Simon Hudon (Sep 05 2020 at 04:29):

Your estimate is that you'd roughly cut down the number of traversals by half?

Mario Carneiro (Sep 05 2020 at 04:30):

at the very least the repeat must do 3 traversals for a successful run

Scott Morrison (Sep 05 2020 at 04:30):

Don't simp and dsimp by themselves only ever do a single traverse? Sometimes they act multiple times on the same subexpression, but they never "go back".

Simon Hudon (Sep 05 2020 at 04:31):

No, on their own they also repeat. They traverse bottom up and then they try again. They stop when they can't make any more changes

Simon Hudon (Sep 05 2020 at 04:31):

(unless you turn on the single_pass option)

Mario Carneiro (Sep 05 2020 at 04:32):

consider a case where we want to do simp, dsimp, simp:

simp -> succeed
simp -> fail
dsimp -> succeed
simp -> succeed
simp -> fail
dsimp -> fail

that's 6 traversals where 3 were done before and 1 could do the job in principle

Scott Morrison (Sep 05 2020 at 04:32):

I thought single_pass just prevented it from rewriting in an already rewritten subexpression

Scott Morrison (Sep 05 2020 at 04:33):

I've been trying to read https://github.com/leanprover-community/lean/blob/master/src/library/tactic/simplify.cpp looking for the spot where simp stops looking inwards because it can't find an appropriate congr. And failing. :-)

Simon Hudon (Sep 05 2020 at 04:34):

Look for the function visit_app

Simon Hudon (Sep 05 2020 at 04:35):

@Mario Carneiro I believe you. I don't have a firm grip on the number of passes that simp performs on average. If you have the time to implement your idea, I'd love to see the difference

Mario Carneiro (Sep 05 2020 at 04:36):

It is possible to get fewer than 6 calls with better state management even if you only treat simp and dsimp as black boxes

Mario Carneiro (Sep 05 2020 at 04:37):

if you assume that simp never succeeds twice in a row (which is false, but close enough) then you can get it down to 4 calls in the example

Simon Hudon (Sep 05 2020 at 04:38):

That would be worth experimenting with. This is something that can be done without going into the C++ code I think

Scott Morrison (Sep 05 2020 at 04:38):

So is Mario's suggestion to revert to dsimp mode if congr lemmas aren't available a suggestion to add a line before
https://github.com/leanprover-community/lean/blob/master/src/library/tactic/simplify.cpp#L769

Scott Morrison (Sep 05 2020 at 04:39):

that does definitional simplification on arg?

Mario Carneiro (Sep 05 2020 at 04:40):

I think we want to enhance the defeq_canonize step at the beginning of that function

Simon Hudon (Sep 05 2020 at 04:42):

For comparison, you can look at what dsimplify.cpp does in visit_app

Mario Carneiro (Sep 05 2020 at 04:43):

https://github.com/leanprover-community/lean/blob/master/src/library/tactic/dsimplify.cpp#L188

Simon Hudon (Sep 05 2020 at 04:43):

I think you're right Mario, defeq canonize is a good place to look

Mario Carneiro (Sep 05 2020 at 04:44):

what's not clear to me is whether to basically call dsimp on the expression, or just do a head simplification; the first sounds like it might have exponential blowup but the second is probably incomplete

Mario Carneiro (Sep 05 2020 at 04:46):

In order to incorporate it in the traversal, we need to analyze the congr lemma to find all arguments that don't change, and dsimp in them before traversing the hypotheses of the congr lemma

Scott Morrison (Sep 05 2020 at 04:51):

Is it really necessary to do this analysis? I feel like we'd get most of the benefit just by doing the dsimp on the arg of a dependent function, when simp has fallen back just using congr_fun.

Mario Carneiro (Sep 05 2020 at 04:52):

It comes up pretty frequently in automatically generated congr lemmas, for example a = b -> c = d -> @add A inst a c = @add A inst b d

Mario Carneiro (Sep 05 2020 at 04:52):

here if we loop over just the hypotheses we will miss A and inst

Mario Carneiro (Sep 05 2020 at 04:55):

We can probably modify try_auto_eq_congr to do something else in the Fixed and FixedNoParam cases

Scott Morrison (Sep 05 2020 at 05:15):

Going back to the repeat {simp <|> dsimp} suggestion, I think

import tactic.core

namespace tactic

meta def alternate : tactic unit → tactic unit → tactic unit
| t₁ t₂ := t₁ >> try (alternate t₂ t₁)

meta def alternate' (t₁ t₂ : tactic unit) : tactic unit :=
(t₁ >> try (alternate t₂ t₁)) <|> (t₂ >> try (alternate t₁ t₂))

meta def simp' := alternate' `[dsimp] `[simp]

end tactic

would work.

Scott Morrison (Sep 05 2020 at 05:15):

Should I bother dressing this up (e.g. argument handling an interactive simp')?

Scott Morrison (Sep 05 2020 at 05:17):

(Of course I'd be perfectly happy if making a PR of this provoked someone into working out how to improve simp in core, thereby making it obsolete!)

Scott Morrison (Sep 05 2020 at 05:17):

But on the other hand if people wouldn't want to allow simp' as a valid terminal step in proofs I'm not keen to do the work.

Simon Hudon (Sep 05 2020 at 05:26):

That looks like a reasonable workaround to me. Given how prevalent the pattern is in category theory code, that looks like a nice improvement. And we could support it with squeeze_* too

Gabriel Ebner (Sep 05 2020 at 10:11):

The hsimp tactic did this in '18: https://github.com/gebner/hott3/blob/7ead7a8a2503049eacd45cbff6587802bae2add2/src/hott/init/meta/simp.lean

Last updated: May 02 2025 at 03:31 UTC