Zulip Chat Archive

Stream: general

Topic: Time complexity of simp with an associativity lemma


Bolton Bailey (Dec 07 2023 at 19:39):

What is the time complexity of simp when I run it with a single associativity lemma, say add_assoc? Clearly some simp lemmas will, abstractly speaking, loop infinitely if you run them (like running add_comm on a + b). But if I run simp [add_assoc] on some expression of length n, is it guaranteed to run in O(n) time?

Bolton Bailey (Dec 07 2023 at 19:55):

I had a simp call time out with an associativity lemma that i didn't expect to time out if it really was a linear time operation. This has left me wondering if there is room for some kind of simp_assoc lemma that simplifies in an asymptotically optimal way.

Yaël Dillies (Dec 07 2023 at 19:57):

assoc_rw?

Floris van Doorn (Dec 07 2023 at 19:59):

Note that simp [add_comm] doesn't loop! simp has a detection against loops:

example (a b : Nat) : a + b = 0 := by simp only [Nat.add_comm] -- no progress
example (a b : Nat) : b + a = 0 := by simp only [Nat.add_comm] -- rewrites once, then stops

Bolton Bailey (Dec 07 2023 at 20:01):

Yaël Dillies said:

assoc_rw?

Potentially very cool! However I don't really understand what is meant by the docstring "behaves like rewrite [h₀,← h₁] at ⊢ h₂ with the exception that associativity is used implicitly to make rewriting possible."

Floris van Doorn (Dec 07 2023 at 20:04):

simp only [add_assoc] will rewrite using add_assoc eagerly starting at the leaves of the expression, and then moving outwards. So that is O(nlog(n))O(n\log(n)), I think?

Bolton Bailey (Dec 07 2023 at 20:05):

It seems like the assoc_rw tactic is less about simplification and more about being able to do other rewrites on a horribly parenthesized goal.

Floris van Doorn (Dec 07 2023 at 20:06):

Oh, no probably O(n2)O(n^2) if it's maximally associated in the other direction.

Bolton Bailey (Dec 07 2023 at 20:08):

Ok that's good to know, that could totally be causing the problem.

Bolton Bailey (Dec 07 2023 at 20:10):

Floris van Doorn said:

Oh, no probably O(n2)O(n^2) if it's maximally associated in the other direction.

Yeah that makes sense, because it first simplifies the left subtree, and then it needs an additional n steps to move the last addition to the other side.

Bolton Bailey (Dec 07 2023 at 20:11):

I wonder if there's a way of automatically identifying lemmas/lemma sets that have this property and making a simplification procedure for them that's linear time.

Floris van Doorn (Dec 07 2023 at 20:12):

what is the optimal strategy for doing the rewrites? simp also has capability to start rewriting the full term and only then going into subexpressions

Floris van Doorn (Dec 07 2023 at 20:13):

section
attribute [local simp] Nat.add_assoc
set_option trace.Meta.Tactic.simp.rewrite true in
example (a b c d e : Nat) : a + b + c + d + e = 3 := by simp; sorry -- 6 rewrites
end

section
attribute [local simp] Nat.add_assoc
set_option trace.Meta.Tactic.simp.rewrite true in
example (a b c d e : Nat) : a + b + c + d + e = 3 := by simp; sorry -- 4 rewrites
end

Floris van Doorn (Dec 07 2023 at 20:17):

Or shorter example:

set_option trace.Meta.Tactic.simp.rewrite true
example (a b c d e : Nat) : a + b + c + d + e = 3 := by simp [Nat.add_assoc]; sorry -- 6 rewrites
example (a b c d e : Nat) : a + b + c + d + e = 3 := by simp [Nat.add_assoc]; sorry -- 4 rewrites

Bolton Bailey (Dec 07 2023 at 20:19):

If my only simp lemma is add_assoc, I could

  1. Construct a binary tree of subexpressions where + is not the top level operation of any leaf
  2. Flatten the tree into a singly-linked list.
  3. Recursively search leaf subexpressions for more +.

Which I think is linear time, at least if every thing is reference-oriented enough that I don't have to copy anything.

Bolton Bailey (Dec 07 2023 at 20:22):

This might be equivalent to always rewriting at the top of the syntax tree first and then working downwards.

Bolton Bailey (Dec 07 2023 at 20:23):

(Is that what simp↓ does? I haven't seen that before) Yes

Bolton Bailey (Dec 07 2023 at 20:30):

Hmm, simp↓ is used exactly once in mathlib, it makes me wonder if there are efficiency gains we are missing out on.

Yaël Dillies (Dec 07 2023 at 20:30):

Worth investigating, for sure!

Bolton Bailey (Dec 07 2023 at 20:33):

Floris van Doorn said:

Or shorter example:

set_option trace.Meta.Tactic.simp.rewrite true
example (a b c d e : Nat) : a + b + c + d + e = 3 := by simp [Nat.add_assoc]; sorry -- 6 rewrites
example (a b c d e : Nat) : a + b + c + d + e = 3 := by simp [Nat.add_assoc]; sorry -- 4 rewrites

If you add f to the first expression, the number of rewrites goes up to 10. If you add g it goes up to 15.

Bolton Bailey (Dec 07 2023 at 20:51):

Yaël Dillies said:

Worth investigating, for sure!

For example, reading the documentation of simp↓, it seems like a no-brainer to me that a lemma like mul_zero should be simp↓ rather than simp. Why should I simplify subexpressions when I am going to eliminate them anyway?

Bolton Bailey (Dec 07 2023 at 20:51):

And yet docs#Int.mul_zero seems to be simp, not simp↓. Why is this?

Yaël Dillies (Dec 07 2023 at 20:55):

In that case it might be worth trying out tagging some such lemmas across mathlib then spending some time thinking how to make @[simp] automatically decide how a lemma should be tagged.

Bolton Bailey (Dec 07 2023 at 20:57):

I mean I feel like any affine simplification that eliminates at least one subexpression should be simp↓, right? But aren't simp lemmas almost always affine anyway? I feel like you were the first one to point this out to me - that a rewrite lemma being affine is a good rule-of-thumb for if it should be simp.

Floris van Doorn (Dec 07 2023 at 20:58):

Bolton Bailey said:

Yaël Dillies said:

Worth investigating, for sure!

For example, reading the documentation of simp↓, it seems like a no-brainer to me that a lemma like mul_zero should be simp↓ rather than simp. Why should I simplify subexpressions when I am going to eliminate them anyway?

Does that matter much? If my expression is x * 0, then presumably there is little (nothing) to simplify in 0, and I don't see why it matters whether I simplify with mul_zero first or simplify x first.
You might be right that it's quicker (we don't visit 0 if we simplify with ↓mul_zero), but I expect the gains to be tiny.

Bolton Bailey (Dec 07 2023 at 20:59):

Well the point is that x could be some huge expression that I could take forever to simplify, only to find out that it was eliminated anyway.

Yaël Dillies (Dec 07 2023 at 21:00):

Bolton Bailey said:

I mean I feel like any affine simplification that eliminates at least one subexpression should be simp↓, right? But aren't simp lemmas almost always affine anyway?

Yes, although I think a good deal of simp lemmas (maybe half? maybe a third?) don't eliminate any subexpression but instead shuffle things around.

Floris van Doorn (Dec 07 2023 at 21:00):

Oh wait, you're saying mul_zero, not add_zero. You're right.

Bolton Bailey (Dec 07 2023 at 21:00):

Maybe what I'm not understanding is the reason leaf-first is the default to begin with.

Bolton Bailey (Dec 07 2023 at 21:02):

Yaël Dillies said:

Bolton Bailey said:

I mean I feel like any affine simplification that eliminates at least one subexpression should be simp↓, right? But aren't simp lemmas almost always affine anyway?

Yes, although I think a good deal of simp lemmas (maybe half? maybe a third?) don't eliminate any subexpression but instead shuffle things around.

Yeah, clicking around in the grep, this seems to be the case.

Yaël Dillies (Dec 07 2023 at 21:06):

Bolton Bailey said:

Maybe what I'm not understanding is the reason leaf-first is the default to begin with.

You might want to investigate how complicated simp calls work in eg category theory. Usually, you can only simplify a function call only once you know its input is of a certain shape.

Bolton Bailey (Dec 07 2023 at 21:08):

Ok thinking about it more, I think I see the reason it's sometimes better to go leaf-first: Some simp lemmas like sub_self reference an input twice, and so we would like to normalize the subexpressions to make it more likely that the lemma fires the first time we investigate it.

Floris van Doorn (Dec 07 2023 at 21:10):

Bolton Bailey said:

For example, reading the documentation of simp↓, it seems like a no-brainer to me that a lemma like mul_zero should be simp↓ rather than simp. Why should I simplify subexpressions when I am going to eliminate them anyway?

Btw, what documentation did you read?

Bolton Bailey (Dec 07 2023 at 21:11):

I guess I don't understand how simp↓ works? If I make mul_zero a simp↓ lemma and then I simplify an expression a * b with mul_zero and a bunch of other leaf-first lemmas, is mul_zero investigated again each time b changes?

Floris van Doorn (Dec 07 2023 at 21:11):

Even with something like mul_zero you get more hits if you simplify the leaves first. You might have a * b when going down, and b might simplify to 0.

Bolton Bailey (Dec 07 2023 at 21:12):

Floris van Doorn said:

Bolton Bailey said:

For example, reading the documentation of simp↓, it seems like a no-brainer to me that a lemma like mul_zero should be simp↓ rather than simp. Why should I simplify subexpressions when I am going to eliminate them anyway?

Btw, what documentation did you read?

The docstring itself here https://github.com/leanprover/lean4/blob/c474dff38c401723226277240cd3baec0437de15/src/Init/Tactics.lean#L785

Sometimes searching github works better than searching https://leanprover-community.github.io/mathlib4_docs/

Floris van Doorn (Dec 07 2023 at 21:13):

Bolton Bailey said:

I guess I don't understand how simp↓ works? If I make mul_zero a simp↓ lemma and then I simplify an expression a * b with mul_zero and a bunch of other leaf-first lemmas, is mul_zero investigated again each time b changes?

I expect (but wanted to read in some documentation), that it will visit the expression both before and after simplifying leaves (and not any time in between), but I'm not sure.

Floris van Doorn (Dec 07 2023 at 21:13):

Ah, the documentation of the simp attribute (not the simp tactic) mentions .

Bolton Bailey (Dec 07 2023 at 21:19):

There's definitely a fundamental problem where if I have both ↓mul_zero and ↓zero_mul and I am simplifying a * b, I have no a-priori reason to think a is more likely to simplify to zero than b, so whichever I simplify first, there is only a 50%-50% shot that I simplify the one which speeds things up for me.

Floris van Doorn (Dec 07 2023 at 21:22):

I expect that if you have a * b and neither a nor b is definitionally equal to 0, then it doesn't matter whether you use . Neither of them will be tried after you simplify only one of the arguments (at least, that is what I strongly suspect).

Bolton Bailey (Dec 07 2023 at 21:27):

Well, wouldn't I be imposing an additional initial check on every application of simp to a goal with a top level * though? I would be sacrificing a lot of time over the whole library to make a few applications faster.

Floris van Doorn (Dec 07 2023 at 21:28):

No, I expect not, through the magic of discrimination trees. If your expression is a * b where neither a nor b is reducibly-defeq to 0, then mul_zero and zero_mul will not even be considered.

Floris van Doorn (Dec 07 2023 at 21:56):

I was very confused that the following example simplifies using Nat.mul_zero

example (a b c : Nat) (hb : b = 0) : a * b = c := by
  simp only [Nat.mul_zero, hb];
  -- goal: ⊢ 0 = c
  sorry

But I figured it out: simp by default does multiple passes on your expression, as long as it changes. If you disable that, then Nat.mul_zero is not used to simplify (as I expected):

example (a b c : Nat) (hb : b = 0) : a * b = c := by
  simp (config := {singlePass := true}) only [Nat.mul_zero, hb];
  -- goal: ⊢ a * 0 = c
  sorry

My take-aways

  • If you make a lemma a pre-simp lemma (i.e. with ), then it will be (using the default settings) also used as a post-simp lemma (i.e. a normal simp-lemma), since it will simplify in the second pass.
  • However, all other post-simp lemmas will be applied before the pre-simp lemma will be tried on the second pass, so there might still be a small performance boost to make it also a post-simp lemma.
  • I think it is worth investigating to make all simp-lemmas that eliminate a variable also a pre-simp lemma. Maybe that improves performance a bit.

Floris van Doorn (Dec 07 2023 at 21:57):

and indeed, using set_option trace.Debug.Meta.Tactic.simp true Lean will tell you

[Debug.Meta.Tactic.simp] no theorems found for pre-rewriting a * b = c

so there should be no performance loss by making mul_zero a pre-simp lemma.

Jireh Loreaux (Dec 07 2023 at 22:21):

wtf is a "pre-simp" lemma? and where do I read about it?

Bolton Bailey (Dec 07 2023 at 22:23):

In the docstrings of the code in the lean4 github repository, apparently :smiling_devil:.

Jireh Loreaux (Dec 07 2023 at 22:25):

Thanks. :mind_blown:

Bolton Bailey (Dec 07 2023 at 22:32):

I would also like to express that I would like to read something that would give me an in-depth description of how the simplifier works, including this "discrimination tree magic", and how it relates to the precise order in which nodes of the syntax tree are examined and which order lemmas are applied in.

Bolton Bailey (Dec 07 2023 at 22:33):

"Discrimination tree" is a little hard to google explanations for, here is what I eventually found.

Mario Carneiro (Dec 08 2023 at 09:30):

@Floris van Doorn Could I interest you to come up with some heuristics to use for marking things as simp↓ based on your investigations, and maybe we could get a linter out of it?

Mario Carneiro (Dec 08 2023 at 09:32):

alternatively, we could implement the heuristic in core, changing what @[simp] does. Might be worth a benchmark

Bolton Bailey (Dec 08 2023 at 09:45):

I am curious too, so I went and made #8895 to test out adding some pre-simps to one file. Do I need special permission to use the !bench command, or can I just use it?

Floris van Doorn (Dec 08 2023 at 11:42):

Mario Carneiro said:

Floris van Doorn Could I interest you to come up with some heuristics to use for marking things as simp↓ based on your investigations, and maybe we could get a linter out of it?

I think further investigation is needed, e.g. marking 1-10 lemmas as simp↓ and benchmarking that.
Oh, Bolton did just that. I'll remark further on that PR.

Floris van Doorn (Dec 08 2023 at 11:44):

Actually, let me continue here.

Looking at the changes, this looks like a -2.114% decrease in simp-time across the whole library.

That is huge for marking 12 lemmas as simp↓, since most simp-calls will not involve these lemmas (even though the bot thinks it's not significant).
I don't know what the uncertainty is for these percentages (maybe we should add another test making ~100 more simp-lemmas presimp), but it seems that this heuristic has merit.

Floris van Doorn (Dec 08 2023 at 11:45):

I don't know if I have time for this soon, so if @Bolton Bailey is willing to continue this investigation, you have my blessing!

Mario Carneiro (Dec 08 2023 at 11:58):

I think 2% is typical noise for the simp metric, based on the recent history

Mario Carneiro (Dec 08 2023 at 12:01):

well, the average deviation seems to be more like 1%

Sebastian Ullrich (Dec 08 2023 at 12:18):

If you can figure out a stand-alone benchmark for something like this, please suggest one! The biggest issue with the various submetrics in the build benchmark is that we don't have instruction counts for them, which usually has ~100x less noise than time metrics.

Mario Carneiro (Dec 08 2023 at 12:20):

Does collecting heartbeats decrease noise?

Mario Carneiro (Dec 08 2023 at 12:20):

Or is that just a "time metric"

Sebastian Ullrich (Dec 08 2023 at 12:23):

Heartbeats I would hope is completely deterministic but I have no idea whether it is a good metric. Perhaps like instructions it can at least be used to judge whether a time regression is "reasonable". And we could feasibly measure it for subtasks like simp, yes.

Sebastian Ullrich (Dec 08 2023 at 12:23):

Perhaps it's even feasible to measure instructions per subtask, I haven't looked into that before

Mario Carneiro (Dec 08 2023 at 12:26):

why wouldn't it be a good metric? For simp it seems like it would be a very direct correlation

Mario Carneiro (Dec 08 2023 at 12:26):

are you concerned about loops that don't allocate?

Sebastian Ullrich (Dec 08 2023 at 12:27):

We did have such a bug at least once before, yes. But more generally, code changes that affect real time but not (as much) heartbeats.

Mario Carneiro (Dec 08 2023 at 12:27):

we can just collect both metrics

Mario Carneiro (Dec 08 2023 at 12:28):

probably a better use of metrics than the thousands of single-file benchmarks

Floris van Doorn (Dec 08 2023 at 12:28):

Another question: does simp cache failures to simplify? I'm wondering whether tagging lemmas with @[simp↓, simp high] might be worth it to try them both going down and first when going up. But if failures are not cached, this might be expensive (especially with multiple passes).

Bolton Bailey (Dec 08 2023 at 13:18):

Floris van Doorn said:

I don't know if I have time for this soon, so if Bolton Bailey is willing to continue this investigation, you have my blessing!

I'm happy to keep looking into it, though I feel I may be shooting into the dark since I am not familiar with this benchmarking framework. I agree with @Sebastian Ullrich that it would be nice to have a less-noisy/deterministic benchmark to look at so I can know what is noise and what is meaningful.

My strategy for the PR was very simple so far, since at this point my understanding of the internal workings of simp is pretty minimal. I looked for any simp lemmas that didn't have any Prop argument and only referenced variables once, and made them pre-simp if they eliminated a variable and didn't duplicate any other variables.

Floris van Doorn (Dec 08 2023 at 15:12):

Yes, that sounds like a good strategy.
I see you made a new commit, that is good. Hopefully we can make the signal big enough that it's clearly bigger than any noise.

Bolton Bailey (Dec 10 2023 at 12:20):

Seems the new commit isn't any faster than what we started with :sad:. I wouldn't want to go much further marking all these lemmas by hand, so I recommend postponing this investigation until we either have a less noisy speed metric or until I/someone writes a script to identify good pre-simp candidates.


Last updated: Dec 20 2023 at 11:08 UTC