Zulip Chat Archive

Stream: mathlib4

Topic: request: discourage squeezing terminal simps?

Kim Morrison (Dec 19 2024 at 21:09):

A general request to reviewers. I've just been dealing with the Mathlib fallout of fixing a simp bug (cf lean#6397), and I've been encountering a lot of entirely unnecessary simp only in terminal positions, where a much shorter (and generally more robust) simp statement works fine.

Could reviewers please be on the watch for this, and discourage contributors from needlessly converting simp to simp only in terminal position? It's great to do this if otherwise there is a performance problem, but these should probably be explained with comments! Otherwise, it's just unnecessarily verbose and fragile.

Kim Morrison (Dec 20 2024 at 00:38):

c.f. #20097

Michael Rothgang (Dec 20 2024 at 10:05):

@Kim Morrison I've looked at the post-merge benchmarking output of #20097 and #20098, and Mathlib/ModelTheory/Algebra/Ring/Definability.lean regresses by a factor of 1.85 (5 -> 10*10⁹ instructions). That is the only such regression. Is there a passing testing branch on which I can squeeze this myself?/Can I entice you to doing so?

Johan Commelin (Dec 20 2024 at 10:07):

lean-pr-testing-6397 should be that branch

Johan Commelin (Dec 20 2024 at 10:07):

I haven't yet merged master into it. Was planning to do that soonish

Johan Commelin (Dec 20 2024 at 10:07):

But I hope the merge is clean

Johan Commelin (Dec 20 2024 at 10:09):

Done, and pushed

Yaël Dillies (Dec 20 2024 at 10:29):

Kim Morrison said:

converting simp to simp only in terminal position [...] it's just unnecessarily verbose and fragile.

I've been saying this... :sweat_smile:

Michael Rothgang (Dec 20 2024 at 10:39):

There's no mathlib cache yet, right? Then I'll wait for that before I try to re-squeeze.

Michael Rothgang (Dec 20 2024 at 15:34):

#20120

Michael Rothgang (Dec 20 2024 at 15:35):

CI for that fails, but in downstream files and for reasons that look clearly unrelated to the PR.

Filippo A. E. Nuccio (Dec 20 2024 at 16:24):

Kim Morrison said:

A general request to reviewers. I've just been dealing with the Mathlib fallout of fixing a simp bug (cf lean#6397), and I've been encountering a lot of entirely unnecessary simp only in terminal positions, where a much shorter (and generally more robust) simp statement works fine.

Could reviewers please be on the watch for this, and discourage contributors from needlessly converting simp to simp only in terminal position? It's great to do this if otherwise there is a performance problem, but these should probably be explained with comments! Otherwise, it's just unnecessarily verbose and fragile.

And I guess that the same suggestion applies to simpa calls?

Yaël Dillies (Dec 30 2024 at 23:42):

Kim, are you planning on documenting this discouragement somewhere, eg the contribution guide? If not, I am happy to do this myself in a few days.

Michael Stoll (Jan 01 2025 at 19:44):

In #20362 I followed @Kim Morrison's request and un-squeezed the terminal simps. This resulted in a slowdown by ~30%. (I don't think the refactoring part of the PR should have a relevant influence on speed, but I could check that if desired.)

Assuming that this is typical, is the prevailing opnion that the reduced maintenance burden is worth a global increase of build time by 30% or so? (In particular in view of Mathlib's continual growth...)

Michael Stoll (Jan 01 2025 at 19:48):

Maybe I should really check this. The PR contains some additional material...

Ruben Van de Velde (Jan 01 2025 at 19:59):

I don't think it's usually that bad

Michael Rothgang (Jan 01 2025 at 20:04):

How many LOC are we talking about?

Michael Stoll (Jan 01 2025 at 20:43):

OK, with only un-squeezing the terminal simp onlys, the slow-down is ~3.8% (see #20382) (with 25 new simpss in ~500 LOC).

So it is indeed not that bad -- sorry for the noise!

Kim Morrison (Jan 03 2025 at 09:34):

Yaël Dillies said:

Kim, are you planning on documenting this discouragement somewhere, eg the contribution guide? If not, I am happy to do this myself in a few days.

That would be great, thank you @Yaël Dillies!

Yaël Dillies (Jan 03 2025 at 17:00):

https://github.com/leanprover-community/leanprover-community.github.io/pull/567

David Loeffler (Jan 19 2025 at 16:02):

I'd like to make a counter-argument here. I do understand @Kim Morrison 's maintainability concerns, but I think there are cases where squeezing terminal simps is actually better for maintainability, not worse.

Consider the situation where a proof ends with simp only [foo, bar] where foo is part of the standard simp set but baz is not. If either foo or bar gets renamed, then the simp will break. The alternative simp [bar] is immune to failure caused by renaming foo. However, it can fail in another, much more pernicious reason, which is if a new lemma gets added to the standard simp set which works against bar; and this kind of failure, while probably rarer, is going to be a whole lot harder to debug.

An extreme case of this (which is the example that alerted me to the existence of this new rule) came up in #20813, where I (as reviewer) had suggested a simp only with 4 explicitly given simp lemmas, of which none are part of the standard simp set. So in this case, the prescription according to Kim's new rule – changing

simp only [foo, bar, baz, quux]

simp [foo, bar, baz, quux]

– seems (to me) to bring no actual maintainability benefit in terms of robustness against renames; and possibly even a loss in terms of the risk of new simp lemmas being added to the standard simp set which might block foo, bar etc from applying, because {standard simp set} ∪ {foo, bar, baz, quux} may well not be confluent. (The latter may not be an issue if simp prioritizes lemmas explicitly given as arguments over library simp lemmas; I don't know much about the internals of simp.)

Perhaps it might be better to take a middle path, and say that if simp without arguments works as a terminal tactic, it should be left unsqueezed, but that terminal simp only may still be appropriate when the simp call needs to use a mixture of standard simp lemmas + other extra ones?

Yaël Dillies (Jan 19 2025 at 17:38):

Your example seems to argue that the rule should rather be "if a terminal simp does not use any lemma from the standard simp set, then it can be squeezed"

David Loeffler (Jan 20 2025 at 07:27):

That was not the intent of my argument. I presented this example as an extreme case, but there's a spectrum here.

I'm arguing that: if a terminal simp uses no lemmas from the standard simp set, then it should always be squeezed (as in my example from #20813); if it uses only lemmas from the standard set, it should never be squeezed; and if it is a mix of the two, then it should be up to the author's judgement whether to squeeze it or not.

Yaël Dillies (Jan 20 2025 at 07:29):

I definitely agree with your first two points, but the third one (the "mix") can be interpreted in a variety of different ways and personal experience suggests that it would better be interpreted as "almost always leave unsqueezed"

David Loeffler (Jan 20 2025 at 07:30):

Your personal experience and mine differ here, and that is precisely why I am suggesting that it should be up to the code author's judgement.

Yaël Dillies (Jan 20 2025 at 07:30):

Eg I am currently adapting the API for conditional expectation to conditional variance and (after fixing the statements the obvious way) most fixes to broken proofs have been "unsqueeze the simp"

Yaël Dillies (Jan 20 2025 at 07:31):

Sure, but can we trust beginners to do the right call? They tend to oversqueeze

Yaël Dillies (Jan 20 2025 at 07:33):

David Loeffler said:

it can fail in another, much more pernicious reason, which is if a new lemma gets added to the standard simp set which works against bar; and this kind of failure, while probably rarer, is going to be a whole lot harder to debug.

This, to me, is a very hypothetical situation. When such a simp failure happens, you should turn diagnostics on and inspect which lemmas were used.

David Loeffler (Jan 20 2025 at 10:15):

I am concerned that the situation might become less hypothetical in future, if we start introducing lots of un-squeezed terminal simps to mathlib. But time will tell.

(I would also like to suggest that it might be better if changes to library style / conventions, which all contributors are expected to abide by, could be announced a bit more prominently than this one was.)

Eric Wieser (Jan 20 2025 at 10:18):

Yaël Dillies said:

This, to me, is a very hypothetical situation. When such a simp failure happens, you should turn diagnostics on and inspect which lemmas were used.

If the simp was not squeezed, you now have to do this twice, once on your branch at and once on matter.

Eric Wieser (Jan 20 2025 at 10:19):

(and wait for the corresponding cache download and unpack)

Riccardo Brasca (Jan 20 2025 at 11:09):

Also, can we make this discussion public? I would like to point it out to a new contributor.

Notification Bot (Jan 20 2025 at 13:15):

This topic was moved here from #mathlib reviewers > request: discourage squeezing terminal simps? by Kim Morrison.

Edward van de Meent (Jan 20 2025 at 15:06):

:looking: new lore just dropped

David Loeffler (Jan 26 2025 at 16:02):

Yaël Dillies said:

This, to me, is a very hypothetical situation. When such a simp failure happens, you should turn diagnostics on and inspect which lemmas were used.

Yael's "very hypothetical situation" – or something very close to it – seems to be happening already.

In the course of reviewing #20661, I ran into several terminal simp only's (or simpa only's) which fail when changed into simp's, for reasons which I have not been able to identify. Apparently some lemma from the default simp set is conflicting with the explicitly-provided lemmas, but I have not been able to work out which; and prefacing the theorem concerned with set_option diagnostics true in does not provide any relevant information at all.

David Loeffler (Jan 26 2025 at 16:07):

If it is considered a desirable goal to avoid squeezing terminal simps, can we have some guidance on how to diagnose and correct failures of this type?

Ruben Van de Velde (Jan 26 2025 at 16:53):

If you write simp? [existing explicitly provided lemmas], does the output show lemmas that are not listed? What happens if your exclude those?

David Loeffler (Jan 26 2025 at 17:26):

Good point! Of course simp? will show what lemmas are applied even if it doesn't succeed in closing the goal. It took me a little while to realise how to extend this to simpa only, but the solution seems to be to replace simpa only [...] using foo with have := foo; simp? [...] at this ⊢ and see what lemmas get applied.

Michael Rothgang (Jan 28 2025 at 19:23):

#21179 squeezes a few slow terminal simps (and speeds up the file by 17%/ ~40*10⁹ instructions). Initially, I put comments that these were squeezed. @Sébastien Gouëzel suggested I remove them (as they distract from the mathematics). I agree they do, but also would like to note the simps being squeezed for performance reasons. What do others think?

Michael Rothgang (Jan 28 2025 at 19:23):

(My comments also mention specific timings; I'd be happy to remove these before merging.)

Mauricio Collares (Jan 28 2025 at 19:28):

Bad idea: Periodically, someone could run a script which auto-unsqueezes terminal simps whenever that does not increase heartbeats by more than a certain threshold. Then it would be clear the remaining ones are there for performance reasons.

David Loeffler (Jan 28 2025 at 20:38):

It would be good to have some clarity on how slow is “too slow”, since we now have the slightly absurd situation that there are simultaneously PRs submitted to unsqueeze simps and others to squeeze them.

David Loeffler (Jan 28 2025 at 20:48):

As for @Mauricio Collares suggestion: we have tooling for automatically squeezing simps, but not for unsqueezing them; it’s not easy to automate - sometimes it takes a bit of manual tinkering when a lemma from the standard simp set conflicts with an explicitly provided one. So I think your “bad idea” script would be very hard to write.

Jon Eugster (Jan 28 2025 at 20:52):

didn't we once have simp? says simp only [...] as a syntax which @Kim Morrison made for exactly this situation? What happened to that?

Damiano Testa (Jan 28 2025 at 21:05):

I think that says did not really work as expected. My impression is that it made code harder to maintain, since, for instance, any change in what simp? would output would trigger an error of says.

Damiano Testa (Jan 28 2025 at 21:06):

So, a simp producing a valid proof could stop working, since the output of simp? changed.

Jon Eugster (Jan 28 2025 at 21:14):

Maybe we could make two modifications to says to make it more usable:

do not run set_option says.verify true in CI: maybe we don't care whether the result is outdated as long as it still compiles?
If the latter part (i.e. the simp only) fails to compile, try to elaborate the former and suggest the new squeezed version if successful. (I never looked at how reviewdog works, but maybe that suggestion can be posted to the PR breaking a says automatically?)

Mauricio Collares (Jan 28 2025 at 21:23):

David Loeffler said:

As for Mauricio Collares suggestion: we have tooling for automatically squeezing simps, but not for unsqueezing them; it’s not easy to automate - sometimes it takes a bit of manual tinkering when a lemma from the standard simp set conflicts with an explicitly provided one. So I think your “bad idea” script would be very hard to write.

It would be very hard to write properly, but a naïve "replace a terminal simp only [...] by just simp, and see if that succeeds in closing the goal" shouldn't be too bad (famous last words). If it results in an error, just don't do anything. This still has the advantage that if you notice a terminal simp only than can be manually replaced by simp, then you know it's there for perf reasons.

Kevin Buzzard (Jan 28 2025 at 21:25):

I should think that what we really want is to delete "only" and then start deleting elements of the list and putting them back if things break.

David Loeffler (Jan 28 2025 at 21:27):

replace a terminal simp only [...] by just simp, and see if that succeeds in closing the goal

This will almost always fail, if the examples I encountered when making #21133 are at all representative: most of them have one or two manually-added lemmas alongside a bunch more from the standard simp set. A slightly less naive "replace a terminal simp only [lemmas] with simp [subset of those lemmas which are not @simp flagged]" would have a much better chance of working, but would need to be able to interrogate the current context to find out which lemmas have @simp activated.

David Loeffler (Jan 28 2025 at 21:29):

I should think that what we really want is to delete "only" and then start deleting elements of the list and putting them back if things break.

This is pretty much what I did by hand to make #21133; but sometimes it already breaks as soon as you delete the only, without changing the content of the [...] at all, and then you need to work out which standard simp lemma you need to explicitly nerf in order to stop it conflicting with the rest (neg_add_rev was a frequent culprit)

David Loeffler (Jan 28 2025 at 21:32):

(Tangentially related question: how hard would it be to make the hover tooltip on lemma names in VSCode tell you whether those lemmas are @simp or not?)

Damiano Testa (Jan 28 2025 at 21:48):

I do not know how to display this in VSCode, but you can easily write a command that checks if a declaration is marked with simp or not:

import Lean

@[simp]
theorem X : True := trivial

open Lean Elab Command Meta in
elab "#is_simp " ids:ident+ : command => do
  let se := simpExtension.getState (← getEnv)
  for id in ids do
    let nm ← liftCoreM do realizeGlobalConstNoOverload id
    if se.lemmaNames.contains (.decl nm) then
      logInfoAt id m!"'{.ofConstName nm}' has the `simp` attribute!"
    else
      logWarningAt id m!"'{.ofConstName nm}' does not have the `simp` attribute!"

#is_simp X

attribute [local simp] Nat.succ_mul in
#is_simp X Nat Nat.succ_mul

#is_simp X Nat Nat.succ_mul

Kim Morrison (Jan 28 2025 at 23:36):

I think my preference is for lots of comments: squeeze terminal simps whenever you feel there is a performance benefit (probably should be at least 10,000 heartbeats to bother?), but then leave a comment to that effect.

David Loeffler (Jan 29 2025 at 11:09):

Kim Morrison said:

I think my preference is for lots of comments: squeeze terminal simps whenever you feel there is a performance benefit (probably should be at least 10,000 heartbeats to bother?), but then leave a comment to that effect.

Just to clarify: do you mean the simp call alone should be taking more than 10k heartbeats in order to be squeezed, or the whole proof that contains it? (I don't know if there's a one-shot way to count heartbeats in an individual tactic step, although of course one can count for the whole proof pre & post squeezing and compare.)

Jon Eugster (Jan 29 2025 at 12:45):

David Loeffler said:

I don't know if there's a one-shot way to count heartbeats in an individual tactic step

import Mathlib

example : True := by
  skip
  #count_heartbeats simp -- 624

Jon Eugster (Jan 29 2025 at 12:49):

that said, I don't know the implementation and/or don't understand fully what heartbeats are:

import Mathlib

#count_heartbeats in -- Used 6 heartbeats, which is less than the current maximum of 200000.
example : True := by
  skip
  #count_heartbeats simp -- 624

Damiano Testa (Jan 29 2025 at 17:33):

I don't know if this is the reason, but there are heartbeats and "kiloheartbeats", but they are both called the same.

Damiano Testa (Jan 29 2025 at 17:33):

So, sometimes, you have to divide by 1000 or multiply by 1000 to get the same scale.

Damiano Testa (Jan 29 2025 at 17:34):

(I just made up kiloheartbeats, in case it was not clear!)

David Loeffler (Jan 29 2025 at 18:53):

Yes, this is a recurring issue: the whole-proof-level count_heartbeats in … uses kiloHBt, while the hashed #count_heartbeats tacseems to use raw HBt. I assume Kim’s intended threshold was 10,000 kHBt, which is several tens of seconds on my machine, not 10 kHBt, which is less than even the most trivial tactic steps.

Alex Meiburg (Jan 30 2025 at 04:58):

No one has mentioned simp [-foo, bar] as an option for making things more robust?

Not infrequently I have something with one non-simp lemma bar, and that together with some ten standard simp lemmas will close the goal, but it's in conflict with a standard simp lemma foo.

My usual go-to is a long simp only, but I've been toying with using simp lemma exclusion instead and I think it's more readable for myself later.

If/when the simp later breaks, if I run simp? [-foo, bar]? at that place, I'm pretty much guaranteed to see one simp lemma in the output whose left hand side matches that of foo, and that's the culprit to exclude.

Kim Morrison (Jan 30 2025 at 04:59):

(Just an aside: when you find yourself using -foo, please feel encouraged to investigate the impact of removing the @[simp] from foo entirely!)

Johan Commelin (Jan 30 2025 at 07:54):

Damiano Testa said:

So, sometimes, you have to divide by 1000 or multiply by 1000 to get the same scale.

Just like with calories :rofl:

Last updated: May 02 2025 at 03:31 UTC