Zulip Chat Archive

Stream: mathlib4

Topic: noisy simp says in CI

Alex Meiburg (Sep 08 2025 at 03:49):

I can't be the only one who's been plagued by noisy simp? says causing CI to fail, right? e.g. https://github.com/leanprover-community/mathlib4/actions/runs/17538151883/job/49804893528?pr=26331

I think in basically any scenario where the simp is _confluent_ but there are _multiple paths_ to simplify into a common form, simp? says is bound to pseudorandomly vary. And it makes it very annoying to fix when it twiddles back and forth all over mathlib in files you're not touching

Anne Baanen (Sep 08 2025 at 11:10):

I would be in favour of moving everything back to a plain simp.

Kevin Buzzard (Sep 08 2025 at 11:24):

In another (possibly private) channel, @Kim Morrison suggested that simp? says was a "failed experiment" so I would imagine that nobody will be objecting to removing them from mathlib.

Mauricio Collares (Sep 08 2025 at 11:27):

Another reason for removing simp? says is that it causes differences between CI oleans and locally-built (i.e. without setting the CI environment variable to true) oleans.

Mauricio Collares (Sep 08 2025 at 11:30):

Kevin Buzzard said:

In another (possibly private) channel, Kim Morrison suggested that simp? says was a "failed experiment" so I would imagine that nobody will be objecting to removing them from mathlib.

#nightly-testing > Mathlib status updates @ 💬

Yaël Dillies (Sep 08 2025 at 12:38):

Kevin Buzzard said:

@Kim Morrison suggested that simp? says was a "failed experiment" so I would imagine that nobody will be objecting to removing them from mathlib.

Don't remove Kim from mathlib! :fear:

Anne Baanen (Sep 08 2025 at 12:59):

I'll get started on a PR to remove simp? says.

Sébastien Gouëzel (Sep 08 2025 at 13:09):

I'm a big fan of simp? says: when there are very heavy simps, it is very valuable information to know which lemmas are interesting (those not in the default simp set), and which are not (coming from the default simp set). Otherwise, when the simp only breaks, it is very hard to fix. This happens often in differential geometry, which is simp very heavy. I already said that in #mathlib4 > Lint error with `Simp says` @ 💬

Kim Morrison (Sep 08 2025 at 13:13):

I think the solution is to write:

a wrapper for simp that allows you to write something like simp only [foo, bar] ++ [X, Y], and just translates this to simp only [foo, bar, X, Y]
maybe this wrapper should verify that everything in the first set is not in the standard simp set, and everything in the second set is
then some simp?' that generates this sort of output

Kim Morrison (Sep 08 2025 at 13:15):

arguably all simp only should be converted to this form

Kim Morrison (Sep 08 2025 at 13:16):

maybe we could write simp only [foo, bar; X, Y] or similar to keep the noise down

Alex Meiburg (Sep 08 2025 at 13:17):

Kim Morrison said:

arguably all simp only should be converted to this form

I not infrequently write simp only [foo] with just one argument, because really I am doing a sort of rewrite, under binders, and applying foo in several distinct places. I feel like that would be an exception. Although maybe simp only [foo;] or simp only [;foo] isn't that much worse. (I think it would be one extra thing to scare newcomers though.)

Robin Carlier (Sep 08 2025 at 13:19):

Yes, there are many cases of simp only's where everything is in the standard simp set (usually, Category.assoc, Functor.map_comp, etc.), simp only [] ++ [...] would feel very weird...

Kim Morrison (Sep 08 2025 at 13:19):

One thing that is unfortunate about the simp only [foo, bar; X, Y] proposal is that often the order of the lemmas is actually pretty close to the order they are applied in (indeed, sometimes one can successfully replace a simp only with a simp_rw in Mathlib), and having to split things before and after a divider obscures this.

Kim Morrison (Sep 08 2025 at 13:20):

I agree that often everything is in the simp set. So these should just stay with the current syntax.

Anne Baanen (Sep 08 2025 at 13:33):

I consider there are three kinds of simp? says:

a terminal simp?/simpa?: I'm turning this into simp/simpa
simp? [foo] says simp only [foo]: I'm turning this into simp only [foo]
simp? [foo] says simp only [foo, ...] with a long list of arguments: here we seem to have the actual disagreements.

I'll make sure to batch the changes into these three kinds, so we can deal with them in turn.

An alternative approach for 3 is to make it more declarative: instead of simp?/simp only we can write a suffices with the intended goal type.

Kevin Buzzard (Sep 08 2025 at 14:04):

Indeed suffices was a common Lean 3 idiom in this setting.

Damiano Testa (Sep 08 2025 at 14:25):

Would keeping the order, but marking with a + the lemmas that appear before squeezing be reasonable? There is already - to remove from the simp set, so having a + to mean "really add this" could work?

Weiyi Wang (Sep 08 2025 at 15:35):

I have turned to suffices a few times when the simp only list looks long and ugly

Anne Baanen (Sep 08 2025 at 16:42):

Alright, PRs are in. I looked over all sayses excluding those in MathlibTest and Mathlib/Tactic/Says. The edits are split out by controversiality:

Replacing simp? says with a terminal tactic: #29440
Replacing simp? says with suffices/have: #29441
Replacing simp? says with simp only: #29442

Sébastien Gouëzel (Sep 08 2025 at 16:53):

Thanks! I'm very happy with the first two, still not completely convinced by the last one. Random example: going from

  simp? [HomologicalComplex₂.total_d, h₁.comm i₁, dFrom, fromNext, toPrev, dTo] says
    simp only [ι_mapBifunctorMap, h₁.comm i₁, dNext_eq_dFrom_fromNext, dFrom, fromNext,
      AddMonoidHom.mk'_apply, prevD_eq_toPrev_dTo, toPrev, dTo, Functor.map_add,
      Functor.map_comp, NatTrans.app_add, NatTrans.comp_app,
      Preadditive.add_comp, assoc, HomologicalComplex₂.total_d,
      Functor.mapBifunctorHomologicalComplex_obj_obj_toGradedObject, Preadditive.comp_add,
      HomologicalComplex₂.ι_D₁_assoc, Functor.mapBifunctorHomologicalComplex_obj_obj_X_X,
      HomologicalComplex₂.ι_D₂_assoc, add_left_inj]

  simp only [ι_mapBifunctorMap, h₁.comm i₁, dNext_eq_dFrom_fromNext, dFrom, fromNext,
    AddMonoidHom.mk'_apply, prevD_eq_toPrev_dTo, toPrev, dTo, Functor.map_add,
    Functor.map_comp, NatTrans.app_add, NatTrans.comp_app,
    Preadditive.add_comp, assoc, HomologicalComplex₂.total_d,
    Functor.mapBifunctorHomologicalComplex_obj_obj_toGradedObject, Preadditive.comp_add,
    HomologicalComplex₂.ι_D₁_assoc, Functor.mapBifunctorHomologicalComplex_obj_obj_X_X,
    HomologicalComplex₂.ι_D₂_assoc, add_left_inj]

looks like a worsening to me.

Alex Meiburg (Sep 08 2025 at 17:30):

What about a compromise of putting just a comment above?

-- simp [HomologicalComplex₂.total_d, h₁.comm i₁, dFrom, fromNext, toPrev, dTo]
simp only [ι_mapBifunctorMap, h₁.comm i₁, dNext_eq_dFrom_fromNext, dFrom, fromNext,
    AddMonoidHom.mk'_apply, prevD_eq_toPrev_dTo, toPrev, dTo, Functor.map_add,
    Functor.map_comp, NatTrans.app_add, NatTrans.comp_app,
    Preadditive.add_comp, assoc, HomologicalComplex₂.total_d,
    Functor.mapBifunctorHomologicalComplex_obj_obj_toGradedObject, Preadditive.comp_add,
    HomologicalComplex₂.ι_D₁_assoc, Functor.mapBifunctorHomologicalComplex_obj_obj_X_X,
    HomologicalComplex₂.ι_D₂_assoc, add_left_inj]

when we really want to preserve that information

Joël Riou (Sep 08 2025 at 17:51):

Yes, this is what I have been doing in my most recent code.

Patrick Massot (Sep 08 2025 at 18:13):

This kind of comment will automatically become a lie as the simp set changes.

Alex Meiburg (Sep 08 2025 at 18:22):

It's only a lie if you interpret it as a promise that "this code produces this output". It's not a lie if you interpret it as "this code was originally used to discover this other code", and they may well be all that is needed

Joël Riou (Sep 08 2025 at 19:01):

Indeed, I would usually write something like -- this was generated by simp? [...]. In case the proof breaks in the future, the information of the original "minimal" list of lemma can be very fulpful in order to fix the proof, and the comment can be updated.

Damiano Testa (Sep 08 2025 at 19:04):

I agree that the comment is better than nothing, but it saddens me that the final code gets obscured by squeezing the simp and that there is no good alternative to this.

Kim Morrison (Sep 09 2025 at 11:13):

Yes, let's not merge the 3rd PR.

Does anyone want to implement simp [X, Y; foo, bar] or simp [+X, +Y, foo, bar] as suggested above? Hopefully either of these can just be done in Mathlib itself.

Yaël Dillies (Sep 09 2025 at 12:07):

I like the second one a lot given the existence of the simp [-X, -Y, foo, bar] syntax

Eric Rodriguez (Sep 09 2025 at 13:51):

To throw another hat into the ring: before simp says, I wanted to implement a "simp caching" mechanism, where we would have the "short simps" of type 3 in the main codebase, whilst the long expanded simp onlys were either elsewhere or by default folded in the code editor. Importantly, the original command wasn't run [only type-checked for lemma renamings etc]; only this cached command. So how about we implement simp? said that does that?

Jireh Loreaux (Sep 09 2025 at 14:23):

Where would said cache be stored?

Eric Rodriguez (Sep 09 2025 at 14:41):

Back then, my vision was a separate file; but I think just keeping this as simp? said simp [...] would be fine

Alex Meiburg (Sep 09 2025 at 14:58):

I don't see how what you're describing is different from the current behavior of simp? says (when says.verifier is false)

Eric Rodriguez (Sep 09 2025 at 16:27):

Ah, I wasn't aware of says.verifer - I guess I'm then advocating for that to be off

Jireh Loreaux (Sep 09 2025 at 16:37):

It's off except in CI, where it's on so that we can make sure what it says still works before we merge. So it seems like you are advocating for the status quo.

Bryan Gin-ge Chen (Sep 09 2025 at 16:44):

There does seem to be some unfortunate interaction with the cache, where a PR can pass CI even if it causes says to fail if lake build ends up using the cache files for the files containing the outdated says. Then someone else who didn't cause the failure might end up having to fix it just because their PR happened to trigger a rebuild.

Eric Rodriguez (Sep 09 2025 at 17:39):

Jireh - I'm advocating that it should not cause CI failures even if the suggested text is completely wrong, as long as it's a valid lean tactic. Even something like exact h says simp [h, add_comm] should be fine.

To me , X says Y tells me: I wanted to apply tactic X, but this is not appropriate for mathlib due to [tactic stability/speed/otherwise], so therefore I must actually use tactic Y.

Eric Rodriguez (Sep 09 2025 at 17:44):

This would also allow for things like polyrith says XYZ, which could be nice

Alex Meiburg (Sep 09 2025 at 17:48):

That seems almost like having a comment, then, except that it gets checked for valid syntax, and gets correct syntax highlighting?

So is the idea that there would be a says present tense which means "this currently says this" and is checked in CI, and a said past tense which just means "this said this once but no promises now", and it ignores the verify flag?

Alex Meiburg (Sep 09 2025 at 17:51):

Or maybe instead of said, something like became , to remove the connotation that this was output (like in Eric's other examples)

Alex Meiburg (Sep 10 2025 at 03:22):

PR for this: #29488

Alex Meiburg (Sep 10 2025 at 03:23):

/poll Best way to express this?
X said Y
X became Y
This kind of tactic shouldn't exist

Thomas Murrills (Sep 10 2025 at 05:11):

I might propose once_said or says (on := "YYYY-MM-DD"), but I think it would be important to first collectively answer the following three questions:

Who exactly is the audience of this (effective) comment? What are they meant to get out of it? What are the costs?
When should it be used instead of says?
What should be the possible long-term fates of this code? What are the expectations for what to do about it when it becomes outdated? How should it be maintained?

My personal answer to (1) would be the same as any comment: "future maintainers of the proof, who want to know why it was written this way". Communicating the intent behind code, from past to future, seems useful in general, so this seems like a point in favor to me.

But this is a duty well-performed by an ordinary comment, so we've got the cost of new syntax and (arguably) readability. I think the only thing that would elevate said/once_said above an ordinary comment and warrant the syntax is if we could, actually, verify that it did say this, or check what it says now (even if, unlike says, we decline to do so during CI).

Are there other answers to (1)?

I'm not sure I see immediate, canonical answers to questions 2 and 3, and they're important questions for reviewing new code and maintaining existing code, which makes me pause.

I think a reasonable answer to (2) would be "only when both of the following are true: suggestions are liable to change or the result would likely not have been human-generated; the suggested tactic is a 'refined' or 'completed' version of the suggesting tactic (e.g. nothing similar to hint said ...)". The noisiness issue is mentioned in #29488, but not the other parts.

I think (3) presents a real ambiguity, though. If someone is cleaning up a proof and notices that we have tac1? once_said tac2 but tac1? now says tac3, when should they change it?

I'm tempted to suggest a "chore model" wherein we write says (on := "YYYY-MM-DD") such that an automated PR, made relatively infrequently (and independently of ordinary CI), could turn on an option, re-run the suggesting tactic for each dated says, and

if the result is the same, update the date to the current date
if the result is different, check if the resulting tactic state is (meaningfully) different than the one produced by the recorded suggestion. If so, report the change somewhere (e.g. on the PR description).

But that seems complex, and I'd be reluctant to introduce these sorts of long-term chores and extra automation without a sufficiently good reason.

Thomas Murrills (Sep 10 2025 at 05:12):

I wonder if we could fix the noisiness issue just by having some variant of says, like tac1? says +upToSyntax tac2, which still runs during every CI, but only compares the tactic state produced by tac1?'s current suggestion to the tactic state produced by tac2, ignoring any differences in syntax.

This protects against regressions (both still have to work) and alerts us to pro(?)gressions (where tac1? is able to suggest something better).

Are there cases where this wouldn't be what we want, and we'd still want something like said or says (on := "YYYY-MM-DD")?

Kim Morrison (Sep 10 2025 at 06:37):

I really do not like intentionally leaving unchecked code in the repository. It is just a recipe for accumulating rot.

Alex Meiburg (Sep 22 2025 at 13:54):

@Jovan Gerbscheid, re: your comment on #29488, there's more discussion here
At least one benefit for x said y over -- x \n y is that the former checks the syntax for correctness, and at least in principle stores some extra info in the syntax tree

Jovan Gerbscheid (Sep 22 2025 at 13:57):

I thought the point of said was that, as opposed to says, it does not verify the correctness of the tactic. So I don't understand why you would want it to check the syntax for correctness.

Jovan Gerbscheid (Sep 22 2025 at 14:01):

By the way I completely agree with you that simp? says is annoying in CI

Floris van Doorn (Nov 14 2025 at 11:41):

I'm coming late to this topic, after just realizing Mathlib doesn't contain any simp? says anymore.

IIUC, it was removed because CI was annoying, because unnecessary/unneeded changes to the RHS were suggested.
Did we consider changing the semantics of t says s (or similar syntax) to mean

For users: it executes s
For CI (or CI once a ~week): it checks that t and s both (independently) close the goal (assuming it's finishing). It does not check whether t produces the output s (which is noisy/fragile).

Damiano Testa (Nov 14 2025 at 11:53):

For that use, says seems less appropriate than as well as/alternatively.

Floris van Doorn (Nov 14 2025 at 13:51):

I'm happy to rename what the tactic is called, but I think the functionality is useful.

Last updated: Feb 28 2026 at 14:05 UTC