Zulip Chat Archive

Stream: mathlib4

Topic: using `Iff` rfl proofs in `dsimp`

Thomas Murrills (Oct 13 2023 at 05:45):

Update: this is a little more subtle than I thought. I have an idea for what's wrong, but on second thought, I'm not completely certain about it. I'll think more carefully about this. :)

Utensil Song (Oct 13 2023 at 06:01):

Thomas Murrills said:

Issue: #2678
Draft PR forthcoming...

It seems to be lean4#2678 actually.

Thomas Murrills (Oct 13 2023 at 06:47):

Ah, thank you, you're right—I always forget the prefix. Edited. :)

Thomas Murrills (Oct 13 2023 at 06:52):

So, I believe the issue was that simp preprocesses Iff lemmas by applying propext to them. As such, we can't just look for Iff.rfl and the like when determining if something is a rfl theorem (i.e. is fit for use by dsimp); we need to peel away propext as well. We'll see if this notion holds up after the toolchain compiles and I test it. :)

Thomas Murrills (Oct 13 2023 at 19:32):

Good news: it works correctly! Bad news: std is now broken.

Thomas Murrills (Oct 13 2023 at 19:34):

But maybe only a little. :) It looks to be just a single proof (so far). Though I'm a bit worried as to what that means for mathlib...

Thomas Murrills (Oct 14 2023 at 21:59):

I think I've finally understood the issue with the std breakage. It's interesting!

The offending line is of the form simp [by simpa using foo]. This essentially means "simplify the type of foo, then use it as a simp lemma".

The simp in simpa here (and in general) doesn't include dsimp lemmas in its construction of the term. This PR makes more simp lemmas into dsimp lemmas, so when we elaborate by simpa using foo, we get a term that potentially has fewer lemmas in it. This means that when we now infer the type of that term, we might get one which is not as (d)simplified.

However, this is only the case if we instantiateMVars before inferring the type. Note that elab_type has some extra machinery in it to replicate some of the conditions of elaboration within simp (and this affects the result). (I'm guessing this is because something caches the specific type of the mvar being worked on by by simpa using foo, but I'm not sure; inferType is opaque.)

import Lean
import Std.Tactic.Simpa

open Lean Meta Elab Tactic

elab "elab_type" stx:term : tactic => withMainContext <| withReducible do
  let e ← Term.elabTerm stx none
  Term.synthesizeSyntheticMVars (mayPostpone := false) (ignoreStuckTC := true)
  let type ← inferType e
  logInfoAt stx m!"type: {indentD type}"
  let typeOfInstantiated ← inferType <|← instantiateMVars e
  logInfoAt stx m!"type of instantiated expr: {indentD typeOfInstantiated}"

structure Foo where a : Nat
instance : LE Foo where le x y := x.a ≤ y.a

-- Now counts as a `dsimp` theorem
@[simp] theorem Foo.mk_le_mk {a₁ a₂} : { a := a₁ : Foo } ≤ { a := a₂ } ↔ a₁ ≤ a₂ := Iff.rfl

theorem Foo.le_iff_zero (l r) : { a := l + r : Foo} ≤ { a := l } ↔ r = 0 := sorry

example := by
  elab_type by simpa using Foo.le_iff_zero
  exact 0
/-
Currently, without `Foo.mk_le_mk` as a `dsimp` theorem:
type:
  ∀ (l r : Nat), l + r ≤ l ↔ r = 0
type of instantiated expr:
  ∀ (a a_1 : Nat), a + a_1 ≤ a ↔ a_1 = 0

Now, with `Foo.mk_le_mk` as a `dsimp` theorem:
type:
  ∀ (l r : Nat), l + r ≤ l ↔ r = 0
type of instantiated expr:
  ∀ (l r : Nat), { a := l + r } ≤ { a := l } ↔ r = 0
-- ^ note that this is just `Foo.le_iff_zero`, since `simpa` doesn't use `Foo.mk_le_mk` to construct the term
-/

Note that this is not because of an outright defeq failure in Foo.mk_le_mk—if the two sides weren't defeq, we couldn't have used Iff.rfl. However, these are not reducibly defeq, which might contribute.

Note also that this is not specific to any details of how this PR handles Iff theorems—it only comes from the fact that more theorems are now able to behave this way. This behavior already happens with =—and, in fact, replacing ↔ with = in the above example causes current Lean to behave in the same way the PR behaves.

Thomas Murrills (Oct 14 2023 at 22:01):

So, here are the questions that this leaves me with:

Is the failure of reducible defeq really why simp can't use the new result of by simpa using ...?
Would simply inferring the type before instantiating mvars "fix" the situation? Or might it break many other things? (There's an easy way to find out, I guess...)
Should this simply be ignored for now/for this functionality? After all, it's extremely easy to patch over: simp [show <intended type> from by simpa using foo].

Thomas Murrills (Oct 14 2023 at 22:15):

(I've just realized I never linked the draft PR here directly. Here it is: lean4#2679)

Mario Carneiro (Oct 14 2023 at 22:18):

@Thomas Murrills said:

The simp in simpa here (and in general) doesn't include dsimp lemmas in its construction of the term. This PR makes more simp lemmas into dsimp lemmas, so when we elaborate by simpa using foo, we get a term that potentially has fewer lemmas in it. This means that when we now infer the type of that term, we might get one which is not as (d)simplified.

This seems counterintuitive to me. Why would simp not use a theorem because it is a dsimp lemma?

Mario Carneiro (Oct 14 2023 at 22:19):

I would expect simp to use @[simp] lemmas whether or not they are dsimp lemmas

Thomas Murrills (Oct 14 2023 at 22:22):

(Admittedly, I am still looking for where in the code that decision is actually made, but the only difference affecting a lemma's in- or exclusion seems to be its status as a dsimp lemma. It's possible the actual criterion is different.)

Mario Carneiro (Oct 14 2023 at 22:28):

Do you have a simpa-free example?

Thomas Murrills (Oct 14 2023 at 22:28):

Oh, I think the crucial thing might be here, in ..Simp.Rewrite:

private def tryTheoremCore ... (thm : SimpTheorem) ... : SimpM (Option Result) := do
  ...
      let proof? ← if thm.rfl then
        pure none

Thomas Murrills (Oct 14 2023 at 22:28):

Mario Carneiro said:

Do you have a simpa-free example?

I'll try to make one.

Mario Carneiro (Oct 14 2023 at 22:29):

that just says that if the lemma is a dsimp lemma then no proof is required

Mario Carneiro (Oct 14 2023 at 22:29):

it is still simplifying the expression itself

Thomas Murrills (Oct 14 2023 at 22:30):

Isn't saying no proof is required what we're talking about here, though? Like, simp(a) therefore doesn't insert the proof into the term it's constructing.

Thomas Murrills (Oct 14 2023 at 22:30):

Or is it used differently?

Thomas Murrills (Oct 14 2023 at 22:54):

Here's a simpa-free #mwe that doesn't require you to change your toolchain to observe the change in behavior.

import Lean

open Lean Meta Elab Tactic

elab "elab_term" stx:term : tactic => withMainContext <| withReducible do
  let e ← Term.elabTerm stx none
  Term.synthesizeSyntheticMVars (mayPostpone := false) (ignoreStuckTC := true)
  let type ← instantiateMVars <|← inferType e
  logInfoAt stx m!"term: {indentD e}"
  logInfoAt stx m!"type: {indentD type}"
  let typeOfInstantiated ← inferType <|← instantiateMVars e
  logInfoAt stx m!"type of instantiated expr: {indentD typeOfInstantiated}"

structure Foo where a : Nat
instance : LE Foo where le x y := x.a ≤ y.a

theorem Foo.le_iff_zero (l r) : { a := l + r : Foo } ≤ { a := l } ↔ r = 0 := sorry

-- NOT a dsimp lemma currently:
@[simp] theorem Foo.mk_le_mk {a₁ a₂} : ({ a := a₁ : Foo } ≤ { a := a₂ }) ↔ (a₁ ≤ a₂) := Iff.rfl

example (l r : Nat): l + r ≤ l := by
  simp only [by
    have h := Foo.le_iff_zero
    simp at h
    exact h]
  -- unsolved goal: ⊢ r = 0

-- display term and type used by `simp only`
example := by
  elab_term by
    have h := Foo.le_iff_zero
    simp at h
    exact h
  exact 0

attribute [-simp] Foo.mk_le_mk
-- A dsimp lemma:
@[simp] theorem Foo.mk_le_mk_eq {a₁ a₂} : ({ a := a₁ : Foo } ≤ { a := a₂ }) = (a₁ ≤ a₂) := rfl

example (l r : Nat): l + r ≤ l := by
  simp only [by
    have h := Foo.le_iff_zero
    simp at h
    exact h]
  -- simp made no progress

-- display term and type used by `simp only`
example := by
  elab_term by
    have h := Foo.le_iff_zero
    simp at h
    exact h
  exact 0

Thomas Murrills (Oct 14 2023 at 23:32):

Mario Carneiro said:

it is still simplifying the expression itself

I'm guessing the thing here is that when simp [bar] is called, it has to infer the type of the term bar. When bar is e.g. by simpa using foo, simp only sees the proof term created by by simpa using foo, and infers the type from there. But you wind up inferring different types depending on whether you instantiate the mvar that by simpa using foo was working on. If you do instantiate that mvar before inferring the type, the exclusion of dsimp lemmas from the proof term means the inferred type isn't dsimp'd. If you don't, you get to use the mvar context, I'm guessing—which is what remembers that simplified expression.

Thomas Murrills (Oct 15 2023 at 03:33):

Hmm, it looks like making some things rfl theorems which weren't before just...causes things to act differently. There are a few odd breakages over at the lean-pr-testing-2679 branch (#7686), including a whnf timeout (maybe it's working on a different expression now?) and an aesop_cat failure. If anyone would like to investigate these and add to the branch, feel free; otherwise I'll get to them soon. :)

Thomas Murrills (Oct 15 2023 at 04:21):

One of them might be indicative of something a little more serious...a simp invocation now fails. The following used to work (Mathlib.Order.Atoms):

theorem covby_iff_atom_Ici (h : a ≤ b) : a ⋖ b ↔ IsAtom (⟨b, h⟩ : Set.Ici a) := by simp

It now (technically) makes progress but doesn't close the goal. Squeezing, we see that now it uses Set.mem_Ici, whereas the goal-closing invocation used to be equivalent to simp only [Set.Ici.isAtom_iff].

I suspect a crucial fact is that dsimp now makes progress on the type of Set.Ici.isAtom_iff (via Set.mem_Ici)—it rewrites something in an instance argument, so pp.explicit is required to see it. But there are still details to work out; I'm not totally sure where in simp things go wrong.

Thomas Murrills (Oct 19 2023 at 21:22):

It seems that the reason this changes is that simp calls dsimp specifically at certain points, e.g. mkCongrFun r (← dsimp arg) (which is the specific difference in this case). I'm not sure what the best way to proceed here is—just fix the broken cases?

It'd be nice to have an actual "solution" (maybe something else needs to be dsimpd as well), but I'd guess it's not worth holding up the dsimp Iff feature.

Notification Bot (Oct 19 2023 at 21:29):

24 messages were moved here from #mathlib4 > Tactic state explosion by Thomas Murrills.

Thomas Murrills (Oct 25 2023 at 07:02):

@Jannis Limperg , I was just wondering—do you happen to have any opinion on Iff.rfl lemmas being used by dsimp? I ask because in the testing this change (#7686), an aesop invocation breaks. I'm not sure how aesop depends on simp, so I thought I'd just ask. :) (aesop may also possibly involved in a separate whnf category theory timeout.)

(There are two major effects of such a change (as you might be aware): (1) the proofs of these lemmas are now not inserted into the constructed proof term (2) invocations specifically of dsimp during certain parts of simp can change how simp behaves.)

Jannis Limperg (Oct 25 2023 at 08:09):

Aesop runs simp_all on every goal before it does anything else (more or less). It treats simp_all as a black box, so changing simp's behaviour shouldn't affect Aesop itself. As such, I don't mind Iff.rfl lemmas being included. Any breakage would most likely be due to interactions with the current CategoryTheory rule set, which can hopefully be fixed.

Thomas Murrills (Oct 26 2023 at 22:44):

Ok, sounds good!

Thomas Murrills (Oct 26 2023 at 22:52):

General update for those interested: this causes >30 breakages in mathlib so far, which I've been fixing in a back-and-forth with CI.

Some of them are because simp is now more powerful (and thus there are no goals to be solved), which is nice, but most are resolved only by turning off dsimp via the config. I'm not sure I love the notion of "try it again with dsimp := false" being a more common strategy (albeit still quite uncommon) for making simp work.

The most common issue seems to be rewriting under a fun in the argument of a Subtype, which occurs because simp specifically runs dsimp before mkCongrFun. This can prevent other simp lemmas from applying. It can also change the lemmas that simps (or at least simps!) generates.

I wonder if we should either

just accept the dsimp := false strategy, at least for now
try out removing dsimp before mkCongrFun in simp (might break things)
consider dsimping something else inside simp (or when we register the simp lemma) so that these simp lemmas still apply (might not actually solve the issue)
something else?

Thomas Murrills (Oct 26 2023 at 22:55):

(I think I might try (2) just out of curiosity, and see what breaks.)

Eric Wieser (Oct 28 2023 at 23:22):

Can you give an example of the type of thing that causes trouble?

Floris van Doorn (Nov 12 2024 at 12:43):

I was surprised to learn today that dsimp still doesn't use lemmas proven by Iff.rfl. It would be nice if it could...

Edward van de Meent (Nov 12 2024 at 12:51):

i imagine this doesn't always work, as it might cause loops?

Floris van Doorn (Nov 12 2024 at 18:13):

This is issue lean4#2678 btw, and @Thomas Murrills has two draft PRs related to this: lean4#3973 & lean4#2679. @Thomas: what is the status of these PRs?

Floris van Doorn (Nov 12 2024 at 18:15):

Edward van de Meent said:

i imagine this doesn't always work, as it might cause loops?

I don't see the issue. Sure, it could cause loops, but we'll be careful with it (and the simpNF lemma should check for this?). And a lemma makes dsimp loop only if it would make simp loop, I think. So I don't think that is a reason not to implement this.
Clarification: of course I only want dsimp to use lemmas proven by Iff.rfl and tagged by @[simp].

Thomas Murrills (Nov 12 2024 at 20:08):

The status is “it worked as intended, but in practice required so many config adjustments to simp calls in mathlib (e.g. dsimp := false) that, at the time, I gave up” :/

I’ll bump everything and take another look. It’s possible that changes to simp in the meantime just so happen to have mitigated things, or that I’d understand the practical issue better now. (Or, at worst, changing the simp config is at least now much nicer syntactically. :) )

Thomas Murrills (Nov 13 2024 at 07:17):

Weirdly, I'm hitting an error only in the Linux build. Might anyone have an idea as to why (1) the Linux job seems to be the only one to build stage 2, and (2) how to practically approach debugging this error? The issue is that since the Linux build failed, no toolchain is created for any platform, so I can't inspect what's wrong locally. Is my only option an entirely local compilation? (Note: I need release-ci because I'm on an Intel-based Mac. I'd like to avoid compiling Lean locally on that mac if possible! :) )

Kyle Miller (Nov 13 2024 at 07:23):

An answer to (1) is that the Linux build is the only one that builds stage 2 :-)

I only do local builds for debugging, so I don't have any tips there. But, you likely only need a stage1 build to test this. You can create a file in the tests directory and try to reproduce the failure in Init/Data/Option/Attach.lean, then hope you can come up with a proof that succeeds both before and after your changes to what can be dsimped.

Thomas Murrills (Nov 13 2024 at 07:27):

Hmm, right—I suppose one temporary change I could make is just copying the term-mode proof that by? shows me currently, which sidesteps simp entirely, and hope that this is enough to get me a functioning toolchain!

Thomas Murrills (Nov 18 2024 at 22:07):

I've been investigating this, and have a table of all the (~70 and counting) errors I've encountered so far (recorded together with their causes, involved expressions, problematic new dsimp lemmas, etc.) which paints a mostly straightforward picture.

simp calls out to dsimp when rewriting under certain binders in certain circumstances, such as certain type (or prop) families. For example, in @Subtype.val α (p : α → Prop) (self : Subtype p) : α with p := fun x => e, simp will use dsimp to rewrite e. The new dsimp lemmas used here cause later lemmas to not be able to be applied; it seems the DiscrTree just no longer matches them.

Often the issue is within a single simp call, but sometimes it causes subsequent issues, like a later rw failing; or some tactic which calls simp fails, such as aesop or nontriviality. Often, but not always, disabling dsimp in the config and/or removing a certain problematic lemma works; however, sometimes these lemmas (or other dsimp lemmas) are necessary in other spots.

It could be that this change to dsimp causes other simp lemmas to no longer be in simp normal form, but I suspect

that sometimes the failure is more complex, meaning that the other lemmas involved in a particular newly-failing simp call are in simp normal form, but the specific expression at hand is more complex and can be dsimped in a way which nonetheless changes its structure
that we would not want to change this normal form. The most frequent issue involves Subtype, and this would mean, for example, changing e.g. { x // x ∈ _ } to the rhs of whatever the relevant mem_* lemma is: for example, with o : Option A, { x // x ∈ o } would have to become { x // some x = o }.

Thomas Murrills (Nov 18 2024 at 22:09):

I can go further into the spectrum of different issues and some ideas for potential fixes, but first I wanted to check on what changes are already in the pipeline! :) For example, the PR description for Leo's lean4#6054 says

motivation: this is the first step to fix the mismatch between isDefEq and the discrimination tree indexing.

Does this mean that DiscrTree's will (somehow) be better at matching up to isDefEq? If so, that could fix this, I think. But I don't know if that really is the goal for these PRs.

Thomas Murrills (Nov 19 2024 at 23:06):

@Kyle Miller do you happen to know the full scope of what's meant by the PR description of lean4#6054 ? (I saw you discuss it briefly in #metaprogramming / tactics > whnf variant which preserves let-bindings)

Kyle Miller (Nov 19 2024 at 23:10):

It's something to do with making sure that when simp's configuration is in sync with Meta's configuration, so for example if zetaDelta is false then let decls aren't reduced when building a discrimination tree from the local context, and I think another example is to make sure that simp's configuration will control how simp reduces terms

Thomas Murrills (Jan 28 2025 at 17:05):

(I won’t be returning for another week, but wanted to outline some possible approaches in response to a GitHub comment!)

Hmm, ok. It seems like it’s not quite so broad as to eliminate this issue, I think.

I think that one way or another, we do not want to simply adapt these simp calls by e.g. adding -dsimp; there are enough to suggest that simp is actually made worse by naive inclusion of these lemmas in dsimp.

Here are four ways I can think of fixing this problem (for various senses of “fixing”).

Make simp not dsimp in the relevant place. This means either not dsimping arguments on which later arguments depend (a property simp already knows about) or something more surgical (e.g. changing dsimps configuration in that call). I’ll probably try this first; it’s easy. But it might cause other problems.
Change a bunch of lemmas to put them in the new simp normal form. I think this is not a great solution: often the new normal forms are worse (see above), and it probably does not account for all errors.
Target specific lemmas or types at the sites of breakages: either make sure problematic lemmas are not dsimp lemmas, or change how Subtype and DFunlike.coe are handled specifically somehow (a judicious no_index?). (This seems ad-hoc and hard to maintain to me.)
Create a mechanism for “remembering” defeq versions of an expression prior to dsimp rewrites to allow more DiscrTree matches. This would be the most complicated but the most general, and might help in other ways—or might be ridiculously costly. Implementation-wise, we could store the past defeq versions of expressions in an extension that DiscrTree operations could know about, or we could include it in the expression itself.

Last updated: May 02 2025 at 03:31 UTC