Zulip Chat Archive

Stream: lean4

Topic: dsimp simplifies proofs, which is slow

Jovan Gerbscheid (Feb 05 2025 at 17:50):

I was investigating some slowness of simp, and I found out that dsimp, unlike simp, simplifies proof terms. simp is very smart about handling proof terms: in some situations it simplifies the type of the proof term, and wraps the proof term in a cast.

On the other hand, dsimp does traverse proofs, which (particularly for Category Theory) causes significant slowdowns. It would be much better if dsimp did either of

leave proofs as-is, without simplification
simplify the type of the proof and wrap the proof in a cast so that it gets the new type.

The performance issue is still present in simp, because simp relies on dsimp for dependent arguments.

Jovan Gerbscheid (Feb 05 2025 at 17:51):

I think that the 1.4% speedup of Mathlib in #21330 can be (maybe partially) explained by this problem.

Eric Wieser (Feb 05 2025 at 18:03):

If we do leave proofs as is, I think there should be a config option that allows requesting proof simplification; as I understand it there are cases where it does matter.

Jovan Gerbscheid (Feb 05 2025 at 18:04):

Then the second option would be the best one I think.

Matthew Ballard (Feb 05 2025 at 18:10):

I'd :+1: if this were an RFC (and someone linked to it)

Jovan Gerbscheid (Feb 05 2025 at 20:35):

lean#6960

Damiano Testa (Feb 05 2025 at 20:36):

lean4#6960

Jovan Gerbscheid (Feb 21 2025 at 13:46):

My result is that with only a few modifications in Mathlib, disabling dsimp in proofs gives a speedup of 0.83% in Mathlib. (https://github.com/leanprover-community/mathlib4/pull/21519#issuecomment-2655534838)

This is great, but there are also some slowdowns. Looking more closely into this, I found that the simp lemma CategoryTheory.LiftRightAdjoint.constructRightAdjointEquiv_symm_apply, which is generated by simps!, has a RHS that contains a big proof term. This proof term would have been smaller if dsimp (which is called by simp!) had simplified it. And rw has the same issue as dsimp of trying to rewrite proof terms, causing the slowdown.

So this slowdown could be solved by either of

Not allowing rw to rewrite proofs. Although rw doesn't instantiate bound variables, we can still check for most proofs using Lean.Meta.isProofQuick.
Expand the function Lean.Elab.abstractNestedProofs so that it also abstracts nested proofs inside the type of a definition, not just inside the value.

Of course these can come with set_options to turn them off in the very rare case that you do want to rewrite a proof.

Jovan Gerbscheid (Feb 21 2025 at 13:51):

Interestingly, docs#RelSeries.ext has a proof that genuinely needs to simplify a proof, because it eliminates some variable x from appearing on the RHS of h : x = _, followed by subst h.

Eric Wieser (Feb 21 2025 at 16:45):

Jovan Gerbscheid said:

Interestingly, docs#RelSeries.ext has a proof that genuinely needs to simplify a proof, because it eliminates some variable x from appearing on the RHS of h : x = _, followed by subst h.

This sounds like simplification of the type of a proof, not of the proof itself. It would be great to distill an example of proof-rewrites-which-matter down to a mathlib-free example, but that's obviously a rather big ask.

Jovan Gerbscheid (Feb 21 2025 at 16:50):

Yes, but the simplification happens inside of a proof term

Kyle Miller (Feb 21 2025 at 16:50):

I'm wondering if generalize_proofs would help here, which is often the answer where there's a dependence in a proof.

Kyle Miller (Feb 21 2025 at 16:53):

Oh, interesting, in RelSeries.ext, the dsimp is causing the type of the proof to change.

Kyle Miller (Feb 21 2025 at 16:55):

dsimp isn't necessary, and neither is a fancy tactic like generalize_proofs:

@[ext (iff := false)]
lemma ext {x y : RelSeries r} (length_eq : x.length = y.length)
    (toFun_eq : x.toFun = y.toFun ∘ Fin.cast (by rw [length_eq])) : x = y := by
  rcases x with ⟨nx, fx⟩
  cases length_eq
  cases toFun_eq
  rfl

Somehow cases is working better than subst here.

Jovan Gerbscheid (Feb 21 2025 at 16:59):

with that, I get the error

dependent elimination failed, failed to solve equation
  (fun x => y.toFun (Fin.cast ⋯ x)) = fx

Kyle Miller (Feb 21 2025 at 17:06):

You're right. There's a bug where that error is now showing up on by and I didn't notice. Sorry.

Kyle Miller (Feb 21 2025 at 17:10):

This one works, and it looks like dsimp isn't simplifying proofs (unverified though).

@[ext (iff := false)]
lemma ext {x y : RelSeries r} (length_eq : x.length = y.length)
    (toFun_eq : x.toFun = y.toFun ∘ Fin.cast (by rw [length_eq])) : x = y := by
  rcases x with ⟨nx, fx⟩
  cases length_eq
  generalize_proofs pf at toFun_eq
  dsimp at pf toFun_eq
  cases toFun_eq
  rfl

Jovan Gerbscheid (Feb 21 2025 at 17:14):

I verified this does indeed work on the testing branch

Kyle Miller (Feb 21 2025 at 17:15):

Regarding options, I have a feeling that we do want a proof option (default false), but in Simp.Config, not a set_option. I think it should be a boolean (rather than something tri-state) and control both dsimp and simp. Right now, simp won't enter proofs at all, but for proofs it could dsimp them if proof is true. There's an application for this I believe too (in a @Joachim Breitner project).

Kyle Miller (Feb 21 2025 at 17:18):

Not allowing rw to rewrite proofs.

I think this shouldn't be problematic — this wouldn't cause "motive not type correct" errors, right?

Similarly, we could allow rewriting in proofs with a proof option in the rewrite config.

(I'm not saying that core Lean is accepting this change, but I'd like to think through the design for a start.)

Kyle Miller (Feb 21 2025 at 17:22):

Expand the function Lean.Elab.abstractNestedProofs so that it also abstracts nested proofs inside the type of a definition, not just inside the value.

I think this makes sense; to make sure I understand, you're saying that when a term's proofs are abstracted, when we make the theorem, we should make sure that the type of that theorem has recursively been abstracted too, right?

Jovan Gerbscheid (Feb 21 2025 at 17:24):

No, what I meant is that for any theorem, Lean.Elab.abstractNestedProofs should be ran on the theorem statement. in #22160, I did this manually for a few simp lemmas, and it gave a 55% speedup in Mathlib.CategoryTheory.Adjunction.Lifting.Right

Kyle Miller (Feb 21 2025 at 17:29):

I'm not following, the bullet point is about definitions, and now you're saying theorems. Are you just saying that it should be run on theorem statements too and that it is not currently being run on theorem statements?

Kyle Miller (Feb 21 2025 at 17:29):

And can you confirm that you're not saying anything about the theorems that abstractNestedProofs itself produces?

Jovan Gerbscheid (Feb 21 2025 at 17:30):

Kyle Miller said:

I'm not following, the bullet point is about definitions, and now you're saying theorems. Are you just saying that it should be run on theorem statements too and that it is not currently being run on theorem statements?

Yes sorry, I meant to say theorem

Jovan Gerbscheid (Feb 21 2025 at 17:31):

In #22160 I apply this manually to some simp lemmas, which speeds up Mathlib.CategoryTheory.Adjunction.Lifting.Right by 55%

Kyle Miller (Feb 21 2025 at 17:34):

I think it makes sense to abstract the proofs in theorem statements from first principles. The performance measurements though are helpful for making it a priority.

Kyle Miller (Feb 21 2025 at 18:25):

@Jovan Gerbscheid I made a draft PR at lean4#7181. Would you be interested in doing the performance measurements to help justify the change?

Independently of this, I'm not sure why Lean.Meta.AbstractNestedProofs.visit doesn't visit the functions in function applications.

Jovan Gerbscheid (Feb 21 2025 at 18:35):

Haha, I made a draft PR (lean4#7180) with the exact same modifications at lean47181, but I'll close mine.

Jovan Gerbscheid (Feb 21 2025 at 18:38):

Kyle Miller said:

Independently of this, I'm not sure why Lean.Meta.AbstractNestedProofs.visit doesn't visit the functions in function applications.

I think that is because f a₁ .. aₙ is a proof if and only if f is a proof. But I guess if f is a projection, then it could have subterms that are proofs, while f isn't a proof.

Jovan Gerbscheid (Feb 21 2025 at 18:42):

Kyle Miller said:

Would you be interested in doing the performance measurements to help justify the change?

Yes I can bench it when the corresponding mathlib branch gets created

Jovan Gerbscheid (Feb 23 2025 at 07:38):

I tried to fix the Mathlib branch for the PR, but I ran into some problems with the function restrictScalars, and its Prop argument IsScalarTower. There were some rewrites that required rewriting inside the proof of IsScalarTower, but that wasn't possible because these proofs were now being abstracted.

I might try to turn IsScalarTower into a Type instead of a Prop.

Ruben Van de Velde (Feb 23 2025 at 08:33):

No, that sounds like a bad idea

Kevin Buzzard (Feb 23 2025 at 09:04):

It's used all over commutative algebra which is essentially all classical and the moment you deprop it there will be diamonds everywhere

Jovan Gerbscheid (Feb 23 2025 at 21:40):

Ok, I managed to get the mathlib branch working, by using the rw! tactic for dependent rewriting, in this case for casting the type of the proof.

However, benchmarking shows a global slowdown instead of a speedup. In particular we see that

process pre-definitions    +26.5%

Somehow this modification has managed to slow down kernel type checking.

Jovan Gerbscheid (Feb 23 2025 at 21:41):

For example, type checking CategoryTheory.Abelian.Ext.covariant_sequence_exact₃' now takes the kernel 10s instead of the previous 0.06s. I'm having a hard time understanding why. I manually abstracted the proof in the type of this lemma, but that didn't make a difference. So it must be that some other theorems type check less easily with abstracted proofs in their type. set_option diagnostics true shows that the kernel is doing some hard work:

  [kernel] unfolded declarations (max: 126497, num: 66): ▼
    [] HAdd.hAdd ↦ 126497
    [] Add.add ↦ 126439
    [] Int.casesOn ↦ 94890
    [] Discrete.as ↦ 94766
    [] Nat.brecOn ↦ 78981
    [] Nat.add ↦ 78979
    [] Int.add.match_1 ↦ 47435
    [] Int.add ↦ 47422
    [] Int.rec ↦ 31705
    [] HomologicalComplex.X ↦ 31586
    [] Prefunctor.obj ↦ 23346
    [] Nat.cast ↦ 15807
    [] NatCast.natCast ↦ 15807
    [] Quotient.as ↦ 9997
    [] DFunLike.coe ↦ 155
    [] Functor.comp ↦ 129
    [] Quiver.Hom ↦ 107
    [] List.rec ↦ 80
    [] Ab ↦ 78
    [] ZeroHom.toFun ↦ 71
    [] AddCommGrp.carrier ↦ 69
    [] Equiv.toFun ↦ 68
    [] AddCommMonoid.toAddMonoid ↦ 65
    [] Function.comp ↦ 62
    [] SubNegMonoid.toAddMonoid ↦ 62
    [] OfNat.ofNat ↦ 58
    [] CategoryStruct.comp ↦ 56
    [] AddMonoidHom.toZeroHom ↦ 54
    [] Function.Injective.addMonoid ↦ 48
    [] List.casesOn ↦ 47
    [] AddMonoidHom.instFunLike ↦ 46
    [] Functor.toPrefunctor ↦ 41
    [] AddMonoid.toZero ↦ 38
    [] HSub.hSub ↦ 37
    [] Nat.casesOn ↦ 37
    [] Sub.sub ↦ 37
    [] AddMonoid.toAddZeroClass ↦ 36
    [] Prefunctor.map ↦ 35
    [] shiftFunctor ↦ 33
    [] ShiftMkCore.F ↦ 33
    [] AddGroup.toSubNegMonoid ↦ 32
    [] Nat.rec ↦ 30
    [] Quot.liftOn ↦ 30
    [] Int.negOfNat.match_1 ↦ 30
    [] AddMonoid.toAddSemigroup ↦ 28
    [] AddSemigroup.toAdd ↦ 28
    [] Classical.indefiniteDescription ↦ 28
    [] Classical.choose ↦ 27
    [] Exists.choose ↦ 27
    [] Option.casesOn ↦ 27
    [] Option.rec ↦ 27
    [] Subtype.val ↦ 27
    [] Functor.objPreimage ↦ 26
    [] Localization.lift ↦ 26
    [] Functor.essImage.witness ↦ 26
    [] outParam ↦ 25
    [] Equiv.instFunLike ↦ 25
    [] ShiftedHom ↦ 24
    [] Quotient.comp ↦ 24
    [] ModuleCat.carrier ↦ 23
    [] AddCommGrp.Hom.hom' ↦ 23
    [] AddCommGroup.toAddGroup ↦ 22
    [] ConcreteCategory.ofHom ↦ 22
    [] Int.neg.match_1 ↦ 22
    [] inferInstance ↦ 21
    [] Decidable.casesOn ↦ 21

Jovan Gerbscheid (Feb 24 2025 at 01:32):

I managed to somewhat minimize the slowness to this:

import Mathlib.Algebra.Homology.DerivedCategory.Ext.ExtClass
import Mathlib.CategoryTheory.Triangulated.Yoneda

universe w' w v u
namespace CategoryTheory.Abelian.Ext
open Opposite DerivedCategory Pretriangulated Pretriangulated.Opposite
variable {C : Type u} [Category.{v} C] [Abelian C] [HasExt.{w} C]
variable {X : C} {S : ShortComplex C} (hS : S.ShortExact)

lemma little_proof {n₀ : ℕ}
    {n₁ : ℕ} (h : n₀ + 1 = n₁) :
    (n₀:ℤ) + 1 = n₁ := by omega

lemma preadditiveCoyoneda_homologySequenceδ_singleTriangle_apply
    [HasDerivedCategory.{w'} C] {X : C} {n₀ : ℕ} (x : Ext X S.X₃ n₀)
    {n₁ : ℕ} (h : n₀ + 1 = n₁) :
    (preadditiveCoyoneda.obj (op ((singleFunctor C 0).obj X))).homologySequenceδ
      hS.singleTriangle n₀ n₁ sorry x.hom =
        (x.comp hS.extClass h).hom := by
  sorry

It takes the kernel about 1.5s to figure out that the type of sorry and the type of the lemma are the same. This is because the sorry gets abstracted. And replacing the sorry in the lemma statement with little_proof h fixes the slowness, because little_proof h will not be abstracted.

Jovan Gerbscheid (Feb 24 2025 at 01:32):

So I think the solution is to do the abstraction of nested proofs before elaborating the value/proof of a def/theorem. I believe it's only for theorems that the type is completely elaborated before the value is elaborated. So at least for theorems, it should be possible to abstract nested proofs before elaborating the proof. (@Kyle Miller)

Floris van Doorn (Feb 24 2025 at 13:26):

@Sven Manthe and I defined a tactic abstract that does the abstraction of proof in the type of a lemma before the proof is elaborated. It is here. Sven is working a lot with partial functions that depend on proof terms, and he got massive slowdowns without abstracting proof terms (cf. lean#5108).
It would be nice to put the abstract tactic into Mathlib (unless Lean will do this automatically by default).

Jovan Gerbscheid (Feb 24 2025 at 13:46):

abstract is just a tactic, i.e. it doesn't change the theorem statement that gets added. What I'm looking for is to abstract nested proofs in the theorem statement. And my hypothesis is that it's more efficient to do this step before rather that after elaborating the proof, so that the statement and proof match up better.

Jovan Gerbscheid (Feb 24 2025 at 13:57):

(The infrastructure for abstracting nested proofs is already present, it's just a matter of applying it in the right place)

Floris van Doorn (Feb 24 2025 at 14:40):

If you call abstract in the statement of a lemma, it will create an auxiliary declaration, and use that auxiliary declaration in the statement of the lemma.

Floris van Doorn (Feb 24 2025 at 14:43):

import Mathlib.Tactic

open Lean Meta Elab Tactic Term

elab "abstract" tacs:ppDedent(tacticSeq) : tactic => do
  if (← getGoals).length = 0 then return
  if (← getGoals).length > 1 then do
    throwError "more than one goal"
  let target ← getMainTarget
  let goal ← getMainGoal
  let newGoal ← mkFreshExprMVar target
  setGoals [newGoal.mvarId!]
  Elab.Tactic.evalTactic tacs
  setGoals [goal]
  goal.assign (← mkAuxTheorem ((← getDeclName?).get! ++ `abstract ++ (← mkFreshId)) target newGoal)

lemma foo (x : Fin 3) : x = ⟨2, by abstract omega⟩ := sorry

#print foo -- theorem foo : ∀ (x : Fin 3), x = ⟨2, foo.abstract._uniq.2632⟩
-- `foo.abstract._uniq.2632` is some auxiliary lemma stating `2 < 3`.

Floris van Doorn (Feb 24 2025 at 14:44):

But perhaps making Lean automatically do this is better. It would certainly have made life easier for Sven.

Jovan Gerbscheid (Feb 24 2025 at 14:52):

Oh, I see now. Yes that is what I'd like to happen automatically. One difference between that and the abstractNestedProofs implementation is that with by abstract, the type of the abstracted proof becomes the expected type of the proof term, whereas using abstractNestedProofs makes it have the inferred type of the proof term.

Jovan Gerbscheid (Feb 24 2025 at 14:55):

I ran into an example once where a proof stopped working because of this difference between the expected type and actual type of a proof (the proof was rfl)

Kevin Buzzard (Feb 24 2025 at 18:53):

I am confused by the example. How am I supposed to see the slowness? The file compiles instantly for me (I'm reviewing an old PR so I'm on v4.16.0-rc2, don't know if this makes a difference)

import Mathlib.Algebra.Homology.DerivedCategory.Ext.ExtClass
import Mathlib.CategoryTheory.Triangulated.Yoneda

universe w' w v u
namespace CategoryTheory.Abelian.Ext
open Opposite DerivedCategory Pretriangulated Pretriangulated.Opposite
variable {C : Type u} [Category.{v} C] [Abelian C] [HasExt.{w} C]
variable {X : C} {S : ShortComplex C} (hS : S.ShortExact)

lemma little_proof {n₀ : ℕ}
    {n₁ : ℕ} (h : n₀ + 1 = n₁) :
    (n₀:ℤ) + 1 = n₁ := by omega

set_option trace.profiler true in -- 0.06 seconds
lemma preadditiveCoyoneda_homologySequenceδ_singleTriangle_apply
    [HasDerivedCategory.{w'} C] {X : C} {n₀ : ℕ} (x : Ext X S.X₃ n₀)
    {n₁ : ℕ} (h : n₀ + 1 = n₁) :
    (preadditiveCoyoneda.obj (op ((singleFunctor C 0).obj X))).homologySequenceδ
      hS.singleTriangle n₀ n₁ sorry x.hom =
        (x.comp hS.extClass h).hom := by
  sorry

end CategoryTheory.Abelian.Ext

Jovan Gerbscheid (Feb 24 2025 at 22:32):

Ah sorry, the slowness is only there on the testing branch of lean4#7181, it's not in normal Lean

Jovan Gerbscheid (Feb 25 2025 at 17:10):

Should abstract be added to mathlib? It would be very handy for #22160, where I did what abstract does, but manually, to get a performance improvement.

Jovan Gerbscheid (Feb 25 2025 at 17:12):

Mathlib.CategoryTheory.Monad.Monadicity              -1.290⬝10⁹     (-5.90%)
Mathlib.CategoryTheory.Adjunction.Lifting.Right     -21.384⬝10⁹     (-55.59%)

Floris van Doorn (Feb 25 2025 at 17:54):

#22296

Jovan Gerbscheid (Feb 26 2025 at 15:48):

Jovan Gerbscheid said:

Ah sorry, the slowness is only there on the testing branch of lean4#7181, it's not in normal Lean

@Kevin Buzzard, actually, the slowness can be replicated in normal Lean. To trigger the slow unification, we can use the change tactic:

import Mathlib.Algebra.Homology.DerivedCategory.Ext.ExtClass
import Mathlib.CategoryTheory.Triangulated.Yoneda

universe w' w v u
namespace CategoryTheory.Abelian.Ext
open Opposite DerivedCategory Pretriangulated Pretriangulated.Opposite

set_option trace.profiler true in
lemma preadditiveCoyoneda_homologySequenceδ_singleTriangle_apply
    {C : Type u} [Category.{v} C] [Abelian C] [HasExt.{w} C]
    {X : C} {S : ShortComplex C} (hS : S.ShortExact)
    [HasDerivedCategory.{w'} C] {X : C} {n₀ : ℕ} (x : Ext X S.X₃ n₀)
    {n₁ : ℕ} (h : n₀ + 1 = n₁) :
  (preadditiveCoyoneda.obj (op ((singleFunctor C 0).obj X))).homologySequenceδ
    hS.singleTriangle n₀ n₁ sorry x.hom =
      (x.comp hS.extClass h).hom
  := by
  change (preadditiveCoyoneda.obj (op ((singleFunctor C 0).obj X))).homologySequenceδ
      hS.singleTriangle n₀ n₁ sorry x.hom =
        (x.comp hS.extClass h).hom
  sorry

Kevin Buzzard (Feb 26 2025 at 16:15):

The reason I asked was just to rule out the possibility that removing universes fixed the problem; this is another time I've seen the kernel take far too long to do something. I didn't expect universe-removal to fix the problem given your description and indeed it doesn't.

Jovan Gerbscheid (Mar 03 2025 at 00:52):

I've written a linter that complains if the RHS of a simp lemma contains a large proof term, which for example finds the lemma that slows down Mathlib.CategoryTheory.Adjunction.Lifting.Right (as seen in #22160). So when #22296 is merged, I'd like to use the abstract tactic to improve some simp lemmas with proof terms on the RHS.

Here's the code

import Mathlib

open Lean Meta
open Batteries Tactic Lint

def getRHS (e : Expr) : Option Expr :=
  if let some (_, _, rhs) := e.eq? then
    some rhs
  else if let some (_, rhs) := e.iff? then
    some rhs
  else
    none

@[env_linter] def simpNF' : Lint.Linter where
  noErrorsFound := "NOTHING"
  errorsFound := "FOUND SOME!"
  test := fun declName => do
    unless ← isSimpTheorem declName do return none
    let e := (← getConstInfo declName).type
    forallTelescope e fun _ e => do
    let some rhs := getRHS e | return none
    try
      _ ← Meta.transform rhs (pre := fun e => do
        if ← isProof e then
          if e.size > 200000 then
            throwError "too big proof term of {← inferType e}"
          return .done e
        return .continue)
      return none
    catch er =>
      return m!"{e} might be a bad lemma; {er.toMessageData}"

#lint only simpNF' in all

Jovan Gerbscheid (Mar 05 2025 at 20:46):

Kyle Miller said:

Independently of this, I'm not sure why Lean.Meta.AbstractNestedProofs.visit doesn't visit the functions in function applications.

I found out that for example commaMapEquivalenceInverse suffers from this, since it has a proof that sits inside a beta-unreduced function application. I'd guess this problem is quite common in functors, since the expected type of map usually contains the obj in beta-unreduced form.

Last updated: May 02 2025 at 03:31 UTC