Zulip Chat Archive

Stream: mathlib4

Topic: simp timeout at `whnf`


thielema (Jan 18 2025 at 10:02):

I have the following simplified code:

import Mathlib
import Lean

lemma injective_unit_pow
  {m : Nat} (is_prime : m.succ.Prime) (p : Units (Fin m.succ))
  : Function.Injective (p ^ ·)
  := by
  have p_order := orderOf_eq_card_powers (x := p)
  simp at p_order

Lean works several seconds on simp, then aborts with this error:

tactic 'simp' failed, nested error:
(deterministic) timeout at `whnf`, maximum number of heartbeats (200000) has been reached
Use `set_option maxHeartbeats <num>` to set the limit.
Additional diagnostic information may be available using the `set_option diagnostics true` command.

Kim Morrison (Jan 18 2025 at 10:46):

Have you tried the suggestion set_option diagnostics true that is shown in the error message yet? What does it say?

Kevin Buzzard (Jan 19 2025 at 15:07):

If I open this code in the web editor there's no problem with it (other than the fact that the lemma is false); there's no timeout. So maybe update your mathlib? You don't need import Lean by the way, you're not doing metaprogramming here.

thielema (Jan 19 2025 at 15:19):

It happened in Lean-4.14. Seems to have gone in 4.15.

Michael Stoll (May 07 2025 at 18:59):

Here is a similar occurrence:

import Mathlib

open Polynomial

lemma foo (f g : [X]) (hf : 0 < f.natDegree) (c c' : ) (k : ) :
    eval 0 ((aeval (c  X)) (c'  f)) = eval 0 (1 - X ^ (k + 1) * (1 + X * g)) := by
  simp
  sorry

times out (and set_option diagnostics true does not seem to give helpful output), but replacing the import by

import Mathlib.Algebra.Polynomial.AlgebraMap
import Mathlib.Data.Complex.Basic

makes it work. Adding import Mathlib.Tactic to that makes it time out again. Let me try to find the culprit...

Michael Stoll (May 07 2025 at 19:38):

Here is the minimized version:

import Mathlib.Algebra.Polynomial.AlgebraMap
import Mathlib.Data.Complex.Basic

-- from Mathlib.Algebra.Polynomial.GroupRingAction:

open Polynomial

variable {M R : Type*} [Monoid M] [Semiring R] [MulSemiringAction M R]

noncomputable instance [MulSemiringAction M R] : MulSemiringAction M R[X] := sorry

@[simp] -- remove the attribute to avoid the time out below
theorem Polynomial.smul_X (m : M) : (m  X : R[X]) = X := sorry

--

lemma foo (c : ) : eval 0 (c  (X : [X])) = 0 := by
  simp
  sorry

So it seems that docs#Polynomial.smul_X is a bad simp lemma?
(Without the simp attribute on that result, simp makes no progress here because docs#Polynomial.smul_eq_C_mul is not a simp lemma.)

Or maybe the problem is that lean goes on a wild goose chase to see whether c • X can come from a MulSemiringAction?

Michael Stoll (May 07 2025 at 19:45):

simp only [smul_X] already times out; there seems to be an infinite loop...

Michael Stoll (May 07 2025 at 19:47):

The typeclass synth trace shows lots of

[Meta.synthInstance] ❌️ MulSemiringAction ℂ ℂ ▶

Michael Stoll (May 07 2025 at 19:50):

The first two expanded:

[Meta.synthInstance] ❌️ MulSemiringAction ℂ ℂ ▼
  [] no instances for MulSemiringAction ℂ ℂ ▼
    [instances] #[]
  [] result <not-available>
[Meta.synthInstance] ❌️ MulSemiringAction ℂ ℂ ▼
  [] result <not-available> (cached)

(the following ones are like the second one).

Michael Stoll (May 07 2025 at 19:52):

set_option trace.Meta.Tactic.simp.all true does not show anything. Is there a way to get trace output for what simp does here?

Michael Stoll (May 07 2025 at 19:54):

(I have to :in_bed: :zzz: now, but maybe somebody can take the investigation up from here...)

Kevin Buzzard (May 07 2025 at 20:33):

MulSemiringAction ℂ ℂ is failing quickly, I don't think that's the issue. I am not often faced with debugging simp issues. The questions at this point are:
1) why does this

import Mathlib.Algebra.Polynomial.GroupRingAction
import Mathlib.Data.Complex.Basic

open Polynomial

lemma foo (c : ) : eval 0 (c  (X : [X])) = 0 := by
  simp only [smul_X] -- why does this time out?

time out? and (2) what debugging options are available for answering this kind of question?

aah, it's not even a simp issue:

lemma foo (c : ) : eval 0 (c  (X : [X])) = 0 := by
  rw [smul_X]

times out.

Oh, answer to (2): even though things are timing out, stuff like

set_option trace.Meta.synthInstance true in
set_option trace.profiler true in

works fine.

Kevin Buzzard (May 07 2025 at 21:37):

So here's an interesting trace:

                  [Meta.isDefEq] [7.259843] 💥️ c  X =?= ?m  X 
                    [] [7.259617] 💥️ instHSMul =?= instHSMul 
                      [delta] [7.259594] 💥️ instHSMul =?= instHSMul 
                        [] [7.259541] 💥️ Algebra.toSMul =?= SMulZeroClass.toSMul 
                          [] [7.259469] 💥️ SMulZeroClass.toSMul =?= SMulZeroClass.toSMul 
                            [delta] [7.259457] 💥️ SMulZeroClass.toSMul =?= SMulZeroClass.toSMul 
                              [] [7.259431] 💥️ smulZeroClass =?= smulZeroClass 
                                [synthInstance] [0.000024] ❌️ MulSemiringAction   
                                [synthInstance] [0.000007] ❌️ MulSemiringAction   
                                [synthInstance] [0.000003] ❌️ MulSemiringAction   
                                [synthInstance] [0.000002] ❌️ MulSemiringAction   
                                [synthInstance] [0.000002] ❌️ MulSemiringAction   
                                [synthInstance] [0.000002] ❌️ MulSemiringAction   
                                [synthInstance] [0.000002] ❌️ MulSemiringAction   
                                [synthInstance] [0.000002] ❌️ MulSemiringAction   
                                [synthInstance] [0.000029] ❌️ MulSemiringAction   
...

0.000002 seconds is tiny, it's not the reason it's slow, but I'm beginning to understand why Lean keeps asking this question hundreds of times. Let's look at the exploding typeclass instance search which is taking 7 seconds to give up. If you switch implicits on with pp.explicit true then there are holes in the right hand side terms in that exploding search. For example
[delta] [7.259594] 💥️ instHSMul =?= instHSMul ▼ expands to

                      [delta] [7.658977] 💥️ @instHSMul Complex (@Polynomial Complex Complex.instSemiring)
                            (@Algebra.toSMul Complex (@Polynomial Complex Complex.instSemiring) Complex.instCommSemiring
                              (@semiring Complex Complex.instSemiring)
                              (@algebraOfAlgebra Complex Complex Complex.instCommSemiring Complex.instSemiring
                                (@Algebra.id Complex
                                  Complex.instCommSemiring))) =?= @instHSMul Complex
                            (@Polynomial Complex Complex.instSemiring)
                            (@SMulZeroClass.toSMul Complex (@Polynomial Complex Complex.instSemiring)
                              (@zero Complex Complex.instSemiring)
                              (@smulZeroClass Complex Complex.instSemiring Complex
                                (@DistribSMul.toSMulZeroClass Complex Complex
                                  (@AddMonoid.toAddZeroClass Complex
                                    (@AddMonoidWithOne.toAddMonoid Complex
                                      (@AddCommMonoidWithOne.toAddMonoidWithOne Complex
                                        (@NonAssocSemiring.toAddCommMonoidWithOne Complex
                                          (@Semiring.toNonAssocSemiring Complex Complex.instSemiring)))))
                                  (@DistribMulAction.toDistribSMul Complex Complex ?m.230
                                    (@AddMonoidWithOne.toAddMonoid Complex
                                      (@AddCommMonoidWithOne.toAddMonoidWithOne Complex
                                        (@NonAssocSemiring.toAddCommMonoidWithOne Complex
                                          (@Semiring.toNonAssocSemiring Complex Complex.instSemiring))))
                                    (@MulSemiringAction.toDistribMulAction Complex Complex ?m.230 Complex.instSemiring
                                      ?m.233)))))

and the right hand side of that term has two metavariables in. Here ?m.230 : Monoid Complex (which is in mathlib) and ?m.233 : @MulSemiringAction Complex Complex ?m.230 Complex.instSemiring a.k.a. MulSemiringAction ℂ ℂ.

But note that there is no instance of MulSemiringAction ℂ ℂ in mathlib, the natural action of the complexes on itself doesn't satisfy the MulSemiringAction axiom.

So in fact what's happening is that with this line [] [7.567531] 💥️ instHSMul =?= instHSMul, Lean is trying to prove that two instances of HSMul ℂ ℂ[X] ℂ[X] are equal. The first is an instance which actually exists: it's coming from Algebra ℂ ℂ[X] which itself comes from algebraOfAlgebra [Algebra R A] : Algebra R A[X].

But the second is an instance which does not exist. It's coming from smulZeroClass [SMulZeroClass S R] → SMulZeroClass S R[X] which reduces us to SMulZeroClass ℂ ℂ, which does exist, but for some reason Lean tries to find it via MulSemiringAction ℂ ℂ which does not exist.

After trying to do this for a bit, Lean suddenly realises it needs this MulSemiringAction ℂ ℂ, fails to find it, and then makes the hideous mistake of deciding that perhaps the best way to work on smulZeroClass =?= smulZeroClass (which is where it had got up to; these are both instances of SMulZeroClass ℂ ℂ[X]) is to unfold it (despite the fact that the RHS has still got this ?m.233 : MulSemiringAction ℂ ℂ term in it). So it's now faced with

                                [] [7.465669] 💥️ { smul := fun r p ↦ { toFinsupp := r • p.toFinsupp },
                                      smul_zero :=
                                        ⋯ } =?= { smul := fun r p ↦ { toFinsupp := r • p.toFinsupp }, smul_zero := ⋯ } ▶

(with ?m.223 now deeply embedded in the RHS) and now the whole thing explodes into finsupps and decidability, all doomed to failure because ?m.233 will never be synthesized. The first bifurcation is

                                          [delta] [4.865229] ❌️ r • p.toFinsupp =?= r • p.toFinsupp ▶
                                          [] [2.695273] 💥️ instHSMul.1 r p.toFinsupp =?= instHSMul.1 r p.toFinsupp ▶

but nothing can ever work because the RHSs all have this unassigned ?m.233 : MulSemiringAction ℂ ℂ. Every now and then Lean remembers it and you get a superfast burst of

                                                      [synthInstance] [0.000004] ❌️ MulSemiringAction   
                                                      [synthInstance] [0.000005] ❌️ MulSemiringAction   
                                                      [synthInstance] [0.000003] ❌️ MulSemiringAction   
...

20 or so times, and then it goes back to unfolding.

Kevin Buzzard (May 07 2025 at 21:43):

set_option maxHeartbeats 800000 in
lemma foo (c : ) : eval 0 (c  (X : [X])) = 0 := by
  rw [smul_X] -- tactic 'rewrite' failed, did not find instance of the pattern in the target expression

Jovan Gerbscheid (May 09 2025 at 10:19):

This is a relatively common problem in unification of type classes in the presence of metavariables. The unification algorithm tries to unify the same expressions in different ways (e.g. unify the arguments directly, or first unfold the head constant and then unify the arguments), but because there is a metavariable, doesn't do enough caching. This causes unification to be exponentially slow. Usually this is not a problem, but only if the hierarchy of definitions becomes too large, the exponential slowness become noticeable.

In this particular case, it can be sped up by 50x with seal Finsupp.mapRange. This constant appears in the instance AddMonoidAlgebra.smulZeroClass

Kevin Buzzard (May 09 2025 at 12:32):

How did you find this fix @Jovan Gerbscheid ? I can confirm that it makes the rewrite fail much more quickly (just to be clear, the rewrite should fail because the lemma is not applicable, the conclusion of the lemma is c • X = X because the hypotheses of the lemma imply c • 1 = 1, whereas in this situation c • X = c * X; the correct lemma to be rewriting with is smul_eq_C_mul.)

Jovan Gerbscheid (May 09 2025 at 12:49):

To see what the unification does, use set_option trace.Meta.isDefEq true.

To also see the time of each node, use set_option trace.profiler true and set_option trace.profiler.threshhold 10000000. This turns on the profiler without actually adding more nodes to the trace.

Then you can keep clicking on the nodes that take the most time, and observe the exponential blowup, and at some point we get to the constant Finsupp.mapRange.

Matthew Ballard (May 09 2025 at 13:08):

smul_X (R := ℂ[X]) D'oh, I had R and M switched and this doesn't help
simp only [smul_X (R := ℂ) (M := ℂ)] should be right

Matthew Ballard (May 09 2025 at 13:09):

Elaboration tends to have a lot of trouble with polynomials

Kevin Buzzard (May 09 2025 at 14:04):

It shouldn't be right because the lemma doesn't apply

Matthew Ballard (May 09 2025 at 14:06):

Right as in the type of the expression is correct, not as in it applies

Matthew Ballard (May 09 2025 at 14:07):

Lean can't figure out which ?R we want when invoking (X : ?R[X]). Ideally it could.

Jovan Gerbscheid (May 09 2025 at 14:24):

No, the problem is the isntance that it can't find (and doesn't exist)

Matthew Ballard (May 09 2025 at 14:25):

That is where the metavariables come from which leads to the crazy searches

Jovan Gerbscheid (May 09 2025 at 14:27):

I wouldn't call it a search (like type class search). Instead it is exponentially slow unification.

Matthew Ballard (May 09 2025 at 14:28):

Fair, I was thinking of searching for the value to instantiate the metavariable

Matthew Ballard (May 09 2025 at 14:31):

import Mathlib.Algebra.Polynomial.GroupRingAction
import Mathlib.Data.Complex.Basic

open Polynomial

lemma foo (c : ) : eval 0 (c  (X : [X])) = 0 := by
  -- simp only [smul_X c] -- why does this time out?
  simp only [smul_X (R := ) (M := )] -- this doesn't time out and fails quickly

Generally if you throw very unconstrained metavariables into the machine, then you are going to have a bad time

Matthew Ballard (May 09 2025 at 14:33):

Another example is the declaration foo itself where you need to provide an explicit type ascription

Matthew Ballard (May 09 2025 at 14:36):

Also, I would not have thought that docs#Polynomial.smul_X had the type it does if I was just guessing from the name

Kevin Buzzard (May 09 2025 at 14:45):

Of course the problem is that smul_X is a simp lemma so typically we don't have the option to supply the type hints.

Matthew Ballard (May 09 2025 at 14:51):

There are unif_hint's

Matthew Ballard (May 09 2025 at 14:52):

No idea about their utility here at the moment

Michael Stoll (May 09 2025 at 15:32):

Thanks everybody for investigating!

I don't think having to (remember to) write seal Finsupp.mapRange before a simp call is a viable option. So I'm wondering if there is a way to avoid the problem (without doing much harm elsewhere).

Jovan Gerbscheid (May 09 2025 at 18:01):

The easiest is to disable the simp lemma with simp [-smul_X]

Michael Stoll (May 09 2025 at 18:18):

But this only helps if you realize that in the specific situation this specific lemma might cause a problem. (In particular you'd need to remember its name.)

Jovan Gerbscheid (May 09 2025 at 18:22):

Yeah, I agree it's not a great solution, but besides unsimping the lemma, there is no obvious way to fix the situation. And you'll notice something is wrong if simp takes 40 seconds ;).

Eric Wieser (May 15 2025 at 21:37):

Can we speed things up by extracting the support of docs#Finsupp.mapRange into a sealed definition?

Eric Wieser (May 15 2025 at 21:39):

And in fact, probably the support of all Finsupp arithmetic

Eric Wieser (May 15 2025 at 21:39):

The defeq of the function is important, but the defeq of the support is classical and irrelevant

Eric Wieser (May 15 2025 at 22:01):

Attempted in #24944

Eric Wieser (May 15 2025 at 23:43):

Mathlib.Algebra.Category.ModuleCat.Presheaf.Free becomes 4x faster!

Eric Wieser (May 16 2025 at 00:00):

The example above takes 48615 heartbeats on that PR vs 590152 at its diffbase.

Still not fast, but much better. Only 17743 with seal Finsupp.mapRange, but that isn't safe to apply globally.

Jovan Gerbscheid (May 16 2025 at 09:44):

Are there more functions that could use a similar refactor with irreducible_def?

Kevin Buzzard (May 16 2025 at 09:45):

I think tensor products? But the refactor will probably be tricky. And I don't really know what I'm talking about, just that you never want to unfold the definition of a tensor product.

Michael Stoll (May 16 2025 at 09:48):

There is #24752 that looks at docs#Real.sqrt . (Prompted by its use in the definition of the norm on complex numbers.)

Eric Wieser (May 16 2025 at 10:09):

Kevin Buzzard said:

I think tensor products? But the refactor will probably be tricky. And I don't really know what I'm talking about, just that you never want to unfold the definition of a tensor product.

But I really like having the defeq lift f (mk x y) = f x y :(. I don't think we can keep that if we make other things irreducible, though I'd be happy to be proven wrong.

Matthew Ballard (May 16 2025 at 12:13):

I took a run at making docs#TensorProduct irreducible and then at using at irreducible_def somewhere I think... The former helped some. I suspect the later would help more but my main conclusion is that more tooling is needed around irreducible_def before we deploy it in more sustained fashion.

Matthew Ballard (May 23 2025 at 15:02):

I took a look at #24944 and I think the original sin for the performance regression is docs@ModuleCat.free specifically free R-module with generators x : X, implemented as the type X →₀ R

Matthew Ballard (May 23 2025 at 15:15):

  • This seems common elsewhere too when one needs an implementation of a free module. Should there be some abstraction boundary?
  • I would like more transparency into the kernel. What is the SOTA? Can I run lean4lean as a proxy?
  • Even with the above, I am ok merging #24944 and will do so unless waived off.

Last updated: Dec 20 2025 at 21:32 UTC