Zulip Chat Archive

Stream: mathlib4

Topic: JacobiZariski is slow.

Kevin Buzzard (Jan 27 2025 at 22:53):

Matt Ballard pointed out to me the following stunning stat (the screenshot below is from the speedcenter):

Screenshot from 2025-01-27 22-25-41.png

This is all benchmarks sorted by "value" (which I think is "number of instructions"?). Turns out that Mathlib.RingTheory.Kaehler.JacobiZariski's value is about twice as large as the next file in mathlib. Christian just opened #21129 which looks to me like a huge hack but even with the resulting massive speedup the file is still the most instructiony file in mathlib and it's under 500 lines.

I fired it up and watched the orange bars; there was a distinct pause in docs#Algebra.Generators.CotangentSpace.compEquiv on my fast machine. set_option trace.profiler true on line 156 just before that declaration gives [Elab.command] [2.141008]... but when I unfold it down, the majority of the time is

  [def.processPreDef] [1.543822] process pre-definitions ▼
  [Kernel] [1.516111] typechecking declaration

with no further unfolding available. My understanding of the kernel is primitive. Are we able to conjecture exactly what is making the kernel take 1.5 seconds here? The full declaration is

noncomputable
def CotangentSpace.compEquiv (Q : Generators.{w} S T) (P : Generators.{w'} R S) :
    (Q.comp P).toExtension.CotangentSpace ≃ₗ[T]
      Q.toExtension.CotangentSpace × (T ⊗[S] P.toExtension.CotangentSpace) :=
  (Q.comp P).cotangentSpaceBasis.repr.trans
    (Q.cotangentSpaceBasis.prod (P.cotangentSpaceBasis.baseChange T)).repr.symm

I noticed that removing the type and just writing

noncomputable
def CotangentSpace.compEquiv (Q : Generators.{w} S T) (P : Generators.{w'} R S) :=
  (Q.comp P).cotangentSpaceBasis.repr.trans
    (Q.cotangentSpaceBasis.prod (P.cotangentSpaceBasis.baseChange T)).repr.symm

made the time drop to 0.5 seconds, and the file still compiles, with

  [Kernel] [0.036026] typechecking declaration

so somehow I've removed the problem (but I don't know what I'm doing). There seem to be many other problems with this file but I thought I'd start by flagging that because it's something I don't understand. In fact the file seems to be full of slow "typechecking declaration"s e.g. CotangentSpace.fst_compEquiv has

  [Kernel] [3.417108] typechecking declaration

I noticed that you can tell when this is happening because the orange bars jump up: they compile the proof and then when the kernel is typechecking the declaration they jump up again to the statement.

There is also a very slow-looking rfl on line 326.

Kevin Buzzard (Jan 27 2025 at 23:10):

The output of #check CotangentSpace.compEquiv with and without the def being given explictly are different with pp.all on. (< is with, > is without):

19c19
<   @LinearEquiv.{uT, uT, max (max (max u uT) w) w', max (max (max (max u v) w') uT) (max uT v) w} T T
---
>   @LinearEquiv.{uT, uT, max (max (max u uT) w) w', max (max (max (max u uT) v) w) w'} T T
28c28
<     (@Algebra.Extension.CotangentSpace.{max (max u w) w', u, uT} R T inst✝ inst✝² inst✝⁵
---
>     (@Algebra.Extension.CotangentSpace.{max u w w', u, uT} R T inst✝ inst✝² inst✝⁵
31c31
<     (Prod.{max (max uT v) w, max (max (max u v) w') uT}
---
>     (Prod.{max (max uT v) w, max (max (max u uT) v) w'}
...

There are loads of universe differences like this, and what look like different solutions to typeclass inference in other places. This could explain why the hack in #21129 has an effect.

Andrew Yang (Jan 28 2025 at 01:34):

cf #PR reviews > #21099 use unification hints for generators

Andrew Yang (Jan 28 2025 at 01:56):

So my conjecture is

Lean does some universe normalization when assigning universes.
This causes some universe discrepancies when some universe is normalized but we are trying to unify it to another term which produces a non-normalized universe.
e. g. in the example above, (Q.comp P).toExtension has type Extension.{max u w w', u, uT} R T, but in (Q.comp P).toExtension.CotangentSpace, the inferred universe is now CotangentSpace.{max (max u w) w'}. We then try to unify this with (Q.comp P).cotangentBasis, which gives a basis for CotangentSpace.{max u w w'}. If one replace (Q.comp P).toExtension.CotangentSpace with Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension then the issue disappears.
When the kernel notices that the two universes don't match, it unfolds both sides first before trying to unify them. I have no evidence at hand because I don't know how to get traces for the kernel. But I recall something similar happening in IsDefEq checks.
The normalization in step 1 sorts the universes alphabetically, which is why Christian's hack in #21129 changing the names of universe variables has an impact.

Kevin Buzzard (Jan 28 2025 at 10:21):

Bleurgh sorry for starting a thread on this, I had thought the other one was a discussion about unification hints. I claim my title is better :-)

Christian Merten (Jan 28 2025 at 10:23):

Kevin Buzzard said:

Bleurgh sorry for starting a thread on this, I had thought the other one was a discussion about unification hints. I claim my title is better :-)

Thanks for starting it, the other started as a discussion about unification hints, but went on the performance tangent. I am still interested in the original design question for the unification hints.

Kevin Buzzard (Jan 28 2025 at 10:52):

Oh wow, the only file which imports JacobiZariski is RingTheory.Etale.Kaehler which is universe-monomorphic without any explanation. @Andrew Yang do you know why this is? I've tried making JacobiZariski universe-monomorphic in #21165 just as an experiment to see what happens (I don't know of a better way of counting instructions than benchmarking).

Sebastian Ullrich (Jan 28 2025 at 10:55):

On Linux, lake env perf stat lean <file> should give you similar numbers independent of the hardware

Christian Merten (Jan 28 2025 at 11:04):

Kevin Buzzard said:

Oh wow, the only file which imports JacobiZariski is RingTheory.Etale.Kaehler which is universe-monomorphic without any explanation. Andrew Yang do you know why this is? I've tried making JacobiZariski universe-monomorphic in #21165 just as an experiment to see what happens (I don't know of a better way of counting instructions than benchmarking).

This is because docs#Algebra.Smooth (and hence docs#Algebra.Etale) are only defined universe-monomorphic, which might change soonish though.

Kevin Buzzard (Jan 28 2025 at 11:08):

Looking forward to the resulting super-slowdowns ;-)

Kevin Buzzard (Jan 28 2025 at 11:14):

Taking a step back, I know the prevailing philosophy here is "we have universes therefore of course we should be universe-polymorphic" and I could even imagine a justification of the form "an initial definition of etale cohomology involves taking a limit over a class of schemes and thus $H^0_{et}(\mathbb{P}^0,\Z_p)$ will be a group in Type 1 so it's important that we are polymorphic" but honestly, can you really envisage some kind of application of the specific results in this file where we really do have rings or presentations in universes other than Type or at worst Type u? If it comes down to a choice between "this file can be made much faster by making things universe-monomorphic" or "this file can be made much faster by some PR to core making Algebra.Extension.CotangentSpace.{max (max u w) w', u, uT} unify with Algebra.Extension.CotangentSpace.{max u w w', u, uT} without unfolding all fields" then it might be a darn sight easier to roll back on our universe conventions. Our current super-universe-polymorphic design decisions at this level might be making our lives much worse with no actual applications ever.

Christian Merten (Jan 28 2025 at 11:22):

I am fine with restricting to R S T : Type u, because we plug this into CommRingCat.{u} and then in Scheme.{u} anyway, so it does not matter. But we should be more careful when restricting the universes of the vars field in Generators.{w, u, u}, because there both w = 0 and w = u appear in applications.

Kevin Buzzard (Jan 28 2025 at 11:26):

Hmm. Mario warned me years ago about structures such as Generators, which have a free universe variable in a field. I don't see why you need w = 0, you can just ulift. I agree it's a bit crappy though. As someone who believes in the set-theoretic foundations of mathematics I don't see why we can't just take w = u = 0 because none of those higher universes actually exist anyway ;-)

Kevin Buzzard (Jan 28 2025 at 11:43):

Boom: universe monomorphism hack beats alphabetical order hack hands down:

https://github.com/leanprover-community/mathlib4/pull/21165#issuecomment-2618740450

+ ~Mathlib.RingTheory.Kaehler.JacobiZariski   instructions   -60.7%

Looks like you need to work more on your alphabet :-)

Damiano Testa (Jan 28 2025 at 11:54):

The oleans are also about 20% smaller with your version, Kevin.

Ruben Van de Velde (Jan 28 2025 at 12:01):

We can monomorphise now and still consider going back once lean is smarter about it

Kevin Buzzard (Jan 28 2025 at 12:18):

I closed the PR because I wanted to discourage this solution over actually thinking harder about the problem now.

Matthew Ballard (Jan 28 2025 at 13:15):

Kevin Buzzard said:

Hmm. Mario warned me years ago about structures such as Generators, which have a free universe variable in a field.

This was also my concern initially.

Christian Merten (Jan 28 2025 at 13:24):

One major point of the whole API around Generators, Presentation, Extension etc. is to have one type for all presentations of a given algebra. But we could still achieve this with a sigma type on top of a more unbundled Generators (etc.) type. For example:

import Mathlib

universe u v w

structure Generators' (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S]
    (ι : Type w) where
  val : ι → S
  σ : S → MvPolynomial ι R
  aeval_val_σ (s : S) : MvPolynomial.aeval val (σ s) = s

abbrev Generators (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S] :=
  Σ (ι : Type w), Generators' R S ι

Then possibly a lot of theory can be developed only using Generators' and if we need the uniform type we can use the sigma type.

Matthew Ballard (Jan 28 2025 at 13:29):

I also think this will suffer from the same issues. I have to teach from dawn to dusk today so won't be able to contribute more usefully until tomorrow morning at the earliest.

Matthew Ballard (Jan 28 2025 at 13:32):

There are two precise Lean issues so far:

does normalization of levels not fully associate max in some situations? (Unknown)
does the mismatch between the normal forms in the elaborator and kernel materially cause problems? I think this file provides a definite yes to that.

Kevin Buzzard (Jan 28 2025 at 14:31):

Andrew Yang said:

...
If one replace (Q.comp P).toExtension.CotangentSpace with Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension then the issue disappears.

Wooah. Indeed just after CotangentSpace.compEquiv dump this in the file:

section experiment

variable (Q : Generators.{w} S T) (P : Generators.{w'} R S)

set_option trace.profiler true in
example :
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace.{max (max u w) w'} (Q.comp P).toExtension := rfl

--   [Kernel] [2.126365] typechecking declaration

end experiment

Kevin Buzzard (Jan 28 2025 at 15:07):

#lean4 > kernel universe normalization issue @ 💬

Christian Merten said:

I am fine with restricting to R S T : Type u, because we plug this into CommRingCat.{u} and then in Scheme.{u} anyway, so it does not matter. But we should be more careful when restricting the universes of the vars field in Generators.{w, u, u}, because there both w = 0 and w = u appear in applications.

Given that #21165 compiled, I can only assume you mean "...in applications which have not yet appeared in mathlib"? But there are definitely such applications? (I can quite believe that there are, for example "a relation for every term in R" or "a relation for every term in Fin n", I'm just saying that we don't seem to have seen them yet?)

Christian Merten (Jan 28 2025 at 15:22):

We have uses of Generators.{u, u, u} and uses of Generators.{0, u, u} in mathlib, the first by docs#Generators.self (this is your R example with R replaced by S) and the second one by docs#AlgebraicGeometry.IsSmooth (this is your Fin n example). The first one is the more important one, since it is used to define docs#Algebra.H1Cotangent, which is then used in RingTheory.Etale.Kaehler.

I believe that everything we need goes through if we restrict to w = u = v in the JacobiZariski file, since docs#Algebra.H1Cotangent is independent of the presentation, so you can always choose w = u. And we are restricting to u = v in the AG applications anyway.

Mathematically, there is of course no obstruction, because where we want to use w = 0 is for docs#Algebra.IsStandardSmooth (where the universes should just be changed to .{0, 0, u, v}, I have been wanting to PR a fix for a while), but there the index type is finite so the universe does not matter.

Kevin Buzzard (Jan 28 2025 at 15:23):

Aah I see: so in fact what will have happened in my branch would simply be that u was set to 0 in those examples.

Matthew Ballard (Jan 28 2025 at 15:54):

Drive by thought between other things: it should be simple to write an elaborator that just (re?-)normalizes the universe levels of constants declared in a file. Putting this at the top of problematic files would put all the universe levels in the same normalized form and you can test the hypothesis that max association is the cause itself.

Kevin Buzzard (Jan 28 2025 at 15:59):

Did you see #lean4 > kernel universe normalization issue @ 💬 ? That looks like a pretty good test. Is it clear that this would help with e.g. CotangentSpace.compEquiv? cf Andrew's point 2 here.

Andrew Yang (Jan 28 2025 at 16:06):

For the record, I am currently trying to unbundle P from Extensions but I am failing miserably and I don't see a good way out.

import Mathlib.LinearAlgebra.TensorProduct.RightExactness
import Mathlib.RingTheory.Ideal.Cotangent

universe w u v

open TensorProduct MvPolynomial

variable (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S]
variable (P : Type w) [CommRing P] [Algebra R P]

structure Algebra.Extension where
  [algebra : Algebra P S]
  [isScalarTower : IsScalarTower R P S]
  /-- A chosen (set-theoretic) section of an extension. -/
  σ : S → P
  algebraMap_σ : ∀ x, algebraMap P S (σ x) = x

namespace Algebra.Extension

variable {R S}
variable (Pe : Extension R S P)

/-- The underlying algebra of an extension. -/
abbrev Ring : Extension R S P → Type w := fun _ ↦ P

attribute [simp] algebraMap_σ

instance : Algebra Pe.Ring S := Pe.algebra

instance : IsScalarTower R Pe.Ring S := Pe.isScalarTower

noncomputable instance {R₀} [CommRing R₀] [Algebra R₀ R] [Algebra R₀ S] [IsScalarTower R₀ R S] :
    Algebra R₀ Pe.Ring := Algebra.compHom Pe.Ring (algebraMap R₀ R)

#synth Algebra Pe.Ring S -- understandibly fails

Christian Merten (Jan 28 2025 at 16:09):

Why do we need Extension.Ring if P is unbundled?

Damiano Testa (Jan 28 2025 at 16:10):

With respect to max associativity, I think that this discussion also happened here.

Andrew Yang (Jan 28 2025 at 16:10):

Because we need the instance synthesizer to find algebra P S.

Christian Merten (Jan 28 2025 at 16:11):

Andrew Yang said:

Because we need the instance synthesizer to find algebra P S.

At this point you might as well unbundle [Algebra P S]

Andrew Yang (Jan 28 2025 at 16:12):

This is also very painful because in practice they are some hand constructed thing and we don't have these instances floating around. (e.g. docs#Algebra.Extension.localization)

Matthew Ballard (Jan 28 2025 at 16:15):

@Damiano Testa excellent! Ask and you shall receive. Thanks to @Eric Wieser :) (Caveat: I didn’t actually read the code.)

Andrew Yang (Jan 28 2025 at 17:54):

Matthew Ballard said:

Drive by thought between other things: it should be simple to write an elaborator that just (re?-)normalizes the universe levels of constants declared in a file. Putting this at the top of problematic files would put all the universe levels in the same normalized form and you can test the hypothesis that max association is the cause itself.

I'm not sure if this is what you are thinking of but using this elab function that I modified from Eric's code

section

open Lean Meta Elab Tactic Term

deriving instance DecidableEq for Lean.LevelMVarId
deriving instance DecidableEq for Lean.Level
deriving instance Ord for Lean.Name
deriving instance Ord for Lean.LevelMVarId
deriving instance Ord for Lean.Level

/-- Associate and sort `max`es within a `Level`. -/
def _root_.Lean.Level.normalizeMax (l : Level) (max_fun : Level → Level → Level) : Level :=
  addMax (go l)
where
  addMax (l : List Level) : Level :=
    match l.dedup.mergeSort (compare · · ≠ .lt) with
    | [] => panic!"impossible"
    | [l] => l
    | l :: ls => ls.dedup.foldl max_fun l
  go : Level → List Level
    | u@.zero | u@(.mvar _) | u@(.param _) => [u]
    | .succ u => [.succ <| addMax (go u)]
    | .max u v => go u ++ go v
    | .imax u v => [.imax (addMax (go u)) (addMax (go v))]

elab "clean_universe% " t:term : term => do
  let t ← Term.levelMVarToParam (← instantiateMVars (← Term.elabTerm t none))
  return t.replaceLevel fun l ↦ l.normalizeMax .max

elab "clean_universe_reverse% " t:term : term => do
  let t ← Term.levelMVarToParam (← instantiateMVars (← Term.elabTerm t none))
  return t.replaceLevel fun l ↦ l.normalizeMax (Function.swap .max)

end

Adding clean_universe_reverse% to type signature and the body of defs works until δAux_toAlgHom which wants clean_universe% and after that, these elab functions has little effect and are slower compared to master (where clean_universe weren't added to any of the defs above)

Matthew Ballard (Jan 28 2025 at 19:40):

Between classes so I can’t really read or think. Does “works “ mean gets faster? Since the elaborator and the kernel have different association normal form conventions and the declaration you reference is the worst for the kernel in the file but is fine for the elaborator I wouldn’t be surprised by a phase transition in performance.

Can you give more details on your investigations?

Matthew Ballard (Jan 29 2025 at 14:02):

I plan on coming back to this but I need to figure out grading first. At worst, I will show up to office hours later today.

Andrew Yang (Jan 29 2025 at 15:57):

By "works" I meant that it does not get stuck in kernel. And I do not observe a huge increase in the other parts of elaboration.
e.g. in CotangentSpace.compEquiv_symm_inr it was 1.5s on elaboration of type signature, 1.4s on elaboration on proof body and 7.4s of kernel typechecking, and it became 1.4s, 1.4s, 0.3s after adding the elaboration command.
Or in δAux_toAlgHom it is 0.6s, 4.9s, 9.6s in the start, 0.6s, 5.0s, 11.6s with clean_universe_reverse% and 0.6s, 5.6s, 1.0s with clean_universe% .

Matthew Ballard (Jan 29 2025 at 21:02):

structure Foo where
  type : Type u

universe u v w

variable (α : Type u) (β : Type v) (γ : Type w)

def foo : Foo where
  type := (α ⊕ β) ⊕ γ

set_option pp.universes true in
#check foo -- foo.{u, v, w} (α : Type u) (β : Type v) (γ : Type w) : Foo.{max w v u}

def bar : Foo where
  type := α ⊕ β ⊕ γ

set_option pp.universes true in
#check bar -- bar.{u, v, w} (α : Type u) (β : Type v) (γ : Type w) : Foo.{max (max w v) u}

Should elaboration normalize the universes of the expected types of structure instances?

Johan Commelin (Jan 30 2025 at 14:11):

@Sebastian Ullrich I'm hoping you can help here.

Matthew Ballard (Jan 30 2025 at 14:15):

My comment is slightly obtuse:

should we normalize universes as the final step in elaboration?
should the universe normal forms for the elaborator and kernel match?

My opinion:

Need to try it out to see if the penalty you pay each time you elaborate is worth the savings that unification gets
Yes. I think this is uncontroversial. @Eric Wieser already gave one suggestion in the ordinal thread.

Matthew Ballard (Jan 30 2025 at 14:16):

#mathlib4 > Universe-heterogeneous ordinal and cardinal relations @ 💬

Matthew Ballard (Jan 30 2025 at 14:19):

I could try out changes like this but today is another teaching from dawn to dusk.

Andrew Yang (Feb 05 2025 at 00:28):

Let me point out another behaviour that looks strange to me:

universe u v

def Foo (A : Type v) (B : Type max u v) : Type max u v := B

def Bar (A : Type u) : Type u := A

set_option pp.universes true

variable (A : Type v) (B : Type max u v)

#check Foo A B -- Type (max u v)

#check Bar (Foo A B) -- Type (max v u)

I thought the universes are sorted alphabetically but then why is this sorted the other way around?
I ran into this while trying to speed up AffineSpace.lean.
If one switches u and v in the snippet, then u and v will switch in both of the outputs too. So lean genuinely switches the universes instead of sorting them.

Jovan Gerbscheid (Feb 05 2025 at 03:06):

If you use

set_option trace.Elab.step true
set_option trace.Meta.isDefEq true
set_option trace.Meta.isLevelDefEq true

You can see where the two universe levels are swapped.

Bar gets elaborated as Bar.{?u.34} and Foo gets elaborated asFoo.{?u.36, ?u.35}. Then the types are unified, so Type ?u.34 =?= Type (max ?u.36 ?u.35), which leads to the assignment ?u.34 := max ?u.35 ?u.36. The variables get swapped because it sorts ?u.35 and ?u.36.

Matthew Ballard (Feb 05 2025 at 19:52):

So I got the total type checking down to 2.43s on my machine for the whole file by

associating max to the the right for levels during normalization
normalizing levels consistently during elaboration
associating max to the right on docs#Algebra.Generators
replacing u with z for the names of declared levels for the "deeper" levels in Generators

Matthew Ballard (Feb 05 2025 at 19:54):

Currently on master is about 30s

Matthew Ballard (Feb 05 2025 at 19:56):

I should clean up 1 and 2 and do a broader test of the effect on mathlib.

Matthew Ballard (Feb 05 2025 at 19:57):

1+2 alone don't make a huge difference directly until you put in 4. (I am not sure how much 3 really matters because my process was not very scientific.)

Matthew Ballard (Feb 05 2025 at 19:59):

4 is still deeply unsatisfying as a solution

Christian Merten (Feb 05 2025 at 20:46):

And 4 alone is significantly worse than 1+2+4?

Matthew Ballard (Feb 05 2025 at 20:51):

Still 15-16s

Matthew Ballard (Feb 12 2025 at 21:56):

Mild update:

I reversed the association of max's in the elaborator and benchmarked the effects in #21684
normalized levels after instantiating metavariables #21714
then did both #21722

Observations: this is file is really an outlier.

Kevin Buzzard (Feb 12 2025 at 22:03):

So for two of your experiments, you changed some general set-up overall and the only file which had a significant change was JacobZariski.lean ?

Matthew Ballard (Feb 13 2025 at 14:25):

In two of them the only thing that really affected anything was the same change: use max u (max v w) instead of max (u max v w) during normalization in the elaborator. And the only thing it really affected was this file. Overall the impact was slightly negative.

This tells me something about the design of this file is suboptimal compared to the remainder of the mathlib. Whenever I can breathe for a bit, I am going to poke around the kernel but I think the intuition that transparency is set too low resulting in default unfolding of large amounts of stuff is probably going to be the end conclusion regardless.

Christian Merten (Feb 13 2025 at 14:50):

But making CotangentSpace a def did not make the file faster at all, see #lean4 > kernel universe normalization issue @ 💬. Do you think that combining your kernel changes with making it a def is significantly better than your kernel changes with the current abbrev?

Making CotangentSpace a def just means either duplicating a big chunk of tensor product and Kaehler differential API specialized to CotangentSpace, immediatly unfolding the definition at the beginning of most proofs or introducing many erws. I think none of these are desirable.

I totally understand why making definitions like TensorProduct as irreducible as possible is sensible, because here we never want to unfold the definition, but only work with the characterizing properties. But requiring that every type that is built on top of a tensor product can't be an abbrev, because of potential performance regressions, does not sound like a strategy that will scale.

Matthew Ballard (Feb 13 2025 at 14:55):

First, the changes to the elaborator do not result in a sufficient performance impact to really turn core's head. This is a low priority and I don't see anything we've learned here changing that.

I think def + duplication is what where this will end up. It is the natural way. We are already in desperate need to better automation for this in many other settings.

Has anyone attempted making TensorProduct completely irreducible?

Kevin Buzzard (Mar 15 2025 at 22:15):

I got nerdsniped into thinking about this again. Maybe this CotangentSpace being a def is a red herring in the sense that it might be one cause of a slowdown, but not the main cause. I went back to my joke branch where there was only one universe (the joke branch gave a 60% speed-up but the file is still super-slow!), and chose a random slowish result (in this case δAux_toAlgHom, line 269 or so) and tried to figure out why it was slow. The proof ends rw [add_left_comm]; rfl and the rw takes nearly half a second on my fast machine. The reason is that Lean seems to be going through some unification hell because the term it has before the rewrite, namely

⊢ eval₂ (algebraMap S T) (fun i ↦ Q.val i) p • (δAux R Q') (f.val n) +
    (Q.val n • (δAux R Q) p +
      Q.val n •
        (Finsupp.linearCombination T (⇑(δAux R Q') ∘ f.val))
          (Q.cotangentSpaceBasis.repr (1 ⊗ₜ[Q.Ring] (D S Q.Ring) p))) =
  Q.val n • (δAux R Q) p +
    ((aeval Q.val) p • (δAux R Q') (f.val n) +
      Q.val n •
        (Finsupp.linearCombination T (⇑(δAux R Q') ∘ f.val)) (Q.cotangentSpaceBasis.repr (1 ⊗ₜ[Q.Ring] (D S Q.Ring) p)))

, has some issues. The Q.cotangentSpaceBasis.repr term in there is carrying around a TensorProduct.leftModule : Module T (T ⊗[Q.toExtension.Ring] Ω[Q.toExtension.Ring⁄S]). But it's being fed a term of type T ⊗[Q.Ring] Ω[Q.Ring⁄S]. Now it is true that Q.toExtension.Ring = Q.Ring by rfl, but not on reducible transparency. So what is happening in this proof is whenever Lean tries to do any arithmetic it is constantly having to check that the addition on T ⊗[Q.Ring] Ω[Q.Ring⁄S] is equal to the addition on T ⊗[Q.toExtension.Ring] Ω[Q.toExtension.Ring⁄S] and it does this by unfolding all of the instances resulting in everything taking forever. Example of trace:

                      [] [0.466879] rewrite [add_left_comm] ▼
                        [Meta.check] [0.458498] ✅️ fun _a ↦
                              _a =
                                Q.val n • (δAux R Q) p +
                                  ((aeval Q.val) p • (δAux R Q') (f.val n) +
                                    Q.val n •
                                      (Finsupp.linearCombination T (⇑(δAux R Q') ∘ f.val))
                                        (Q.cotangentSpaceBasis.repr (1 ⊗ₜ[Q.Ring] (D S Q.Ring) p))) ▼
                          [isDefEq] [0.056118] ✅️ AddCommMonoid Q.toExtension.CotangentSpace =?= AddCommMonoid (T ⊗[Q.Ring] Ω[Q.Ring⁄S]) ▶
                          [isDefEq] [0.047696] ✅️ Module T Q.toExtension.CotangentSpace =?= Module T (T ⊗[Q.Ring] Ω[Q.Ring⁄S]) ▶
                          [isDefEq] [0.047258] ✅️ AddCommMonoid Q.toExtension.CotangentSpace =?= AddCommMonoid (T ⊗[Q.Ring] Ω[Q.Ring⁄S]) ▶
                          [isDefEq] [0.042882] ✅️ Module T Q.toExtension.CotangentSpace =?= Module T (T ⊗[Q.Ring] Ω[Q.Ring⁄S]) ▶
                          [isDefEq] [0.042163] ✅️ AddCommMonoid Q.toExtension.CotangentSpace =?= AddCommMonoid (T ⊗[Q.Ring] Ω[Q.Ring⁄S]) ▶
                          [isDefEq] [0.046017] ✅️ Module T Q.toExtension.CotangentSpace =?= Module T (T ⊗[Q.Ring] Ω[Q.Ring⁄S]) ▶
                          [isDefEq] [0.045918] ✅️ Basis Q.vars T Q.toExtension.CotangentSpace =?= Basis Q.vars T Q.toExtension.CotangentSpace ▶
                          [isDefEq] [0.057254] ✅️ Q.toExtension.CotangentSpace =?= T ⊗[Q.Ring] Ω[Q.Ring⁄S] ▶

Note that Q.toExtension.CotangentSpace is an abbrev for T ⊗[Q.toExtension.Ring] Ω[Q.toExtension.Ring⁄S]. When typeclass inference can't see that the types are equal, but it knows they must be because of some defeq abuse which it has picked up, then it has to do lots of very long checks checking that the typeclasses it finds on the two types match up. Is there a way of making these types more equaller?

Kevin Buzzard (Mar 15 2025 at 22:40):

Right now we have docs#Algebra.Extension:

structure Algebra.Extension where
  /-- The underlying algebra of an extension. -/
  Ring : Type w
  [commRing : CommRing Ring]
  [algebra₁ : Algebra R Ring]
  [algebra₂ : Algebra Ring S]
  [isScalarTower : IsScalarTower R Ring S]
  /-- A chosen (set-theoretic) section of an extension. -/
  σ : S → Ring
  algebraMap_σ : ∀ x, algebraMap Ring S (σ x) = x

with this type embedded as a field, and then docs#Algebra.Generators.toExtension taking (P : Generators R S) and defined as

@[simps]
noncomputable
def toExtension : Extension R S where
  Ring := P.Ring
  σ := P.σ
  algebraMap_σ := by simp

so we have P.toExtension.Ring = P.Ring but with_reducible_and_instance rfl doesn't prove it. So, as far as typeclass inference is concerned, these are two different rings and any defeq abuse will be very costly.

Kevin Buzzard (Mar 15 2025 at 22:42):

Also, woe betide you if you don't remember P.toExtension.{u} because then there will be another very long search as Lean tries to figure out the most general universe variable which makes everything work.

Kevin Buzzard (Mar 15 2025 at 23:02):

Similarly the lemma map_comp_cotangentComplex_baseChange is slow even with one universe; the proof is ext; simp but the simp takes over 5 seconds on my machine, and most of it is spent proving that things which look syntactically equal are defeq. However if you put pp.all on (making the terms over 1000 lines long) and wade through the diff you see

10,16c10,13
<                     (@Algebra.Extension.Ring.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴
<                       (@Algebra.Generators.toExtension.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P))
<                     (@CommRing.toCommSemiring.{u}
<                       (@Algebra.Extension.Ring.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴
<                         (@Algebra.Generators.toExtension.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P))
<                       (@Algebra.Extension.commRing.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴
<                         (@Algebra.Generators.toExtension.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P)))
---
>                     (@Algebra.Generators.Ring.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P)
>                     (@CommRing.toCommSemiring.{u} (@Algebra.Generators.Ring.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P)
>                       (@MvPolynomial.instCommRingMvPolynomial.{u, u} R
>                         (@Algebra.Generators.vars.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P) inst✝⁶))
18,20c15
<                     (@KaehlerDifferential.{u, u} R
<                       (@Algebra.Extension.Ring.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴
<                         (@Algebra.Generators.toExtension.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P))
---
>                     (@KaehlerDifferential.{u, u} R (@Algebra.Generators.Ring.{u, u, u} R S inst✝⁶ inst✝⁵ inst✝⁴ P)

(and hundreds more of these): again it's the case that Algebra.Extension.Ring and Algebra.Generators.Ring are somehow being unified by defeq abuse and this is throwing typeclass inference into chaos (it proves that e.g. the ring instances on these two types match up by completely taking everything apart, which is very time-consuming).

Christian Merten (Mar 16 2025 at 13:13):

Kevin Buzzard said:

Is there a way of making these types more equaller?

I tried with a unification hint in #22975, but the performance did not improve.

Kevin Buzzard (Mar 27 2025 at 18:53):

Christian is visiting London and we revisited this issue. Turns out that on master P.toExtension.Ring = P.Ring is true by rfl but not with_reducible rfl (so typeclass inference is free to assign ring structures which might be different but aren't, and then spend ages checking that they are). And on the unification hint branch #22975 they are equal by with_reducible rfl, but this doesn't solve the problem because apparently (according to people who can read the source code) this only works because the usual with_reducible rfl check fails and only then does Lean check for unification hints. So the simp calls are still slow because they work slowly instead of failing, meaning that unification hints never kick in!

Kevin Buzzard (Apr 24 2025 at 10:10):

Kyle Miller said:

Plus: Is there any case of a universe level that's used only in the body of a declaration that couldn't just be replaced by 0?

Yes. In fact Mathlib's slowest file, JacobiZariski.lean, is full of them (and indeed someone suggested to me that perhaps one reason that this file is so slow is that every time a user forgets to explicitly state which universe they're using might cause a huge time hit). The culprits are declarations such as docs#Algebra.Extension and docs#Algebra.Generators . Mathematically the latter definition is all about presenting ring extensions, and sometimes this is done under finiteness conditions where you want vars just to be Fin n for some appropriate n, which is of course in Type 0, but sometimes you are doing some universal argument and you want vars to be equal to the type underlying some ring in the argument in which case it could well have Type u.

Yaël Dillies (Apr 24 2025 at 10:28):

Why can't docs#Algebra.Extension.Ring live in Type (max u v)?

Kevin Buzzard (Apr 24 2025 at 11:06):

Ansering the analogous question about Generators: because sometimes you want vars to be Fin n or whatever and this lives in Type 0. And no you don't want it to be uLift (Fin n) because that just introduces a lot of noise.

Yaël Dillies (Apr 24 2025 at 11:08):

But why? Surely any indexing type is just as good as another one

Yaël Dillies (Apr 24 2025 at 11:09):

What is it about Fin n that's less noisy than ULift (Fin n)?

Kevin Buzzard (Apr 24 2025 at 11:10):

For your Ring example, this would be like saying "any ring isomorphic to the ring you want would be just as good as the one you want" so this is like saying "adding loads of RingEquivs into a file which aren't currently there and then fixing all the proofs will be just as good as not having them there" which is clearly false.

Kevin Buzzard (Apr 24 2025 at 11:12):

In #21165 I changed all universes in JacobiZariski to be the same universe and I got a 61% speedup. Universes are definitely one of the big problems in that file but I don't think that people are prepared to pay the price which you want them to pay. You could try the refactor yourself I guess!

Yaël Dillies (Apr 24 2025 at 11:12):

Sure, for a ring I would understand, but for an indexing type?

Yaël Dillies (Apr 24 2025 at 11:14):

And follow up question on the ring example: Why is Algebra.Extension.Ring a field of Algebra.Extension, rather than an index?

Kevin Buzzard (Apr 24 2025 at 11:15):

Sometimes the indexing type is one of the rings involved, which would then have to become ulift of it, so again you are going to get a lot of noise. Regarding the design decisions, I quizzed @Christian Merten about this when he was in London last month and he tried a different design but ran into problems; I think he would be better placed to answer this than me.

Yaël Dillies (Apr 24 2025 at 11:15):

In my mind, either you care about the defeqs and that type should be an index, or you don't care about the defeqs and that type can be ulifted however the definition requires it to. Currently, you're saying "We care about the defeqs, but also it's not an index"

Kevin Buzzard (Apr 24 2025 at 11:16):

Mario suggested that these random extra universes in structure fields were a code smell and that we should try and rip them out altogether, but I think this is what Christian tried and failed to do.

Yaël Dillies (Apr 24 2025 at 11:17):

I stand with Mario here. I guess I will go bother Christian about this

Kevin Buzzard (Apr 24 2025 at 11:20):

It's one thing "standing with Mario" but unfortunately it's quite another to find a design which works and which doesn't have the smell.

Kevin Buzzard (Apr 24 2025 at 11:29):

Let me go through the mathematics of docs#Algebra.Generators to at least give you a feel for the issue. I'll simplify a little by dropping the section $\sigma$ , that is not important for us. What is happening is that we have two commutative rings $R$ and $S$ and $S$ is an $R$ -algebra, so we have some fixed ring homomorphism from $R$ to $S$ . Sometimes in commutative algebra it becomes necessary to have certain "presentations" of this homomorphism, i.e. you want to write $R\to S$ as a composite $R\to R[X_i]\to S$ where $R[X_i]$ is a polynomial ring in variables $X_i$ indexed by $i$ in an indexing set $I$ , and the Lean issue is the universe which $I$ lives in. Often you will not want just one presentation of the ring homomorphism, you want to be able to change the presentation on the fly during a proof (this is why it's a structure, not a class). Situations which I could imagine being needed in the real world: $I$ could be finite (if $S$ is finitely-presented) or it could be $S$ (in $S$ 's universe) or it could be $R\coprod S$ (in the max of $R$ and $S$ 's universes) or it could be an arbitrary type which we get from the axiom of choice (and then maybe we don't even have much control over it, it could end up as max (max u (v+1)) v) or whatever, depending on how much category theory was used to construct the index set).

Kevin Buzzard (Apr 24 2025 at 11:34):

One natural attempt to fix this is to make vars (this is $I$ in the previous post) an input to the structure rather than a field but I think this is precisely what Christian tried to do and had problems with.

Notification Bot (Apr 24 2025 at 13:38):

17 messages were moved here from #mathlib4>Task 26: Replace Type u by Type* wherever possible by Kevin Buzzard.

Jz Pan (Apr 24 2025 at 14:49):

Kevin Buzzard said:

ou want to write $R\to S$ as a composite $R\to R[X_i]\to S$ where $R[X_i]$ is a polynomial ring in variables $X_i$ indexed by $i$ in an indexing set $I$ , and the Lean issue is the universe which $I$ lives in.

Maybe we can try force I be a Set S, just like what Module.rank do.

Andrew Yang (Apr 24 2025 at 14:54):

Yaël Dillies said:

What is it about Fin n that's less noisy than ULift (Fin n)?

There are a lot of API on tuples (Fin n -> R) that is not there for ULift (Fin n). Or are you suggesting we develop them for ULift (Fin n) as well?

Antoine Chambert-Loir (Apr 24 2025 at 21:04):

in the ongoing work of @María Inés de Frutos Fernández and me on polynomial laws, we had to consider (partial) presentations of $R$ -algebras $S$ , namely $R$ -algebra morphisms $S' \to S$ where $S'$ is in the same universe as $R$ and whose image contains given elements. Our solution was indeed to take for $S'$ a polynomial algebra over $\mathrm{Fin} n$ for a given integer $n$ . I have tried using docs#Small but this wasn't as easy.

Last updated: May 02 2025 at 03:31 UTC