Zulip Chat Archive

Stream: lean4

Topic: kernel universe normalization issue

Kevin Buzzard (Jan 28 2025 at 14:56):

Currently the slowest file in mathlib is Mathlib.RingTheory.Kaehler.JacobiZariski. In attempting to try and figure out why, we've stumbled upon the following phenomenon.
A summary: the proof Foo.{max u w w'} = Foo.{max (max u w) w'} := by rfl is elaborated very quickly but typechecking in the kernel takes several seconds. The slow mathlib file is full of stuff like that (implicitly). A possible reason: this is a gigantic term and the kernel is unfolding it before noticing that the universes can be unified. Here's some code demonstrating the problem (not mathlib-free of course, but if someone has a slick way to make a gigantic term involving several universes which would be costly to unfold then it might be possible to give a simple mathlib-free example):

import Mathlib.RingTheory.Kaehler.JacobiZariski

namespace Algebra.Generators

open KaehlerDifferential TensorProduct MvPolynomial

universe w' w u v uT

variable {R : Type u} {S : Type v} [CommRing R] [CommRing S] [Algebra R S]
  {T : Type uT} [CommRing T] [Algebra R T] [Algebra S T] [IsScalarTower R S T]
  (Q : Generators.{w} S T) (P : Generators.{w'} R S)

set_option trace.profiler true

-- quick
-- [Elab.command] [0.041051] example
example :
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension := rfl

-- slow
-- [Elab.command] [2.038793] example :
-- ...
-- [Kernel] [1.874395] typechecking declaration
example :
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace.{max (max u w) w'} (Q.comp P).toExtension := rfl

Currently the slow (500 line) mathlib file starts universe u₁ u₂ u₃ u₄ w' w u v uT. For an experiment I tried replacing this line with universe u and using one universe throughout the file: the result was a saving of nearly half a trillion instructions (see here ) and a 20% reduction in olean file size.

I would be happy to do further experiments if necessary, but the naive path to a mathlib-free minimisation will just involve building a full algebra hierarchy from scratch (which I've done before...). Are there people out there who understand how the kernel handles universes who can shed some light on what is actually happening here?

Eric Wieser (Jan 28 2025 at 16:55):

The other note here is that lean4#5695 makes the second "slow" example likely to occur, right?

Christian Merten (Jan 28 2025 at 19:54):

I minimized a bit, still imports RingTheory.TensorProduct.Basic, but I need to stop now. If it helps, here it is:

foo

import Mathlib.RingTheory.TensorProduct.Basic

set_option linter.style.setOption false
suppress_compilation

universe w w' u v uT x

class HasQuotient (A : outParam <| Type u) (B : Type v) where
  quotient' : B → Type max u v

abbrev HasQuotient.Quotient (A : outParam <| Type u) {B : Type v}
    [HasQuotient A B] (b : B) : Type max u v :=
  HasQuotient.quotient' b

notation:35 G " ⧸ " H:34 => HasQuotient.Quotient G H

namespace QuotientAddGroup

variable {G : Type u} [AddGroup G] (N : AddSubgroup G) [nN : N.Normal] {H : Type v} [AddGroup H]

def leftRel : Setoid G :=
  AddAction.orbitRel N.op G

protected def con : AddCon G where
  toSetoid := leftRel N
  add' := sorry

instance instHasQuotientSubgroup : HasQuotient G (AddSubgroup G) :=
  ⟨fun s => Quotient (leftRel s)⟩

instance Quotient.addGroup : AddGroup (G ⧸ N) := (QuotientAddGroup.con N).addGroup

instance Quotient.addCommGroup {G : Type*} [AddCommGroup G] (N : AddSubgroup G) :
      AddCommGroup (G ⧸ N) :=
  { toAddGroup := have := N.normal_of_comm; QuotientAddGroup.Quotient.addGroup N
    add_comm := sorry }

end QuotientAddGroup

instance {R M : Type*} [Ring R] [AddCommGroup M] [Module R M] :
    HasQuotient M (Submodule R M) :=
  ⟨fun p => Quotient (QuotientAddGroup.leftRel p.toAddSubgroup)⟩

instance {R M : Type*} [Ring R] [AddCommGroup M] [Module R M] (p : Submodule R M) :
    AddCommGroup (M ⧸ p) :=
  QuotientAddGroup.Quotient.addCommGroup p.toAddSubgroup

open TensorProduct

namespace Algebra

open TensorProduct

variable {R : Type u} {S : Type v} [CommRing R] [CommRing S] [Algebra R S]

structure Extension (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S] where
  Ring : Type w

structure Generators (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S] where
  vars : Type w

def Generators.toExtension (_ : Generators.{w} R S) : Extension.{max u w, u, v} R S where
  Ring := ULift ℤ

def Generators.comp {T} [CommRing T] [Algebra R T] [Algebra S T] [IsScalarTower R S T]
      (_ : Generators.{w} S T) (_ : Generators.{w'} R S) : Generators.{max w w'} R T where
  vars := ULift ℤ

end Algebra

-- removing quotients or tensorproducts here makes it fast
def KaehlerDifferential (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S] :
    Type v :=
  (⊥ : AddSubgroup (S ⊗[R] S)) ⧸ (⊥ : Submodule R (⊥ : Submodule S (S ⊗[R] S)))

instance (R : Type u) (S : Type v) [CommRing R] [CommRing S] [Algebra R S] :
    AddCommGroup (KaehlerDifferential R S) := by
  delta KaehlerDifferential
  infer_instance

-- removing tensorproduct here makes it fast
abbrev Algebra.Extension.CotangentSpace {R : Type u} {S : Type v} [CommRing R] [CommRing S]
    [Algebra R S] (_ : Extension.{w} R S) : Type _ :=
  ℤ ⊗[ℤ] (KaehlerDifferential ℤ (ULift.{w} ℤ))

section Experiment

open Algebra

variable {R : Type u} {S : Type v} [CommRing R] [CommRing S] [Algebra R S]
  {T : Type uT} [CommRing T] [Algebra R T] [Algebra S T] [IsScalarTower R S T]
  (Q : Generators.{w} S T) (P : Generators.{w'} R S)

set_option profiler true

example :
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension := rfl

example :
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace.{max (max u w) w'} (Q.comp P).toExtension := rfl

end Experiment

Kevin Buzzard (Jan 28 2025 at 22:22):

One could no doubt persevere down this line but if there is some slick way to perhaps autogenerate some term which takes forever to unfold and eats three universes then this might do just as well.

Just to stress that the phenomenon here is not artificial -- here is an explanation of an example of it actually happening in an algebraic geometry file in mathlib. Algebraic geometry is famously slow in mathlib and I wonder to what extent this is being caused by this funny universe issue.

Sebastian Ullrich (Jan 30 2025 at 22:38):

So one issue here seems to be that Extension.CotangentSpace is a relatively complex definition, but it is marked as abbrev. This essentially tells the kernel "don't bother comparing my (level/expr) arguments modulo defeq, just unfold me". Which may lead to the same thing happening with some other abbrev inside its definition, I haven't checked that far. Making it def makes it fast.

Christian Merten (Jan 30 2025 at 23:18):

Did you try this on the minimized version I posted above or in the actual file? Because I can confirm that it makes it fast in the minized version, but when I tried it on master, the performance did not change / got worse (I had sorried out some proofs that broke, but that should make it faster).

Sebastian Ullrich (Jan 31 2025 at 07:06):

I tried it in Mathlib directly below the actual definition, no other changes

Johan Commelin (Feb 03 2025 at 05:22):

@Christian Merten Would you mind trying again, to double check?

Christian Merten (Feb 03 2025 at 10:59):

Johan Commelin said:

Christian Merten Would you mind trying again, to double check?

I tried it again at #21364 sorrying out all proofs that break, which unfortunately the speed center does not like. But local lake env perf stat lean Mathlib/RingTheory/Kaehler/JacobiZariski.lean is still at 736 * 10^9 instructions which is very close to the 727 * 10^9 instructions on master.

Matthew Ballard (Feb 03 2025 at 13:52):

Things I investigated

Implementing Eric's suggestion to normalize max associativity the other way
Reversing the accumulator in the kernel to match the convention of the elaborator
Trying to sprinkle docs#Lean.Level.normalize in places to test during elaboration

I was successful with the first two but nothing really improved. The last one was not as successful as I would have liked. In particular, I could not get the universes in #mathlib4 > JacobiZariski is slow. @ 💬 normalized. I couldn't get them to change at all.

I will break off my confusion on that into a separate post shortly ( :fingers_crossed:)

Matthew Ballard (Feb 03 2025 at 14:03):

But, I will say that it seems that the level metavariable introduced in such patterns (free universe parameters in fields) doesn't get instantiated until really, really late in elaboration.

Sebastian Ullrich (Feb 03 2025 at 14:05):

Christian Merten said:

Johan Commelin said:

Christian Merten Would you mind trying again, to double check?

I tried it again at #21364 sorrying out all proofs that break, which unfortunately the speed center does not like. But local lake env perf stat lean Mathlib/RingTheory/Kaehler/JacobiZariski.lean is still at 736 * 10^9 instructions which is very close to the 727 * 10^9 instructions on master.

Let's focus on the slow example given above

import Mathlib.RingTheory.Kaehler.Polynomial
import Mathlib.RingTheory.Generators

open KaehlerDifferential TensorProduct MvPolynomial

namespace Algebra

universe w u v

variable {R : Type u} {S : Type v} [CommRing R] [CommRing S] [Algebra R S]

namespace Extension

variable (P : Extension.{w} R S)

/--
The cotangent space on `P = R[X]`.
This is isomorphic to `Sⁿ` with `n` being the number of variables of `P`.
-/
abbrev CotangentSpace : Type _ := S ⊗[P.Ring] Ω[P.Ring⁄R]
def CotangentSpace' : Type _ := S ⊗[P.Ring] Ω[P.Ring⁄R]

universe uT w'
variable {R : Type u} {S : Type v} [CommRing R] [CommRing S] [Algebra R S]
  {T : Type uT} [CommRing T] [Algebra R T] [Algebra S T] [IsScalarTower R S T]
  (Q : Generators.{w} S T) (P : Generators.{w'} R S)

set_option trace.profiler true
example :
    Extension.CotangentSpace.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace.{max (max u w) w'} (Q.comp P).toExtension := rfl
example :
    Extension.CotangentSpace'.{max u w w'} (Q.comp P).toExtension =
    Extension.CotangentSpace'.{max (max u w) w'} (Q.comp P).toExtension := rfl

[Elab.command] [1.655671]
[Elab.command] [0.035150]

Matthew Ballard (Feb 03 2025 at 14:22):

NB: No provision of my statements shall be interpreted or enforced in a manner that alters, diminishes, or otherwise impacts my support for making that a def or structure.

Matthew Ballard (Feb 03 2025 at 15:11):

#lean4 > instantiation of level metavariables in structure instances is my precise question(s)

Last updated: May 02 2025 at 03:31 UTC