Zulip Chat Archive

Stream: mathlib4

Topic: Typeclass Synthesis Regression


Artie Khovanov (Dec 18 2025 at 03:17):

On Mathlib revision 381f27922fc7b639c2670d0039751dc603d5c55d

import Mathlib

variable {R : Type*} [CommRing R]

open Polynomial

#synth Mul (R[X]  ( : Ideal R[X])) -- times out (~20500 hb)

Kevin Buzzard (Dec 18 2025 at 09:48):

Even before the recent PRs which have pushed it over the line, mathlib was having trouble finding this instance. On a version of mathlib from last week we have this monster trace

first part of trace

(note "350 more entries" at the bottom). Highlights from the beginning are:

    [] [306.000000] new goal Mul (R[X] ⧸ ⊥) ▼
      [whnf] [25.000000] Non-easy whnf: Mul (R[X] ⧸ ⊥)
      [instances] #[@Lean.Grind.Semiring.toMul, @Semigroup.toMul, @CommMagma.toMul, @MulOne.toMul, @MulZeroClass.toMul, @Distrib.toMul, @GradedMonoid.GradeZero.mul]

i.e. "we're going to try and find Mul on a thing, and there are a ton of basic instances of this defined right at the start of mathlib such as Distrib.toMul, and then there's the completely exotic GradedMonoid.GradeZero.mul defined right at the end, and so we'll try this one first because that's the algorithm". So the next step is

    [] [576514.000000] ✅️ apply @GradedMonoid.GradeZero.mul to Mul (R[X] ⧸ ⊥) ▶

and now suddenly we're looking for a grading to get our mul (and note that this is happening in every file which imports GradedMonoid.GradeZero.mul and in every project or file which has a casual import Mathlib somewhere, every time someone asks for a multiplication) (I take that back -- there is an extra coincidence here).

[] [292.000000] new goal GradedMonoid.GMul (HasQuotient.Quotient R[X]) ▼
        [whnf] [25.000000] Non-easy whnf: GradedMonoid.GMul (HasQuotient.Quotient R[X])
        [instances] #[@GradedMonoid.GMonoid.toGMul, @DirectSum.GNonUnitalNonAssocSemiring.toGMul]

So now we're about to start looking for a graded ring structure on...wait...what? HasQuotient.Quotient R[X]?

    [] [4482.000000] ✅️ apply @DirectSum.GCommRing.toGCommSemiring to DirectSum.GCommSemiring (HasQuotient.Quotient R[X]) ▼
      [whnf] [25.000000] Non-easy whnf: DirectSum.GCommSemiring (HasQuotient.Quotient R[X])

and there it ultimately goes: to find a multiplication on the quotient we're going to look for a graded ring structure on HasQuotient.Quotient R[X], because HasQuotient.Quotient R[X] is indeed a function from the type Ideal R[X] to Type (sending II to R[X]/IR[X]/I) and indeed these quotients do have additive group structures, so perhaps this is a graded ring. And indeed it's not hard to find additive group structures on the quotient. So what's left to do before we ultimately fail to find this graded ring structure?

    [] [3984.000000] ✅️ apply @GradedMonoid.GMonoid.toGMul to GradedMonoid.GMul (HasQuotient.Quotient R[X]) ▼

Aah yes, a graded structure on a collection of additive groups indexed by a type which has addition (and indeed you can add ideals) needs a GMul, so now having started looking for Mul on R[X]/I we're now looking for GMul on the function sending an ideal II to R[X]/IR[X]/I, i.e. a map R[X]/I->R[X]/J->R[X]/(I+J). OK so it will suffice to find a GMonoid structure on this function. Now, how do we find GMonoid structures? Well we certainly need additive monoid structures on each factor, so how do we find an additive monoid structure on R[X]/IR[X]/I?

      [] [314.000000] new goal (i : Submodule R[X] R[X]) → AddCommMonoid (R[X] ⧸ i) ▼
        [whnf] [25.000000] Non-easy whnf: AddCommMonoid (R[X] ⧸ i)
        [instances] #[@AddCancelCommMonoid.toAddCommMonoid, @AddCommMonoid.ofIsAddCommutative, @SubtractionCommMonoid.toAddCommMonoid, @AddCommGroup.toAddCommMonoid, @AddCommMonoidWithOne.toAddCommMonoid, @NonUnitalNonAssocSemiring.toAddCommMonoid, @LinearOrderedAddCommMonoidWithTop.toAddCommMonoid, @ESeminormedAddCommMonoid.toAddCommMonoid]

Well there are a lot of choices, but we've imported everything and the last example of how to make an additive monoid is we can get one from an ESeminormedAddCommMonoid. Fabulous! Let's now start looking for norms on R[X]/I. Well, norms are something to do with topology so maybe we should be looking for a topology?

    [] [135646.000000] ✅️ apply @topologicalRingQuotientTopology to (i : Submodule R[X] R[X]) → TopologicalSpace (R[X] ⧸ i) ▶
    [] [1206.000000] ✅️ apply @UpgradedStandardBorel.toTopologicalSpace to TopologicalSpace R[X] ▶

etc etc.

Kevin Buzzard (Dec 18 2025 at 09:59):

attribute [instance 900] GradedMonoid.GradeZero.mul

divides the time taken to find this instance by a factor of three (and in particular fixes the timeout on this week's mathlib)

Kevin Buzzard (Dec 18 2025 at 10:01):

aargh lol looking at the new trace:

    [] [305.000000] new goal Mul (R[X] ⧸ ⊥) ▶
    [] [1500.000000] ✅️ apply @Distrib.toMul to Mul (R[X] ⧸ ⊥) ▶
    [] [1706.000000] ✅️ apply @NonUnitalNonAssocSemiring.toDistrib to Distrib (R[X] ⧸ ⊥) ▶
    [] [576477.000000] ✅️ apply @DirectSum.GradeZero.nonUnitalNonAssocSemiring to NonUnitalNonAssocSemiring (R[X] ⧸ ⊥) ▶
    [] [4152.000000] ❌️ apply @FirstOrder.Language.instDecidableEqRelations to DecidableEq (Submodule R[X] R[X]) ▶
    [] [4000.000000] ❌️ apply @FirstOrder.Language.instDecidableEqFunctions to DecidableEq (Submodule R[X] R[X]) ▶
    [] [3885.000000] ✅️ apply @LinearOrder.toDecidableEq to DecidableEq (Submodule R[X] R[X]) ▶
    [] [151392.000000] ✅️ apply ValuationRing.instLinearOrderIdealOfDecidableLE to LinearOrder (Submodule R[X] R[X]) ▶
    [] [5008.000000] ✅️ apply @ValuationRing.toPreValuationRing to PreValuationRing R[X] ▶
    [] [3056.000000] ✅️ apply of_isDiscreteValuationRing to ValuationRing R[X] ▶

and off we go again with randomness

Kevin Buzzard (Dec 18 2025 at 10:03):

attribute [instance 900] GradedMonoid.GradeZero.mul DirectSum.GradeZero.nonUnitalNonAssocSemiring

makes it 10x faster

Kevin Buzzard (Dec 18 2025 at 10:03):

I think we've got graded rings wrong somehow (or perhaps just their prios)

Kevin Buzzard (Dec 18 2025 at 10:05):

aargh dammit

    [] [307.000000] new goal Mul (R[X] ⧸ ⊥) ▶
    [] [1504.000000] ✅️ apply @Distrib.toMul to Mul (R[X] ⧸ ⊥) ▶
    [] [1706.000000] ✅️ apply @NonUnitalNonAssocSemiring.toDistrib to Distrib (R[X] ⧸ ⊥) ▶
    [] [1649.000000] ✅️ apply @NonUnitalNonAssocCommSemiring.toNonUnitalNonAssocSemiring to NonUnitalNonAssocSemiring
          (R[X] ⧸ ⊥) ▶
    [] [1647.000000] ✅️ apply @NonUnitalNonAssocCommRing.toNonUnitalNonAssocCommSemiring to NonUnitalNonAssocCommSemiring
          (R[X] ⧸ ⊥) ▶
    [] [1669.000000] ✅️ apply @NonUnitalCommRing.toNonUnitalNonAssocCommRing to NonUnitalNonAssocCommRing (R[X] ⧸ ⊥) ▶
    [] [1661.000000] ✅️ apply @NonUnitalNormedCommRing.toNonUnitalCommRing to NonUnitalCommRing (R[X] ⧸ ⊥) ▶
    [] [1644.000000] ✅️ apply @NonUnitalCommCStarAlgebra.toNonUnitalNormedCommRing to NonUnitalNormedCommRing (R[X] ⧸ ⊥) ▶
    [] [1637.000000] ✅️ apply CommCStarAlgebra.toNonUnitalCommCStarAlgebra to NonUnitalCommCStarAlgebra (R[X] ⧸ ⊥) ▶
    [] [1917.000000] ✅️ apply @NormedCommRing.toNonUnitalNormedCommRing to NonUnitalNormedCommRing (R[X] ⧸ ⊥) ▶

why is typeclass inference so stupid? Don't look for a norm! Clearly this question has nothing to do with norms! Don't look for a grading! Don't look for a topology! This is boring old ring theory!

Kevin Buzzard (Dec 18 2025 at 10:12):

Actually I am wrong about the severity of the issue. The funny coincidence which is happening here is this:

@GradedMonoid.GradeZero.mul : {ι : Type u_1}  (A : ι  Type u_1)  [inst : AddZeroClass ι]  [GradedMonoid.GMul A]  Mul (A 0)

i.e. "if I can find a GMul on a function A : ι → Type where ι has a zero and an addition, then I can find a Mul on A 0. So is (R[X] ⧸ (⊥ : Ideal R[X])) of the form A 0? Oh, coincidentally, it is! Because 0 = ⊥ and ι = Ideal (R[X]) has an addition, so this works great if A = HasQuotient.Quotient R[X] = I ↦ R[X]⧸I. So it's this weird coincidence which is causing the problems here; usually this GradedMonoid.GradeZero.mul would presumably be quick to fail.

Floris van Doorn (Dec 18 2025 at 11:40):

I think making the graded instances globally a lower priority might be nice to try.
A more severe option would be scoping them, but that might be annoying.

Floris van Doorn (Dec 18 2025 at 11:56):

Some other things I noticed when looking when looking at live.lean-lang (Mathlib commit dfccc70):

  • It spends a very long time trying to apply both docs#Ideal.Quotient.normedCommRing and docs#Ideal.Quotient.semiNormedCommRing . Unfortunately it spends the same amount of time for both of them, probably doing essentially the same steps (but not the same enough so that the cache applies) (roughly 1.2M heartbeats with attribute [-instance] GradedMonoid.GradeZero.mul DirectSum.GradeZero.nonUnitalNonAssocSemiring DirectSum.GradeZero.nonUnitalNonAssocRing)
  • It spends a lot of time unifying, before checking that there is actually a NormedCommRing instance on R[X]. Since elaboration is so complex, this is probably not something we can change.
  • It spends a very long amount of time on both unifying HasQuotient.quotient _ _ =?= HasQuotient.quotient _ _ and then on HasQuotient.quotient' _ _ =?= HasQuotient.quotient' _ _:
          [] [1283186.000000] ❌️ R[X]   =?= ?m.23  ?m.25 
            [delta] [434066.000000] ❌️ R[X]   =?= ?m.23  ?m.25 
            [] [848764.000000] ❌️ HasQuotient.quotient'  =?= HasQuotient.quotient' ?m.25 

I'm wondering if there is secretly a very high cost to the following very innocently-looking abbreviation that just changes the explicitness of an argument:

abbrev HasQuotient.Quotient (A : outParam <| Type u) {B : Type v}
    [HasQuotient A B] (b : B) : Type max u v :=
  HasQuotient.quotient' b

Note: it spends roughly the same number of heartbeats trying to unify arguments, so it seems it's actually doing a lot of work twice because of the existence of this abbrev:

              [] [433700.000000] ❌️  =?= ?m.25 
[...]
              [] [422278.000000] ❌️  =?= ?m.25 
  • When unifying these quotient instances, it spends a lot of time unfolding everything it can from the instance Polynomial.semiring. Adding attribute [local irreducible] Polynomial.semiring improves the heartbeat count by a lot. We should make more instances on Polynomial (and other types) irreducible.
    The only thing we have to ensure is that the different instances of Polynomial do unify (e.g. Polynomial.instAdd maybe should be irreducible, but Polynomial.semiring should use the same (irreducible) addition as Polynomial.instAdd. This might require some reworking of the instances.

Floris van Doorn (Dec 18 2025 at 13:08):

Attempt to fix the HasQuotient.quotient' issue: #33037

Artie Khovanov (Dec 19 2025 at 00:39):

@Kevin Buzzard tried some ablations

#count_heartbeats in
#synth Mul (R  ( : Ideal R)) -- 2951 hb

#count_heartbeats in
#synth Mul (R[X]  I) -- 2982 hb

#count_heartbeats in
#synth Mul (R  I) -- 335 hb

Looks like it's some sort of multiplicative blow up due to the combination of the two issues described

Kevin Buzzard (Dec 19 2025 at 08:38):

Yeah the issue is that you're looking for a multiplication on something which is easily checked to be of the form F(0)F(0) so the first thing it tries to find is a heterogeneous multiplication F(a)×F(b)F(a+b)F(a)\times F(b)\to F(a+b) and this takes a long time to fail.

Kevin Buzzard (Dec 19 2025 at 08:39):

Oh but actually then the second one surprises me. Did you look at the trace? I'm unlikely to be near lean for the next two days


Last updated: Dec 20 2025 at 21:32 UTC