Zulip Chat Archive

Stream: mathlib4

Topic: Typeclass Synthesis Regression

Artie Khovanov (Dec 18 2025 at 03:17):

On Mathlib revision 381f27922fc7b639c2670d0039751dc603d5c55d

import Mathlib

variable {R : Type*} [CommRing R]

open Polynomial

#synth Mul (R[X] ⧸ (⊥ : Ideal R[X])) -- times out (~20500 hb)

Kevin Buzzard (Dec 18 2025 at 09:48):

Even before the recent PRs which have pushed it over the line, mathlib was having trouble finding this instance. On a version of mathlib from last week we have this monster trace

first part of trace

[Meta.synthInstance] [12744412.000000] ✅️ Mul (R[X] ⧸ ⊥) ▼
    [whnf] [25.000000] Non-easy whnf: Mul (R[X] ⧸ ⊥)
    [whnf] [25.000000] Non-easy whnf: Mul (R[X] ⧸ ⊥)
    [] [306.000000] new goal Mul (R[X] ⧸ ⊥) ▼
      [whnf] [25.000000] Non-easy whnf: Mul (R[X] ⧸ ⊥)
      [instances] #[@Lean.Grind.Semiring.toMul, @Semigroup.toMul, @CommMagma.toMul, @MulOne.toMul, @MulZeroClass.toMul, @Distrib.toMul, @GradedMonoid.GradeZero.mul]
    [] [576514.000000] ✅️ apply @GradedMonoid.GradeZero.mul to Mul (R[X] ⧸ ⊥) ▶
    [] [3394.000000] ✅️ apply @DirectSum.GNonUnitalNonAssocSemiring.toGMul to GradedMonoid.GMul (HasQuotient.Quotient R[X]) ▶
    [] [397231.000000] ✅️ apply @DirectSum.GSemiring.toGNonUnitalNonAssocSemiring to DirectSum.GNonUnitalNonAssocSemiring
          (HasQuotient.Quotient R[X]) ▶
    [] [4279.000000] ✅️ apply @DirectSum.GRing.toGSemiring to DirectSum.GSemiring (HasQuotient.Quotient R[X]) ▶
    [] [10620.000000] ✅️ apply @DirectSum.GCommRing.toGRing to DirectSum.GRing (HasQuotient.Quotient R[X]) ▶
    [] [4766.000000] ✅️ apply @DirectSum.GCommSemiring.toGSemiring to DirectSum.GSemiring (HasQuotient.Quotient R[X]) ▶
    [] [4482.000000] ✅️ apply @DirectSum.GCommRing.toGCommSemiring to DirectSum.GCommSemiring (HasQuotient.Quotient R[X]) ▶
    [] [2477238.000000] ✅️ apply @Submodule.Quotient.addCommGroup to (i : Submodule R[X] R[X]) → AddCommGroup (R[X] ⧸ i) ▶
    [isDefEq] [87430.000000] ✅️ ?m.31 =?= Submodule.Quotient.addCommGroup ▶
    [resume] [308.000000] propagating (p : Submodule R[X] R[X]) →
          AddCommGroup
            (R[X] ⧸
              p) to subgoal (i : Submodule R[X] R[X]) →
          AddCommGroup (R[X] ⧸ i) of DirectSum.GCommSemiring (HasQuotient.Quotient R[X]) ▶
    [] [3984.000000] ✅️ apply @GradedMonoid.GMonoid.toGMul to GradedMonoid.GMul (HasQuotient.Quotient R[X]) ▶
    [] [3310.000000] ✅️ apply @DirectSum.GSemiring.toGMonoid to GradedMonoid.GMonoid (HasQuotient.Quotient R[X]) ▶
    [] [2669.000000] ✅️ apply @ESeminormedAddCommMonoid.toAddCommMonoid to (i : Submodule R[X] R[X]) → AddCommMonoid (R[X] ⧸ i) ▶
    [] [135646.000000] ✅️ apply @topologicalRingQuotientTopology to (i : Submodule R[X] R[X]) → TopologicalSpace (R[X] ⧸ i) ▶
    [] [1206.000000] ✅️ apply @UpgradedStandardBorel.toTopologicalSpace to TopologicalSpace R[X] ▶
    [] [1296.000000] ✅️ apply @UniformSpace.toTopologicalSpace to TopologicalSpace R[X] ▶
    [] [1744.000000] ✅️ apply @Valued.toUniformSpace to UniformSpace R[X] ▶
    [] [4651.000000] ✅️ apply @IsValuativeTopology.instValuedValueGroupWithZeroOfIsUniformAddGroup to Valued R[X]
          (ValuativeRel.ValueGroupWithZero R[X]) ▶
    [] [1532.000000] ✅️ apply @PseudoMetricSpace.toUniformSpace to UniformSpace R[X] ▶
    [] [1428.000000] ✅️ apply @TopologicalSpace.UpgradedIsCompletelyPseudoMetrizableSpace.toPseudoMetricSpace to PseudoMetricSpace
          R[X] ▶
    [] [1445.000000] ✅️ apply @SeminormedRing.toPseudoMetricSpace to PseudoMetricSpace R[X] ▶
    [] [1450.000000] ✅️ apply @SeminormedCommRing.toSeminormedRing to SeminormedRing R[X] ▶
    [] [1469.000000] ✅️ apply @NormedCommRing.toSeminormedCommRing to SeminormedCommRing R[X] ▶
    [] [1455.000000] ✅️ apply @CommCStarAlgebra.toNormedCommRing to NormedCommRing R[X] ▶
    [] [1481.000000] ✅️ apply @NormedField.toNormedCommRing to NormedCommRing R[X] ▶
    [] [1466.000000] ✅️ apply @DenselyNormedField.toNormedField to NormedField R[X] ▶
    [] [1626.000000] ✅️ apply @RCLike.toDenselyNormedField to DenselyNormedField R[X] ▶
    [] [1458.000000] ✅️ apply @NontriviallyNormedField.toNormedField to NormedField R[X] ▶
    [] [1291.000000] ✅️ apply @DenselyNormedField.toNontriviallyNormedField to NontriviallyNormedField R[X] ▶
    [] [1456.000000] ✅️ apply @NormedRing.toSeminormedRing to SeminormedRing R[X] ▶
    [] [1447.000000] ✅️ apply @CStarAlgebra.toNormedRing to NormedRing R[X] ▶
    [] [1426.000000] ✅️ apply @CommCStarAlgebra.toCStarAlgebra to CStarAlgebra R[X] ▶
    [] [1254.000000] ✅️ apply @NormedCommRing.toNormedRing to NormedRing R[X] ▶
    [] [1433.000000] ✅️ apply @NormedDivisionRing.toNormedRing to NormedRing R[X] ▶
    [] [1274.000000] ✅️ apply @NormedField.toNormedDivisionRing to NormedDivisionRing R[X] ▶
    [] [1482.000000] ✅️ apply @NonUnitalSeminormedRing.toPseudoMetricSpace to PseudoMetricSpace R[X] ▶
    [] [1456.000000] ✅️ apply @NonUnitalSeminormedCommRing.toNonUnitalSeminormedRing to NonUnitalSeminormedRing R[X] ▶
    [] [1291.000000] ✅️ apply @SeminormedCommRing.toNonUnitalSeminormedCommRing to NonUnitalSeminormedCommRing R[X] ▶
    [] [1469.000000] ✅️ apply @NonUnitalNormedCommRing.toNonUnitalSeminormedCommRing to NonUnitalSeminormedCommRing R[X] ▶
    [] [1444.000000] ✅️ apply @NonUnitalCommCStarAlgebra.toNonUnitalNormedCommRing to NonUnitalNormedCommRing R[X] ▶
    [] [1452.000000] ✅️ apply CommCStarAlgebra.toNonUnitalCommCStarAlgebra to NonUnitalCommCStarAlgebra R[X] ▶
    [] [1261.000000] ✅️ apply @NormedCommRing.toNonUnitalNormedCommRing to NonUnitalNormedCommRing R[X] ▶
    [] [1481.000000] ✅️ apply @NonUnitalNormedRing.toNonUnitalSeminormedRing to NonUnitalSeminormedRing R[X] ▶
    [] [1463.000000] ✅️ apply @NonUnitalCStarAlgebra.toNonUnitalNormedRing to NonUnitalNormedRing R[X] ▶
    [] [1267.000000] ✅️ apply @NonUnitalCommCStarAlgebra.toNonUnitalCStarAlgebra to NonUnitalCStarAlgebra R[X] ▶
    [] [1282.000000] ✅️ apply CStarAlgebra.toNonUnitalCStarAlgebra to NonUnitalCStarAlgebra R[X] ▶
    [] [1255.000000] ✅️ apply @NonUnitalNormedCommRing.toNonUnitalNormedRing to NonUnitalNormedRing R[X] ▶
    [] 350 more entries... ▶

(note "350 more entries" at the bottom). Highlights from the beginning are:

    [] [306.000000] new goal Mul (R[X] ⧸ ⊥) ▼
      [whnf] [25.000000] Non-easy whnf: Mul (R[X] ⧸ ⊥)
      [instances] #[@Lean.Grind.Semiring.toMul, @Semigroup.toMul, @CommMagma.toMul, @MulOne.toMul, @MulZeroClass.toMul, @Distrib.toMul, @GradedMonoid.GradeZero.mul]

i.e. "we're going to try and find Mul on a thing, and there are a ton of basic instances of this defined right at the start of mathlib such as Distrib.toMul, and then there's the completely exotic GradedMonoid.GradeZero.mul defined right at the end, and so we'll try this one first because that's the algorithm". So the next step is

    [] [576514.000000] ✅️ apply @GradedMonoid.GradeZero.mul to Mul (R[X] ⧸ ⊥) ▶

and now suddenly we're looking for a grading to get our mul (and note that this is happening in every file which imports GradedMonoid.GradeZero.mul and in every project or file which has a casual import Mathlib somewhere, every time someone asks for a multiplication) (I take that back -- there is an extra coincidence here).

[] [292.000000] new goal GradedMonoid.GMul (HasQuotient.Quotient R[X]) ▼
        [whnf] [25.000000] Non-easy whnf: GradedMonoid.GMul (HasQuotient.Quotient R[X])
        [instances] #[@GradedMonoid.GMonoid.toGMul, @DirectSum.GNonUnitalNonAssocSemiring.toGMul]

So now we're about to start looking for a graded ring structure on...wait...what? HasQuotient.Quotient R[X]?

    [] [4482.000000] ✅️ apply @DirectSum.GCommRing.toGCommSemiring to DirectSum.GCommSemiring (HasQuotient.Quotient R[X]) ▼
      [whnf] [25.000000] Non-easy whnf: DirectSum.GCommSemiring (HasQuotient.Quotient R[X])

and there it ultimately goes: to find a multiplication on the quotient we're going to look for a graded ring structure on HasQuotient.Quotient R[X], because HasQuotient.Quotient R[X] is indeed a function from the type Ideal R[X] to Type (sending $I$ to $R[X]/I$ ) and indeed these quotients do have additive group structures, so perhaps this is a graded ring. And indeed it's not hard to find additive group structures on the quotient. So what's left to do before we ultimately fail to find this graded ring structure?

    [] [3984.000000] ✅️ apply @GradedMonoid.GMonoid.toGMul to GradedMonoid.GMul (HasQuotient.Quotient R[X]) ▼

Aah yes, a graded structure on a collection of additive groups indexed by a type which has addition (and indeed you can add ideals) needs a GMul, so now having started looking for Mul on R[X]/I we're now looking for GMul on the function sending an ideal $I$ to $R[X]/I$ , i.e. a map R[X]/I->R[X]/J->R[X]/(I+J). OK so it will suffice to find a GMonoid structure on this function. Now, how do we find GMonoid structures? Well we certainly need additive monoid structures on each factor, so how do we find an additive monoid structure on $R[X]/I$ ?

      [] [314.000000] new goal (i : Submodule R[X] R[X]) → AddCommMonoid (R[X] ⧸ i) ▼
        [whnf] [25.000000] Non-easy whnf: AddCommMonoid (R[X] ⧸ i)
        [instances] #[@AddCancelCommMonoid.toAddCommMonoid, @AddCommMonoid.ofIsAddCommutative, @SubtractionCommMonoid.toAddCommMonoid, @AddCommGroup.toAddCommMonoid, @AddCommMonoidWithOne.toAddCommMonoid, @NonUnitalNonAssocSemiring.toAddCommMonoid, @LinearOrderedAddCommMonoidWithTop.toAddCommMonoid, @ESeminormedAddCommMonoid.toAddCommMonoid]

Well there are a lot of choices, but we've imported everything and the last example of how to make an additive monoid is we can get one from an ESeminormedAddCommMonoid. Fabulous! Let's now start looking for norms on R[X]/I. Well, norms are something to do with topology so maybe we should be looking for a topology?

    [] [135646.000000] ✅️ apply @topologicalRingQuotientTopology to (i : Submodule R[X] R[X]) → TopologicalSpace (R[X] ⧸ i) ▶
    [] [1206.000000] ✅️ apply @UpgradedStandardBorel.toTopologicalSpace to TopologicalSpace R[X] ▶

etc etc.

Kevin Buzzard (Dec 18 2025 at 09:59):

attribute [instance 900] GradedMonoid.GradeZero.mul

divides the time taken to find this instance by a factor of three (and in particular fixes the timeout on this week's mathlib)

Kevin Buzzard (Dec 18 2025 at 10:01):

aargh lol looking at the new trace:

    [] [305.000000] new goal Mul (R[X] ⧸ ⊥) ▶
    [] [1500.000000] ✅️ apply @Distrib.toMul to Mul (R[X] ⧸ ⊥) ▶
    [] [1706.000000] ✅️ apply @NonUnitalNonAssocSemiring.toDistrib to Distrib (R[X] ⧸ ⊥) ▶
    [] [576477.000000] ✅️ apply @DirectSum.GradeZero.nonUnitalNonAssocSemiring to NonUnitalNonAssocSemiring (R[X] ⧸ ⊥) ▶
    [] [4152.000000] ❌️ apply @FirstOrder.Language.instDecidableEqRelations to DecidableEq (Submodule R[X] R[X]) ▶
    [] [4000.000000] ❌️ apply @FirstOrder.Language.instDecidableEqFunctions to DecidableEq (Submodule R[X] R[X]) ▶
    [] [3885.000000] ✅️ apply @LinearOrder.toDecidableEq to DecidableEq (Submodule R[X] R[X]) ▶
    [] [151392.000000] ✅️ apply ValuationRing.instLinearOrderIdealOfDecidableLE to LinearOrder (Submodule R[X] R[X]) ▶
    [] [5008.000000] ✅️ apply @ValuationRing.toPreValuationRing to PreValuationRing R[X] ▶
    [] [3056.000000] ✅️ apply of_isDiscreteValuationRing to ValuationRing R[X] ▶

and off we go again with randomness

Kevin Buzzard (Dec 18 2025 at 10:03):

attribute [instance 900] GradedMonoid.GradeZero.mul DirectSum.GradeZero.nonUnitalNonAssocSemiring

makes it 10x faster

Kevin Buzzard (Dec 18 2025 at 10:03):

I think we've got graded rings wrong somehow (or perhaps just their prios)

Kevin Buzzard (Dec 18 2025 at 10:05):

aargh dammit

    [] [307.000000] new goal Mul (R[X] ⧸ ⊥) ▶
    [] [1504.000000] ✅️ apply @Distrib.toMul to Mul (R[X] ⧸ ⊥) ▶
    [] [1706.000000] ✅️ apply @NonUnitalNonAssocSemiring.toDistrib to Distrib (R[X] ⧸ ⊥) ▶
    [] [1649.000000] ✅️ apply @NonUnitalNonAssocCommSemiring.toNonUnitalNonAssocSemiring to NonUnitalNonAssocSemiring
          (R[X] ⧸ ⊥) ▶
    [] [1647.000000] ✅️ apply @NonUnitalNonAssocCommRing.toNonUnitalNonAssocCommSemiring to NonUnitalNonAssocCommSemiring
          (R[X] ⧸ ⊥) ▶
    [] [1669.000000] ✅️ apply @NonUnitalCommRing.toNonUnitalNonAssocCommRing to NonUnitalNonAssocCommRing (R[X] ⧸ ⊥) ▶
    [] [1661.000000] ✅️ apply @NonUnitalNormedCommRing.toNonUnitalCommRing to NonUnitalCommRing (R[X] ⧸ ⊥) ▶
    [] [1644.000000] ✅️ apply @NonUnitalCommCStarAlgebra.toNonUnitalNormedCommRing to NonUnitalNormedCommRing (R[X] ⧸ ⊥) ▶
    [] [1637.000000] ✅️ apply CommCStarAlgebra.toNonUnitalCommCStarAlgebra to NonUnitalCommCStarAlgebra (R[X] ⧸ ⊥) ▶
    [] [1917.000000] ✅️ apply @NormedCommRing.toNonUnitalNormedCommRing to NonUnitalNormedCommRing (R[X] ⧸ ⊥) ▶

why is typeclass inference so stupid? Don't look for a norm! Clearly this question has nothing to do with norms! Don't look for a grading! Don't look for a topology! This is boring old ring theory!

Kevin Buzzard (Dec 18 2025 at 10:12):

Actually I am wrong about the severity of the issue. The funny coincidence which is happening here is this:

@GradedMonoid.GradeZero.mul : {ι : Type u_1} → (A : ι → Type u_1) → [inst : AddZeroClass ι] → [GradedMonoid.GMul A] → Mul (A 0)

i.e. "if I can find a GMul on a function A : ι → Type where ι has a zero and an addition, then I can find a Mul on A 0. So is (R[X] ⧸ (⊥ : Ideal R[X])) of the form A 0? Oh, coincidentally, it is! Because 0 = ⊥ and ι = Ideal (R[X]) has an addition, so this works great if A = HasQuotient.Quotient R[X] = I ↦ R[X]⧸I. So it's this weird coincidence which is causing the problems here; usually this GradedMonoid.GradeZero.mul would presumably be quick to fail.

Floris van Doorn (Dec 18 2025 at 11:40):

I think making the graded instances globally a lower priority might be nice to try.
A more severe option would be scoping them, but that might be annoying.

Floris van Doorn (Dec 18 2025 at 11:56):

Some other things I noticed when looking when looking at live.lean-lang (Mathlib commit dfccc70):

It spends a very long time trying to apply both docs#Ideal.Quotient.normedCommRing and docs#Ideal.Quotient.semiNormedCommRing . Unfortunately it spends the same amount of time for both of them, probably doing essentially the same steps (but not the same enough so that the cache applies) (roughly 1.2M heartbeats with attribute [-instance] GradedMonoid.GradeZero.mul DirectSum.GradeZero.nonUnitalNonAssocSemiring DirectSum.GradeZero.nonUnitalNonAssocRing)
It spends a lot of time unifying, before checking that there is actually a NormedCommRing instance on R[X]. Since elaboration is so complex, this is probably not something we can change.
It spends a very long amount of time on both unifying HasQuotient.quotient _ _ =?= HasQuotient.quotient _ _ and then on HasQuotient.quotient' _ _ =?= HasQuotient.quotient' _ _:

          [] [1283186.000000] ❌️ R[X] ⧸ ⊥ =?= ?m.23 ⧸ ?m.25 ▼
            [delta] [434066.000000] ❌️ R[X] ⧸ ⊥ =?= ?m.23 ⧸ ?m.25 ▶
            [] [848764.000000] ❌️ HasQuotient.quotient' ⊥ =?= HasQuotient.quotient' ?m.25 ▼

I'm wondering if there is secretly a very high cost to the following very innocently-looking abbreviation that just changes the explicitness of an argument:

abbrev HasQuotient.Quotient (A : outParam <| Type u) {B : Type v}
    [HasQuotient A B] (b : B) : Type max u v :=
  HasQuotient.quotient' b

Note: it spends roughly the same number of heartbeats trying to unify arguments, so it seems it's actually doing a lot of work twice because of the existence of this abbrev:

              [] [433700.000000] ❌️ ⊥ =?= ?m.25 ▼
[...]
              [] [422278.000000] ❌️ ⊥ =?= ?m.25 ▶

When unifying these quotient instances, it spends a lot of time unfolding everything it can from the instance Polynomial.semiring. Adding attribute [local irreducible] Polynomial.semiring improves the heartbeat count by a lot. We should make more instances on Polynomial (and other types) irreducible.
The only thing we have to ensure is that the different instances of Polynomial do unify (e.g. Polynomial.instAdd maybe should be irreducible, but Polynomial.semiring should use the same (irreducible) addition as Polynomial.instAdd. This might require some reworking of the instances.

Floris van Doorn (Dec 18 2025 at 13:08):

Attempt to fix the HasQuotient.quotient' issue: #33037

Artie Khovanov (Dec 19 2025 at 00:39):

@Kevin Buzzard tried some ablations

#count_heartbeats in
#synth Mul (R ⧸ (⊥ : Ideal R)) -- 2951 hb

#count_heartbeats in
#synth Mul (R[X] ⧸ I) -- 2982 hb

#count_heartbeats in
#synth Mul (R ⧸ I) -- 335 hb

Looks like it's some sort of multiplicative blow up due to the combination of the two issues described

Kevin Buzzard (Dec 19 2025 at 08:38):

Yeah the issue is that you're looking for a multiplication on something which is easily checked to be of the form $F(0)$ so the first thing it tries to find is a heterogeneous multiplication $F(a)\times F(b)\to F(a+b)$ and this takes a long time to fail.

Kevin Buzzard (Dec 19 2025 at 08:39):

Oh but actually then the second one surprises me. Did you look at the trace? I'm unlikely to be near lean for the next two days

Kim Morrison (Jan 06 2026 at 23:04):

We really really need some standalone (i.e. Mathlib free!) demonstrations of these issues. We would love to fix these issues, but need the test cases. I have been trying with the new https://github.com/kim-em/lean-minimizer, but so far the interesting examples take too long (~days) to successfully minimize.

Jovan Gerbscheid (Jan 07 2026 at 09:15):

The above issues of slow unification are fixed in my PR lean#8883, and the corresponding issue lean#10414 shows a small standalone example of the problem of exponentially slow unification.

If I need to do anything about the PR such as improve the documentation, let me know.

Artie Khovanov (Jan 07 2026 at 22:40):

Kim Morrison said:

We really really need some standalone (i.e. Mathlib free!) demonstrations of these issues.

Is it feasible for someone who knows little about the internal workings of Lean to minimise something like this? Or are you saying we need better tooling?

Kim Morrison (Jan 07 2026 at 23:33):

It's definitely possibly to do minimizations only understanding Mathlib, not Lean. It's just very very tedious, on the order of day's work for examples like this. lean-minimizer linked above is even slower, but essentially automatic. I would love to have people try out lean-minimizer (there is also the https://github.com/kim-em/mathlib-minimizer repo, which just provides the convenience of a project importing both lean-minimizer and mathlib) on realistic examples, and let me know:

additional minimizations it should be performing
ideas for speeding it up.

Kevin Buzzard (Jan 08 2026 at 08:04):

I've minimised things like this manually before. It's quite tedious but definitely possible. You just have to copy out huge chunks of the algebra hierarchy. I have some skeleton files which would probably help.

Last updated: Feb 28 2026 at 14:05 UTC