Zulip Chat Archive

Stream: mathlib4

Topic: Typeclass performance improvement without selecting instance

Anne Baanen (Jan 28 2026 at 10:44):

While waiting for other code to compile, I came across an interesting slowdown in the Radar graph for Mathlib.CategoryTheory.Limits.Shapes.NormalMono.Equalizers: lake shake actually made this file 20% slower. Doing some bisection of imports, it looks like adding either of the following imports to Mathlib.CategoryTheory.Limits.Shapes.NormalMono.Equalizers reduces build time by about 10% and adding both does so by 20% on my machine:

import Mathlib.CategoryTheory.Limits.Types.Limits -- shake: keep
import Mathlib.CategoryTheory.Limits.Types.Colimits -- shake: keep

Some more minimization later, and in fact the speedup happens exactly when docs#CategoryTheory.Limits.Types.hasLimit / docs#CategoryTheory.Limits.Types.hasColimit are made available. But this is weird, because we never take a limit in Type u anywhere in the file! They are only used in failing instance synthesis. But their existence interestingly means that they allow the synthesis to fail quickly:

[Meta.synthInstance] 💥️ HasBinaryProduct ?m.417 ?m.418 ▼
  [] new goal HasLimit (pair ?m.417 ?m.418) ▶
  [] 💥️ apply @Types.hasLimit to HasLimit (pair ?m.417 ?m.418) ▼
    [tryResolve] 💥️ HasLimit (pair ?m.417 ?m.418) ≟ HasLimit ?m.424 ▶

compared to 34 different instances being tried without the existence of Types.hasLimit.

In other words, Lean tries to synthesize an instance before the required metavariables are assigned, but does not realize this is the case until it actually tries to apply a concrete instance. And this can take a long while if we have a complex instance hierarchy (like apparently we do for Has(Co)Limit.

This behaviour surprised me, and would be worth sharing widely. Apart from that, I would be interested in knowing of an actionable suggestion. Making one file 20% faster is nice, but can we do this in a thousand more files?

Last updated: Feb 28 2026 at 14:05 UTC