Zulip Chat Archive

Stream: mathlib4

Topic: Quotient slowdowns


Xavier Roblot (Aug 05 2023 at 01:12):

I am experiencing big slowdowns that I believe to be related to quotients and I would like to know how I can improve the situation. The following #mwe is extracted from the work on Dirichlet unit's theorem.

import Mathlib.GroupTheory.Torsion
import Mathlib.NumberTheory.NumberField.Basic

open NumberField BigOperators

variable (K : Type _) [Field K] (n : )

def torsion : Subgroup (𝓞 K)ˣ := CommGroup.torsion (𝓞 K)ˣ

set_option maxHeartbeats 300000 in
set_option synthInstance.maxHeartbeats 100000 in
example : Basis (Fin n)  (Additive ((𝓞 K)ˣ  (torsion K))) := sorry -- 8 sec.

set_option maxHeartbeats 800000 in
set_option synthInstance.maxHeartbeats 70000 in
example {ι : Type _} [Fintype ι] (f : ι  (𝓞 K)ˣ) :
    (( i, (f i) : (𝓞 K)ˣ) : (𝓞 K)ˣ  (torsion K)) =  i, (f i) := by -- 22 sec.
  simp_rw [ QuotientGroup.mk'_apply]
  convert map_prod (QuotientGroup.mk' (torsion K)) f Finset.univ

The first example takes about 8 sec. on my machine which is still okay, but that's only for the statement to compile. The second example takes more than 20 sec. and also is the only way I have found to prove this result (map_prod should work directly here, I think, without having to do the rw first and without using convert).

Is there anything I can do to speed up things?

Kevin Buzzard (Aug 05 2023 at 11:30):

There's a with -> let src in Algebra.Group.TypeTags which might be the cause: you could try this change

instance Additive.addCommMonoid [CommMonoid α] : AddCommMonoid (Additive α) :=
  { toAddMonoid := Additive.addMonoid
    add_comm := Additive.addCommSemigroup.add_comm }

and then you have to compile a few hundred files to see if it made any difference.

Kevin Buzzard (Aug 05 2023 at 11:32):

@Matthew Ballard do you have some PR where you methodically remove these? There are a bunch in Algebra.Group.TypeTags on master.

Matthew Ballard (Aug 05 2023 at 11:32):

Only in the Data folder.

Matthew Ballard (Aug 05 2023 at 11:34):

#6276 actually has some outside there but not this one

Kevin Buzzard (Aug 05 2023 at 11:43):

That change didn't speed it up: next one is

@[to_additive "Additive units of an additive monoid form an additive group."]
instance : Group αˣ :=
  { (inferInstance : MulOneClass αˣ) with
    one := 1,
    mul_assoc := fun _ _ _ => ext <| mul_assoc _ _ _,
    inv := Inv.inv, mul_left_inv := fun u => ext u.inv_val }

in Algebra.Group.Units. And then wait ages for it to compile again. etc etc.

Kevin Buzzard (Aug 05 2023 at 11:49):

Trying this:

instance : Group αˣ :=
  { toDivInvMonoid := {
      toMonoid := {
        toSemigroup := {
          mul := (· * ·)
          mul_assoc := fun _ _ _ => ext <| mul_assoc _ _ _
        }
        one_mul := (inferInstance : MulOneClass αˣ).one_mul
        mul_one := (inferInstance : MulOneClass αˣ).mul_one
      }
  }
    mul_left_inv := fun u => ext u.inv_val }

Kevin Buzzard (Aug 05 2023 at 12:16):

Oh I shouldn't have given mul explicitly. Now trying this (but also going out)

@[to_additive "Additive units of an additive monoid form an additive group."]
instance : Group αˣ :=
  { toDivInvMonoid := {
      toMonoid := {
        toSemigroup := {
          mul_assoc := fun _ _ _ => ext <| mul_assoc _ _ _
        }
        one_mul := (inferInstance : MulOneClass αˣ).one_mul
        mul_one := (inferInstance : MulOneClass αˣ).mul_one
      }
  }
    mul_left_inv := fun u => ext u.inv_val }

Kevin Buzzard (Aug 05 2023 at 19:18):

It's slowly getting quicker. Next problem is

protected def group [Group M₁] (f : M₁  M₂) (hf : Surjective f) (one : f 1 = 1)
    (mul :  x y, f (x * y) = f x * f y) (inv :  x, f x⁻¹ = (f x)⁻¹)
    (div :  x y, f (x / y) = f x / f y) (npow :  (x) (n : ), f (x ^ n) = f x ^ n)
    (zpow :  (x) (n : ), f (x ^ n) = f x ^ n) : Group M₂ :=
  { hf.divInvMonoid f one mul inv div npow zpow with
    mul_left_inv := hf.forall.2 fun x => by erw [ inv,  mul, mul_left_inv, one] }

in Algebra.Group.InjSurj

Kevin Buzzard (Aug 05 2023 at 19:19):

Now

  { toDivInvMonoid := hf.divInvMonoid f one mul inv div npow zpow
    mul_left_inv := hf.forall.2 fun x => by erw [ inv,  mul, mul_left_inv, one] }

and I'm going out again. Sorry things are so slow!

Anatole Dedecker (Aug 05 2023 at 19:20):

Kevin if you like playing with this kind of stuff, can we send you requests for speed up ? :pray:

Kevin Buzzard (Aug 05 2023 at 19:21):

no, these need to be sent directly to the Lean 4 repo so that the devs begin to understand the problem better. If possible can you make a mathlib-free version first? Thanks. ;-)

Kevin Buzzard (Aug 06 2023 at 13:18):

More seriously, here is the very long and protracted method I use to see if the let issue is causing slowdown problems in your file.

0) Make sure the problem is still there on current master (there have been several PRs recently which make stuff much quicker).

1) Minimise the problem down to a single short file. Xavier has done a great job of this in his first post but I deleted the second example just so I could concentrate on the first (hopefully progress in the first will also help the second). So the game is to try and find out why

example : Basis (Fin n)  (Additive ((𝓞 K)ˣ  (torsion K))) := sorry

is taking 8 seconds and needs raised heartbeats.

2) We want to look at the number of heartbeats the declaration is using, and also the wall clock time that the declaration is using, so I add some set_option stuff to trace this.

count_heartbeats in -- prints heartbeat count in the declaration (and sets `maxHeartbeats` to infinity)
set_option synthInstance.maxHeartbeats 100000 in
set_option trace.profiler true in -- prints wall clock times in the declaration
set_option pp.proofs.withType false in -- makes output much shorter
example : Basis (Fin n)  (Additive ((𝓞 K)ˣ  (torsion K))) := sorry -- 8 sec.

The pp.proofs.withType false line makes proofs all appear as _; the default in Lean 4 is that they appear as (_ : \forall a b c, a + b + c = a + (b + c)) and this clutters stuff up immensely, so we swich that off.

Check that by clicking on various blue lines in the file that you're seeing output such as

Used 256641 heartbeats, which is greater than the current maximum of 200000.

and

[Elab.command] [6.939679s] example : Basis (Fin n) ℤ (Additive ((𝓞 K)ˣ ⧸ (torsion K))) :=
      sorry ▶

(it was on 8 seconds but I've already started the process with this one; I'm writing this half way through the process). Unfold the triangle in sorry ▶ in the infoview, and keep unfolding and follow the biggest number. If you see stuff ilke this:

            [] [2.875129s] ✅ Additive.addCommMonoid =?= AddCommGroup.toAddCommMonoid ▶

then this means "Lean's typeclass inference system just took 2.8 seconds to verify (that's the tick) that two AddCommMonoid structures were equal by rfl" and that's the sort of problem which with is causing right now in mathlib.

3) Put your test file in the Mathlib directory, e.g. my copy is Mathlib/scratch37.lean; this makes it easier to compile on the command line.

4) Create a new branch off master e.g. I made kbuzzard-torsion-slowness.

Now exit VS Code for a while. We need to make sure all imports are building, so in this case that's

$ lake build Mathlib.GroupTheory.Torsion Mathlib.NumberTheory.NumberField.Basic

Now this is the start of the cycle: we keep repeating these steps until we get bored or strike gold.

5) Exit VS Code (this might not be necessary but I'm paranoid) and pipe the trace output to a file (remember to change the filename if you've already done steps 5-10 once)

$ lake build Mathlib.scratch37 > traceoutput1.txt

This should take about as long as it takes the file to be processed in VS Code, unless you have hundreds of millions of lines of trace, in which case you might want to do a better job of minimising the problem (note that apparently we have extract_goal now, or will have it soon, which will be a great help if you find yourself in this position; use the trace output to find which line of your 20 line tactic proof is the super-slow one and just profile that line instead). Xavier's output is about 4000 lines.

6) Now fire up VS Code again and open traceoutput1.txt and have a look around. Search for let src := , that's probably a bad with. I've got rid of all the lets in this particular trace with the edits I've already made. One thing that I see in this trace is

                          [Meta.isDefEq] [0.321421s]  CommGroup.toGroup =?= Units.instGroupUnits
                            [Meta.isDefEq] [0.321335s]  CommGroup.toGroup =?= Group.mk _
                              [Meta.isDefEq] [0.321273s]  Units.instCommGroupUnitsToMonoid.1 =?= Group.mk _
                                [Meta.isDefEq] [0.321246s]  Group.mk _ =?= Group.mk _
                                  [Meta.isDefEq] [0.313584s]  Group.toDivInvMonoid =?= DivInvMonoid.mk zpowRec
                                    [Meta.isDefEq] [0.313522s]  inferInstance.1 =?= DivInvMonoid.mk zpowRec
                                      [Meta.isDefEq] [0.313500s]  DivInvMonoid.mk zpowRec =?= DivInvMonoid.mk zpowRec
                                        [Meta.isDefEq] [0.061008s]  zpowRec =?= zpowRec
                                        ...
                                        [Meta.isDefEq] [0.108075s]  Monoid.mk _ _ npowRec =?= Monoid.mk _ _ npowRec
                                        ...
                                        [Meta.isDefEq] [0.129239s]  {
                                              div := DivInvMonoid.div' } =?= { div := DivInvMonoid.div' }
                                        ...

Note that by this point

[Meta.isDefEq] [0.313500s]  DivInvMonoid.mk zpowRec =?= DivInvMonoid.mk zpowRec

Lean is spending 0.3 seconds checking that two terms that by now look syntactically equal are equal. This is a symptom of the issue that we're trying to debug. It's only 0.3 seconds but now you have access to the full trace you can just search for DivInvMonoid.mk zpowRec =?= DivInvMonoid.mk zpowRec and see that it's showing up 13 times, so there is 4 seconds of your 8 seconds right now.

7) Now we have to try a change to see if we can fix it. A few lines above where the equalities become apparently syntactic, we have

[Meta.isDefEq] [0.295745s]  Units.instGroupUnits =?= CommGroup.toGroup

and Units.instGroupUnits =?= CommGroup.toGroup shows up 16 times in the trace, always costing 0.3 seconds; this is the claim that two group structures on the units of something are defeq and it's taking too long. Let's take a look at Units.instGroupUnits. Now this is where it all comes experimental as far as I am concerned, and people like Eric and Matt I think understand better what's going on. I'm going to try changing it to

instance : Group αˣ :=
  { toDivInvMonoid := {
      toMonoid := {
        toSemigroup := { mul_assoc := fun _ _ _ => ext <| mul_assoc _ _ _}
        one_mul := instMulOneClassUnits.one_mul
        mul_one := instMulOneClassUnits.mul_one } }
    mul_left_inv := fun u => ext u.inv_val }

and I'm unclear about whether I should make a toDivInvMonoid instance separately. But Ok I've made a change. So now let's compile.

8) Commit the changes to your branch.

9) Now we have to compile Lean because we've changed a file high up in the hierarchy. I'll do it on the command line; another possibility is pushing the branch to github, possibly opening a PR (tag it WIP) and letting CI do it (this takes longer if you have a half-decent machine to do the compiling on yourself). Building might take a long time, it all depends on how complex your MWE imports are and how far up in the import tree you were fiddling around)

$ lake build Mathlib.GroupTheory.Torsion Mathlib.NumberTheory.NumberField.Basic
[429/1189] Building Mathlib.Algebra.Group.Units
[433/1189] Building Mathlib.Algebra.Group.InjSurj
[477/1189] Building Mathlib.Algebra.Group.TypeTags
[559/1189] Building Mathlib.Algebra.Hom.Equiv.TypeTags
[559/1189] Building Mathlib.Algebra.Order.Monoid.TypeTags
...

You might want to do time lake buid... if you want to get a feel for how long this part of the process takes.

10) Go back to step 5. If the file is now quick, open a PR with the branch. If it's still slow, then repeat until either it's quick or you give up.

Kevin Buzzard (Aug 06 2023 at 13:54):

Occasionally you might want to compare diffs of traces. This is a pain with the timings in because it means the traces differ on basically every line. You can remove the traces by explaining the problem to ChatGPT, who came up with sed -E 's/\[[0-9]+\.[0-9]+s\]/[s]/g' input_file > output_file (which changes all [0.1234s] to [s]) and then the diffs are far more useful.

Eric Wieser (Aug 06 2023 at 13:58):

I only skimmed this, but my comment would be that when making these types of changes you should compare #print before and after: sometimes the messy nested syntax is identical to what is generated anyway

Eric Wieser (Aug 06 2023 at 13:59):

I think I general toFoo := ... is only worth writing if the ... isn't a { ... }

Kevin Buzzard (Aug 06 2023 at 14:04):

So for example do you think that changing this
from

instance {α} [CommMonoid α] : CommGroup αˣ :=
  { (inferInstance : Group αˣ) with
    mul_comm := fun _ _ => ext <| mul_comm _ _ }

to

instance {α} [CommMonoid α] : CommGroup αˣ :=
  { mul_comm := fun _ _ => ext <| mul_comm _ _ }

will have any effect? Oh -- what do you mean by "isn't a {}" -- I interpreted as meaning "isn't empty" but this might be the mathematician in me coming out.

Matthew Ballard (Aug 06 2023 at 14:06):

I think you want a clean projection onto the parent class. You don't want Lean trying to build it out of pieces from the with including if it is the list of instances specified before the with. If you don't have the parent class that cleanly, it is a sign you should break it out.

Matthew Ballard (Aug 06 2023 at 14:07):

I think you want to change that for multiple reasons

Kevin Buzzard (Aug 06 2023 at 14:16):

@Xavier Roblot that last change took number of heartbeats down from 268578 to 11541 (an order of magnitude) so your problems are now hopefully solved. If you want local access to a fast version quickly then just make the change in the definition of CommGroup αˣ above and your file should now be fast.

Kevin Buzzard (Aug 06 2023 at 14:19):

example {ι : Type _} [Fintype ι] (f : ι  (𝓞 K)ˣ) :
    (( i, (f i) : (𝓞 K)ˣ) : (𝓞 K)ˣ  (torsion K)) =  i, (f i) :=
  map_prod (QuotientGroup.mk' (torsion K)) f Finset.univ

works fine on my branch now.

So now we have to do some clean-up. I have made 4 changes in my branch and only one of them fixed the problem (the other changes all made the size of the trace go up slightly in fact). In the past I was just pushing all changes and then Eric was saying "waitwaitwait, are you sure we need to make all those changes?". So I'll now start again from master with just the last change I made and see if your file is still fast. If it is I'll make a PR.

Kevin Buzzard (Aug 06 2023 at 15:02):

Yes, it just needs that CommGroup fix and the example file is now much faster. Mathlib still compiles locally for me (completely) with the change. I've made a PR and branched off a commit which is currently being benchmarked, so we should be able to compare the result across all mathlib. #6398 (WIP for now)

Kevin Buzzard (Aug 06 2023 at 18:05):

Results are in. Several number theory files are faster :D

Xavier Roblot (Aug 06 2023 at 23:48):

Kevin, thanks a lot for all your work and excellent speedup! I can now work in this file without having to wait after each keystroke :smile: Thanks also for the very enlightening explanation. Next time, I hit a slowdown, I can now try to fix it myself thanks to you.

Scott Morrison (Aug 07 2023 at 04:56):

Can we please make sure that any changes to structures that fiddle with the with in this way come with clear comments about the performance implications, and a warning not to change it? This stuff will decay very quickly otherwise.

Kevin Buzzard (Aug 07 2023 at 06:30):

I don't understand the suggestion. Where am I supposed to put this warning in #6398 ?

Eric Wieser (Aug 07 2023 at 08:52):

Maybe -- TODO(lean4#XXX): once `with` is fixed, add back `(inferInstance : Group αˣ)` ?

Kevin Buzzard (Aug 07 2023 at 09:17):

Why do we want to add it back? It's been inferred by typeclass inference anyway. We don't write have hG : Group G := inferInstance in the middle of a proof about groups.

Scott Morrison (Aug 07 2023 at 09:29):

If removing the X with is the helpful thing, then that doesn't require a comment.

Any situation where adding something (either an X with or a field) that could be removed without breaking something, but would cause a slowdown downstream, needs to be guarded by a comment. That's all I meant.

Kevin Buzzard (Aug 07 2023 at 09:33):

Yeah I've got it now. I'll add a note

Eric Wieser (Aug 07 2023 at 10:36):

Why do we want to add it back?

Because removing withs greatly increases the chances of introducing accidental diamonds due to default fields

Kevin Buzzard (Aug 07 2023 at 10:53):

I am still a bit confused about how this can happen. The user would have to explicitly delete a nsmul field, or other field with a default, for this to happen, right?

Eric Wieser (Aug 07 2023 at 10:54):

I'll make a mwe

Matthew Ballard (Aug 07 2023 at 10:56):

Re: #6262 did you try changing the auto to synthesizing instead of deleting?

Eric Wieser (Aug 07 2023 at 10:59):

@Kevin Buzzard:

class A where
  a := 1

class B extends A where
  b := 2

class C extends A where
  c := 3

class D extends C, B where
  d := 4

instance instB : B where
  a := 10
  b := 20

instance instC : C where
  a := 10
  c := 30

instance instD : D :=
  { instB, instC with
    d := 40 }

-- careless `with` removal
instance instDBad : D :=
  { d := 40 }

example : instD.c = instC.c := rfl -- ok
example : instD.b = instB.b := rfl -- ok
example : instDBad.c = instC.c := rfl -- ok
example : instDBad.b = instB.b := rfl  -- fails

Matthew Ballard (Aug 07 2023 at 10:59):

Suppose boo is field of a class Mummy with an auto. If you currently have mummy with … and delete it, then Lean will fall back to the auto for boo instead whatever you carefully put in mummy

Matthew Ballard (Aug 07 2023 at 10:59):

Or what Eric wrote

Eric Wieser (Aug 07 2023 at 11:00):

Note that this isn't just caused by carelessly removing a with now; it's also caused by swapping the order of the parents in D in a future refactor (try it and see above)

Matthew Ballard (Aug 07 2023 at 11:09):

Changing the default to something like exact Monoid.npow and fallback to a try this suggestion for npowRec would help.

Kevin Buzzard (Aug 07 2023 at 11:11):

Yikes. Thanks for the explanation Eric.

Eric Wieser (Aug 07 2023 at 11:46):

Removing all the defaults would also fix this, but be annoying for teaching


Last updated: Dec 20 2023 at 11:08 UTC