Zulip Chat Archive
Stream: mathlib4
Topic: the number of instances is too dang high
Matthew Ballard (Jun 27 2024 at 14:51):
As a experiment, I tried to made assert_not_exists OrderedSemiring
succeed in RingTheory.Kaehler.Basic
in branch#mrb/working. (Note: it is far from building all of mathlib and there are many sorry
's)
The intent was to spot check the impact on performance from not having the projections from ordered algebra classes on typeclass synthesis.
Here its profile output
cumulative profiling times:
attribute application 12.2ms
compilation 59.6ms
dsimp 122ms
elaboration 859ms
import 380ms
initialization 16ms
interpretation 2.8s
linting 100ms
parsing 23.2ms
simp 2.41s
tactic execution 2.32s
type checking 2.15s
typeclass inference 15s
Command being timed: "lake env lean --profile Mathlib/RingTheory/Kaehler/Basic.lean"
User time (seconds): 25.95
System time (seconds): 0.43
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:26.45
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1104592
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 110
Minor (reclaiming a frame) page faults: 221275
Voluntary context switches: 25
Involuntary context switches: 2144
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 16384
Exit status: 0
vs current master
cumulative profiling times:
attribute application 12.9ms
compilation 60.3ms
dsimp 128ms
elaboration 888ms
import 498ms
initialization 16ms
interpretation 3.14s
linting 105ms
norm_num 0.653ms
parsing 23.4ms
simp 2.59s
tactic execution 2.39s
type checking 2.23s
typeclass inference 20s
Command being timed: "lake env lean --profile Mathlib/RingTheory/Kaehler/Basic.lean"
User time (seconds): 31.47
System time (seconds): 0.60
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.26
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1348592
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 602
Minor (reclaiming a frame) page faults: 247703
Voluntary context switches: 131
Involuntary context switches: 3941
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 16384
Exit status: 0
The main difference is 15s in typeclass synthesis vs 20s depending on whether you have ordered algebra floating around.
Compared other files that had to be rid of ordered algebra to reach here, this on the low end. LinearAlgebra.TensorProduct.RightExactness
dropped by about 40%.
Matthew Ballard (Jun 27 2024 at 14:54):
If you've stared a winding trace of instances synthesis, then you can viscerally appreciate the problem when there are too many projections to, say, Ring
, in the context and the file itself doesn't deal with orders or analysis.
Matthew Ballard (Jun 27 2024 at 14:59):
This problem will only get worse as more complicated classes are built which intertwine algebra with other hierarchies so I want to get a sense of what people think should be done:
- Nothing -- not a great option but often the choice
- Core fix -- I don't know what avoids this without seriously touching instance synthesis which is a no-go currently.
- Carefully refactor to use the import structure to cut down on the instances in scope.
- Manually scope the instances in some way, eg write an attribute that erases the instance and scopes it on declaring.
- Unbundle algebra from everything else.
Are there other options?
Michael Stoll (Jun 27 2024 at 15:01):
I think @Yaël Dillies is planning on doing a version of 5 at some point.
Matthew Ballard (Jun 27 2024 at 15:01):
For ordered algebra, yes
Matthew Ballard (Jun 27 2024 at 15:03):
The disadvantage of 5 is that you will need a [Ring R]
every time you want an OrderedRing
.
Matthew Ballard (Jun 27 2024 at 15:04):
But nothing is hidden from the user as with 4.
Michael Stoll (Jun 27 2024 at 15:05):
I tried a version of 4 (for algebra vs. order) in #12778.
Matthew Ballard (Jun 27 2024 at 15:09):
Hmm. I would expect more benefits than that
Matthew Ballard (Jun 27 2024 at 15:12):
Yeah, it looks like some things leak through. For example,
variable {α : Type*} [OrderedRing α] in
#synth Ring α -- OrderedRing.toRing
succeeds in the Kaehler.Basic
Matthew Ballard (Jun 27 2024 at 16:08):
Ok, I started on 5. It is something I can nibble at in between other commitments
Josha Dekker (Jun 27 2024 at 16:16):
Matthew Ballard said:
Ok, I started on 5. It is something I can nibble at in between other commitments
I'm sure you're already aware of @Yaël Dillies thoughts and work on this, but let me an issue that I believe is relevant just in case: https://github.com/leanprover-community/mathlib4/issues/11757
Floris van Doorn (Jun 27 2024 at 18:07):
I have gotten to similar conclusion around this message
It would be great if we could decouple parts of the hierarchy more. It would be great if we can investigate option 4 more. I expect that if we scope all instances used <10 times in Mathlib, that will already result in a noticeable gain.
Matthew Ballard (Jun 27 2024 at 20:05):
If anyone wants something to do branch#mrb/unbundle_algebra_from_orders
Matthew Ballard (Jun 27 2024 at 20:05):
Not sure when I will touch it again today
Floris van Doorn (Jun 27 2024 at 21:24):
I tested my example from 2 months ago again, and it fails 30x faster now. So the combination of lean4#3996, removing some instances from Mathlib and whatever else we did in the meantime did have a big effect on this.
Yaël Dillies (Jun 27 2024 at 22:19):
Yes, I have a pretty clear picture of what I want to try. Nothing concrete quite yet, and I won't have time to work on this before another two weeks
Matthew Ballard (Jun 28 2024 at 10:10):
Matthew Ballard said:
If anyone wants something to do branch#mrb/unbundle_algebra_from_orders
And I now realize I ran off too quickly and didn't push this at post time... :man_facepalming:
Matthew Ballard (Jun 28 2024 at 16:37):
This has reached submodules now.
Matthew Ballard (Jul 01 2024 at 17:03):
Headline: only ~2% decrease in total instructions . Observations:
- it is fairly straightforward except for the few cases where we built algebra data using ordered info, eg
Nonneg
types. - writing the algebraic instances always is not terrible is pretty annoying (looking at
Valuation.Basic
) - the
simp
lemmas proving algebraic statements using ordered info are still in scope and drag down performance - some files really win, eg
Algebra.Star.NonUnitalSubalgebra
- some namespaces are kinda chaotic, eg
EllipticCurve
Floris van Doorn (Jul 01 2024 at 17:06):
-7.632 %
on typeclass inference is very good, IMO.
Matthew Ballard (Jul 01 2024 at 17:08):
I can make it better with lean#2905 but overall we regress
Matthew Ballard (Jul 01 2024 at 17:28):
I think the best solution is to comb the import ball of yarn. It is probably the most involved though.
Matthew Ballard (Jul 01 2024 at 17:48):
In this direction, #14121 is awaiting-review
now
Matthew Ballard (Jul 01 2024 at 17:58):
Also #14355
Matthew Ballard (Jul 01 2024 at 18:00):
Here is the graph from Algebra.Order.Group.Defs
to Algebra.Star.NonUnitalSubalgebra
.
Matthew Ballard (Jul 01 2024 at 18:14):
#14336 is another easy one avoiding importing Algebra.Order.BigOperators.Ring.Finset
in BigOperators.Finsupp
to talk about positive natural numbers.
Matthew Ballard (Jul 01 2024 at 18:26):
#14338 just deletes a redundant import. Waiting to see if it is needed downstream.
Matthew Ballard (Jul 01 2024 at 18:59):
Matthew Ballard said:
#14338 just deletes a redundant import. Waiting to see if it is needed downstream.
Nope. Builds just fine. Clearly easy
now
Matthew Ballard (Jul 01 2024 at 20:16):
#14341 avoids ordered algebra in Data.Rat.Lemmas
.
Matthew Ballard (Jul 01 2024 at 22:02):
#14129 avoids ordered algebra in Set.Pointwise.Basic
by moving one declaration to a new file Set.Pointwise.BoundedMul
Kevin Buzzard (Jul 01 2024 at 22:24):
Matthew Ballard said:
Matthew Ballard said:
#14338 just deletes a redundant import. Waiting to see if it is needed downstream.
Nope. Builds just fine. Clearly
easy
now
Why isn't shake
going nuts about this one?
Matthew Ballard (Jul 01 2024 at 22:26):
Not sure but rerunning locally it did. Then I had to do a shake hokey-pokey
Matthew Ballard (Jul 01 2024 at 22:26):
Oh wait. That was another one.
Matthew Ballard (Jul 01 2024 at 22:26):
I assume pollution of noshake
Matthew Ballard (Jul 01 2024 at 22:27):
The motivation for #mathlib4 > shake hangs with no config file
Mario Carneiro (Jul 01 2024 at 22:28):
what do you mean by pollution?
Matthew Ballard (Jul 01 2024 at 22:28):
People just run --update
reflexively
Matthew Ballard (Jul 01 2024 at 22:30):
Here is a question: how many import Mathlib.Logic.Function.Iterate
can be straight deleted?
Mario Carneiro (Jul 01 2024 at 22:58):
Is there something about Mathlib.Logic.Function.Iterate
that foils shake? Some spot checks suggest that it can indeed be deleted in some cases but it's not noshaken
Matthew Ballard (Jul 01 2024 at 22:58):
Oh, I assumed it was noshaken
Mario Carneiro (Jul 01 2024 at 23:00):
most likely it's transitively imported by sibling imports in many files
Matthew Ballard (Jul 01 2024 at 23:04):
#14122 moves ordered algebra instances on products to a separate file
Kevin Buzzard (Jul 01 2024 at 23:04):
I know it's been said before but that declarations diff thing is an absolute godsend.
Matthew Ballard (Jul 01 2024 at 23:04):
Yes, that is very nice
Kim Morrison (Jul 01 2024 at 23:41):
Matthew Ballard said:
Also #14355
@Matthew Ballard I think this was a typo, I can't find it.
Matthew Ballard (Jul 01 2024 at 23:47):
#14335 ?
Matthew Ballard (Jul 02 2024 at 10:46):
#14351 avoids bundled ordered algebra in unbundled ordered algebra
Matthew Ballard (Jul 02 2024 at 22:49):
#14371 splits Data.Nat.Cast.Order
into files depending on unbundled ordered algebra and bundled ordered algebra.
Following up from that #14370 splits Algebra.Order.Nonneg.Ring
similarly.
The next step is NNRat.Defs
.
Ralf Stephan (Jul 03 2024 at 08:04):
Could you also have a look at #14357?
Matthew Ballard (Jul 03 2024 at 12:35):
#14380 splits Algebra.Order.Group.Defs
moving the unbundled results to Algebra.Order.Group.Unbundled.Basic
Matthew Ballard (Jul 10 2024 at 14:04):
Some results coming from minimizing imports.
The goal was to cleave Algebra.Star.NonUnitalSubalgebra
from OrderedCommMonoid
. The benefits are the same as unbundling the algebra from bundled ordered algebra hierarchy.
Matthew Ballard (Jul 10 2024 at 14:15):
Behold the graph from bundled ordered rings to Module.PID
Kim Morrison (Jul 10 2024 at 16:02):
We need sober spaces to talk about modules over a PID? :-)
Matthew Ballard (Jul 10 2024 at 17:21):
Exactly how I was taught it in Algebra I
Yaël Dillies (Jul 10 2024 at 19:05):
Algebraic geometry in my PIDs? :face_with_raised_eyebrow:
Matthew Ballard (Jul 10 2024 at 19:56):
One thing you don't see compared to the unbundling is the increase in linting though. We are fairly flat for instructions.
Matthew Ballard (Jul 10 2024 at 20:00):
Two main things are cause for the tangles:
- At the bottom of the ordered algebra stuff, it is quite common to have a single file with unbundled classes/results and bundled classes/results together. This means when you want something basic to glue together
mul
and<
, for example, then the full bundled machinery comes for a ride. - There is tendency to put very small sections in
Basic
files where you prove your thing isLinearOrderedField
under assumptions that make you import things you don't otherwise need forBasic
.
Matthew Ballard (Jul 10 2024 at 21:45):
If you don't want modules over a PID to require Mathlib.Topology.Basic
#14631
Matthew Ballard (Aug 16 2024 at 12:33):
I am still on this kick with PRs that should be pretty uncontroversial: #15071, #15072, and #15152
Last updated: May 02 2025 at 03:31 UTC