Zulip Chat Archive
Stream: mathlib4
Topic: noisy `FilteredColimitCommutesFiniteLimit`
Matthew Ballard (Jan 11 2024 at 19:47):
This file seems very noisy when benchmarking for reasons I cannot figure out yet.
Repeatedly running
gtime -v lake env lean --profile Mathlib/CategoryTheory/Limits/FilteredColimitCommutesFiniteLimit.lean
on my machine shows the user time bounce around by 10-20%. Elaboration is the only metric that seems to vary significantly but does not account for the whole change. Does anyone have any thoughts on where to look for the missing variance?
Kevin Buzzard (Jan 11 2024 at 19:54):
Wouldn't one expect user time to bounce around like this anyway? I've certainly seen outputs of set_option trace.profiler true
bounce around. Everything is super-quick in this file except for two difficult proofs.
Matthew Ballard (Jan 11 2024 at 19:57):
I am using user time on my machine as a proxy for instructions which also bounce around a lot during benchmarking. (Aside: is there really no analog of perf
on MacOS?)
While things bounce around some in general, this file an outlier for benchmarking instructions.
Sebastian Ullrich (Jan 11 2024 at 20:53):
I took a quick look at the file as well, I agree that it's a perplexing outlier
Kevin Buzzard (Jan 11 2024 at 21:43):
OK I'm pointing my finger at the second long proof: docs#CategoryTheory.Limits.colimitLimitToLimitColimit_surjective .This is a subtle proof in category theory, formalised by @Scott Morrison . I'm pinging Scott because I once said to them that count_heartbeats in
was giving me varying results, and I thought they'd claimed in response that it was supposed to be deterministic. So I'm claiming that it's highly non-deterministic in this declaration. I copied the file FilteredColimitCommutesFiniteLimit.lean
up to and including this declaration, into a scratch file, added count_heartbeats in
above it, and then made the following stupid file (because I'm not good at computers):
lake build Mathlib.scratch.scratch10
rm .lake/build/lib/Mathlib/scratch/scratch10*
lake build Mathlib.scratch.scratch10
rm .lake/build/lib/Mathlib/scratch/scratch10*
lake build Mathlib.scratch.scratch10
rm .lake/build/lib/Mathlib/scratch/scratch10*
...
and let it run on my tube journey home. Grepping the output for heartbeats
gives
Used 113061 heartbeats, which is less than the current maximum of 200000.
Used 156769 heartbeats, which is less than the current maximum of 200000.
Used 119662 heartbeats, which is less than the current maximum of 200000.
Used 117448 heartbeats, which is less than the current maximum of 200000.
Used 102932 heartbeats, which is less than the current maximum of 200000.
Used 121102 heartbeats, which is less than the current maximum of 200000.
Used 121290 heartbeats, which is less than the current maximum of 200000.
Used 122790 heartbeats, which is less than the current maximum of 200000.
Used 97722 heartbeats, which is less than the current maximum of 200000.
Used 113103 heartbeats, which is less than the current maximum of 200000.
Used 110243 heartbeats, which is less than the current maximum of 200000.
Used 115873 heartbeats, which is less than the current maximum of 200000.
Used 114272 heartbeats, which is less than the current maximum of 200000.
Used 115779 heartbeats, which is less than the current maximum of 200000.
Used 108220 heartbeats, which is less than the current maximum of 200000.
Used 104048 heartbeats, which is less than the current maximum of 200000.
Used 119068 heartbeats, which is less than the current maximum of 200000.
Used 120746 heartbeats, which is less than the current maximum of 200000.
Used 110828 heartbeats, which is less than the current maximum of 200000.
Used 110804 heartbeats, which is less than the current maximum of 200000.
Used 103106 heartbeats, which is less than the current maximum of 200000.
Used 107728 heartbeats, which is less than the current maximum of 200000.
Used 112339 heartbeats, which is less than the current maximum of 200000.
Used 114127 heartbeats, which is less than the current maximum of 200000.
Used 109402 heartbeats, which is less than the current maximum of 200000.
Used 115403 heartbeats, which is less than the current maximum of 200000.
Used 108656 heartbeats, which is less than the current maximum of 200000.
Used 105474 heartbeats, which is less than the current maximum of 200000.
Used 128118 heartbeats, which is less than the current maximum of 200000.
Used 108604 heartbeats, which is less than the current maximum of 200000.
Used 115873 heartbeats, which is less than the current maximum of 200000.
Used 116891 heartbeats, which is less than the current maximum of 200000.
Used 117010 heartbeats, which is less than the current maximum of 200000.
Used 104934 heartbeats, which is less than the current maximum of 200000.
Used 113224 heartbeats, which is less than the current maximum of 200000.
Used 131571 heartbeats, which is less than the current maximum of 200000.
Used 122630 heartbeats, which is less than the current maximum of 200000.
Used 128894 heartbeats, which is less than the current maximum of 200000.
Used 120839 heartbeats, which is less than the current maximum of 200000.
Used 125893 heartbeats, which is less than the current maximum of 200000.
Used 111033 heartbeats, which is less than the current maximum of 200000.
Used 115798 heartbeats, which is less than the current maximum of 200000.
Used 142694 heartbeats, which is less than the current maximum of 200000.
Used 111420 heartbeats, which is less than the current maximum of 200000.
Used 116855 heartbeats, which is less than the current maximum of 200000.
Used 117161 heartbeats, which is less than the current maximum of 200000.
Used 115206 heartbeats, which is less than the current maximum of 200000.
Used 123498 heartbeats, which is less than the current maximum of 200000.
Used 129311 heartbeats, which is less than the current maximum of 200000.
Used 114465 heartbeats, which is less than the current maximum of 200000.
Used 108799 heartbeats, which is less than the current maximum of 200000.
Used 127767 heartbeats, which is less than the current maximum of 200000.
Used 111308 heartbeats, which is less than the current maximum of 200000.
Used 118953 heartbeats, which is less than the current maximum of 200000.
Used 114031 heartbeats, which is less than the current maximum of 200000.
Used 129164 heartbeats, which is less than the current maximum of 200000.
Used 135640 heartbeats, which is less than the current maximum of 200000.
Used 115803 heartbeats, which is less than the current maximum of 200000.
Used 116829 heartbeats, which is less than the current maximum of 200000.
Used 110955 heartbeats, which is less than the current maximum of 200000.
Used 128525 heartbeats, which is less than the current maximum of 200000.
Used 120924 heartbeats, which is less than the current maximum of 200000.
Used 132223 heartbeats, which is less than the current maximum of 200000.
Used 107426 heartbeats, which is less than the current maximum of 200000.
Used 112406 heartbeats, which is less than the current maximum of 200000.
Used 124224 heartbeats, which is less than the current maximum of 200000.
Used 109971 heartbeats, which is less than the current maximum of 200000.
Used 106555 heartbeats, which is less than the current maximum of 200000.
Used 108331 heartbeats, which is less than the current maximum of 200000.
Used 110922 heartbeats, which is less than the current maximum of 200000.
Used 120041 heartbeats, which is less than the current maximum of 200000.
Used 127298 heartbeats, which is less than the current maximum of 200000.
Used 117376 heartbeats, which is less than the current maximum of 200000.
Used 131497 heartbeats, which is less than the current maximum of 200000.
Used 120933 heartbeats, which is less than the current maximum of 200000.
Used 139723 heartbeats, which is less than the current maximum of 200000.
Used 101761 heartbeats, which is less than the current maximum of 200000.
Used 109955 heartbeats, which is less than the current maximum of 200000.
Used 113512 heartbeats, which is less than the current maximum of 200000.
Used 127074 heartbeats, which is less than the current maximum of 200000.
Used 106246 heartbeats, which is less than the current maximum of 200000.
Used 108073 heartbeats, which is less than the current maximum of 200000.
Used 112868 heartbeats, which is less than the current maximum of 200000.
Used 115719 heartbeats, which is less than the current maximum of 200000.
Used 108666 heartbeats, which is less than the current maximum of 200000.
Used 100549 heartbeats, which is less than the current maximum of 200000.
Used 121516 heartbeats, which is less than the current maximum of 200000.
Used 95127 heartbeats, which is less than the current maximum of 200000.
Used 107210 heartbeats, which is less than the current maximum of 200000.
Used 122246 heartbeats, which is less than the current maximum of 200000.
Used 124293 heartbeats, which is less than the current maximum of 200000.
Used 123156 heartbeats, which is less than the current maximum of 200000.
Used 113086 heartbeats, which is less than the current maximum of 200000.
so the number of heartbeats which that declaration is taking varies from 95127 to 156769, an over 50% increase. Because the file is two long proofs and then a bunch of super-quick stuff, and the proof we're profiling takes about twice as long to compile as the other one, this may explain the high percentage variation picked up by the benchmarking software.
Kim Morrison (Jan 12 2024 at 02:26):
I read through the proof, and don't see anything suspicious.
Kim Morrison (Jan 12 2024 at 02:27):
If I were writing it again today I would take more effort to split it into subcomponents (in my defence I did think about this at the time; it's cumbersome to split).
Kim Morrison (Jan 12 2024 at 02:28):
The fact that there are some erw
s is of course a red flag (although par for the course in the later parts of the category theory library...), but I really don't see anything that has an excuse for being nondeterministic!
Kevin Buzzard (Jan 12 2024 at 09:18):
I am still unclear on whether the answer to "is the number of heartbeats taken by a proof supposed to be the same each time you compile it?" is "yes" or "no". I tried it for a short proof at the beginning of that file and it was always the same.
Sebastian Ullrich (Jan 12 2024 at 09:22):
I think the short answer is that we have not tested this hypothesis in detail yet, especially across mathlib. It certainly would be a nice invariant to have, and it is not yet clear why it is broken here.
Matthew Ballard (Jan 12 2024 at 14:32):
It is now present in the 4 of the last 5 significants runs and in 3 of them it is the only change reported.
Kim Morrison (Jan 13 2024 at 02:34):
Should we just rewrite this proof?
Kevin Buzzard (Jan 13 2024 at 10:05):
I spent two hours yesterday trying to figure out which, if any, line was "causing the problem", by slowly moving a sorry
down the proof and recompiling 20 times and counting heartbeats. One of my conclusions, unfortunately, was that 20 recompilations probably wasn't enough to see the statistics reliably and that perhaps 50 would be better. One thing which would be helpful, but I'm not sure how easy it would be or even if it's a meaningful concept, would be to see the number of heartbeats taken by each tactic invocation in a long proof. That way one could see whether the large spikes are happening because everything is taking a bit longer or whether one tactic is occasionally taking a lot longer.
One thing I did learn from this experiment is that after a few lines the number of heartbeats starts being not completely invariant, you start getting errors of +-1, which turn into +-2 and then +-3 etc -- the error bar is typically about 1% of the total number of heartbeats. The error bar stays at 1% until about half way through the proof by which point it starts creeping up to about 2% around line 250 of the file. It was at this point that I realised that 20 iterations weren't enough because when I switched to 50 I started getting error bars of 3%. But I didn't get as far as finding the 50% variation which we see at the bottom of the proof. I did manage to convince myself that it wasn't the erw
s though.
Mario Carneiro (Jan 13 2024 at 23:08):
Sounds like you should have done binary search instead
Kevin Buzzard (Jan 13 2024 at 23:40):
Yeah we don't get taught that in maths departments
Kim Morrison (Jan 14 2024 at 02:08):
I wrote a heartbeat_variation in
wrapper that does 10 runs and reports the variations, and used this to poke at this proof a bit.
It seems some of the simp only
commands are the perpetrators, here, although I still don't understand why. I've take three of these and replaced them by explicit rw
s, and this reduces the standard deviation in heartbeats from the 25% range to the 5% range.
I'll commit the wrapper and the proof update separately.
Kim Morrison (Jan 14 2024 at 02:22):
Proof in #9732, heartbeat_variation in
in #9733.
Kevin Buzzard (Jan 14 2024 at 12:02):
I agree with the simp only
comment: one point in the file where I felt there seemed to be a jump in error bar percentages (from 1.5 to 3) was at a simp only
on line 255 or so, but then I switched from 20 to 50 compilations and became less confident in this assertion, which was when I decided to give up and post what I knew.
Matthew Ballard (Jan 15 2024 at 20:17):
Could the spirit of FilteredColimitCommutesFiniteLimit
possessed Analysis.NormedSpace.Multilinear.Basic
?
Last updated: May 02 2025 at 03:31 UTC