Zulip Chat Archive
Stream: mathlib4
Topic: simp prefers CharP.cast_eq_zero over Nat.cast_zero
Michael Stoll (Mar 29 2024 at 14:29):
Is there a particular reason why simp uses CharP.cast_eq_zero
instead of Nat.cast_zero
to change casted zeros to actual zeros (when we happen to work in something of characteristic zero)? I find this mildly annoying...
import Mathlib
example : ((0 : ℕ) : ℤ) = 0 := by
simp? says simp only [CharP.cast_eq_zero]
example : ((0 : ℕ) : ZMod 37) = 0 := by
simp? says simp only [Nat.cast_zero]
Yaël Dillies (Mar 29 2024 at 14:33):
I don't think there's much reason besides "That depends on the order in which the two lemmas were declared in a way that is not supposed to be relied upon".
Eric Wieser (Mar 29 2024 at 22:08):
I think this is because the eq_zero
version matches a larger subexpression, so is tried first
Michael Stoll (Apr 07 2024 at 15:29):
Coming back to this: There is also a performance problem. Consider:
import Mathlib
count_heartbeats in
example : ((0 :) : ℤ) = 0 := by
simp only [CharP.cast_eq_zero] -- suggested by `simp?`
-- slow: 1703 heartbeats
count_heartbeats in
example : ((0 :) : ℤ) = 0 := by
simp only [Nat.cast_zero] -- fast (59 heartbeats)
count_heartbeats in
-- shortcut instance suggested by `#synth`, 68 heartbeats
instance : CharZero ℤ := StrictOrderedSemiring.to_charZero
count_heartbeats in
example : ((0 :) : ℤ) = 0 := by
simp only [CharP.cast_eq_zero] -- now fast (66 heartbeats)
This is related to the general problem that typeclass inference is often a peerformance bottleneck. Compare #11824 and the recent discussion here.
- Why is there no shortcut instance for
CharZero ℤ
(and similarly forℚ
andℝ
, but there is docs#Complex.charZero)?
The same question can be asked for many other cases. In my attempt to speed up #11824, I was looking for situations where type class synthesis takes more than 20ms (on my fairly fast laptop) and replaced them by shortcut instances (28 of them in the end, including for CharZero ℤ
and CharZero ℝ
, but there were still a few left). Some examples I would expect to be fast, but which I had to add explicitly:
SMul ℝ ℂ
,SMul ℂ ℂ
Module ℂ ℂ
,Module ℝ ℂ
,Module ℂ (ℂ × ℂ)
AddCommMonoid ℂ
,AddCommMonoid (ℂ × ℂ)
TopologicalSpace ℂ
,TopologicalSpace (ℂ × ℂ)
SMulCommClass ℂ ℂ ℂ
,SMulCommClass ℂ ℝ ℂ
ContinuousConstSMul ℂ ℂ
,ContinuousConstSMul ℝ ℂ
IsScalarTower ℂ ℂ ℂ
,IsScalarTower ℝ ℂ ℂ
(Adding these and a few more on ℂ × ℂ →L[ℂ] ℂ
reduces the time for typeclass inference from about 8 seconds to well below 4 seconds on my machine in TwoVariable.lean
from #11824.)
See also this thread, which resulted in the addition of a shortcut instance for T2Space ℂ
.
Then there are the instances like the one that @David Loeffler mentioned:
import Mathlib
count_heartbeats in -- 32817
#synth FiniteDimensional ℝ (ℂ × ℂ →L[ℂ] ℂ)
As Mathlib grows, I expect this to only get worse, unless there is a way to make typeclass synthesis reliably fast.
One idea (that may be naïve and/or complete nonsense):
Usually such a slow instance is needed several times in a file (think TopologicalSpace ℂ
, say). Would it make sense to cache successful instance searches cumulatively while compiling a file? My understanding is that currently, results are cached only per search query (but I may be wrong). So this would at least speed up things when instances are needed a second, third etc. time.
Kevin Buzzard (Apr 07 2024 at 15:43):
The SMulCommClass
instances were discovered to be a bottleneck in a recent example of David Loeffler's, and Sebastien Gouezel sped them up by changing instance priorities.
Michael Stoll (Apr 07 2024 at 15:50):
I know. Quoting David from a comment on #11824:
On my system, compilation of
JacobiTheta/TwoVariable.lean
(at the current version, commitbf62a22
) takes about 30 seconds. Merging #11980 brings this down to 26 seconds. This is not quite as much of a speedup as adding Michael's list of explicit custom instances (23 sec on my machine), but it's not too far off.
It helps, but it does not really solve the problem.
Yaël Dillies (Apr 07 2024 at 15:58):
Michael Stoll said:
- Why is there no shortcut instance for
CharZero ℤ
(and similarly forℚ
andℝ
, but there is docs#Complex.charZero)?
It should exist, so feel free to add it
Michael Stoll (Apr 07 2024 at 15:59):
OK, but where do I stop? (Looking at the list in my message above...)
Richard Osborn (Apr 07 2024 at 16:27):
Would it be possible to somewhat automatically generate some of these shortcut instances? Just looking at the synthesis of SMul ℝ ℝ
is somewhat surprising. It seems to be using @Algebra.toSMul ℝ ℝ Real.instCommSemiringReal Real.semiring Bialgebra.toAlgebra
, which is constructing a HopfAlgebra ℝ ℝ
. As ℕ
, ℚ
, ℝ
, and ℂ
are so heavily used, it would seem to make sense to automatically generate their instances in certain cases. For example, when Algebra.id
is defined. (and similarly for other structures defined over semirings, rings, fields, etc...).
Richard Osborn (Apr 07 2024 at 16:30):
Another potential solution is to have some sort of test file that makes sure common instances use a "canonical" path through the instance search and throws errors if later definitions cause unnecessary detours.
Kevin Buzzard (Apr 07 2024 at 17:02):
Just looking at the synthesis of
SMul ℝ ℝ
is somewhat surprising.
Right. If you import all of mathlib then it finds a silly route:
import Mathlib
set_option trace.Meta.synthInstance true
#synth SMul ℝ ℝ
/-
[Meta.synthInstance] ✅ SMul ℝ ℝ ▼
[] new goal SMul ℝ ℝ ▶
[] ❌ apply @GradedMonoid.GradeZero.smul to SMul ℝ ℝ ▶
[] ✅ apply @Algebra.toSMul to SMul ℝ ℝ ▼
[tryResolve] ✅ SMul ℝ ℝ ≟ SMul ℝ ℝ
[] new goal CommSemiring ℝ ▼
[instances] #[@CommRing.toCommSemiring, @OrderedCommSemiring.toCommSemiring, @StrictOrderedCommSemiring.toCommSemiring, @CanonicallyOrderedCommSemiring.toCommSemiring, @Semifield.toCommSemiring, @IdemCommSemiring.toCommSemiring, @DirectSum.GradeZero.commSemiring, Real.instCommSemiringReal]
[] ✅ apply Real.instCommSemiringReal to CommSemiring ℝ ▼
[tryResolve] ✅ CommSemiring ℝ ≟ CommSemiring ℝ
[resume] propagating CommSemiring ℝ to subgoal CommSemiring ℝ of SMul ℝ ℝ ▼
[] size: 1
[] new goal Semiring ℝ ▶
[] ✅ apply Real.semiring to Semiring ℝ ▼
[tryResolve] ✅ Semiring ℝ ≟ Semiring ℝ
[resume] propagating Semiring ℝ to subgoal Semiring ℝ of SMul ℝ ℝ ▼
[] size: 2
[] new goal Algebra ℝ ℝ ▼
[instances] #[@Algebra.complexToReal, Algebra.id, @NormedAlgebra.toAlgebra, @Bialgebra.toAlgebra]
[] ✅ apply @Bialgebra.toAlgebra to Algebra ℝ ℝ ▶
[] ✅ apply @HopfAlgebra.toBialgebra to Bialgebra ℝ ℝ ▶
[] ✅ apply CommSemiring.toHopfAlgebra to HopfAlgebra ℝ ℝ ▶
[resume] propagating HopfAlgebra ℝ ℝ to subgoal HopfAlgebra ℝ ℝ of Bialgebra ℝ ℝ ▶
[resume] propagating Bialgebra ℝ ℝ to subgoal Bialgebra ℝ ℝ of Algebra ℝ ℝ ▶
[resume] propagating Algebra ℝ ℝ to subgoal Algebra ℝ ℝ of SMul ℝ ℝ ▶
[] result Algebra.toSMul
-/
The issue (sorry if you know this already!) is that progress on the goal Algebra ℝ ℝ
can be made with the instances [@Algebra.complexToReal, Algebra.id, @NormedAlgebra.toAlgebra, @Bialgebra.toAlgebra]
, and they are tried in order from last-discovered to first-discovered. But the leaf files in mathlib (the last-discovered stuff) are the ones with the wackiest instances, so right now typeclass inference is doing a lot of goofy stuff.
My understanding is that the suggestions to fix this so far have been:
1) persuade the FRO to change the algorithm to try first-discovered instances first
2) persuade the FRO that certain forgetful inheritance instances should have lower priority (I still don't understand this option properly, but people like @Eric Wieser do)
3) manually fiddle with instance priorities in mathlib a la #11980 (a hard-to-maintain solution)
4) Add a bunch of typeclass shortcuts (see the suggestion above)
Note that the FRO are unlikely to want to start changing the typeclass algorithm, and I suspect that of their major users, mathlib will be the one with by far the largest typeclass instance system.
Vincent Beffara (Apr 07 2024 at 17:06):
Would it make sense to also allow to register non-instances (e.g. telling lean that there is no SMul C R
instance) to prevent the inference system from spending time in useless rabbit holes?
Kevin Buzzard (Apr 07 2024 at 17:07):
Again, that boils down to asking the FRO to change the way typeclass inference works, which is really unlikely to get anywhere.
Eric Wieser (Apr 07 2024 at 17:27):
I think your point 2) above is probably not very controversial, and just requires persuading the FRO to allocate time to it; perhaps @Kyle Miller can comment on whether it's on their radar, and how many other side-projects they have with which it would be competing for time allocation.
Richard Osborn (Apr 07 2024 at 17:28):
This may be a silly question, but is there a way to generate all the "paths" through the instance search (perhaps with some depth limit)? I find that as I get deeper into the mathlib library (especially working with tensors), the instance search becomes cumbersome, and these shortcut instances are almost necessary. If we were able to view the paths in the instance search, we may be able to have meta code suggest the shortcut instances.
Michael Stoll (Apr 07 2024 at 18:21):
I'm trying to figure out where to put a shortcut instance CharZero ℤ
.
It looks like the minimal imports required are
Mathlib.Algebra.CharZero.Defs
(forCharZero
) andMathlib.Data.Int.Basic
(forInt.ofNat.inj
).
There is no immediate common successor of these two. Except for Mathlib.Algebra.CharZero.Infinite
, which is a leaf file containing exactly one declaration, all other files importing Mathlib.Algebra.CharZero.Defs
involve order. But we don't actually need the order (on the naturals or integers) to obtain a CharZero
instance, so it feels a bit unnatural to go via these.
Kyle Miller (Apr 07 2024 at 18:22):
@Eric Wieser I think for now the typeclass inference algorithm should be considered to be more-or-less frozen, but potentially adjusting the structure
command's way of setting instance priorities might be a possibility, or the order in which it defines instances, if it's a well-motivated change.
Michael Stoll (Apr 07 2024 at 18:27):
It looks like Mathlib.Data.Int.Order.Basic
is a file with the right imports (still involving order, but it has some instances for ℤ
(and likely to be imported transitively by files that use that ℤ
has characteristic zero), so maybe this is a reasonable place.
Kyle Miller (Apr 07 2024 at 18:31):
@Vincent Beffara Do you know if anyone has explored non-instances in other systems? Without previous research evaluating such an idea, it might or might not make sense. The closest idea that I personally know about is the cut operator in Prolog, but I think with that power you lose some desirable properties (Datalog excludes it because of this).
Michael Stoll (Apr 07 2024 at 18:39):
Michael Stoll said:
It looks like
Mathlib.Data.Int.Order.Basic
is a file with the right imports (still involving order, but it has some instances forℤ
(and likely to be imported transitively by files that use thatℤ
has characteristic zero), so maybe this is a reasonable place.
I've just pulled Mathlib and the file is no longer there? It seems to have moved to Mathlib.Algebra.Order.Ring.Int
...
David Loeffler (Apr 07 2024 at 18:40):
This idea (of "non-instances") seems like a significantly more invasive change to the typeclass inference algorithm than either of Kevin's suggestions (1) or (2). So it seems strange to consider it when those other options have been dismissed because "typeclass inference should be considered more-or-less frozen".
Michael Stoll (Apr 07 2024 at 18:41):
...and this file no longer imports CharZero
:frown:
Michael Stoll (Apr 07 2024 at 18:43):
@Yaël Dillies if you are still around: any suggestion where to put this instance?
Kyle Miller (Apr 07 2024 at 19:42):
(@David Loeffler I don't mean to suggest that it's being considered, but rather to say "even if changes to the typeclass algorithm were being considered, this is unexplored territory." I'm personally curious to know if anyone's tried it before though.)
Vincent Beffara (Apr 07 2024 at 19:43):
Kyle Miller said:
Vincent Beffara Do you know if anyone has explored non-instances in other systems? Without previous research evaluating such an idea, it might or might not make sense. The closest idea that I personally know about is the cut operator in Prolog, but I think with that power you lose some desirable properties (Datalog excludes it because of this).
I have no idea if it makes sense or if it has been tried, it just felt from the discussion that it would be a plausible thing to do.
Eric Wieser (Apr 07 2024 at 20:05):
Non-instances sound like a reasonably thing to try, but they don't sound particularly scalable; they'd really just be a mechanism to prune out-of-control TC search on an ad-hoc basis
Yaël Dillies (Apr 09 2024 at 21:25):
@Michael Stoll said:
Yaël Dillies if you are still around: any suggestion where to put this instance?
docs#CharZero relies on docs#AddMonoidWithOne, which is a ring-like concept. Hence it does not belong in Algebra.Group.Int
/Algebra.Order.Group.Int
. It has pretty low imports apart from that, so Algebra.Ring.Int
is a good place.
Yaël Dillies (Apr 09 2024 at 21:26):
You will probably get errors about assert_not_exists NeZero
around Algebra.GroupPower
. The very existence of those files is bogus, so I would simply delete the assert_not_exists
(but please ping me on Github if you do so, just so that I can assess)
Michael Stoll (Apr 11 2024 at 09:42):
@Yaël Dillies Right now, docs#CharZero is unknown in Algebra.Ring.Int
(and Algebra.Ring.Nat
). Do I understand you correctly that it is OK to add import Mathlib.Algebra.CharZero.Defs
to this file?
Ruben Van de Velde (Apr 11 2024 at 09:55):
Yes, that's how I read the comment
Michael Stoll (Apr 11 2024 at 12:34):
Michael Stoll (Apr 11 2024 at 18:02):
SMul ℂ ℂ
still takes about 11000 heartbeats. This is caused by the algorithm trying Complex.instSMulRealComplex
first, leading it to trying to establish SMul ℂ ℝ
, which takes a very long time to fail. The correct thing to try is Algebra.toSMul
, but increasing its priority to 1000 does not seem to have any effect (I wonder why). However, setting the priority of Complex.instSMulRealComplex
to 90 brings it down to about 30 heartbeats(!).
import Mathlib
count_heartbeats in -- 10881 heartbeats
#synth SMul ℂ ℂ
attribute [instance 90] Complex.instSMulRealComplex
count_heartbeats in -- 31 heartbeats
#synth SMul ℂ ℂ
I think I will make a PR reducing this priority and see what the effect is.
Ruben Van de Velde (Apr 11 2024 at 18:40):
Independently, maybe also add the shortcut instance?
Michael Stoll (Apr 11 2024 at 18:41):
This shouldn't save much in addition, I would think.
Michael Stoll (Apr 11 2024 at 19:44):
#12070
It saves a few tens of (american) billions of instructions, mostly in files related to modular forms (total build instructions -67.551*10^9).
BTW, how can I sort the speedcenter output according to relative (or absolute) gain/loss? I can sort according to the numbers in the first or second run by clicking on the column heads, but I don't see how to do something similar with the comparison column.
Michael Stoll (Apr 11 2024 at 20:26):
NumberTheory.LSeries.Basic
and ...Linearity
also profit.
@David Loeffler @Chris Birkbeck
Michael Stoll (Apr 11 2024 at 20:28):
... and NumberTheory.NumberField.Embeddings
...
Michael Stoll (Apr 12 2024 at 08:20):
Michael Stoll said:
BTW, how can I sort the speedcenter output according to relative (or absolute) gain/loss? I can sort according to the numbers in the first or second run by clicking on the column heads, but I don't see how to do something similar with the comparison column.
Not sure who to ping about this. Perhaps @Sebastian Ullrich ?
Michael Stoll (Apr 12 2024 at 08:22):
I am now wondering whether we need instSMulRealComplex
as an instance at all.
Does anybody have a use case where we have SMul R ℝ
and not also (e.g.) Algebra R ℂ
?
Michael Stoll (Apr 12 2024 at 08:23):
I'll make a PR demoting it to a def
, to see if anything breaks.
Michael Stoll (Apr 12 2024 at 08:35):
It seems to be needed at least locally in Data.Complex.Basic
, to get stuff like nsmul
and zsmul
on ℂ
.
I'm making the instance local to the file; let's see...
Michael Stoll (Apr 12 2024 at 11:55):
#12080
I ended up scoping it, because it is also needed in Data.Complex.Module
.
Nothing breaks, and there is a positive speed effect on these two files.
Last updated: May 02 2025 at 03:31 UTC