Zulip Chat Archive

Stream: mathlib4

Topic: performance issues with `SubringClass`.


Jireh Loreaux (Sep 25 2024 at 14:48):

In #16975 I am trying to add missing instances (or, what would have a similar effect: make some hom classes class abbrev). In fact, there are only two of them, but the performance impact is annoyingly bad. These instances are entirely natural (essentially just SubringClass.toNonUnitalSubringClass and the semiring variant).

This came up as a consequence of #16953 where I'm trying to add classes of C*-algebras in order to improve performance there (which is basically successful). I'm soliciting suggestions for what I should do or attempt.

Eric Wieser (Sep 25 2024 at 14:58):

#12617 is very loosely related

Johan Commelin (Sep 26 2024 at 07:30):

Do you understand in more detail where the performance hit lands? The bench bot shows that certain files become much slower? But are there specific decls that become slower? Or is it just generically "death by a thousand cuts"?

Johan Commelin (Sep 26 2024 at 07:31):

If 1 or 2 decls become significantly slower, then we can try to start working on a MWE displaying the behaviour.

Jireh Loreaux (Sep 30 2024 at 17:50):

I took the four worst offending files and looked at all the declarations in each. It seems like for the worst performance, it is indeed mostly just a few declarations. I've tried looking at some of them, but this isn't my area of Mathlib, so I didn't make much progress. Help would be greatly appreciated.

NB: I compared against 30da628648, which is the first commit in the PR, not the most recent commit (as Eric suggested, and I agree, that the first commit is really the correct one to use; feel free to revert the PR to that commit if you choose to work on this).

NumberTheory.NumberField.Discriminant

NumberTheory.Cyclotomic.PrimitiveRoots

Algebra.Algebra.Subalgebra.Rank

FieldTheory.PurelyInseparable

Jireh Loreaux (Sep 30 2024 at 17:52):

Note that the performance hit in these 4 files accounts for about 1/3 of the entire impact of the PR. (by instruction count)

Jireh Loreaux (Sep 30 2024 at 17:53):

NumberTheory.NumberField.Discriminant alone accounts for 1/6 of the total impact, and in there, it's really just 2-3 declarations that cause the fallout.

Jireh Loreaux (Oct 02 2024 at 17:58):

Can I tempt a number theorist, say, @Filippo A. E. Nuccio or @Riccardo Brasca into taking a look at the slow declarations on 30da628648 in NumberTheory.NumberField.Discriminant that I linked in the fold above? It would be very nice to see if we can address this performance issue to unlock some other PRs.

Riccardo Brasca (Oct 02 2024 at 18:47):

I feel guilty for half of the files, so I will try to have a look. Not sure I will have time before next week though.

Riccardo Brasca (Oct 02 2024 at 18:48):

Well, actually, maybe @Xavier Roblot can have a look at the declarations in the discriminant file.

Xavier Roblot (Oct 02 2024 at 19:19):

Yes, I should have some time to take a look tomorrow or Friday.

Jireh Loreaux (Oct 02 2024 at 22:05):

Riccardo Brasca said:

I feel guilty for half of the files

There's no blame here! I'm not even sure why adding these instances has such a significant performance hit.

Xavier Roblot (Oct 03 2024 at 11:36):

I have posted some possible changes for the file Discriminant.lean on the PR page. With those, I was able to get compilation times comparable to master.

Filippo A. E. Nuccio (Oct 03 2024 at 17:46):

Late to the party, but it seems the problem has been solved :smile:

Jireh Loreaux (Oct 03 2024 at 22:10):

Thanks @Xavier Roblot ! I'll definitely incorporate those changes.

Jireh Loreaux (Oct 03 2024 at 22:15):

As for the Rank.lean file, it seems to me that what's going on is, because of the additional import of the NonUnitalSubring file, Lean spends more time looking for AddCommMonoid instances when it wants to find a Module R M instance in order to make sense of Module.rank R M. Even on current master, elaboration of the statement alone of docs#Subalgebra.rank_sup_eq_rank_left_mul_rank_of_free takes 25000 heartbeats and it spends 1.5 seconds on inferring those Module instances. On the PR, it spends about 31000 heartbeats on the same.

So, my take on that particular file is that inferring those Module instances is already slow, and this only makes it a bit worse. I'm fairly certain that it's slow because it's working with a lot of subtypes, and so, as we've seen before, Lean has to search all over the place because the head expression is a coercion to Sort and the indexing is not fine-grained enough.

Jireh Loreaux (Oct 03 2024 at 22:37):

I managed to speed up docs#IsPrimitiveRoot.powerBasis by replacing

protected noncomputable def powerBasis : PowerBasis K L :=
  PowerBasis.map (Algebra.adjoin.powerBasis <| (integral {n} K L).isIntegral ζ) <|
    (Subalgebra.equivOfEq _ _ (IsCyclotomicExtension.adjoin_primitive_root_eq_top )).trans
      Subalgebra.topEquiv

with

protected noncomputable def powerBasis : PowerBasis K L :=
  letI pb := Algebra.adjoin.powerBasis <| (integral {n} K L).isIntegral ζ;
  pb.map <| (Subalgebra.equivOfEq _ _ (IsCyclotomicExtension.adjoin_primitive_root_eq_top )).trans
    Subalgebra.topEquiv

it went from ~20000 heartbeats on master and ~38000 on this PR, to ~4700 with the new (identical! aside from elaboration) version.

Jireh Loreaux (Oct 03 2024 at 22:38):

@Heather Macbeth :up: given that you found something like this recently, I thought you would be interested.

Jireh Loreaux (Oct 04 2024 at 03:35):

docs#IsPrimitiveRoot.norm_pow_sub_one_of_prime_pow_ne_two was suffering from some serious defeq abuse, and being a bit more careful actually improves performance (on this branch) beyond current master by about 40%.

Jireh Loreaux (Oct 04 2024 at 03:56):

I've split off the performance enhancements into #17395

Jireh Loreaux (Oct 04 2024 at 04:15):

By the way, in the original code for docs#IsPrimitiveRoot.powerBasis above, the synthInstance trace shows repeated attempts at CommRing ?mvar :explosion: . I was under the impression that this should never happen because this is a doomed search. Do we know why that happens? This can even be witnessed on current master (and note that it's the same ?mvar every time).


Last updated: May 02 2025 at 03:31 UTC