Zulip Chat Archive

Stream: maths

Topic: Why is class_group.mk so slow?


David Ang (Dec 17 2022 at 14:44):

I was trying to define a map to the class group of a very explicit polynomial ring of a curve and prove some properties, but it quickly became incredibly annoying because elaborating the definition takes ~10s on my machine, while any theorems I prove would take several minutes to type-check and times out with the standard timeout settings.

import ring_theory.class_group
open polynomial
open_locale non_zero_divisors polynomial

variables (K : Type) [field K]
structure my_curve := (a : K)
variables {K} (M : my_curve K)
noncomputable def my_polynomial : K[X][X] := X ^ 2 - C (X ^ 3 + C M.a)
@[derive comm_ring] def my_ring := adjoin_root $ my_polynomial M
instance : is_domain $ my_ring M := sorry

set_option profiler true
set_option profiler.freq 1

noncomputable def my_ideal : ideal $ my_ring M := ideal.span {adjoin_root.mk (my_polynomial M) X}
noncomputable def my_units : (fractional_ideal (my_ring M) $ fraction_ring $ my_ring M)ˣ := my_ideal M, my_ideal M, sorry, sorry
noncomputable def my_class : class_group $ my_ring M := class_group.mk $ my_units M -- ~10s

The complete version of this is in the associativity branch, which has the theorem I want to prove but times out. Why is class_group.mk so slow? Is it because the construction just has too many layers for Lean to parse in time? Are there any speed-up tricks I can do to go around this problem?

This is a continuation of the speed issue in the elliptic curves thread that is still unresolved.

Eric Wieser (Dec 17 2022 at 15:13):

docs#class_group.mk

Junyan Xu (Dec 18 2022 at 18:44):

On my computer it's faster, less than 4s. I'm on commit 809e920edf, or maybe just my machine is faster (i9-9880H @ 2.30GHz).
Two observations: first,

noncomputable def my_class' : class_group $ my_ring M :=
class_group.mk ⟨(my_ideal M : fractional_ideal (my_ring M) $ fraction_ring $ my_ring M),
  my_ideal M, sorry, sorry

is fast enough (<0.7s).
Second, if I add a "shortcut" instance

noncomputable instance c : algebra (my_ring M) (fraction_ring $ my_ring M) := infer_instance

then the version above takes ~1.1s, while your original version times out (>1min).

David Ang (Dec 19 2022 at 02:19):

Junyan Xu said:

On my computer it's faster, less than 4s. I'm on commit 809e920edf, or maybe just my machine is faster (i9-9880H @ 2.30GHz).
Two observations: first,

noncomputable def my_class' : class_group $ my_ring M :=
class_group.mk ⟨(my_ideal M : fractional_ideal (my_ring M) $ fraction_ring $ my_ring M),
  my_ideal M, sorry, sorry

is fast enough (<0.7s).
Second, if I add a "shortcut" instance

noncomputable instance c : algebra (my_ring M) (fraction_ring $ my_ring M) := infer_instance

then the version above takes ~1.1s, while your original version times out (>1min).

The shortcut instance doesn't really do much for me, but replacing my_units M with its definition does elaborate the definition significantly faster. The problem with this is that each of these sorrys is very long when filled in properly, and it's not realistic to copy the entire unit structure everywhere. I did try replacing every definition/lemma in the result I'm trying to prove with their actual definition/proofs and it does elaborate within somewhat reasonable time, but the proof becomes extremely wordy i.e. essentially squeezing most of the definitions and lemmas from line 472 to line 623 in the branch into one proof.

I don't know why your version is significantly faster, but I'm guessing this is some reducibility or elaboration order issue that I don't know how to control at all. In the original branch I've tried making the analogue of my_ring reducible, and even introduced something like a reducible function_field := fraction_ring $ my_ring M, and it seems to make everything even slower.

Junyan Xu (Dec 19 2022 at 05:18):

The shortcut instance doesn't really do much for me

Yes, what I observed is that it makes things slower, which is puzzling. It's also puzzling that replacing my_units M with its definition makes things faster.

Anne Baanen (Dec 19 2022 at 11:03):

Let me take a look. If adding a shortcut instance makes it slower, it's probably the unifier that is being slow.

Anne Baanen (Dec 19 2022 at 11:05):

Indeed, set_option trace.class_instances true gives about 5x more output for my_units than my_class, while my_units is about 3x faster than my_class.

Anne Baanen (Dec 19 2022 at 11:13):

As I suspected: set_option trace.type_context.is_def_eq_detail true gives about 240 000 lines of output on my_class before timing out. From the trace, it seems we're triggering unification inside types, which Lean is very slow at:

[type_context.is_def_eq_detail] [1]: (fractional_ideal (my_ring M) (fraction_ring (my_ring M)))ˣ =?= (fractional_ideal ?m_3⁰ ?m_5)ˣ
[type_context.is_def_eq_detail] [2]: units =?= units
[type_context.is_def_eq_detail] [2]: fractional_ideal (my_ring M) (fraction_ring (my_ring M)) =?= fractional_ideal ?m_3⁰ ?m_5
...
[type_context.is_def_eq_detail] [3]: my_ring.comm_ring M =?= my_ring.comm_ring M
[type_context.is_def_eq_detail] [3]: localization.comm_ring =?= euclidean_domain.to_comm_ring
[type_context.is_def_eq_detail] unfold left: localization.comm_ring
[type_context.is_def_eq_detail] [4]: {add := comm_semiring.add localization.comm_semiring,
 add_assoc := _,
 zero := comm_semiring.zero localization.comm_semiring,
 zero_add := _,
 add_zero := _,
 nsmul := comm_semiring.nsmul localization.comm_semiring,
 nsmul_zero' := _,
 nsmul_succ' := _,
 neg := has_neg.neg localization.has_neg,
 sub := λ (x y : localization (my_ring M)), x + -y,
 sub_eq_add_neg := _,
 zsmul := has_smul.smul localization.has_smul,
 zsmul_zero' := _,
 zsmul_succ' := _,
 zsmul_neg' := _,
 add_left_neg := _,
 add_comm := _,
 int_cast := ring.int_cast._default comm_semiring.nat_cast comm_semiring.add localization.comm_ring._proof_13
               comm_semiring.zero
               localization.comm_ring._proof_14
               localization.comm_ring._proof_15
               comm_semiring.nsmul
               localization.comm_ring._proof_16
               localization.comm_ring._proof_17
               comm_semiring.one
               localization.comm_ring._proof_18
               localization.comm_ring._proof_19
               has_neg.neg
               (λ (x y : localization (my_ring M)), x + -y)
               localization.comm_ring._proof_20
               has_smul.smul
               localization.comm_ring._proof_21
               localization.comm_ring._proof_22
               localization.comm_ring._proof_23
               localization.comm_ring._proof_24,
 nat_cast := comm_semiring.nat_cast localization.comm_semiring,
 one := comm_semiring.one localization.comm_semiring,
 nat_cast_zero := _,
 nat_cast_succ := _,
 int_cast_of_nat := _,
 int_cast_neg_succ_of_nat := _,
 mul := comm_semiring.mul localization.comm_semiring,
 mul_assoc := _,
 one_mul := _,
 mul_one := _,
 npow := comm_semiring.npow localization.comm_semiring,
 npow_zero' := _,
 npow_succ' := _,
 left_distrib := _,
 right_distrib := _,
 mul_comm := _} =?= euclidean_domain.to_comm_ring
[type_context.is_def_eq_detail] [5]: {add := comm_semiring.add localization.comm_semiring,
 add_assoc := _,
 zero := comm_semiring.zero localization.comm_semiring,
 zero_add := _,
 add_zero := _,
 nsmul := comm_semiring.nsmul localization.comm_semiring,
 nsmul_zero' := _,
 nsmul_succ' := _,
 neg := has_neg.neg localization.has_neg,
 sub := λ (x y : localization (my_ring M)), x + -y,
 sub_eq_add_neg := _,
 zsmul := has_smul.smul localization.has_smul,
 zsmul_zero' := _,
 zsmul_succ' := _,
 zsmul_neg' := _,
 add_left_neg := _,
 add_comm := _,
 int_cast := ring.int_cast._default comm_semiring.nat_cast comm_semiring.add localization.comm_ring._proof_13
               comm_semiring.zero
               localization.comm_ring._proof_14
               localization.comm_ring._proof_15
               comm_semiring.nsmul
               localization.comm_ring._proof_16
               localization.comm_ring._proof_17
               comm_semiring.one
               localization.comm_ring._proof_18
               localization.comm_ring._proof_19
               has_neg.neg
               (λ (x y : localization (my_ring M)), x + -y)
               localization.comm_ring._proof_20
               has_smul.smul
               localization.comm_ring._proof_21
               localization.comm_ring._proof_22
               localization.comm_ring._proof_23
               localization.comm_ring._proof_24,
 nat_cast := comm_semiring.nat_cast localization.comm_semiring,
 one := comm_semiring.one localization.comm_semiring,
 nat_cast_zero := _,
 nat_cast_succ := _,
 int_cast_of_nat := _,
 int_cast_neg_succ_of_nat := _,
 mul := comm_semiring.mul localization.comm_semiring,
 mul_assoc := _,
 one_mul := _,
 mul_one := _,
 npow := comm_semiring.npow localization.comm_semiring,
 npow_zero' := _,
 npow_succ' := _,
 left_distrib := _,
 right_distrib := _,
 mul_comm := _} =?= {add := has_add.add (distrib.to_has_add ?m_3),
 add_assoc := _,
 zero := 0,
 zero_add := _,
 add_zero := _,
 nsmul := field.nsmul ?m_4,
 nsmul_zero' := _,
 nsmul_succ' := _,
 neg := has_neg.neg (sub_neg_monoid.to_has_neg ?m_3),
 sub := field.sub ?m_4,
 sub_eq_add_neg := _,
 zsmul := field.zsmul ?m_4,
 zsmul_zero' := _,
 zsmul_succ' := _,
 zsmul_neg' := _,
 add_left_neg := _,
 add_comm := _,
 int_cast := field.int_cast ?m_4,
 nat_cast := field.nat_cast ?m_4,
 one := 1,
 nat_cast_zero := _,
 nat_cast_succ := _,
 int_cast_of_nat := _,
 int_cast_neg_succ_of_nat := _,
 mul := has_mul.mul (distrib.to_has_mul ?m_3),
 mul_assoc := _,
 one_mul := _,
 mul_one := _,
 npow := field.npow ?m_4,
 npow_zero' := _,
 npow_succ' := _,
 left_distrib := _,
 right_distrib := _,
 mul_comm := _}

Kevin Buzzard (Dec 19 2022 at 11:18):

Thanks so much Anne!

Anne Baanen (Dec 19 2022 at 11:25):

Okay, so one solution is to add attribute [irreducible] my_ring.comm_ring, which at least will block the type-level unification at a doable level. That cuts elaboration time on my machine by a factor 6 or so.

Anne Baanen (Dec 19 2022 at 11:26):

Making other things irreducible either doesn't help much or causes errors because diamond inheritance doesn't line up anymore.

Anne Baanen (Dec 19 2022 at 11:42):

I was looking for another thread where we discuss this, but can't find anything good. In short: under some circumstances (which might just be "always"), checking the definitional equality of two terms requires checking the equality of their types, which can require checking the types' types' equality, etc. And under some circumstances, Lean does not cache these equalities of types, which means it has to recheck the types are equal, and the types' types, and so on. So Lean will do something like (term size)^(type complexity) defeq checks in this case, which is quite slow.

I recall that there are cases where this checking is indeed needed, and definitional structure eta in Lean 4 might shrink the set of circumstances that trigger this bad behaviour. Still I'd like to see this fixed in these common cases, especially where making things irreducible speeds it up so much.

Anne Baanen (Dec 19 2022 at 11:44):

Until then your best bet is to make as many definitions that are defined in terms of other types irreducible or structures.

Anne Baanen (Dec 19 2022 at 11:46):

Just to be clear, by type complexity I'm talking about the nesting depth in units (fractional_ideal (localization (non_zero_divisors (quotient (ideal (polynomial (polynomial K))))))), each head of which depends on some sort of ring instance on the type inside it.

David Ang (Dec 19 2022 at 13:04):

Thank you so much for this! The result I wanted to prove now elaborates within reasonable time at around ~30 seconds on my machine.

Junyan Xu (Dec 19 2022 at 14:06):

Thanks! I guess the blowup issue fixed at lean4#1102 is related?


Last updated: Dec 20 2023 at 11:08 UTC