## Stream: maths

### Topic: Slow instance

#### Thomas Browning (Jan 12 2021 at 01:57):

The following code is very slow. However, swapping the order of the two letI statements make the code much quicker. What's going on here?

import field_theory.normal

theorem slow_theorem {F E : Type*} [field F] [field E] [algebra F E] {p : polynomial F}
[polynomial.is_splitting_field F E p] : normal F E :=
begin
intro x,
haveI hFE : finite_dimensional F E := polynomial.is_splitting_field.finite_dimensional E p,
have H : is_integral F x := is_integral_of_noetherian hFE x,
refine ⟨H, or.inr _⟩,
rintros q q_irred ⟨r, hr⟩,
let C := adjoin_root (minimal_polynomial H),
letI : algebra C E := ring_hom.to_algebra
(adjoin_root.lift (algebra_map F E) x (minimal_polynomial.aeval H)),
letI : algebra C D := ring_hom.to_algebra
((algebra_map E D).comp (algebra_map F E))
(by { rw [←polynomial.eval₂_map, hr, adjoin_root.algebra_map_eq, polynomial.eval₂_mul,
sorry,
end


#### Johan Commelin (Jan 12 2021 at 15:38):

I don't really understand what's going wrong. But the final rw seems to be the culprit. It must be having a hard time finding some instance, but I don't see how the algebra C E instance would cause such a problem.
You can use

set_option trace.class_instances true


to see how lean is searching for instances.

#### Thomas Browning (Jan 12 2021 at 19:23):

I don't think that it's the rw specifically though, because the same problem occurs with this code:

import field_theory.normal

theorem slow_theorem {F E : Type*} [field F] [field E] [algebra F E] {p : polynomial F}
[polynomial.is_splitting_field F E p] : normal F E :=
begin
intro x,
haveI hFE : finite_dimensional F E := polynomial.is_splitting_field.finite_dimensional E p,
have H : is_integral F x := is_integral_of_noetherian hFE x,
refine ⟨H, or.inr _⟩,
rintros q q_irred ⟨r, hr⟩,
let C := adjoin_root (minimal_polynomial H),
letI : algebra C E := ring_hom.to_algebra
(adjoin_root.lift (algebra_map F E) x (minimal_polynomial.aeval H)),
letI : algebra C D := ring_hom.to_algebra
((algebra_map E D).comp (algebra_map F E))
(by sorry)),
sorry,
end


#### Thomas Browning (Jan 13 2021 at 02:38):

It seems like the q_irred is causing problems. This code is slow, but commenting out the q_irred line makes the code fast.

import field_theory.normal

theorem slow_theorem {F E : Type*} [field F] [field E] [algebra F E] {p : polynomial F}
{q : polynomial E} {x : E} [polynomial.is_splitting_field F E p] : false :=
begin
have q_irred : irreducible q := sorry,
haveI hFE : finite_dimensional F E := polynomial.is_splitting_field.finite_dimensional E p,
have H : is_integral F x := is_integral_of_noetherian hFE x,
let C := adjoin_root (minimal_polynomial H),
letI : algebra C E := ring_hom.to_algebra
(adjoin_root.lift (algebra_map F E) x (minimal_polynomial.aeval H)),
letI : algebra C D := ring_hom.to_algebra
((algebra_map E D).comp (algebra_map F E))
(by sorry)),
sorry,
end


#### Thomas Browning (Jan 13 2021 at 03:04):

The weird thing is that q_irred is not an instance, so I have trouble seeing why it would make things slower

#### Thomas Browning (Jan 13 2021 at 23:49):

Here's a simpler example of the problem. Commenting out either hyp1 or hyp2 makes the code run fast.

import field_theory.normal

theorem slow_theorem {F : Type*} [field F] {p q : polynomial F} : false :=
begin
have hyp1 : irreducible q := sorry,
haveI hyp2 : finite_dimensional F F := sorry,
letI : algebra C D := ring_hom.to_algebra
sorry,
end


#### Kevin Buzzard (Jan 14 2021 at 00:30):

Just to be clear -- the code isn't slow, it's broken. The theorem isn't taking a long time to compile -- it's timing out in the presence of hyp1 and hyp2.

#### Thomas Browning (Jan 14 2021 at 00:32):

For me, it's just slow (23 seconds), rather than timing out

#### Kevin Buzzard (Jan 14 2021 at 00:43):

This is not a contradiction -- I might have fiddled with VS code timeout settings

#### Kevin Buzzard (Jan 14 2021 at 00:44):

The debugging output is 47K :-(

#### Kevin Buzzard (Jan 14 2021 at 00:52):

I would be tempted to try and spot which instances are being created in the three runs.

#### Thomas Browning (Jan 14 2021 at 00:56):

I can't spot any significant differences between the outputs (using set_option trace.class_instances true, and a diff tool).

#### Thomas Browning (Jan 14 2021 at 20:45):

Here's a more simplified example. Deleting the hyp1 line or deleting the hyp2 line or deleting : algebra F C makes the code run fast.

I still have no idea what's going on.

import field_theory.normal

theorem slow_theorem {F : Type*} [field F] {q : polynomial F} : false :=
begin
have hyp1 : irreducible q := sorry,
haveI hyp2 : finite_dimensional F F := sorry,
have key : algebra F C := ring_hom.to_algebra (algebra_map F C),
sorry,
end


#### Kevin Buzzard (Jan 14 2021 at 20:58):

If you want to get the experts interested then the next thing to try is removing the dependency on mathlib. I'll take another look at this later.

#### Thomas Browning (Jan 14 2021 at 21:17):

I'll try. Here's something really strange though: swapping hyp1 and hyp2 makes the code fast.

#### Sebastien Gouezel (Jan 14 2021 at 21:23):

I had a look (setting set_option trace.class_instances true and trying to decipher the output). When you add the fact that q is irreducible, you open a new possibility to get that C is a semiring, because it can deduce that from the fact that C is a field. And when you have several paths in the hierarchy, and Lean can not see for itself that they lead to the same thing, then it is lost.

A solution (which is not a satisfactory one, normally Lean should not get lost) is to add the line

attribute [instance, priority 1001] adjoin_root.comm_ring


before the lemma.

#### Thomas Browning (Jan 14 2021 at 21:31):

I do think that there is some semiring/field confusion going on. However, the significance of finite_dimensional eludes me.

#### Alex J. Best (Jan 14 2021 at 21:38):

I don't think finite_dimensional is significant, if you change that line to  haveI hyp2 : monoid ℕ := by apply_instance, its still slow, so really its the fact you are resetting the instance cache thats slowing you down.

#### Sebastien Gouezel (Jan 14 2021 at 21:39):

Yes, it is adding hyp1 to the instance cache (note that irreducible is a class!)

#### Sebastien Gouezel (Jan 14 2021 at 21:40):

You can get the same effect with haveI hyp1 : irreducible q := sorry.

#### Thomas Browning (Jan 14 2021 at 21:45):

Alex J. Best said:

I don't think finite_dimensional is significant, if you change that line to  haveI hyp2 : monoid ℕ := by apply_instance, its still slow, so really its the fact you are resetting the instance cache thats slowing you down.

That's a relief. I was banging my head searching through mathlib trying to find the weird interaction between noetherian modules and quotient rings.

#### Thomas Browning (Jan 14 2021 at 21:47):

So is there any way to avoid this problem? In my use case, it seems inevitable that I will have a irreducible hypothesis lying around and will eventually write a haveI statement.

#### Sebastien Gouezel (Jan 14 2021 at 21:48):

Debuggued a little bit further. This is a case of a slow rfl between two semiring instances on C:

theorem slow_theorem {F : Type*} [field F] {q : polynomial F} : false :=
begin
haveI hyp1 : irreducible q := sorry,
have : @ring.to_semiring C (@division_ring.to_ring C (@field.to_division_ring C (@adjoin_root.field F _inst_1 q hyp1)))
= @ring.to_semiring C (@comm_ring.to_ring C
(@adjoin_root.comm_ring F (@euclidean_domain.to_comm_ring F (@field.to_euclidean_domain F _inst_1)) q)) :=
rfl,
sorry
end


The rfl is taking ages.

#### Johan Commelin (Jan 14 2021 at 21:54):

Sebastien Gouezel said:

Yes, it is adding hyp1 to the instance cache (note that irreducible is a class!)

I guess maybe this shouldn't be a class...

#### Sebastien Gouezel (Jan 14 2021 at 21:56):

Even if it were not a class, the rfl problem above would show up whenever you would want to use the field structure on an adjoin_root type.

#### Sebastien Gouezel (Jan 14 2021 at 21:57):

I wonder if it's an instance of the performance problem of old type structure (against which Leo has warned us strongly) or if we can do something to improve the situation.

#### Sebastien Gouezel (Jan 14 2021 at 22:07):

A work-around to prevent expansion of structures too far is to add the line

attribute [irreducible] ideal.quotient.comm_ring


(or maybe local attribute) before your statement. On my computer, the elaboration time for the lemma goes from 18s to 3s. Still long, but much more manageable.

#### Sebastien Gouezel (Jan 15 2021 at 08:27):

This issue is really frightening me, as this is not a typeclass search issue. I have already asked this before, and I know @Mario Carneiro is not enthusiastic about the idea, but let me ask it again (since in any case we will need to redesign a little bit the algebraic hierarchy in Lean 4 since it does not support old style structures directly).

What about having classes that contain the data (zero, one, add, sub, neg, mul, and so on), and then classes taking the first ones as parameters, being Prop-valued (so that defeqness would be automatic) and asserting that the data is well behaved (defining a ring, a field, and so on). One thing to avoid would be having one class for each data (because it would lead to everything depending on a lot of parameters, and create blow-up when defining products for instance), but what about having classes HasZeroOneAddSubNegMul and all the natural variations (there wouldn't be that many, and we would set them up and their inheritance properties by hand -- this one would just be extends HasZero HasOne HasAdd HasSub HasNeg HasMul, and we would record an instance from HasZeroOneAddSubNegMul to HasZeroOneAddMul for example, and even to HasZero since looping is not a problem in Lean 4) and then Ring would take as an instance parameter HasZeroOneAddSubNegMul, while semiring would take HasZeroOneAddMul.

I have the impression that it would solve all the long rfl problems, and all the diamond issues in the algebraic hierarchy, at a minimal cost. Of course, for this to be usable we would need the syntax [[Ring R]] that declares automatically an instance of HasZeroOneAddSubNegMul R if none can be found, to keep things transparent for the user.

#### Mario Carneiro (Jan 15 2021 at 08:30):

The core of my complaint with this design is that you can replace HasZeroOneAddSubNegMul with Ring and everything still works the same

#### Mario Carneiro (Jan 15 2021 at 08:30):

on the assumption that we don't care about instances of HasZeroOneAddSubNegMul that don't satisfy weaker constraints than Ring, otherwise make a class for those weakest assumptions

#### Mario Carneiro (Jan 15 2021 at 08:31):

With that modification, the main difference from the current approach would be in classes with no new data, for example integral_domain, which would become parameterized over ring

#### Mario Carneiro (Jan 15 2021 at 08:33):

That said, I haven't looked into the problem in this thread

#### Mario Carneiro (Jan 15 2021 at 08:35):

One thing I think is a contributing factor if not main factor in long rfl proofs of typeclass diamonds is that with the old structure approach every instance has to be unfolded to see all of its fields, and then these have to be compared pairwise with the fields in the other structure for defeq

#### Mario Carneiro (Jan 15 2021 at 08:37):

I'm not even sure the new structure command will solve this (even in the cases where it applies), you still have to unfold a bunch of things. What we need is a way to prove defeqs as lemmas and get them to be used automatically when they come up. In principle unification hints can help with the second part, and it's not clear to me if they solve the first part

#### Mario Carneiro (Jan 15 2021 at 08:39):

(Even better would be if we could prove equalities that aren't defeqs and get the typeclass system to use them. Alas, I don't think DTT can handle it...)

#### Sebastien Gouezel (Jan 15 2021 at 08:47):

Here is the problem I see with adding Prop fields to basic data structures. For instance, instead of HasZeroOneAddSubNegMulInvDiv, you would like to have just Field. But then you have long paths from Field to EuclideanDomain to IntegralDomain to Ring to Semiring, keeping the data but removing props one afther the other. If the fields in Field are complicated, all the data has to be copied several times along the path. If there is just a basic path Field -> Ring -> SemiRing and all the others are mixins, I agree this issue disappears (and also we don't need any old style structure command to implement this). Looks like a good solution to me.

#### Mario Carneiro (Jan 15 2021 at 08:54):

Something that we might be able to achieve along the lines of the old/new structure design, is if all instances are of the form ring.mk <semiring> <neg> <ring_mixin> where <ring_mixin> is either a conjunction or a list of props, then checking defeq here means checking the semirings (and if we're lucky and these are the syntactically equal then we're done), the neg, and then the ring mixin part is trivial because it's props (unless lean checks that the types match and then this could be quite expensive)

#### Mario Carneiro (Jan 15 2021 at 08:56):

This seems to be the main idea behind the new structure command, where by additional levels of bundling you increase your odds of getting lucky and happening on a syntactically equal segment before you have to unfold another layer, rather than having one big ring.mk <add> <one> <zero> <mul> <prop> <prop> <neg> <prop (with a type involving the previous)>

#### Mario Carneiro (Jan 15 2021 at 08:58):

However, the new structure command doesn't support diamonds, and this is an issue. As Leo et al have said, it's not hard to implement a diamond manually and/or automate them, but this doesn't solve this underlying problem, these diamond instances are still going to have to splat out all the fields and put them back together, meaning that defeq problems involving them will have to compare all the fields, we don't ever get lucky

#### Sebastien Gouezel (Jan 15 2021 at 09:00):

If we just have a flat inheritance structure for data (Field -> Ring -> Semiring) and Prop mixins that may involve as many diamonds as you want, I don't see why you would get a problem since Props are automatically defeq.

#### Mario Carneiro (Jan 15 2021 at 09:01):

Is the data diamond free? I don't see why it would be

#### Mario Carneiro (Jan 15 2021 at 09:01):

For example OrderedRing is going to be involved in a data diamond with OrderedAddGroup and Ring

#### Sebastien Gouezel (Jan 15 2021 at 09:03):

I want a flat inheritance structure for algebraic data, a flat inheritance structure for orders (in fact there should just be preorder) and mixins.

#### Mario Carneiro (Jan 15 2021 at 09:05):

I mean it might not be horrible but there will need to be some unpacking and repacking any way you have it

#### Mario Carneiro (Jan 15 2021 at 09:09):

so in the ordered ring example when you build one you need OrderedRing.mk <order> <ring> which is fine, but then the "diamond instance" will be OrderedRing -> OrderedAddGroup  since it's not a field in this arrangement, which has to be defined as OrderedAddGroup.mk R.to_order R.to_ring.to_group, and then when you get defeq problems about this instance you will have to unfold projections and so on

#### Johan Commelin (Jan 15 2021 at 09:09):

Mario Carneiro said:

on the assumption that we don't care about instances of HasZeroOneAddSubNegMul that don't satisfy weaker constraints than Ring, otherwise make a class for those weakest assumptions

Note that ring homs don't need the prop fields... in the Witt vector project we had to build some_fun and prove that it preserves the ring data on Witt (R), after that, we used it to check the ring axioms for Witt (R), making it into an actual ring, and finally we could bundle some_fun into a ring_hom.
If we have "lawless rings", such code would also become easier. But of course this is only one data point, and admittedly a rare one.

#### Sebastien Gouezel (Jan 15 2021 at 09:10):

There is something I don't know about the inner workings, and that would be important for this discussion. Suppose you have a class foo (with data) and a Prop class bar taking an argument [foo], maybe involving ten times the operation foo with different arguments. And then you have two instances of bar, depending on two instances of foo (that are defeq, but not obviously). Will Lean unfold everything in bar (i.e., unfold ten times the foo) to check that the two instances of bar are defeq, or will it be clever and say: I know it's enough to check that the two foo instances coincide, so let's not expand bar and just work on foo.

#### Mario Carneiro (Jan 15 2021 at 09:13):

It will try the first, but very often the two instances of foo aren't defeq, only all their fields are, and in the latter case lean will have to unfold bar (and any other derived functions bar uses...) to discover that in fact only fields off foo and not foo itself are used

#### Mario Carneiro (Jan 15 2021 at 09:14):

For example, if you have a ring variable and destructure it and put it back together you will end up with something that isn't defeq to the variable you started with

#### Mario Carneiro (Jan 15 2021 at 09:14):

and this is exactly what happens in parent instances

#### Mario Carneiro (Jan 15 2021 at 09:15):

(which, by the way, is a very good argument for "eta for structures")

#### Sebastien Gouezel (Jan 15 2021 at 09:29):

@Mario Carneiro , I'd be interested in your diagnosis on what is happening in the example that started this thread.

#### Mario Carneiro (Jan 15 2021 at 10:13):

I narrowed it down to the same refl problem as you.

import field_theory.normal

example {F : Type*} [field F] {q : polynomial F} [hyp1 : irreducible q] :
(@adjoin_root.comm_ring F (@euclidean_domain.to_comm_ring F (@field.to_euclidean_domain F _inst_1)) q)) :=
rfl


But the next question is: why is this particular refl problem hard? Presumably it has something to do with adjoin_root, since this is the type that is producing all the fields. On the one hand we have adjoin_root.field and on the other we have adjoin_root.comm_ring, so let's look at those:

instance : comm_ring (adjoin_root f) := ideal.quotient.comm_ring _

noncomputable instance field : field (adjoin_root f) :=
ideal.quotient.field (span {f} : ideal (polynomial K))


Uh oh. Those instances aren't obviously related at all! We have to go unfold a bunch more things to find out that the ring instance extends the field instance.

Let's make it a bit more manifest:

noncomputable instance field : field (adjoin_root f) :=
..ideal.quotient.field (span {f} : ideal (polynomial K)) }


And now the rfl example at the start is instant.

#### Mario Carneiro (Jan 15 2021 at 10:49):

By the way, one reason why using ideal.quotient.field directly is a bad idea here is because it's a (semireducible) def, meaning that when lean is trying to prove instances are equal it will unfold things with instance reducibility but stop at defs, which is usually a good thing but means here that it's first going to unfold everything else, the entire ring hierarchy on the other side, before it reluctantly unfolds ideal.quotient.field when it has no other options

#### Mario Carneiro (Jan 15 2021 at 10:50):

@Gabriel Ebner Is there a way to mark definitions as instance reducible without making them instances? I don't know any keyword combination for doing so

#### Gabriel Ebner (Jan 15 2021 at 10:51):

No, the definition of instance reducible is literally "has attribute instance".

dang

#### Gabriel Ebner (Jan 15 2021 at 10:51):

Luckily instance reducibility is gone in :four_leaf_clover: .

#### Mario Carneiro (Jan 15 2021 at 10:54):

I wonder if we should have a general (lintable?) rule that all instances should start with a structure constructor or another instance. That should make these situations much less likely

#### Gabriel Ebner (Jan 15 2021 at 10:55):

I believe we have some helper definitions to construct instances like docs#complete_lattice_of_Inf that violate this rule.

#### Mario Carneiro (Jan 15 2021 at 10:56):

The idea is that in such cases you use {..constructor}

#### Gabriel Ebner (Jan 15 2021 at 10:56):

Should we also do this for docs#complete_lattice.copy?

#### Gabriel Ebner (Jan 15 2021 at 10:56):

This is often used directly.

#### Mario Carneiro (Jan 15 2021 at 10:57):

actually in that particular case it would be {inf := inf, ..CL_of_inf}

#### Mario Carneiro (Jan 15 2021 at 10:57):

and similarly for copy

#### Mario Carneiro (Jan 15 2021 at 10:57):

you write explicitly all the data fields that you have something interesting to say about, and use ..struct for the rest

#### Mario Carneiro (Jan 15 2021 at 10:59):

We could make complete_lattice.copy @[reducible] to make it possible to do this setup inside the definition itself

#### Mario Carneiro (Jan 15 2021 at 11:00):

in fact maybe that's sufficient for this kind of problem: "instances" that are def should always be marked @[reducible]

#### Mario Carneiro (Jan 15 2021 at 11:00):

that sounds a little dangerous as a general rule though

#### Sebastien Gouezel (Jan 15 2021 at 13:01):

Does it mean that the instance docs#pi_Lp.emetric_space is a code smell, for instance? Or docs#pi_Lp.metric_space, for what it's worth?

#### Mario Carneiro (Jan 15 2021 at 13:11):

As long as you don't care about the particular values the instance provides, it's okay... but since I see a replace_uniformity I assume you at least care about the uniformity

#### Mario Carneiro (Jan 15 2021 at 13:12):

The second one is constructing data by tactics, which is also not good. I would suggest putting these in an auxiliary def and then using the trick I showed to extract the parts you care about in the instance

#### Mario Carneiro (Jan 15 2021 at 13:15):

Also FWIW it doesn't surprise me that there are many instances of this in mathlib, I'm sure I've written plenty. It's only just now coming to light as a pattern we should try to avoid

#### Mario Carneiro (Jan 15 2021 at 13:16):

and I'm not even sure my recommendations above are correct; splatting all the fields has some downsides (we're deliberately making the term bigger) and it remains to be seen if this is useful for lean

#### Thomas Browning (Jan 15 2021 at 18:58):

Just saw this thread. Should I PR this?

noncomputable instance field : field (adjoin_root f) :=
..ideal.quotient.field (span {f} : ideal (polynomial K)) }


#### Thomas Browning (Jan 15 2021 at 19:02):

(It does speed things up. I have a big proof which takes 20 seconds with local attribute [irreducible] ideal.quotient.comm_ring and 4 seconds with the above change)

#### Floris van Doorn (Jan 15 2021 at 19:47):

I think it's good to PR this. And we should probably try to do this systematically in mathlib for similar definitions.

One interesting thing I noticed it that it makes a huge difference whether the arguments are provided in the example, or as variables beforehand. Making them variables is much faster.

import field_theory.normal

-- slow
example  {K : Type*} [field K] {f : polynomial K} [irreducible f] : @ring.to_semiring (adjoin_root f) (@division_ring.to_ring (adjoin_root f)
(@adjoin_root.comm_ring K (@euclidean_domain.to_comm_ring K (@field.to_euclidean_domain K _inst_1)) f)) :=
rfl

variables {K : Type*} [field K] {f : polynomial K} [irreducible f]

-- fast
(@adjoin_root.comm_ring K (@euclidean_domain.to_comm_ring K (@field.to_euclidean_domain K _inst_1)) f)) :=
rfl


#5759

#### Bryan Gin-ge Chen (Jan 15 2021 at 20:11):

Floris van Doorn said:

And we should probably try to do this systematically in mathlib for similar definitions.

Is this something a linter could catch?

#### Floris van Doorn (Jan 15 2021 at 22:40):

I'm not sure, since I don't know what exactly is bad and what is good.

For example, @Mario Carneiro proposed the rule

I wonder if we should have a general (lintable?) rule that all instances should start with a structure constructor or another instance.

However, the following is bad (has worse rfl performance for me than the current instance), but satisfies Mario's rule:

noncomputable instance field : field (adjoin_root f) :=
{ ..ideal.quotient.field (span {f} : ideal (polynomial K)) }


#### Mario Carneiro (Jan 15 2021 at 22:43):

What's bad about that is that it adds new fields on top of a diamond without restating them. The lint would make you think twice about this but it doesn't know which fields you care about or what defeq expression you want to exhibit

#### Thomas Browning (Feb 10 2021 at 17:47):

Lean times out on the following code:

import field_theory.splitting_field

open polynomial

example {F : Type*} [field F] (n : ℕ) (x : F) : false :=
begin
let K := (X ^ n - C (1 : F)).splitting_field,
let L := (X ^ n - C x).splitting_field,
have key : (X ^ n - C (1 : F)).splits (algebra_map F (X ^ n - C x).splitting_field) := sorry,
let f : K →ₐ[F] L := splitting_field.lift (X ^ n - C (1 : F)) key,
letI : algebra K L := f.to_ring_hom.to_algebra,
--haveI := is_scalar_tower.of_ring_hom f,
--haveI : is_scalar_tower F K L := sorry,
haveI : is_scalar_tower F K L := is_scalar_tower.of_ring_hom f,
sorry,
end


However, Lean is fine with either of the two commented lines of code. Does anyone know what's going on?

#### Thomas Browning (Feb 22 2021 at 18:23):

@Patrick Lutz

Last updated: May 09 2021 at 09:11 UTC