Zulip Chat Archive

Stream: general

Topic: Slow definition

Sebastien Gouezel (Apr 08 2021 at 07:47):

I have a little riddle, from the file exterior_algebra. Consider the following two definitions:

import linear_algebra.exterior_algebra

variables (R : Type*) [comm_semiring R]
variables (M : Type*) [add_comm_monoid M] [semimodule R M]

set_option profiler true

def foo1 {S : Type*} [comm_ring S] [semimodule S M] : semiring (exterior_algebra S M) :=
let i : semiring (tensor_algebra S M) := infer_instance in
@ring_quot.semiring (tensor_algebra S M) i (exterior_algebra.rel S M)

def foo2 {S : Type*} [comm_ring S] [semimodule S M] : ring (exterior_algebra S M) :=
let i : ring (tensor_algebra S M) := infer_instance in
@ring_quot.ring (tensor_algebra S M) i (exterior_algebra.rel S M)

Here foo1 elaborates in 18ms on my computer, and foo2 in 18s. So a ratio of 1 to 1000...

Eric Wieser (Apr 08 2021 at 07:49):

This one was a mystery to me too

Eric Wieser (Apr 08 2021 at 07:50):

I was kinda fed up with not having a ring structure at all, and had such a hard time adding one that I lost interest in making it fast too!

Eric Wieser (Apr 08 2021 at 07:51):

~~src#exterior_algebra.ring has a few more lets than that, doesn't it?~~

Sebastien Gouezel (Apr 08 2021 at 07:51):

I noticed it because it gets even slower in #7084

Sebastien Gouezel (Apr 08 2021 at 07:52):

Eric Wieser said:

src#exterior_algebra.ring has a few more lets than that, doesn't it?

No, I copied the exact definition.

Eric Wieser (Apr 08 2021 at 07:52):

I thought that might be where this was going

Eric Wieser (Apr 08 2021 at 07:53):

Does #7084 change ring_quot (docs#ring_quot.semiring?) to add a non-auto_param nsmul?

Eric Wieser (Apr 08 2021 at 07:56):

I just noticed that ring_quot.ring uses the automatic has_sub - I'll see if changing that makes things better.

Sebastien Gouezel (Apr 08 2021 at 07:56):

It didn't initially, although it does now on my local branch (because there is also an algebra instance below, so to kill the potential diamond one needs to add the nsmul field). But it becomes too slow even without adding it.

Sebastien Gouezel (Apr 08 2021 at 09:19):

I found a way to speed up things, but I don't understand what's going on. The current ring structure on a quotient is:

instance {R : Type u₁} [ring R] (r : R → R → Prop) : ring (ring_quot r) :=
{ neg           := quot.map (λ a, -a)
    rel.neg,
  add_left_neg  := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (add_left_neg _), },
  .. (ring_quot.semiring r) }

With this definition, my test example foo2 takes 18s to compile. If I change it to

instance {R : Type u₁} [ring R] (r : R → R → Prop) : ring (ring_quot r) :=
{ add           := quot.map₂ (+) rel.add_right rel.add_left,
  add_assoc     := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (add_assoc _ _ _), },
  zero          := quot.mk _ 0,
  zero_add      := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (zero_add _), },
  add_zero      := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (add_zero _), },
  add_comm      := by { rintros ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (add_comm _ _), },
  mul           := quot.map₂ (*) rel.mul_right rel.mul_left,
  mul_assoc     := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (mul_assoc _ _ _), },
  one           := quot.mk _ 1,
  one_mul       := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (one_mul _), },
  mul_one       := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (mul_one _), },
  left_distrib  := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (left_distrib _ _ _), },
  right_distrib := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (right_distrib _ _ _), },
  neg           := quot.map (λ a, -a) rel.neg,
  add_left_neg  := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (add_left_neg _) } }

(So, doing everything by hand instead of referring to the semiring structure), then this goes down to 8s.

If I trim it down to

instance {R : Type u₁} [ring R] (r : R → R → Prop) : ring (ring_quot r) :=
{ add           := quot.map₂ (+) rel.add_right rel.add_left,
  zero          := quot.mk _ 0,
  mul           := quot.map₂ (*) rel.mul_right rel.mul_left,
  one           := quot.mk _ 1,
  neg           := quot.map (λ a, -a) rel.neg,
  add_left_neg  := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (add_left_neg _) },
  .. (ring_quot.semiring r) }

(Keeping only the data fields, and leaving the Prop fields to the semiring structure), this goes up again to 21s.

So the content of Prop-fields matters, which I don't get. And all this is to be compared to the 18ms for the semiring version.

TLDR: I don't understand anything to the inner workings of structure elaboration. Any advice on how to make things right / good practice rule would be most welcome!

Eric Wieser (Apr 08 2021 at 11:23):

Eric Wieser said:

I just noticed that ring_quot.ring uses the automatic has_sub - I'll see if changing that makes things better.

Done in #7112, although I didn't end up profiling it because the behavior seemed like something we'd want anyway

Sebastien Gouezel (Apr 08 2021 at 11:41):

I had done profiling, and it doesn't make any difference. Unfortunately, the speedup I mentioned earlier in the discussion is not enough for #7084, which is therefore stuck for now :-(

Sebastien Gouezel (Apr 08 2021 at 14:11):

I'm wondering if the solution here would not be to make more things irreducible: here the exterior algebra is a quotient of the tensor algebra, which itself is already a complicated quotient, so if Lean tries to unfold anything in exterior algebras the terms are astronomically big right away, which is probably not reasonable (and essentially pointless). Thoughts, anyone?

Sebastien Gouezel (Apr 08 2021 at 14:12):

(by "irreducible", I mean use structures just like in the definition of reals, not mark things irreducible after the fact).

Eric Wieser (Apr 08 2021 at 14:16):

Note that originally things were marked irreducible and that actually broke typeclass search

Kevin Buzzard (Apr 08 2021 at 15:07):

(So, doing everything by hand instead of referring to the semiring structure), then this goes down to 8s.

This makes no difference for me. On my desktop it takes around 9 seconds to elaborate foo2 below, and if I uncomment all the comments then it still takes around 9 seconds.

import linear_algebra.exterior_algebra
import algebra.ring_quot

variables (R : Type*) [comm_semiring R]
variables (M : Type*) [add_comm_monoid M] [semimodule R M]

set_option profiler true

def foo1 {S : Type*} [comm_ring S] [semimodule S M] : semiring (exterior_algebra S M) :=
let i : semiring (tensor_algebra S M) := infer_instance in
@ring_quot.semiring (tensor_algebra S M) i (exterior_algebra.rel S M)

universe u₁

open ring_quot

-- local attribute [-instance] ring_quot.ring

-- instance inst37 {R : Type u₁} [ring R] (r : R → R → Prop) : ring (ring_quot r) :=
-- { add           := quot.map₂ (+) rel.add_right rel.add_left,
--   add_assoc     := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (add_assoc _ _ _), },
--   zero          := quot.mk _ 0,
--   zero_add      := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (zero_add _), },
--   add_zero      := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (add_zero _), },
--   add_comm      := by { rintros ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (add_comm _ _), },
--   mul           := quot.map₂ (*) rel.mul_right rel.mul_left,
--   mul_assoc     := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (mul_assoc _ _ _), },
--   one           := quot.mk _ 1,
--   one_mul       := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (one_mul _), },
--   mul_one       := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (mul_one _), },
--   left_distrib  := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (left_distrib _ _ _), },
--   right_distrib := by { rintros ⟨⟩ ⟨⟩ ⟨⟩, exact congr_arg (quot.mk _) (right_distrib _ _ _), },
--   neg           := quot.map (λ a, -a) rel.neg,
--   add_left_neg  := by { rintros ⟨⟩, exact congr_arg (quot.mk _) (add_left_neg _) } }

def foo2 {S : Type*} [comm_ring S] [semimodule S M] : ring (exterior_algebra S M) :=
let i : ring (tensor_algebra S M) := infer_instance in
@ring_quot.ring (tensor_algebra S M) i (exterior_algebra.rel S M)

Kevin Buzzard (Apr 08 2021 at 15:09):

Oh in fact if I just make ring_quot.ring not an instance then it still takes 9 seconds -- in my set-up it doesn't seem to be using ring_quot.ring at all (this is on a relatively recent master -- are you doing something else?)

Sebastien Gouezel (Apr 08 2021 at 15:09):

You need to do this in the original file, before tensor_product is defined.

Kevin Buzzard (Apr 08 2021 at 15:09):

Got it!

Eric Wieser (Apr 08 2021 at 15:11):

Did you actually mean tensor_product and not tensor_algebra?

Kevin Buzzard (Apr 08 2021 at 15:32):

It's a universe issue.

import linear_algebra.exterior_algebra

variables (R : Type*) [comm_semiring R]
variables (M : Type*) [add_comm_monoid M] [semimodule R M]

set_option profiler true

def foo1 {S : Type*} [comm_ring S] [semimodule S M] : semiring (exterior_algebra S M) :=
let i : semiring (tensor_algebra S M) := infer_instance in
@ring_quot.semiring (tensor_algebra S M) i (exterior_algebra.rel S M)

-- Change {S : Type*} to {S : Type}
-- 455 ms
def foo2 {S : Type} [comm_ring S] [semimodule S M] : ring (exterior_algebra S M) :=
let i : ring (tensor_algebra S M) := infer_instance in
@ring_quot.ring (tensor_algebra S M) i (exterior_algebra.rel S M)

Eric Wieser (Apr 08 2021 at 15:33):

What if you put {S : Type*} in a separate variables declaration?

Kevin Buzzard (Apr 08 2021 at 15:34):

import linear_algebra.exterior_algebra

universes u v w
variables (R : Type u) [comm_semiring R]
variables (M : Type v) [add_comm_monoid M] [semimodule R M]

set_option profiler true

def foo1 {S : Type w} [comm_ring S] [semimodule S M] : semiring (exterior_algebra S M) :=
let i : semiring (tensor_algebra S M) := infer_instance in
@ring_quot.semiring (tensor_algebra S M) i (exterior_algebra.rel S M)

-- 455 ms
def foo2 {S : Type w} [comm_ring S] [semimodule S M] : ring (exterior_algebra S M) :=
let i : ring (tensor_algebra S M) := infer_instance in
@ring_quot.ring (tensor_algebra S M) i (exterior_algebra.rel S M)

This is also instant. It's {Type*} v {Type u} again.

Kevin Buzzard (Apr 08 2021 at 15:35):

import linear_algebra.exterior_algebra

variables (R : Type*) [comm_semiring R]
variables (M : Type*) [add_comm_monoid M] [semimodule R M]
variable {S : Type*}

set_option profiler true

def foo1 [comm_ring S] [semimodule S M] : semiring (exterior_algebra S M) :=
let i : semiring (tensor_algebra S M) := infer_instance in
@ring_quot.semiring (tensor_algebra S M) i (exterior_algebra.rel S M)

-- 484ms
def foo2 [comm_ring S] [semimodule S M] : ring (exterior_algebra S M) :=
let i : ring (tensor_algebra S M) := infer_instance in
@ring_quot.ring (tensor_algebra S M) i (exterior_algebra.rel S M)

As Eric suggests: if you strip it out it's quick even with Type*.

Eric Wieser (Apr 08 2021 at 15:36):

This is annoying, because I constantly run into universe issues where someone wrote Type u and mean Type something_other_than_u because guessing the universe variables already in the lemma was a pain for them.

In the past I've ended up stripping out universe variables entirely and replacing things with Type*. Only now we find that introduces subtler problems...

Damiano Testa (Apr 08 2021 at 15:38):

I have not encountered this issue much, but the first time it came for me was also in linear algebra. Could there be something specific to the linear algebra part of the library that makes these issues more prominent?

Kevin Buzzard (Apr 08 2021 at 15:44):

I think this was why universes were causing problems with some previous linear algebra thread:

universes u v w
-- this is the type of `flip`
#check Π {α : Type u} {β : Type v} {φ : Type w}, (α → β → φ) → β → α → φ
-- Type (max (u+1) (v+1) (w+1) (max u v w) v u w)

This type can be simplified, but you can't run the simplifier on universe levels :-)

Damiano Testa (Apr 08 2021 at 15:46):

Indeed, it was flip! Maybe it is the fact of having 3 interlaced Types that causes problems?

Kevin Buzzard (Apr 08 2021 at 15:46):

My instinct, probably not the correct one but it's what I tell students, is that whilst it's important to stay universe polymorphic, can you really imagine a situation where you have your ring in one universe and your module in another one? Put everything into one universe! And then of course people want to do modules over $\Z$ and they realise they have to do everything in Type.

Damiano Testa (Apr 08 2021 at 15:47):

... which is fine as it might even be inconsistent to have more than Type! :upside_down:

Kevin Buzzard (Apr 08 2021 at 16:21):

Here is the non-diamond which I think is causing the issue:

import linear_algebra.exterior_algebra

universes u v w
variables (R : Type*) [comm_semiring R]
variables (M : Type*) [add_comm_monoid M] [semimodule R M]

set_option profiler true

-- 10 seconds
def foo {S : Type*} [comm_ring S] [semimodule S M] :
  @tensor_algebra.semiring S (@comm_ring.to_comm_semiring S _inst_4) M _inst_2 _inst_5 =
  @ring.to_semiring (@tensor_algebra S (@comm_ring.to_comm_semiring S _inst_4) M _inst_2 _inst_5) (@tensor_algebra.ring M _inst_2 S _inst_4 _inst_5)
:= rfl

-- 1/2 second
def bar {S : Type w} [comm_ring S] [semimodule S M] :
  @tensor_algebra.semiring S (@comm_ring.to_comm_semiring S _inst_4) M _inst_2 _inst_5 =
  @ring.to_semiring (@tensor_algebra S (@comm_ring.to_comm_semiring S _inst_4) M _inst_2 _inst_5) (@tensor_algebra.ring M _inst_2 S _inst_4 _inst_5)
:= rfl

(foo and bar are the same apart from {S : Type*} changed to {S : Type w})

Eric Wieser (Apr 08 2021 at 16:30):

What does foo infer for the *?

Kevin Buzzard (Apr 08 2021 at 16:31):

{S : Type u_2}

Sebastien Gouezel (Apr 08 2021 at 16:32):

Thanks a lot for debugging this and noticing it was a universe issue! In the meantime, I have made this quotient ring construction irreducible, which I think is a good idea in any case because it hides some irrelevant complexity.

Kevin Buzzard (Apr 08 2021 at 16:33):

Thanks a lot for your diamond-killing! You have put in a lot of work recently with some really technical and messy issues and I am all too happy to help.

Sebastien Gouezel (Apr 08 2021 at 16:33):

After making things irreducible, foo takes 2 seconds and bartakes 0.5s, so it's better but the universe issues are still in the way.

Kevin Buzzard (Apr 08 2021 at 16:35):

@Eric Wieser as far as I can see foo and bar are identical even up to universes.

Kevin Buzzard (Apr 08 2021 at 16:38):

For some reason Lean is spending a long time solving for * in foo, and ultimately deciding that it can be anything.

Kevin Buzzard (Apr 08 2021 at 16:44):

If someone had told me 5 years ago I'd be working with an analyst on universe issues, I would have been extremely skeptical.

Eric Wieser (Apr 08 2021 at 16:45):

Kevin Buzzard said:

If someone had told me 5 years ago I'd be working with an analyst on universe issues, I would have been extremely skeptical.

Sounds like you're in a different universe now to the one you thought you were in

Sebastien Gouezel (Apr 08 2021 at 16:46):

I'm not really an analyst, you know, more of a dynamical systems/probability person. But still in a different universe, yes :-)

Sebastien Gouezel (Apr 08 2021 at 16:47):

And I've already written a paper relying on the classification of irreducible unitary representations of $SL(2,\R)$ , so that almost makes me an automorphic guy also :-)

Kevin Buzzard (Apr 08 2021 at 16:48):

They're defeq to projective 2-d reps of the Weil group of $\R$ I guess.

Sebastien Gouezel (Apr 08 2021 at 18:49):

Another weirdness I encountered while fixing the slowness. In the file data.mv_polynomial.equiv, on Line 255, there is

change algebra_map R (mv_polynomial S₁ (mv_polynomial S₂ R)) r with C (C r),

which is already slow, and becomes slower after the refactor. You can replace the change with an equivalent

  have A : algebra_map R (mv_polynomial S₁ (mv_polynomial S₂ R)) r = C (C r) := rfl

which takes 64s on my branch. With := by refl, same thing. With , by refl, it becomes instant (less than 200ms).

Sebastien Gouezel (Apr 08 2021 at 20:46):

Another weird example.

@[simp] lemma finite_field.char_poly_pow_card {K : Type*} [field K] [fintype K] (M : matrix n n K) :
  char_poly (M ^ (fintype.card K)) = char_poly M :=
begin
  by_cases hn : nonempty n,
  { haveI := hn,
    cases char_p.exists K with p hp, letI := hp,
    rcases finite_field.card K p with ⟨⟨k, kpos⟩, ⟨hp, hk⟩⟩,
    haveI : fact p.prime := ⟨hp⟩,
    dsimp at hk, rw hk at *,
    apply (frobenius_inj (polynomial K) p).iterate k,
    repeat { rw iterate_frobenius, rw ← hk },
    rw ← finite_field.expand_card,
    unfold char_poly, rw [alg_hom.map_det, ← is_monoid_hom.map_pow],
    apply congr_arg det,
    apply mat_poly_equiv.injective, swap, { apply_instance },
    rw [← mat_poly_equiv.coe_alg_hom, alg_hom.map_pow, mat_poly_equiv.coe_alg_hom,
          mat_poly_equiv_char_matrix, hk, sub_pow_char_pow_of_commute, ← C_pow],
    { exact mat_poly_equiv_eq_X_pow_sub_C (p ^ k) M },
    { exact (C M).commute_X } },
  { congr, apply @subsingleton.elim _ (subsingleton_of_empty_left hn) _ _, },
end

takes 30s on mathlib master. If I change the last three lines to

    { exact (id (mat_poly_equiv_eq_X_pow_sub_C (p ^ k) M) : _) },
    { exact (C M).commute_X } },
  { apply congr_arg, apply @subsingleton.elim _ (subsingleton_of_empty_left hn) _ _, },

it goes down to 2s.

This PR is getting me to meet all the corners of mathlib where weird things are going on!

Sebastien Gouezel (Apr 09 2021 at 19:42):

#7084 is finally building without timeouts. There is an issue with the tests for transport, though, since it can not transport just by itself the nsmul field (or it can not transport the proof that it is equal to the inductive nsmul). Can this be fixed in transport, or should we use another playground for transport tests (or do you think this PR was a bad idea to begin with)? @Scott Morrison , you wrote transport, right?

Scott Morrison (Apr 09 2021 at 23:11):

Yes. Sorry, it's been a while since I worked on transport. Last year I had a long todo list of further improvements, but I stopped working on it for a while. I'd like to get back to it, but I think for now we should just change the tests to avoid this situation.

Scott Morrison (Apr 09 2021 at 23:14):

Hmm.. Actually, I guess this fundamentally breaks transport... :-) I'll see how hard a fix would be.

Scott Morrison (Apr 09 2021 at 23:26):

@Sebastien Gouezel, do you know if there is some way to ask refine_struct to make use of optional values? This would solve the problem.

Scott Morrison (Apr 09 2021 at 23:47):

Ah, it was meant to all along, but #2319 accidentally broke it.

Scott Morrison (Apr 09 2021 at 23:50):

Okay #7143 will fix this problem. I'd suggest just merging it into #7084.

Scott Morrison (Apr 10 2021 at 12:51):

Unfortunately #7143 doesn't work at all, so I've marked it WIP, and am hoping for some help from Simon.

Sebastien Gouezel (Apr 10 2021 at 13:08):

I think I've found a workaround, switching to better requirements for nsmul that should be more in line with what transfer can do.

Sebastien Gouezel (Apr 11 2021 at 13:54):

I've now changed the definition of add_monoid in #7084. It is now

class add_monoid (M : Type u) extends add_semigroup M, add_zero_class M :=
(nsmul : ℕ → M → M := nsmul_rec)
(nsmul_zero' : ∀ x, nsmul 0 x = 0 . try_refl_tac)
(nsmul_succ' : ∀ (n : ℕ) x, nsmul n.succ x = nsmul n x + x . try_refl_tac)

so, just using addition and multiplication to express the properties one wants of nsmul. This is the kind of thing that transfer should like more, and indeed in the tests transfer has no difficulty transferring a semiring instance with this new definition. Some tests are failing, but just because the definition is not the same so there are more fields to fill. @Scott Morrison , I could fix the tests myself but I am not sure what they are testing, so it is maybe better if you have a look yourself. If you trust me to fix them, tell me and I'll do it!

Scott Morrison (Apr 11 2021 at 13:54):

You mean transport not transfer, right?

Scott Morrison (Apr 11 2021 at 13:55):

I'll look now.

Scott Morrison (Apr 11 2021 at 14:01):

I fixed the tests for equiv_rw, but am a bit stumped by the failure of the test for transport. It works for semiring at the top of the file, but then not at the bottom.

Scott Morrison (Apr 11 2021 at 14:12):

I've also fixed the tests for refine_struct. transport itself really shouldn't care how many fields there are: it works that out for the structure itself.

Scott Morrison (Apr 11 2021 at 14:16):

Oh!

Scott Morrison (Apr 11 2021 at 14:17):

The problem is that the second test for transport is transporting between nat and mynat. When it gets to the npow field, it can't tell the difference between the copies of nat that it is meant to transport to mynat (because they are the carrier type of the semiring), and the copies of nat that are referring to powers, and just be left invariant.

Scott Morrison (Apr 11 2021 at 14:19):

It's late here, but I will try to think about a good solution to this. Not sure at this point.

Sebastien Gouezel (Apr 11 2021 at 17:22):

Isn't it just that the test is unfortunate here? Doing the same with $\R$ instead of $\N$ should avoid this issue, right?

Scott Morrison (Apr 11 2021 at 23:09):

Yes.

Sebastien Gouezel (Apr 12 2021 at 11:20):

I have fixed the test by using the general semiring.map defined at the top of the file (which is defined for a generic ring and therefore not perturbed by the coincidental fact that our ring is $\N$ ). I hope it's ok.

Sebastien Gouezel (Apr 12 2021 at 11:25):

There is another test failing in #7084, for ring_exp (hopefully the last one!)

example (a b : ℚ) : (a * b) ^ 1000000 = (b * a) ^ 1000000 := by ring_exp
-- deep recursion was detected at 'replace' (potential solution: increase stack space in your system)

This is probably due to the fact that I changed the definition of powers (now, x ^ n.succ is definitionally x ^ n * x, while it used to be x * x^n). I don't understand the internals of ring_exp enough to be able to fix this. @Anne Baanen , do you think you could have a look or give me a hint where to look?

Anne Baanen (Apr 12 2021 at 11:30):

It's not supposed to use defeq to unfold pows, instead applying the pow_p_pf_succ family of lemmas. Those still typecheck, right?

Anne Baanen (Apr 12 2021 at 11:32):

Would it work to fix the previous commit, or should I wait for the new oleans to be built?

Sebastien Gouezel (Apr 12 2021 at 11:33):

Fixing the previous commit would be perfect.

Sebastien Gouezel (Apr 12 2021 at 11:34):

Yes, all the lemmas still typecheck, the problem only shows up in this specific test.

Sebastien Gouezel (Apr 12 2021 at 11:35):

Strangely enough, the test works if one replaces $\mathbb{Q}$ with $\N$ , or a general comm_ring.

Anne Baanen (Apr 12 2021 at 11:38):

The proof term looks like it typechecks too, so it is the last exact that fails.

Anne Baanen (Apr 12 2021 at 11:45):

~~I think it's a general bug that ring_exp happens to expose:~~

def my_op {α : Type*} [comm_semiring α] : α → α := λ a, a ^ 10000000000
example (a : ℚ) : my_op a = a ^ 100000000000 := rfl -- deep recursion detected at replace

Anne Baanen (Apr 12 2021 at 11:46):

Nevermind, miscopied the number of 0's

Anne Baanen (Apr 12 2021 at 11:57):

But overall yes, seems that this is an artifact of bad unfolding of the definition of npow. The following works:

local attribute [irreducible] npow_rec

example (a b : ℚ) : (a * b) ^ 1000000 = (b * a) ^ 1000000 := by ring_exp -- works

Anne Baanen (Apr 12 2021 at 12:00):

Ah, and changing elaboration order by going from example to lemma breaks the my_op example:

def my_op {α : Type*} [comm_semiring α] : α → α := λ a, a ^ 10000
lemma fooQ (a : ℚ) : my_op a = a ^ 10000 := rfl -- deep recursion detected at replace
lemma fooN (a : ℕ) : my_op a = a ^ 10000 := rfl -- works

local attribute [irreducible] npow_rec

lemma fooQ' (a : ℚ) : my_op a = a ^ 10000 := rfl -- works

Anne Baanen (Apr 12 2021 at 12:04):

Is there an easy way to suggest to lean "when unifying @npow α inst1 x n =?= @npow α inst2 x n, try unifying inst1 with inst2 before unfolding the whole thing"?

Anne Baanen (Apr 12 2021 at 12:04):

(I guess making npow_rec irreducible is close enough.)

Anne Baanen (Apr 12 2021 at 12:46):

Here's my attempt at fixing this by making npow_rec irreducible (doesn't build yet because there are still some uses of reducible npow_rec): https://github.com/leanprover-community/mathlib/commit/7dc0d02c8610d881cd33f1605c3315518484b8fb

If we make npow_rec irreducible, we can't prove its properties anymore by try_refl_tac in the instances. I made a new npow_zero_tac and npow_succ_tac that try to apply the correct definitional lemmas. Unfolding npow_rec in try_refl_tac doesn't work always, since it may appear in a form like div_inv_monoid.npow._default.

Sebastien Gouezel (Apr 12 2021 at 14:00):

What I don't get is why it didn't show up before. Is it that one definition x^(n+1) = x * x^n is optimized by tail-recursion and the other x^(n+1) = x^n * x is not? If this is the case, I should definitely switch back to the original definition. I'd like to avoid putting npow_rec irreducible, because being able to compute small powers by rfl on $\mathbb{N}$ or $\mathbb{Q}$ seems desirable.

Anne Baanen (Apr 12 2021 at 14:37):

The issue is not with the definition of npow itself, it's the fact that there are many npow fields available, so to unify some_class_extending_monoid.npow with some_other_class_extending_monoid.npow we need to unfold both sides a couple of times. But if one side has a shorter path to npow_rec than another, it will unfold the definition of npow_rec, actually calculating the outcome.

Anne Baanen (Apr 12 2021 at 14:38):

So the better comparison is: why doesn't this happen for the (+) operator on ℕ?

Anne Baanen (Apr 12 2021 at 14:40):

Is it just that npow causes the terms to grow exponentially?

Sebastien Gouezel (Apr 12 2021 at 14:40):

I think it's really a tail-recursion optimization. Look at the following:

def mypow1 {α : Type*} [comm_semiring α] : ℕ → α → α
| 0 x := 1
| (n+1) x := mypow1 n x * x

def haspow1 : has_pow ℚ ℕ := ⟨λ x n, mypow1 n x⟩
def my_op1 {α : Type*} [comm_semiring α] : α → α := λ a, mypow1 10000 a
lemma foo1 (a : ℚ) : my_op1 a = @has_pow.pow ℚ ℕ haspow1 a 10000  := rfl -- deep recursion detected at replace

def mypow2 {α : Type*} [comm_semiring α] : ℕ → α → α
| 0 x := 1
| (n+1) x := x * mypow2 n x

def haspow2 : has_pow ℚ ℕ := ⟨λ x n, mypow2 n x⟩
def my_op2 {α : Type*} [comm_semiring α] : α → α := λ a, mypow2 10000 a
lemma foo2 (a : ℚ) : my_op2 a = @has_pow.pow ℚ ℕ haspow2 a 10000  := rfl -- works

Exact same definition for both, but one with left multiplication and the other one with right multiplication. One works, the other fails.

Sebastien Gouezel (Apr 12 2021 at 14:41):

I'll switch back to the other direction, even if it makes some things more painful.

Sebastien Gouezel (Apr 12 2021 at 14:41):

(And I'll add a comment explaining why it has to be done this way).

Anne Baanen (Apr 12 2021 at 14:41):

Huh, but the recursive call isn't in tail position for those definitions.

Sebastien Gouezel (Apr 12 2021 at 14:42):

It might depend on how multiplication is defined, no?

Anne Baanen (Apr 12 2021 at 14:44):

This is always a "proper" tail-recursive definition, but still fails:

def mypow3 {α : Type*} [comm_semiring α] : ℕ → α → α → α
| 0 x acc := acc
| (n+1) x acc := mypow3 n x (acc * x)

def haspow3 : has_pow ℚ ℕ := ⟨λ x n, mypow3 n x 1⟩
def my_op3 {α : Type*} [comm_semiring α] : α → α := λ a, mypow3 10000 a 1
lemma foo3 (a : ℚ) : my_op3 a = @has_pow.pow ℚ ℕ haspow3 a 10000 := rfl -- deep recursion detected at replace

Sebastien Gouezel (Apr 12 2021 at 14:48):

Really weird...

Anne Baanen (Apr 12 2021 at 14:50):

Looking at the set_option trace.type_context.is_def_eq_detail true, for some reason Lean really wants to unfold only the mypow3 on the right hand side:

[type_context.is_def_eq_detail] [2]: a ^ 10000 =?= my_op3 a
[type_context.is_def_eq_detail] unfold right: my_op3
[type_context.is_def_eq_detail] [3]: a ^ 10000 =?= mypow3 10000 a 1
[type_context.is_def_eq_detail] unfold right: mypow3
[type_context.is_def_eq_detail] [4]: a ^ 10000 =?= mypow3 (5000.add (2500.add (1250.add (62
5.add (624.add 0))))) a (1 * a)
[type_context.is_def_eq_detail] unfold right: mypow3
[type_context.is_def_eq_detail] [5]: a ^ 10000 =?= mypow3 (5000.add (2500.add (1250.add (62
5.add (312.add (156.add (78.add (39.add (38.add 0))))))))) a (1 * a * a)
[type_context.is_def_eq_detail] unfold right: mypow3
[type_context.is_def_eq_detail] [6]: a ^ 10000 =?= mypow3 (5000.add (2500.add (1250.add (62
5.add (312.add (156.add (78.add (39.add (19.add (18.add 0)))))))))) a
  (1 * a * a * a)
[type_context.is_def_eq_detail] unfold right: mypow3
[type_context.is_def_eq_detail] [7]: a ^ 10000 =?= mypow3 (5000.add (2500.add (1250.add (62
5.add (312.add (156.add (78.add (39.add (19.add (9.add (8.add 0))))))))))) a
  (1 * a * a * a * a)
[type_context.is_def_eq_detail] unfold right: mypow3
[type_context.is_def_eq_detail] [8]: a ^ 10000 =?= mypow3
  (5000.add
     (2500.add (1250.add (625.add (312.add (156.add (78.add (39.add (19.add (9.add (4.add (
2.add (1.add 0)))))))))))))
  a
  (1 * a * a * a * a * a)
[type_context.is_def_eq_detail] unfold right: mypow3
...

Anne Baanen (Apr 12 2021 at 14:52):

(From the Lean source: because mypow3 can be delta-reduced, and has_pow.pow cannot?)

Anne Baanen (Apr 12 2021 at 14:54):

So we actually want non-tail recursive definitions? It looks like, for foo2, they block the delta-reducing:

[type_context.is_def_eq_detail] [2]: a ^ 10000 =?= my_op2 a
[type_context.is_def_eq_detail] unfold right: my_op2
[type_context.is_def_eq_detail] [3]: a ^ 10000 =?= mypow2 10000 a
[type_context.is_def_eq_detail] unfold right: mypow2
[type_context.is_def_eq_detail] [4]: a ^ 10000 =?= a * mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a
[type_context.is_def_eq_detail] [5]: (λ (x : ℚ) (n : ℕ), mypow2 n x) a 10000 =?= distrib.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] after whnf_core: mypow2 10000 a =?= distrib.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] unfold left: mypow2
[type_context.is_def_eq_detail] [6]: a * mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a =?= distrib.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [7]: distrib.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= semiring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [8]: semiring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= comm_semiring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [9]: comm_semiring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= comm_ring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [10]: comm_ring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= euclidean_domain.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [11]: euclidean_domain.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= a * mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a
[type_context.is_def_eq_detail] [12]: a * mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a =?= distrib.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [13]: distrib.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= ring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [14]: ring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= division_ring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [15]: division_ring.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= field.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a)
[type_context.is_def_eq_detail] [16]: field.mul a (mypow2 (5000.add (2500.add (1250.add (625.add (624.add 0))))) a) =?= a.mul (mypow2 (5000.add (2500.add (1250.add (625.add (624.add : ...

Eric Wieser (Apr 12 2021 at 16:26):

Would an id_delta help? edit: I can't see a way

Sebastien Gouezel (Apr 12 2021 at 16:30):

I'll just switch to the mypow2 definition, which works fine.

Eric Wieser (Apr 12 2021 at 16:33):

This is probably worth filing as a github issue since you have a MWE, just so it doesn't get forgotten

Anne Baanen (Apr 12 2021 at 19:49):

lean#563

Last updated: May 02 2025 at 03:31 UTC