Zulip Chat Archive

Stream: lean4

Topic: PR welcome?


Sebastien Gouezel (Jan 05 2021 at 19:52):

There is an annoyance in Lean 3, that nat multiplication n * p is defined by induction on the right variable p. This is an annoyance because a general monoid has an -module instance, where n • x is defined by induction on n, so on we get two -modules instance (the one for a general monoid, and the other one from the fact that any monoid is a module over itself) for which the multiplications are not defeq. And having two instances that are not defeq is something that we should avoid as much as possible. (For now, it means that we have to jump through hoops sometimes). The problem would disappear if nat multiplication were defined by induction on the left variable.

Lean 4 multiplication is defined in the same way by induction on the right variable p, see https://github.com/leanprover/lean4/blob/597304db959b5d1fb69d10d2a15d405f637d19f6/src/Init/Prelude.lean#L481. @Sebastian Ullrich , would you consider a PR changing this to induction on the left-variable if it doesn't create problems elsewhere? (I'll perfectly understand if the answer is no, but if the answer can be yes then the sooner the change is made, the better, as it means fewer fixes down the road).

Eric Wieser (Jan 05 2021 at 19:55):

I think the nat-semimodule diamond is inevitable whatever we do

Eric Wieser (Jan 05 2021 at 19:56):

It exists also for linear_map and tensor_product, and once I clean up my local changes will exist for subtype too

Sebastien Gouezel (Jan 05 2021 at 19:58):

Eric Wieser said:

I think the nat-semimodule diamond is inevitable whatever we do

When you say "diamond", I guess you mean two instances that are not defeq, right? Changing the definition of multiplication on nat should fix it, as you would use literally the same definition in the two instances, no?

Leonardo de Moura (Jan 05 2021 at 20:28):

This kind of modification can only be performed by developers. That being said, we are open to listening to requests. We also expect consensus between users before we even consider this kind of change.

Sebastien Gouezel (Jan 05 2021 at 20:36):

Leonardo de Moura said:

We also expect consensus between users before we even consider this kind of change.

Yes, of course. Thanks for your answer!

Mac (Mar 09 2021 at 19:07):

I think this is a result of the fact that Lean's Nat is based off the Peano naturals, whose multiplication operation is defined by induction on the right variable. So there is tension between the monoid definition and Peano definition.

Sebastien Gouezel (Mar 09 2021 at 19:44):

In the Peano naturals, you can define multiplication by induction on the right variable or on the left variable, as you prefer. In mathematical practice, we often write n + 1, n + 2, and so on (much more often than 1 + n, 2 + n and so on), but on the other hand we often write 1 * n, 2 * n, 3 * n and so on (much more often than n * 1, n * 2, n * 3 and so on), so in everyday practice there is a funny asymmetry between addition and multiplication. If Lean 4 could reflect this everyday practice in its definition of addition and multiplication, this would be nice (for defeq purposes with small numbers, and for defeqness of the natural monoid addition), but it's not a big deal in any case.

Leonardo de Moura (Mar 09 2021 at 19:58):

@Sebastien Gouezel Have you discussed this change with the rest of the mathlib community? For example, have anyone tried this change on the Lean 3 community branch and mathlib? If yes, did it improve things or not?

Johan Commelin (Mar 09 2021 at 20:00):

Related: Reid refactored the has_pow instance for nat to match the one for monoids. This required a change in core lean3. It generally improved things for mathlib.

Sebastien Gouezel (Mar 09 2021 at 20:08):

No, we haven't tested the change in Lean 3, although most people agreed it would probably be an improvement with respect to some issues we have.

Eric Wieser (Mar 09 2021 at 20:28):

I almost wonder if we should remove the global nat-module instance and just have lemmas that use nat-smul take a typeclass argument

Eric Wieser (Mar 09 2021 at 20:33):

(to be clear, I don't oppose the change, just the claim that it solves enough of these diamonds to be interesting)

Eric Wieser (Mar 09 2021 at 20:33):

Do we have a thread on this outside #lean4? I think I remember one

Sebastien Gouezel (Mar 09 2021 at 20:53):

If it only solves one diamond (and it does) and if it is more natural from the mathematical point of view (and it is), then it is a net improvement over the situation we have, although not a big one I agree. But this discussion is moot for Lean 4 while we haven't tried it on Lean 3.

Mac (Mar 10 2021 at 02:04):

Sebastien Gouezel said:

In the Peano naturals, you can define multiplication by induction on the right variable or on the left variable, as you prefer.

True, you could. You could also flip the module definition. My point was simply that the standard formulation of the Peano axioms uses induction on the right variable. Thus, if the nat multiplication uses induction on the left variable it would no longer be defeq to the Peano naturals. So changing it would create a similar problem elsewhere. This happened to be particularly notable to me, since I am currently doing some proof work in Lean using Peano logic.

Kevin Buzzard (Mar 10 2021 at 06:49):

I don't care about any "standard" definition of the Peano naturals, clearly both definitions work and satisfy the same theorems. This is an implementation issue and history doesn't help with this -- for all we know Peano just chose randomly. I think Sebastién has a valid point and it's worth experimenting. I don't know if it's related but to_additive has problems relating pow gng^n with smul ngn \cdot g because the order is switched. This makes me think the naturals are acting on the wrong side for one of these definitions.

Sebastien Gouezel (Mar 22 2021 at 16:03):

Finally we have tried the other nat (and int) multiplication in Lean 3. By "we", I mean @Yakov Pechersky , who's fixed everything in Lean core lean#551, and then in mathlib #6773. Thanks a lot for this!

To recap: currently, in Lean 3, nat addition x + y is defined by induction on y (so that x + 0 = x is definitional), and nat multiplication also (so that x * 1 = x is definitional -- well, this is not true but it could be true with a little tweak of the definition). This creates a few problems down the road, where we say that any type with 0 and + has a nat action n • x defined by induction on n, and this gives two different actions of nat on itself, where multiplication is defined by induction on the first variable in one case, on the second variable in the other case.

The PR redefines nat multiplication to be by induction on the first variable (but addition remains an induction on the second variable). With this, our diamond problem disappears, and also 1 * n = n becomes definitional, which is nice because in maths we write much more often 1 * n or 2 * n than n * 1 or n * 2, for cultural reasons or whatever. And also for int, in the same way. Also, the new definition is given by hand with nat.rec_on instead of using the equation compiler because the term produced by the equation compiler is a little bit messy.

If I understand correctly, the refactor broke a few proofs that were probably relying too much on defeqness of multiplication, so resulting in more robust proofs. A more serious issue is with a technique converting automatically proofs from addition to multiplication or vice versa, through the additive or multiplicative type synonyms. Some of these proofs broke, because the definition of addition and multiplication are now different. I am not sure it matters a lot because this additive and multiplicative stuff is more or less a hack, and it is not clear how it will work in Lean 4 (there is another thread on this, I am not sure we came to a conclusion about this).

For me, the pros of the change outweigh the cons, and by a rather big margin. I'd be curious to hear what others think. Of course, in the end, @Leonardo de Moura has the final call on what he does in Lean 4, and I think we will merge @Yakov Pechersky 's refactor if and only if Leo decides that changing the definition of nat and int multiplication is a good move for Lean 4. (With heterogeneous multiplication in Lean 4 it would be even easier to do it uniformly).

@Yakov Pechersky , maybe you want to comment more on how the refactor did go?

Yakov Pechersky (Mar 22 2021 at 16:16):

The motivating example @Sebastien Gouezel wanted to work was

import algebra.module.basic

example : @add_comm_group.int_module.{0} int (@ring.to_add_comm_group _ int.ring) =
  @semiring.to_semimodule.{0} int
    (@comm_semiring.to_semiring.{0} int (@comm_ring.to_comm_semiring.{0} int int.comm_ring)) := rfl

which is now true for this branch.

Regarding the refactor, it was almost entirely having to rephrase proofs that relied on n * m.succ being defeq to n * m + m.succ or proofs that relied on coercions being automatically applied to a particular level. More work was necessary to rephrase the additive/multiplicative proofs -- I was inspired by some of the formalization tooling @Gabriel Ebner has in #6045. I definitely got a large appreciation for proofs that rely on rewriting instead of "deep-rfl". The "deep-rfl" are the most brittle to such a refactor, particularly the ones that are used in a proof that is not part of the "foundational" API.

I think that this refactor would change what some users are familiar with in terms of which terms are defeq and what "cute" proof-terminating strategies one can use. But I think in general it can help unify the "nat"-action and "int"-action with regular nat-mul and int-mul. I did not do any of that sort of refactoring other than now utilizing those facts (rewriting, not defeq) and providing the instance test example above.

Mario Carneiro (Mar 22 2021 at 21:37):

Sebastien Gouezel said:

A more serious issue is with a technique converting automatically proofs from addition to multiplication or vice versa, through the additive or multiplicative type synonyms. Some of these proofs broke, because the definition of addition and multiplication are now different.

Can you say more about what the issue is?

Sebastien Gouezel (Mar 22 2021 at 21:42):

A random example from @Yakov Pechersky 's PR:

lemma sum_multiset_map_count [decidable_eq α] (s : multiset α)
  {M : Type*} [add_comm_monoid M] (f : α  M) :
  (s.map f).sum =  m in s.to_finset, s.count m  f m :=
-- @prod_multiset_map_count _ _ _ (multiplicative M) _ f
begin
  convert @prod_multiset_map_count _ _ _ (multiplicative M) _ f,
  try { ext, convert of_add_nsmul _ _ }
end

The old proof is the commented line. Here, of_add_nsmul is

lemma of_add_nsmul [add_monoid A] (x : A) (n : ) :
  multiplicative.of_add (n  x) = (multiplicative.of_add x)^n :=
-- rfl
begin
  induction n with n hn,
  { simp },
  { rw [succ_nsmul, of_add_add, hn, pow_succ] }
end

where the old proof was rfl, but the new proof is not. Maybe changing the definition of the power function could also fix this.


Last updated: Dec 20 2023 at 11:08 UTC