Zulip Chat Archive

Stream: new members

Topic: Contributing to mathlib3

Nick Pilotti (Sep 14 2021 at 21:36):

I want to try contributing an undergraduate topic into mathlib. So far I have been learning with Lean 4 and I've found that I prefer it a lot more than Lean 3. Is it still worth contributing to mathlib3 if everything is getting moved over into mathlib4?

Kevin Buzzard (Sep 14 2021 at 22:07):

mathlib3 a.k.a. mathlib is being moved over into lean 4, so sure it's worth contributing to that. mathlib4 is just some fun experimentation and the current plan is that this is not going to be Lean4 maths library.

Nick Pilotti (Sep 14 2021 at 22:14):

Ok, thanks. I noticed everything in mathlib is proved in the highest level of abstraction possible. Can I still contribute if I don't know enough abstract math to do that? Areas I was thinking of are matrices and complex analysis

Kevin Buzzard (Sep 14 2021 at 22:34):

Unfortunately complex analysis is being formalised in mathlib in quite a complicated way; there is a plan which is slowly being executed, and it involves a lot of abstract maths. Matrices on the other hand are in mathlib already.

Nick Pilotti (Sep 14 2021 at 22:42):

I've looked at the matrix files just a little bit. What about any of these topics that are listed on the undergrad todo page?
change of basis, rank of a matrix, elementary row operations, elementary column operations, Gaussian elimination, row-reduced matrices.

Nick Pilotti (Sep 14 2021 at 22:43):

Out of curiosity, how is complex analysis being formalized? Which math does it involve?

Kevin Buzzard (Sep 14 2021 at 22:43):

Matrices: Oh if these are not done then they're definitely fair game! But I would definitely discuss any specific plans you have for this with experts before you launch into anything.

Kevin Buzzard (Sep 14 2021 at 22:44):

complex analysis: the theory of integration and Cauchy's integral formula are being deduced from more general results involving some n-dimensional version of Stokes' theorem in real analysis.

Yakov Pechersky (Sep 14 2021 at 23:59):

Elementary operations and a form of Gaussian elimination were recently added in #8898

ccn (Feb 16 2022 at 05:40):

Hi, I'd like to contribute some things to mathlib here's what I'd like to add:

Generalized binomial coefficients:
image.png
Multinomial coefficients
image.png

Based on this conversation I have a good idea of what arguments the multinomial coefficient could take in:

image.png

Once we have these new types of coefficients I could possibly extend this to add in the multinomial theorem as well as newtons binomial theorem which extends the usual binomial theorem to work for real numbers as well.

I have a few questions:

I'm aware that we already have docs#nat.choose and docs#add_pow. Where would these new definitions go?

I was trying to get an idea of how to define the new coefficients, so I took a look at the current definition we have for nat.choose:

def choose : ℕ → ℕ → ℕ
| _             0 := 1
| 0       (k + 1) := 0
| (n + 1) (k + 1) := choose n k + choose n (k + 1)

Am I right that this is trying to pattern match in the following way?

Anything choose 0 is 1,
0 choose some number is 0
finally (n + 1 ) choose (k + 1) = n choose k + n choose (k + 1) (recursive definition)

Kyle Miller (Feb 16 2022 at 05:49):

What's your goal? Are you wanting to prove Newton's binomial series expansion or something like that?

Stuart Presnell (Feb 16 2022 at 06:05):

That’s the right way to read the definition, yes

Kyle Miller (Feb 16 2022 at 06:10):

Here's a branch containing the multinomial coefficient, using a list for [k1, k2, ..., kn]. https://github.com/leanprover-community/mathlib/blob/binomial/src/data/nat/choose/multinomial.lean#L12

ccn (Feb 16 2022 at 06:11):

Well right now I have questions I'm working with that use multinomial coefficients and the generalized binomial coefficients, once we have them then I could try to prove newtons expansion yes.

Kyle Miller (Feb 16 2022 at 06:35):

Ideally the generalized binomial coefficient could take as input anything that has multiplication, subtraction by naturals, and division by naturals. That way you could both plug in real numbers directly or get a polynomial out of it.

I'm not sure how to go about setting that up in that generality. Maybe it's not necessary, but I was trying to get it so that it could still take in natural numbers.

Kyle Miller (Feb 16 2022 at 06:39):

One thing I was experimenting with (but it doesn't help with division by naturals) is add-cancellative commutative semirings. They have a unique subtraction operation -- but giving a default instance for this seems problematic diamond-wise...

code

set_option old_structure_cmd true

@[protect_proj, ancestor comm_semiring add_left_cancel_monoid]
class add_cancel_comm_semiring (α : Type*) extends comm_semiring α, add_left_cancel_monoid α

@[priority 0]
noncomputable
instance comm_semiring.has_sub {α : Type*} [add_cancel_comm_semiring α] :
  has_sub α :=
⟨λ a b, by classical; exactI if h : ∃ c, a = b + c then h.some else 0⟩

lemma comm_semiring.sub_unique {α : Type*} [add_cancel_comm_semiring α]
  {a b c : α} (h : a = b + c) : c = a - b :=
begin
  classical,
  simp only [has_sub.sub],
  split_ifs with h',
  { subst h,
    have h'' := h'.some_spec,
    exact add_left_cancel h'', },
  { exact (h' ⟨c, h⟩).elim, },
end

instance : add_cancel_comm_semiring ℕ :=
{ add_left_cancel := begin
    apply add_left_cancel,
  end,
  ..(infer_instance : comm_semiring ℕ) }

Kyle Miller (Feb 16 2022 at 06:52):

Oh right, to avoid problems the subtraction can be required as additional data.

code

set_option old_structure_cmd true

@[protect_proj, ancestor comm_semiring add_left_cancel_monoid]
class add_cancel_comm_semiring (α : Type*)
  extends comm_semiring α, add_left_cancel_monoid α, has_sub α :=
(sub_eq_of_eq_add : ∀ (a b c : α), a = b + c → a - b = c)

instance : add_cancel_comm_semiring ℕ :=
{ sub_eq_of_eq_add := begin
    rintro a b c rfl,
    rw [add_comm, nat.add_sub_cancel],
  end,
  add_left_cancel := λ a b c, add_left_cancel,
  ..(infer_instance : comm_semiring ℕ),
  ..(infer_instance : has_sub ℕ)}

noncomputable
def comm_semiring.has_sub {α : Type*} [comm_semiring α] :
  has_sub α :=
⟨λ a b, by classical; exactI if h : ∃ c, a = b + c then h.some else 0⟩

lemma comm_semiring.sub_unique {α : Type*} [comm_semiring α]
  (h : ∀ {a b c : α}, a + b = a + c → b = c)
  {a b c : α} (h : a = b + c) : c = @has_sub.sub _ comm_semiring.has_sub a b :=
begin
  classical,
  change _ = dite _ _ _,
  split_ifs with h',
  { subst h,
    have h'' := h'.some_spec,
    exact h h'', },
  { exact (h' ⟨c, h⟩).elim, },
end

Kevin Buzzard (Feb 16 2022 at 08:32):

Why not just stick to ℚ-algebras? All the important examples like the reals and complexes and p-adics (hypergeometric functions etc) are ℚ-algebras.

Eric Wieser (Feb 17 2022 at 13:51):

Is "Q-algebra" and "char-zero division_ring" synonymous?

Riccardo Brasca (Feb 17 2022 at 14:08):

No reasons for a Q-algebra to be a division ring.

Kyle Miller (Feb 17 2022 at 16:25):

If anyone wants to work on this, here's a possible definition of the generalized binomial coefficient for ℚ-algebras. There's part of a proof that it coincides with nat.choose for natural numbers, though maybe it can be generalized to ℚ-algebras in general.

import algebra.ring.basic
import algebra.algebra.basic
import tactic

def binomial {α : Type*} [semiring α] [algebra ℚ α] : α → ℕ → α
| x 0 := 1
| x (n+1) := ((n+1)⁻¹ : ℚ) • x * binomial (x + (-1 : ℚ) • 1) n

lemma binomial_eq (m n : ℕ) : binomial (m : ℚ) n = nat.choose m n :=
begin
  induction n with n ih generalizing m,
  simp [binomial],
  cases m,
  { simp [binomial],
    -- (↑n + 1)⁻¹ • 0 = 0
    sorry, },
  { simp [binomial, ih],
    norm_cast,
    rw [nat.succ_mul_choose_eq],
    -- (↑(n + 1))⁻¹ • ↑(m.succ.choose n.succ * n.succ) = ↑(m.succ.choose n.succ)
    sorry, },
end

-- maybe?
lemma binomial_eq' {α : Type*} [comm_semiring α] [algebra ℚ α] (m n : ℕ) :
  binomial (m : α) n = nat.choose m n := sorry

Callum Cassidy-Nolan (Feb 20 2022 at 01:17):

Ok, I think that definition looks good, ( I can start with that) and then try to complete those lemmas. Since I'm still pretty new to contributing to mathlib, could you outline how I would get it in there?

So far what I know is that I would fork the project make a branch, but then would this content go into it's own file?

Callum Cassidy-Nolan (Feb 20 2022 at 01:18):

What would I call that file?

Yaël Dillies (Feb 20 2022 at 01:20):

What about data.nat.choose.multi, data.nat.choose.multinomial?

Callum Cassidy-Nolan (Feb 20 2022 at 01:21):

And for the generalized coefficients?

Yaël Dillies (Feb 20 2022 at 01:22):

And for the q-analog, I guess data.nat.choose.q_analog or data.nat.choose.q_binomial (to avoid confusion with other q-analogs) is fair.

Callum Cassidy-Nolan (Feb 20 2022 at 01:23):

Ok, as for the different ways of approaching the multinomial which do you think is better, passing in a list or passing in a function?

Callum Cassidy-Nolan (Feb 20 2022 at 01:23):

If we pass in a list we can do (m.map nat.factorial).prod to dividie it

Yaël Dillies (Feb 20 2022 at 01:24):

and if we pass in a function we can use docs#finset.prod

Callum Cassidy-Nolan (Feb 20 2022 at 01:24):

Ok, so they both seem good

Callum Cassidy-Nolan (Feb 20 2022 at 01:24):

It looks like some work was already done on it: https://github.com/leanprover-community/mathlib/blob/binomial/src/data/nat/choose/multinomial.lean

Callum Cassidy-Nolan (Feb 20 2022 at 01:25):

In this situation is there a way to "revive" the branch and then I can get it to work? I don't even know what's wrong with that branch anyways...

Yaël Dillies (Feb 20 2022 at 01:25):

I personally prefer the function version, because it emphasizes that there is no preferred order. And also it seems pretty convenient to prove the multinomial formula.

Callum Cassidy-Nolan (Feb 20 2022 at 01:26):

Ok, so in this case should I just make a new branch and work on it from there?

Yaël Dillies (Feb 20 2022 at 01:26):

Yes of course, just ask Kyle and merge master.

Yaël Dillies (Feb 20 2022 at 01:26):

(to your previous question)

Callum Cassidy-Nolan (Feb 20 2022 at 01:26):

Do you think it would be worth it to convert it to be working for the function version?

Callum Cassidy-Nolan (Feb 20 2022 at 01:26):

I'm open to trying that

Yaël Dillies (Feb 20 2022 at 01:27):

I think you could try both.

Yaël Dillies (Feb 20 2022 at 01:27):

Arguably, Kyle already did half the job.

Callum Cassidy-Nolan (Feb 20 2022 at 01:28):

When you make new branches like this, do you try and get the definition in and then prove basic facts about it all at once? Or first get the definition, wait till it merges in and then add them?

Callum Cassidy-Nolan (Feb 20 2022 at 01:29):

I guess I only wonder because in the latter then people can start using the definition before you're done proving everything else...

Yaël Dillies (Feb 20 2022 at 01:30):

Well, you should prove at least enough to demonstrate that your definition is reasonable.

Callum Cassidy-Nolan (Feb 20 2022 at 01:33):

Ok thanks for the tips

Kyle Miller (Feb 20 2022 at 17:31):

Yaël Dillies said:

I personally prefer the function version, because it emphasizes that there is no preferred order. And also it seems pretty convenient to prove the multinomial formula.

I suspect it's easiest to prove things initially with the list version (precisely because it has an order) then bolt the function version on top.

There are identities involving adding and removing elements of this list, which is harder with functions, but lists are designed for this.

Kyle Miller (Feb 20 2022 at 17:33):

Callum Cassidy-Nolan said:

Ok, so in this case should I just make a new branch and work on it from there?

I suggest making a new branch, ccm_multinomial or something. The current branch is old and has some stuff written by myself and @Kevin Buzzard mixed together, and once analogues of everything in the old branch end up in mathlib, it can be deleted.

Kyle Miller (Feb 20 2022 at 17:35):

One workflow I do: have a primary branch for development, then for each PR create a new branch with some small chunk (smaller is better). If anyone wonders where your PRs are going, you can point to the primary branch to help justify your design.

Kyle Miller (Feb 20 2022 at 17:36):

And then as those PRs are merged, you can keep merging master into your primary branch.

Kyle Miller (Feb 20 2022 at 17:40):

I should say what the reason was for using lists with the multinomial coefficient: it gives you reasonably good notation without any additional work. multinomial [a, b] is perfect for applications like ∑ p in nat.antidiagonal n, multinomial [p.1, p.2] * x^p.1 * y^p.2, while being able to generalize to larger numbers of terms. (Like, if we had higher antidiagonals, ∑ p in nat.antidiagonal₃ n, multinomial [p.1, p.2, p.3] * x^p.1 * y^p.1 * z^p.3)

Indexed families aren't so easily constructed.

Callum Cassidy-Nolan (Feb 21 2022 at 16:24):

Thanks for the tips about that. I managed to get my branch going and I'll try to rebuild using the list version first, Also I see stuff like this at the top of the files:

Copyright (c) 2021 Yaël Dillies. All rights reserved.
Released under Apache 2.0 license as described in the file LICENSE.
Authors: Yaël Dillies

How does this work?

Bryan Gin-ge Chen (Feb 21 2022 at 16:28):

Have you read the contribution guidelines? It links to a style guide which describes what should go in the header, etc.

Callum Cassidy-Nolan (Feb 21 2022 at 16:34):

Thanks, I just read it

Callum Cassidy-Nolan (Feb 21 2022 at 16:43):

Kyle Miller said:

Callum Cassidy-Nolan said:

Ok, so in this case should I just make a new branch and work on it from there?

I suggest making a new branch, ccm_multinomial or something. The current branch is old and has some stuff written by myself and Kevin Buzzard mixed together, and once analogues of everything in the old branch end up in mathlib, it can be deleted.

I put all that code in the old merge in and fix all the errors in it (not that many), should I put your and Kevin's names at the top of the file using apache lisence?

Callum Cassidy-Nolan (Feb 21 2022 at 16:45):

Kyle Miller said:

import algebra.ring.basic
import algebra.algebra.basic
import tactic

def binomial {α : Type*} [semiring α] [algebra ℚ α] : α → ℕ → α
| x 0 := 1
| x (n+1) := ((n+1)⁻¹ : ℚ) • x * binomial (x + (-1 : ℚ) • 1) n

lemma binomial_eq (m n : ℕ) : binomial (m : ℚ) n = nat.choose m n :=
begin
  induction n with n ih generalizing m,
  simp [binomial],
  cases m,
  { simp [binomial],
    -- (↑n + 1)⁻¹ • 0 = 0
    sorry, },
  { simp [binomial, ih],
    norm_cast,
    rw [nat.succ_mul_choose_eq],
    -- (↑(n + 1))⁻¹ • ↑(m.succ.choose n.succ * n.succ) = ↑(m.succ.choose n.succ)
    sorry, },
end

-- maybe?
lemma binomial_eq' {α : Type*} [comm_semiring α] [algebra ℚ α] (m n : ℕ) :
  binomial (m : α) n = nat.choose m n := sorry

What does the • symbol do?

Patrick Massot (Feb 21 2022 at 17:01):

This is multiplication by scalars. In Lean 3, multiplication is homogeneous: in a*b you need a and b to have the same type. So if you want to multiply, say a vector v with some real number a then you write a • v

MARIA JOSEPH (Nov 08 2023 at 10:54):

I would like to try contributing to an undergraduate topic in mathlib. Would it be worth doing it in Lean 3, and contributing to mathlib3 ? Is it still possible to do that ?

Arthur Paulino (Nov 08 2023 at 10:58):

No, the community has already migrated to Lean 4

Eric Wieser (Nov 08 2023 at 11:01):

@MARIA JOSEPH, can you point to any documentation that suggested that contributing to mathlib3 was still a good idea, so that we can adjust the wording?

Last updated: May 02 2025 at 03:31 UTC