Zulip Chat Archive

Stream: maths

Topic: thoughts on elliptic curves

Kevin Buzzard (Dec 15 2021 at 13:53):

Some students at Imperial are thinking about associativity of the group law on an elliptic curve.

Let's fix a field $k$ , and for simplicity let's assume char 0. An elliptic curve $E$ over $k$ can be written as $Y^2=f(X)$ where $f(X)=X^3+AX+B$ , with $A,B\in K$ (and some polynomial $D(A,B)$ is assumed non-zero to guarantee the cubic is smooth). The formalisation literature only ever does one thing now: defines $E(k)$ to be the union of a mysterious "point at infinity" plus the set of solutions $(x,y)\in k^2$ to the equation $y^2=f(x)$ , and then defines an addition on $E(k)$ via some strange geometric shenannigans involving drawing lines or just directly via "this is the equation, deal with it".

What we in mathlib are in a very good position to do now is to develop a basic theory of elliptic curves over number fields: things like a proof of the Mordell-Weil theorem (this group is finitely-generated if $k$ is a number field), the statement of the Taniyama--Shimura conjecture (the $L$ -function of the curve has an analytic extension to the complexes), and the statement of the Birch and Swinnerton--Dyer conjecture (in both its weak and strong forms) are within scope over the next few years and would be a really nice project. Nothing like this is in any other theorem prover.

However, before we get to those shiny things, we need to prove associativity of the group law!

My initial plan for this was "just grind out the algebra". The question is literally "here is a field k, here are 11 variables a1,a2,a3,a4,a6,x1,y1,x2,y2,x3,y3 in that field, here is some really big rational function in these 11 variables, we claim that the numerator can be written as a simple combination of y1^2-f(x1), y2^2-f(x2), y3^2-f(x3) and we're only allowed to divide by the discriminant". Apparently Lean 3 times out (according to a student -- I've not seen the code but the student is about to get push access to mathlib non-master branches so I'm about to).

There are two other proofs I know. One involves some projective geometry. It goes like this: say the three points are A,B,C and the point at infinity is O. Draw the line through A and B; it intersects the curve again at D. Reflect (change the sign of the y coordinate) to get E. Draw the line through E and C and get a new point F. Now go the other way around; the line through B,C meets the curve again at G; the reflection of G is H; the line through H and A meets the curve again at I. STP F=I ((A+B)+C is the reflection of F; A+(B+C) is the reflection of I). The proof goes like this. First show that for eight points P1,P2,...,P8 in projective 2-space in "general position" (e.g. no 4 on a line etc), any cubic through these 8 points also passes through a mysterious 9th point P9 (this is done by a dimension count) and now apply to the 8 points OABCDEGH. Using the elliptic curve and the degenerate (i.e. product of three linear factors) cubic ABD+ECF+OGH you deduce the 9th point is F, and using BCG+AHI+ODE
you deduce that the 9th point is I, so F=I. I don't really have a feeling as to how practical this method is in practice. One worry is that whatever "general position" actually does mean, there will be a ton of edge cases which will probably need doing; in the books you just say "oh blah blah blah, Zariski-closed set in a scheme containing the generic point so it's everything" but I'm not convinced that this would be any easier.

The other proof I know of associativity is a more high-powered one using the Picard group. Basically the idea is that you write down another object which is obviously an abelian group (it's a subquotient of the free abelian group generated by the points on the curve) and then you write down a bijection between this object and the points on the curve and prove that addition on the object and addition on the curve commutes with the bijection, and hence addition on the curve is associative. This proof looks at first sight like it's way out of reach because it looks like it might need a whole bunch of algebraic geometry (divisors, principal divisors etc) but actually when thinking about this this morning I realised that a lot of this material is closer than we might think. Let me say more about this.

$k$ is our ground field (fixed), equation is $y^2=f(x)$ . Let $R$ be the ring $k[X,Y]/(Y^2-f(X))$ . This ring is an integral domain, and indeed it's a Dedekind domain. I claim that the elliptic curve $E(k)$ is "canonically" isomorphic to the class group of $R$ ! The proof of this involves a little bit of diagram chasing etc but let's just assume it; the point is that Pic^0(E)=Pic(affine model). The map from E(k) to the class group sends the point at infinity to the zero class (the identity of the class group) and sends (x,y) to the ideal (X-x,Y-y). This is easily checked to be a maximal ideal. Now there are two things which need to be done to finish the proof of associativity:

1) Prove that this map is injective
2) Prove that addition on the curve corresponds to the group law on the class group

For (1) this boils down to the following assertion: prove that if (x1,y1) != (x2,y2) are two distinct points on the curve, then there does not exist non-zero f,g in R such that $(X-x1,Y-y1)*(f)=(X-x2,Y-y2)*(g)$ , this being equality of ideals in R.

For (2) this boils down to an identity of ideals -- if (x1,y1)+(x2,y2)=(x3,y3) addition on the curve, then we want to write down explicit f and g (linear) such that $(X-x1,Y-y1)*(X-x2,Y-y2)*(f)=(X-x3,Y-y3)*(g)$ . This looks to me to be much easier algebra.

I am a bit unclear about how to do (1). Again my head says "of course the divisor P-Q isn't principal because this would give an isomorphism of the curve with the projective line, which can't happen because genus is an isomorphism-invariant". If we could come up with a low-level translation then this would be a really cool application of the class group stuff. Do any of the more algebro-geometric people (@Damiano Testa , @Riccardo Brasca , @Antoine Chambert-Loir ) know if there is a proof of (1) which translates down into only ring theory somehow? I think that if we can do this then this would be a really nice way to prove associativity because we can claim that we're doing it the conceptual way which is of course the "correct" way.

Note that we need associativity (which works over a general field) before we can start on all of the arithmetic stuff (which works over a number field e.g. the rationals).

Riccardo Brasca (Dec 15 2021 at 13:58):

I am afraid I don't know a simpler proof, but I really like your idea of using the class group!

Filippo A. E. Nuccio (Dec 15 2021 at 14:01):

Since you would be interested in number fields, wouldn't the option through Weierstrass $\wp$ work? I have never tried the details myself, but what I mean is that you first prove that $\mathbb{C}/\Lambda$ is abelian (associative!); that's easy (coming up with the right lattice might be tricky). And then you prove "by hand" that $\wp$ and $\wp'$ produce a bijection, so you can transfer the group law, and it will be associative for elliptic curves over $\mathbb{C}$ . Then $E(k)$ is a substuff of $E(\mathbb{C})$ and associativity is preserved. Again, I wrote this without too much thinking, so I apologise if it does not work - but it would be nice to see meromorphic functions in action!

Filippo A. E. Nuccio (Dec 15 2021 at 14:02):

That being said, I agree with Riccardo that the idea with the class group is nice! We actually mentioned this example in our final version of our Dedekind paper with @Anne Baanen , @Ashvni Narayanan and @Sander Dahmen , to give an example of a non-finite class group.

Antoine Chambert-Loir (Dec 15 2021 at 14:04):

That's unclear to me that there is a low-tech proof, precisely for the reason that you indicate : it would prove that the cubic is not rational.

On the other hand, some simple cubics can be proved to be nonrational by applying Stothers/Mason, when their equation reduces to a sum of three terms. I remember it works for the Fermat_3 curve, and possibly for $y^2=x^3+1$ (I'm not sure), but in general, you have 4 terms…

Antoine Chambert-Loir (Dec 15 2021 at 14:08):

I remember Mestre telling that with a computer algebra system, associativity of the group law is a mere triviality.
I believe that one should try to understand why the formal computation does not work in Lean, or how to make a computer algebra system produce compilable Lean code.
I think that this is along these lines that Hales and coauthors formalized the associativity for Edwards models of elliptic curves, see https://arxiv.org/abs/2004.12030.

Kevin Buzzard (Dec 15 2021 at 14:31):

Yes associativity is a triviality with a computer algebra system. It is the statement that some element is in some ideal and if the computer algebra system has Groebner bases implemented for polynomial rings over fields it's easy. What we would have to do in Lean is to reduce the question to "element is in ideal" and then solve it using a computer algebra system, copy the explicit presentation of the element as a linear combination of generators of the ideal back into Lean and then hope that ring can handle it. Computer algebra systems can handle very large algebraic objects easily. Lean 3 is not so good at it. I did propose to a couple of people the project of doing this in Lean 4 as an interesting test for ring but someone objected that the "bash it out" proof was not "the proper proof" and so shouldn't be formalised at all!

Adam Topaz (Dec 15 2021 at 14:38):

But I would say the "proper proof" is to prove Riemann Roch first, and that's going to take a little while...

Adam Topaz (Dec 15 2021 at 14:40):

Can ring in lean3 not handle the explicit formulas for the group law?

Kevin Buzzard (Dec 15 2021 at 14:41):

I don't know yet. David Angdinata just pushed where he was at; https://github.com/leanprover-community/mathlib/commit/f9685f40de179cba4fb8f68b1a51c2913c0c7463

Adam Topaz (Dec 15 2021 at 14:42):

The definition of add is ~130 lines long...

Adam Topaz (Dec 15 2021 at 14:43):

Where is my copy of Silverman

Kevin Buzzard (Dec 15 2021 at 14:44):

This is the first time I've seen the code. I also tried something with Cremona at LFTCM: https://github.com/ImperialCollegeLondon/Example-Lean-Projects/blob/master/src/elliptic_curves/basic.lean

Kevin Buzzard (Dec 15 2021 at 14:44):

Looks like I got the definition of add down to 50

Kevin Buzzard (Dec 15 2021 at 14:46):

the definition involves a proof obligation, right? You have P=(x1,y1) and Q=(x2,y2) and the answer to P+Q in general is "the x coordinate is some messy function of x1,y1,x2,y2, as is the y coordinate, but now we have to prove they're a solution to the cubic"

Adam Topaz (Dec 15 2021 at 14:47):

Yeah, I see... I guess the ideal approach would have some custom auto param on that proof obligation that would (hopefully) discharge the goal automatically ;)

Johan Commelin (Dec 15 2021 at 14:48):

An elliptic curve is a smooth proper group scheme of dimension 1. Isn't that the way mathlib should approach this?

Kevin Buzzard (Dec 15 2021 at 14:48):

Jamie Bell's project does addition in 30 lines https://github.com/jamiebell2805/BSD-conjecture/blob/master/src/elliptic_curve_rationals.lean (one of which is very long) but he works with the simpler model y^2=cubic

Kevin Buzzard (Dec 15 2021 at 14:48):

@Johan no I don't think it is. Because then we're just saying "Mordell-Weil is 3 years away".

Adam Topaz (Dec 15 2021 at 14:49):

@Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?

Filippo A. E. Nuccio (Dec 15 2021 at 14:49):

Well, but are we speaking of mathlib or Lean? Mordell--Weil can land in mathlib in 3 years, no? There might be a parallel project proving it in the meanwhile.

Kevin Buzzard (Dec 15 2021 at 14:49):

The proof of Mordell-Weil for elliptic curves over Q looks like this: first prove that a smooth proper group scheme of dimension 1 over a PID has a Weierstrass equation. Then do the theory of heights and a 2-descent.

Kevin Buzzard (Dec 15 2021 at 14:50):

For me the two parts are totally disjoint. The first step is some algebro-geometric argument which I sketch in the module doc for EllipticCurve. The second is arithmetic and is completely independent.

Kevin Buzzard (Dec 15 2021 at 14:51):

The algebro-geometric language buys you very little; this was why Mordell-Weil was proved way before schemes were even invented

Reid Barton (Dec 15 2021 at 14:52):

There is a formalization https://github.com/strub/elliptic-curves-ssr in Coq (paper) that takes this approach via the Picard group. The math-hard part is indeed (1) and unfortunately they don't discuss it at all in the paper. The formalization of that part is in https://github.com/strub/elliptic-curves-ssr/blob/main/src/ecrr.v and it uses elementary methods (Fermat descent). I had some difficulty following the formalization, but then I came across some notes by Eva Belmont which contain the same proof (pages ~6-8).

Kevin Buzzard (Dec 15 2021 at 14:52):

The reason I called the file EllipticCurve in mathlib is because you are supposed to think of it as the category of elliptic curves. Now your objection becomes irrelevant because defnitional equality, and equality in general, is evil in a category, and my category is equivalent to yours and furthermore actually exists so we can start using it.

Johan Commelin (Dec 15 2021 at 14:54):

Adam Topaz said:

Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?

That's a consequence of Riemann-Roch, isn't it?

Kevin Buzzard (Dec 15 2021 at 14:55):

Sure

Kevin Buzzard (Dec 15 2021 at 14:55):

but to do anything like that we need a working theory of sheaf cohomology and projective space

Adam Topaz (Dec 15 2021 at 14:55):

You could use the classical approach with repartitions in this case.

Adam Topaz (Dec 15 2021 at 14:56):

E.g. as in Serre's book on class fields

Johan Commelin (Dec 15 2021 at 14:56):

so we can start using it

I agree. But for a whole bunch of arguments, it will be quite useful to know that an elliptic curve is a projective variety.

Johan Commelin (Dec 15 2021 at 14:57):

So even if you say "definitions don't matter" you are in some sense just post-poning the "problem"

Kevin Buzzard (Dec 15 2021 at 14:57):

My impression is that this is what Strub did (repartitions). They wrote down local uniformisers at all points on the curve, e.g. "it's Y-y unless y=0 in which case it's X-x"

Kevin Buzzard (Dec 15 2021 at 14:58):

Johan we are postponing what you think is a problem but we also postponing the ability to say "we have stated Shimura-Taniyama and BSD in a sorry-free way in a theorem prover" and the less that this is postponed the better.

Kevin Buzzard (Dec 15 2021 at 14:59):

Nobody is going to object that the statement of BSD refers to y^2=cubic -- this is the definition that Wiles gives in his write-up of BSD on the Clay website.

Kevin Buzzard (Dec 15 2021 at 15:00):

I view "generating the theory of the arithmetic of y^2=cubic" and "proving y^2=cubic can be re-interpreted up to equivalence using Grothendieck's machine as a smooth proper morphism" as two completely different problems.

Alex J. Best (Dec 15 2021 at 15:03):

Kevin Buzzard said:

For me the two parts are totally disjoint. The first step is some algebro-geometric argument which I sketch in the module doc for EllipticCurve. The second is arithmetic and is completely independent.

No idea if it still compiles but I did a very basic notion of heights once and proved a weak mw -> mw type result. One may want a more general notion of height in practice though.

import group_theory.index
import group_theory.subgroup.pointwise
import group_theory.quotient_group
import data.real.basic
import data.real.nnreal


lemma lt_of_mul_lt_mul_of_le {G : Type*} [linear_ordered_comm_group_with_zero G] {a b c d : G}
  (h : a * b < c * d) (hc : 0 < c) (hh : c ≤ a) : b < d :=
begin
  rw ←(inv_le_inv₀ (ne_of_gt hc) (ne_of_gt (lt_of_lt_of_le hc hh) : a ≠ 0)) at hh,
  -- rw ← div_lt_div₀ _ _ at h,
  sorry,
end

attribute [to_additive add_subgroup.smul_mem] subgroup.pow_mem
attribute [to_additive] subgroup.gpow_mem
open add_subgroup set
--TODO move
lemma not_mem_of_not_mem_closure {A : Type*} [add_comm_group A] {G : set A} {P : A}
  (hP : P ∉ closure G) : P ∉ G :=
begin
  intro h,
  have := subset_closure h,
  exact hP this,
end

open_locale nnreal
-- TODO generalize to valued in a linear_ordered_field
structure height_function (A : Type*) [add_comm_group A] :=
(h : A → ℝ≥0)
(height_smul' : ∀ (a : A) (m : ℕ), h (m • a) = m^2 * h a)
(parallelogram_law : ∀ (a b : A), h (a + b) + h (a - b) = 2 * h a + 2 * h b)

namespace height_function
variables {A : Type*} [add_comm_group A] (h : height_function A)

instance : has_coe_to_fun (height_function A) := ⟨λ _, A → ℝ≥0, λ h, h.h⟩
-- @[simp]
-- lemma height_gsmul (a : A) (m : ℤ) : h (m • a) = m^2 * h a := h.height_smul a m
@[simp]
lemma height_smul (a : A) (m : ℕ) : h (m • a) = m^2 * h a :=
  h.height_smul' a m

-- begin
--   convert h.height_smul a m using 1,
--   simp,
-- end
-- should be a version for any nontrivial target
@[simp]
lemma height_zero : h 0 = 0 :=
begin
  have := calc 2^2 * h 0 = h (2 • 0) : by simp only [int.cast_bit0, nat.cast_bit0,
                      int.cast_one, nat.cast_one, height_smul]
            ... = h 0 : by simp,
  exact eq_zero_of_mul_eq_self_left (by norm_num) this,
end
lemma zero_of_tors (a : A) {m : ℕ} (hm : m ≠ 0) (h : height_function A) (hma : m • a = 0) :
  h a = 0 :=
begin
  have := eq.symm (calc 0 = h 0 : by simp
                     ... = h (m • a) : by rw hma
                     ... = m^2 * h a : h.height_smul _ _),
  rw mul_eq_zero at this,
  norm_num at this,
  finish,
end

def northcott_property : Prop := ∀ x, {P | h P ≤ x}.finite

end height_function
-- TODO make mul and to_add
open_locale pointwise
open quotient_add_group

variables {A : Type*} [add_comm_group A] (h : height_function A) (hN : h.northcott_property)
variables {Ps : finset A} {m : ℕ} (hm : 2 ≤ m)
-- TODO maybe nicer to take any finite index subgroup
variables (hPs : ∀ x : quotient (m • (⊤ : add_subgroup A)), ∃ (p : A) (hp : p ∈ Ps), ↑ p = x)
include hPs hN hm

-- https://www.msri.org/attachments/workshops/301/HtSurveyMSRIJan06.pdf
-- https://ocw.mit.edu/courses/mathematics/18-782-introduction-to-arithmetic-geometry-fall-2013/lecture-notes/MIT18_782F13_lec25.pdf
lemma group_eq_span_bdd_height : closure {R | h R ≤ Ps.sup h} = ⊤ :=
begin
  rw add_subgroup.eq_top_iff',
  classical,
  by_contra h,
  push_neg at h,
  choose x hx using h,
  have C_finite : ((closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ ∩ {P | h P ≤ h x}).finite,
  exact finite.inter_of_right (hN (h x)) _,
  have : ∃ P, P ∈ ((closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ ∩ {P | h P ≤ h x})
                ∧ ∀ P' ∈ ((closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ ∩ {P | h P ≤ h x}),
                  h P ≤ h P',
  { have := C_finite.to_finset.exists_min_image h _,
    { simp only [and_imp, exists_prop, mem_inter_eq, finite.mem_to_finset, mem_set_of_eq,
        set_like.mem_coe, compl_eq_compl, mem_compl_eq] at this ⊢,
      exact this, },
    rw finite.to_finset.nonempty,
    use x,
    simp only [hx, le_refl, mem_inter_eq, mem_set_of_eq, set_like.mem_coe, not_false_iff,
      and_self, mem_compl_eq], },
  have : ∃ P, P ∈ (closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ
                ∧ ∀ P' ∈ (closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ,
                  h P ≤ h P',
  { obtain ⟨P, hP, hPP⟩ := this,
    use P,
    simp_rw mem_inter_iff at hP hPP ⊢,
    split,
    exact hP.1,
    intros P' hP',
    by_cases hPx : h P' ≤ h x,
    { apply hPP,
      split,
      exact hP',
      rw mem_def,
      exact hPx, },
    { push_neg at hPx,
      exact calc _ ≤ h x : hP.2
               ... ≤ _ : hPx.le }, },
  choose P hP hPP' using this,
  have : ∃ (Pj) (hPj : Pj ∈ Ps) (Q), Pj + m • Q = P,
  { obtain ⟨Pj, hPj, hPPj⟩ := hPs P,
    use [Pj, hPj],
    erw [quotient_add_group.eq, mem_smul_set] at hPPj,
    simpa [eq_neg_add_iff_add_eq] using hPPj, },
  obtain ⟨Pj, hPj, Q, hPj'⟩ := this,
  simp only [set_like.mem_coe, compl_eq_compl, mem_compl_eq] at hP,
  apply hP,
  have : Q ∈ closure {R : A | h R ≤ Ps.sup h},
  { have : 2 * h Pj < 2 * h P,
    { rw mul_lt_mul_left (show (0 : ℝ≥0) < 2, by norm_num),
      have := not_mem_of_not_mem_closure hP,
      simp only [not_le, mem_set_of_eq] at this,
      calc _ ≤ Ps.sup h : finset.le_sup hPj
        ... < _ : this, },
    rw ← eq_sub_iff_add_eq' at hPj',
    have h_main := calc
    (m^2 : ℝ≥0) * h Q = h (m • Q) : (h.height_smul _ _).symm
                  ... = h (P - Pj) : by rw hPj'
                  ... ≤ h (P + Pj) + h (P - Pj) : le_add_self
                  ... = 2 * h P + 2 * h Pj : h.parallelogram_law P Pj
                  ... < 2 * h P + 2 * h P : add_lt_add_left this _
                  ... = 4 * h P : by {rw ← add_mul, norm_num},
    have := hPP' Q,
    rw ← not_imp_not at this,
    simp only [not_le, not_not, set_like.mem_coe, compl_eq_compl, mem_compl_eq] at this,
    apply this,
    have two_le_m : (2 : ℝ≥0) ≤ m := by exact_mod_cast hm,
    have four_le_m : (2^2 : ℝ≥0) ≤ m^2 := pow_le_pow_of_le_left (zero_le 2) two_le_m 2,
    norm_num at four_le_m,
    exact lt_of_mul_lt_mul_of_le h_main (by norm_num) four_le_m, },
  rw ← hPj',
  apply' add_mem,
  { apply subset_closure,
    rw mem_def,
    exact finset.le_sup hPj, },
  { exact smul_mem _ this m, },
end

Kevin Buzzard (Dec 15 2021 at 15:21):

Just trying to disentangle the notes that Reid sent (Tom Fisher's elliptic curves course): on page 8 when the notes refer to "Corollary 1.5(in green)" they mean "Corollary 1.8(with 1.5 written in green afterwards)" as opposed to the completely irrelevant Corollary 1.5 which had me utterly confused.

Alex J. Best (Dec 15 2021 at 15:25):

I just saw on page 2 it says

A note on the numbering
There are two concurrent numbering schemes in this document: numbers and headings
in green refer to Prof. Fisher’s numbering system; everything else refers to LATEX’s numbering, induced by setting section numbers = lecture numbers.

Kevin Buzzard (Dec 15 2021 at 16:12):

unfortunately I started reading on pages 6 to 8 ;-)

Johan Commelin (Dec 16 2021 at 07:09):

Kevin Buzzard said:

The reason I called the file EllipticCurve in mathlib is because you are supposed to think of it as the category of elliptic curves. Now your objection becomes irrelevant because defnitional equality, and equality in general, is evil in a category, and my category is equivalent to yours and furthermore actually exists so we can start using it.

I understand your point. At the same time, right now you have a type in mathlib that doesn't have a category instance yet.

Kevin Buzzard (Dec 16 2021 at 08:45):

That's because I don't know what to do for the morphisms. Are they supposed to be isogenies, or are you allowed the zero map, or are they supposed to be isomorphisms, or random morphisms of schemes? I know how to define all of these.

Kevin Buzzard (Dec 16 2021 at 08:46):

You don't need associativity of the group law to do any of them because of this theorem which says that preserving zero implies preserving +

Johan Commelin (Dec 16 2021 at 08:56):

Probably we need all those categories.

Kevin Buzzard (Dec 16 2021 at 09:02):

You know Jujian is working on the category where the objects are pairs (R,M) with R a ring and M an R-module? There are two kinds of morphisms you can choose for that category. If you want presheaves of modules to be a functor you let the rings and the modules both change. But if you want to do the moduli stack of elliptic curves then you allow the base ring to change but you demand that the map on modules induces an isomorphism after you tensor up. I don't fully understand what's going on here.

Marc Masdeu (Dec 16 2021 at 09:05):

Proving associativity using the explicit equations is not so much a triviality, at least in the computational sense. For the general Weierstrass form, it involves checking polynomial identities with terms having more than 130000 nonzero coefficients. I don't know Lean's IO system, but would be the kind of proof where one would load a separate input file containing the polynomials to be checked. And we would need a tactic for checking equalities of polynomials (ring didn't work last time I tried).

Kevin Buzzard (Dec 16 2021 at 09:10):

Oh interesting! I remember checking the generic case for y^2=x^3+Ax+B in pari-gp when I was a graduate student and it being huge but easily doable. But I wanted to use the 5 variable version in mathlib because I realised I could prove that it was valid for all PIDs and more generally all rings whose Picard group has trivial 12-torsion.

Kevin Buzzard (Dec 16 2021 at 09:15):

@Johan Commelin here's another way of thinking about the definition of elliptic curve in mathlib. There's something weird going on in the Taniyama-Shimura conjecture, because it relates modular forms to elliptic curves and it's supposed to be hard, but modular forms are just functions on elliptic curves so it all sounds like it should be easy. But those two uses of the word "elliptic curve" are completely different. The hard version specifically relates to elliptic curves over the rationals. The easier version is all about moduli spaces and hence is about elliptic curves defined over all schemes (and Wiles uses modular forms defined over the complexes and finite fields and the p-adic integers in his proof as well as arguments over general bases). We still don't have enough algebraic geometry to do the scheme side of the story (the geometric side) but we can start on the arithmetic side right now.

Johan Commelin (Dec 16 2021 at 09:19):

Fair enough

Johan Commelin (Dec 16 2021 at 09:19):

I just hope that the geometric side will catch up sooner rather than later.

Kevin Buzzard (Dec 16 2021 at 09:22):

These things can move independently I think. I don't think we need anything about schemes to prove Mordell-Weil for elliptic curves over number fields -- that's why it was done before schemes existed. There's still every chance that Ashvni will define the p-adic zeta function before the Riemann zeta function.

Johan Commelin (Dec 16 2021 at 09:25):

Mordell-Weil wasn't done before varieties existed, right?

Kevin Buzzard (Dec 16 2021 at 09:28):

@Marc Masdeu it's probably worth saying that even if it's true that when you expand it all out associativity involves terms with a hundred thousand monomials in, that doesn't mean that Lean ever needs to see one. For example it's true that $(x+y)^{100000}=(y+x)^{100000}$ but I can prove this in Lean without ever multiplying everything out. In general Lean does not expand things out unless it's told to, so perhaps some judicious variable name choices can come to our rescue. Associativity is a really annoying sorry right now because any statement you make about a group attached to the curve won't compile without warnings.

Kevin Buzzard (Dec 16 2021 at 09:30):

To do the abelian varieties case you need varieties (to even state it), but I suspect that Mordell had it for elliptic curves over Q before Weil's book. All you need for that is stuff about number fields and remember we had all of class field theory 100 years ago, just not with the cohomological language.

Johan Commelin (Dec 16 2021 at 09:31):

My history of mathematics starts around 1956 :rofl:

Marc Masdeu (Dec 16 2021 at 09:41):

@Kevin Buzzard I'm not clever enough to get those polynomials without calling a Groebner basis algorithm, and then it spits out the kind of monsters that I mentioned. I'd like to see it improved, here's what I am using: https://gist.github.com/mmasdeu/20d5c0fec521f03e30e4f68ccee21a77

Kevin Buzzard (Dec 16 2021 at 09:45):

Thanks Marc -- this is really helpful. Do you have a "brute force" proof of associativity in Sage? i.e. a computer algebra package version of algebraic_geometry.elliptic_curve.group_law? It's on the elliptic_curve_group_law branch.

Marc Masdeu (Dec 16 2021 at 09:53):

I'll look at what you mean later, but when I had to teach some undergraduates about elliptic curves I figured out that I could write a complete proof if I had these three identities for "generic" points, and this I did check in Sage. In this way, I can tell my students that the associativity proof is really "elementary" if you trust Sage to multiply polynomials for you.

Antoine Chambert-Loir (Dec 16 2021 at 09:55):

Johan Commelin said:

An elliptic curve is a smooth proper group scheme of dimension 1. Isn't that the way mathlib should approach this?

— hopefully adding the hypothesis that it has connected fibers

Antoine Chambert-Loir (Dec 16 2021 at 10:00):

Adam Topaz said:

Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?

The first chapter of the book by N. Katz and B. Mazur, Arithmetic moduli of elliptic curves has a clear, high-level, recap of elliptic curves from this point of view. Over a field, Riemann Roch tells you that there is a non constant function $x$ in $|2·O|$, a function $y$ in $|3·O|$ which is not in $|2·O|$, and that $x^3, x^2, x, 1, y^2, xy,y$ are linearly related… Over a base, $|2·O|$, etc. are vector bundles over the base (by the theorems on the dimensions of cohomology groups), locally they are free, etc.

Antoine Chambert-Loir (Dec 16 2021 at 10:02):

Johan Commelin said:

Mordell-Weil wasn't done before varieties existed, right?

Mordell is 1922, it follows a 1901 paper of Poincaré, varieties here are complex varieties, and one can talk about their rational points. In particular, at that time, the group law was understood using the Weierstrass $\wp$-function.

Adam Topaz (Dec 16 2021 at 13:56):

Antoine Chambert-Loir said:

Adam Topaz said:

Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?

The first chapter of the book by N. Katz and B. Mazur, Arithmetic moduli of elliptic curves has a clear, high-level, recap of elliptic curves from this point of view. Over a field, Riemann Roch tells you that there is a non constant function $x$ in $|2·O|$, a function $y$ in $|3·O|$ which is not in $|2·O|$, and that $x^3, x^2, x, 1, y^2, xy,y$ are linearly related… Over a base, $|2·O|$, etc. are vector bundles over the base (by the theorems on the dimensions of cohomology groups), locally they are free, etc.

Yes, this is the standard proof I know that a curve of genus 1 with a point (or a section, if you work over a more general base) has a Weierstrass equation. Well, Johan asked for a smooth proper group scheme of dimension 1. I don't think it's so easy to apply Riemann Roch in that case, since one first has to obtain the genus. In any case, my point was that we would need Riemann Roch regardless.

Adam Topaz (Dec 16 2021 at 14:21):

I guess one needs to show that if C is a proper curve endowed with a group structure, then the group structure forces the canonical class to be trivial, hence degree 0, so that the genus must be one. But again, these are at least two additional results that are nontrivial to formalize.

Damiano Testa (Dec 16 2021 at 14:50):

Dear All,

I have taken some distance from formalization, but I am still very intrigued by such questions!

I wanted to say that proving that cubics are irrational need not be hard. Every automorphism of $\mathbb{P}^1$ fixes at least one point, while translations by non-zero elements on an elliptic curve have no fixed points (you need not know that addition is associative for this, of course). If the ground field does not have characteristic two, you can also use the fact that negation on the elliptic curve is not the identity and fixes 4 points, while an automorphism of $\mathbb{P}^1$ that fixes 3 points is the identity.

Having said this, it is not clear to me how to go from the inclusion of function fields $k(f/g) \subset k(E)$ (notation as in Kevin's initial post in this thread) to an isomorphism $k(t) \simeq k(E)$ without using the fact that rational maps between smooth projective curves extend and that the degree of such a map can be computed by looking at any scheme-theoretic fiber.

Of course, these statements are not "hard", but I do not see an easy way to obtain them without some further machinery,

Adam Topaz (Dec 16 2021 at 15:00):

Wouldn't this argument be circular? I was under the impression that Kevin wanted to use the group structure on the class group to obtain the group structure on the elliptic curve, and it seems to me that you would need the group structure to see that the curve is not rational.

Damiano Testa (Dec 16 2021 at 16:14):

My thought was simply to use the definition of the group structure, not the fact that you know is associative. Once you have a map $E \times E \to E$ , you can fix one coordinate and check whether the map $E \to E$ that you obtain is injective or not. You do not need to know that this map arises from a group action.

Damiano Testa (Dec 16 2021 at 16:15):

Of course, I know that the map is injective because it is translation in a group, but I can check injectivity "by brute force".

Adam Topaz (Dec 16 2021 at 16:15):

I see. Of course, one should conclude by checking that the multiplication $E \times E \to E$ agrees with the multiplication induced by the inclusion in the class group :)

Damiano Testa (Dec 16 2021 at 16:17):

Yes, my argument was simply because I felt that proving irrationality of a cubic could be just as strong as proving associativity. The sketch above makes it more likely that it might be simpler.

Adam Topaz (Dec 16 2021 at 16:20):

Forget the IMO grand challenge. Let's teach lean how to pass a PhD qualifying exam in algebraic geometry ;)

Damiano Testa (Dec 16 2021 at 16:21):

Chances are, that what you describe is a shorter route to the IMO grand challenge...

David Michael Roberts (Dec 16 2021 at 21:57):

Kevin Buzzard said:

You know Jujian is working on the category where the objects are pairs (R,M) with R a ring and M an R-module? There are two kinds of morphisms you can choose for that category. If you want presheaves of modules to be a functor you let the rings and the modules both change. But if you want to do the moduli stack of elliptic curves then you allow the base ring to change but you demand that the map on modules induces an isomorphism after you tensor up. I don't fully understand what's going on here.

You don't need that last restriction, that's only if you want the stack of groupoids of elliptic curves. I've been saying for a long time that it's possibleto do the whole algebraic stack story with the full stack including non-invertible maps, in many cases. For elliptic curves and isogenies it's as about as nice as it can be.

I got pushback when I suggested this nine years ago, but now here and there people (eg Rezk, Gaitsgory) have started to slip the idea in, usually under weird neologisms.

David Michael Roberts (Dec 16 2021 at 22:22):

In particular, while the normal story builds a groupoid scheme, this extended version builds a "category scheme", where the subscheme of invertible arrows is the usual groupoid

Kevin Buzzard (Dec 17 2021 at 09:32):

I learnt about the isomorphism picture in Katz-Mazur. Maybe the idea is that historically we were very happy with this idea of "2-category but all the 2-morphisms are isomorphisms", it used to play an important role in the theory of stacks

David Michael Roberts (Dec 17 2021 at 10:03):

Yes, it's everywhere, but people still want to be able to descend isogenies (I.e. the hom sheaf, not just the isom sheaf), so this stuff is secretly there, but not talked about in a unified way...

Antoine Chambert-Loir (Dec 27 2021 at 12:26):

Adam Topaz said:

Yes, this is the standard proof I know that a curve of genus 1 with a point (or a section, if you work over a more general base) has a Weierstrass equation. Well, Johan asked for a smooth proper group scheme of dimension 1. I don't think it's so easy to apply Riemann Roch in that case, since one first has to obtain the genus. In any case, my point was that we would need Riemann Roch regardless.

Then you probably want to formalize Mumford's Abelian varieties book!

Elif Sacikara (Alumni) (Dec 28 2021 at 22:50):

Since I have been struggling learning both lean and algebraic geometry I was quite hesitant to say something but let me ask the following question anyway. Have you ever considered formalizing the approach for elliptic function fields given in Stichtenoth's Algebraic Function Fields and Codes? My motivation of studying this book is that the notions can be derived from some basics from linear algebra and algebra. For instance elliptic curve addition is defined with respect to equivalence on class group, as it is mentioned in (2). More explicitly we define $P\oplus Q=R$ (addition on elliptic curve) if $P+Q-R-O=(f)$ (addition and subtraction on Divisor group), where $O$ is the identity and $f$ can be chosen as a fraction of two lines relating P, Q and R. However, I am not sure if this can be a good approach for lean.

Kevin Buzzard (Dec 28 2021 at 23:20):

The difficulty with the function field approach is that one has to prove that the map from the points on the curve $E(K)$ to the divisor class group of $K[x,y]/(f(x,y))$ is injective. This is possible but might be messy. It might also be the nicest approach.

Kevin Buzzard (Dec 28 2021 at 23:29):

There's a pdf linked to above (Tom Fisher's Cambridge elliptic curves course notes) has a proof in corollary 1.8 (green 1.5) in the case of characteristic != 2.

Elif Sacikara (Alumni) (Dec 29 2021 at 12:45):

I see. Thanks for the response. For injectivity, instead of considering the class group of K[x,y]/(f(x,y)), I was thinking first to define a function field F, and then work with Cl(F) as the quotient group Div^0(F)/Princ(F) with degree zero divisors Div^0(F) and principal divisors Princ(F). Then by treating E(K) as a set of rational places of F, map each rational place P to the class of P-O. Then injectivity comes the fact that P=Q+(x) implies P=Q, because otherwise [F:K(x)]=1 (degree of zero divisor of x in F) gives us contradiction to the genus of F. However, as said, I should see if all definitions and theorems behind it are doable in Lean.

Kevin Buzzard (Dec 29 2021 at 14:34):

Right, so you have to prove F != K(x) so it boils down to the same calculation. I don't know a low-level way of doing this in characteristic 2

Elif Sacikara (Alumni) (Dec 29 2021 at 15:48):

When I look at the textbook, in theory it is claimed that this approach works for elliptic function fields F/K with an arbitrary field K. This is why may I kindly ask how characteristic 2 creates extra difficulty for Lean (under the assumptions that in Lean we can define an elliptic function as a genus 1 function field F/K containing a degree 1 divisor and if this approach is doable for any other characteristic).

Kevin Buzzard (Dec 29 2021 at 15:50):

It doesn't create extra difficulty for lean, it creates extra difficulty in the sense that there's a cheap proof which works in char > 2 but if you want to deal with char 2 you'll have to develop more of the theory

Kevin Buzzard (Dec 29 2021 at 15:52):

Basically I'm asking you "give me a complete proof that the field of fractions of y^2+axy+by=x^3+cx^2+dx+e over a field k is never isomorphic to k(t)" (provided a discriminant doesn't vanish, yes thanks Oliver :-) )

Kevin Buzzard (Dec 29 2021 at 15:52):

I can prove this using a bunch of algebraic geometry. I can prove this in a low-level way if char isn't 2

Oliver Nash (Dec 29 2021 at 15:52):

(provided a certain discriminant doesn't vanish)

Kevin Buzzard (Dec 29 2021 at 15:56):

The low-level way would be easier to formalise, the general way would take longer but of course needs to be done in the end. The issue is that there's currently no theory of the genus of a curve in algebraic geometry in Lean yet. It could be made, but these things take time. All of the theory needed to do the low-level proof is there already

Adam Topaz (Dec 29 2021 at 16:03):

To illustrate the state of things, for example, I don't think we currently have a good way to define a "regular function field of transcendence degree one" using what's currently in mathlib

Marc Masdeu (Mar 14 2022 at 16:57):

Sorry to bump this old thread, but I've given some more thought to proving associativity the brute force way. As @Kevin Buzzard pointed out above, it is indeed possible to check associativity without carrying out desperately large polynomial calculations. I've played around with Sage and all that is needed is a tactic that generalizes the ring tactic, and that allows one to work in the ring $R[x_1,x_2,x_3,y_1,y_2,y_3]$ modulo the ideal generated by the Weierstrass equation, which is satisfied by the points $(x_i,y_i)$ . That is, after each operation one would run the tactic and the result would be an expression that only contains $y_i$ at most to the power 1. A question then for the tactic writers: how hard would it be to cook up a quick tactic that does this?

Heather Macbeth (Mar 14 2022 at 17:04):

@Marc Masdeu The new linear_combination tactic would get part of the way there. Do you have a sample problem written up in Lean that I could use to write up an example of use?

Marc Masdeu (Mar 14 2022 at 17:15):

Hi @Heather Macbeth, thanks!
I'd like to close goals like this one, for example. Ideally, I'd use something like field_simp [h1, h2] for this.

variables {K : Type*} [field K]

example (a1 a2 a3 a4 a6 x1 x2 y1 y2 : K)
(h1 : y1^2 + a1*x1*y1 + a3*y1 = x1^3 + a2*x1^2 + a4*x1 + a6)
(h2 : y2^2 + a1*x2*y2 + a3*y2 = x2^3 + a2*x2^2 + a4*x2 + a6)
(h: x1 ≠ x2) :
(a2*x1^2 + x1^3 - 2*a2*x1*x2 - x1^2*x2 + a2*x2^2 - x1*x2^2
+ x2^3 - a1*x1*y1 + a1*x2*y1 + a1*x1*y2 - a1*x2*y2 - y1^2 + 2*y1*y2 - y2^2)/(-x1^2 + 2*x1*x2 - x2^2)
 = (-2*a2*x1*x2 - x1^2*x2 - x1*x2^2 + a1*x2*y1 +
 a1*x1*y2 - a4*x1 - a4*x2 + a3*y1 + a3*y2 + 2*y1*y2 - 2*a6)/(-x1^2 + 2*x1*x2 - x2^2) :=
begin
  sorry
end

Heather Macbeth (Mar 14 2022 at 17:35):

@Marc Masdeu You need to figure out the coefficients yourself (which I guess you can do in Sage), but it works pretty well!

import tactic.linear_combination
import tactic.field_simp

variables {K : Type*} [field K]

example (a1 a2 a3 a4 a6 x1 x2 y1 y2 : K)
  (h1 : y1^2 + a1*x1*y1 + a3*y1 = x1^3 + a2*x1^2 + a4*x1 + a6)
  (h2 : y2^2 + a1*x2*y2 + a3*y2 = x2^3 + a2*x2^2 + a4*x2 + a6):
  (a2*x1^2 + x1^3 - 2*a2*x1*x2 - x1^2*x2 + a2*x2^2 - x1*x2^2
  + x2^3 - a1*x1*y1 + a1*x2*y1 + a1*x1*y2 - a1*x2*y2 - y1^2 + 2*y1*y2 - y2^2)/(-x1^2 + 2*x1*x2 - x2^2)
  = (-2*a2*x1*x2 - x1^2*x2 - x1*x2^2 + a1*x2*y1
    + a1*x1*y2 - a4*x1 - a4*x2 + a3*y1 + a3*y2 + 2*y1*y2 - 2*a6)/(-x1^2 + 2*x1*x2 - x2^2) :=
begin
  have : -x1^2 + 2*x1*x2 - x2^2 ≠ 0 := sorry,
  field_simp,
  linear_combination (h1, -1) (h2, -1),
end

Marc Masdeu (Mar 14 2022 at 17:39):

It will need to be more automated, because there will be several dozens of applications of reductions like that. For example, in the expression there may be a term of the form y1^10 (times stuff) and this should be recusively simplified. This is how Hales (yes, that Hales) did it in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7324045/

Heather Macbeth (Mar 14 2022 at 17:44):

I think it can still be a single application of linear_combination (because the coefficients in linear_combination can themselves be polynomial expressions). But it might look very long -- dozens of lines. You could get Sage to compute the polynomial coefficients and then copy-paste them.

Heather Macbeth (Mar 14 2022 at 17:45):

@Rob Lewis is the person to talk to, in any case, whether the process is this Sage-to-linear_combination method, or a direct implementation of this very special case of a Grobner basis tactic.

Yaël Dillies (Mar 14 2022 at 17:45):

I am planning on doing something similar, caching an absurd amount of oracle in the source code, in #10645

Heather Macbeth (Mar 14 2022 at 17:45):

@Yaël Dillies Wrong link?

Yaël Dillies (Mar 14 2022 at 17:46):

slipped!

Heather Macbeth (Mar 14 2022 at 17:52):

@Marc Masdeu So to be more explicit: If you have the polynomial goal $P = Q$ and the polynomial hypotheses h1 that $p_1=q_1$ , ... hn that $p_n = q_n$ :

get Sage to exhibit $P-Q$ as an element of the ideal generated by $p_1 - q_1$ , ... $p_n - q_n$ , say as
$P-Q = a_1 (p_1-q_1)+\ldots +a_n(p_n-q_n)$
for polynomials $a_1$ , ... $a_n$
give Lean the tactic instruction linear_combination (h1, a1) ... (hn, an).

Rob Lewis (Mar 14 2022 at 18:17):

Heather Macbeth said:

Rob Lewis is the person to talk to, in any case, whether the process is this Sage-to-linear_combination method, or a direct implementation of this very special case of a Grobner basis tactic.

There may be news about this in the not too distant future! Maybe @Dhruv Bhatia reads Zulip?

Kevin Buzzard (Mar 15 2022 at 08:12):

@Marc Masdeu there's a PhD student at Imperial @David Ang who is trying to prove Mordell-Weil but right now they're assuming associativity -- if you could make this happen this would be wonderful!

Marc Masdeu (Mar 15 2022 at 08:18):

Kevin Buzzard said:

Marc Masdeu there's a PhD student at Imperial David Ang who is trying to prove Mordell-Weil but right now they're assuming associativity -- if you could make this happen this would be wonderful!

I noticed that branch. I saw that they got the group law definition, which I had worked out in the ell_add_assoc branch. I like better the definition in that branch (I think you made it), and my approach to well-definedness (which is not super long, IMHO). Maybe you can "persuade" them to use the definition in that branch?

David Ang (Mar 15 2022 at 08:28):

The mordell_weil branch already has quite a few other definitions and lemmas in it, so it’s going to be a bit tough to switch over completely, but I’m happy to port over the definition in the other branch once it’s all done.

Dhruv Bhatia (Mar 16 2022 at 15:07):

Heather Macbeth said:

Marc Masdeu So to be more explicit: If you have the polynomial goal $P = Q$ and the polynomial hypotheses h1 that $p_1=q_1$ , ... hn that $p_n = q_n$ :

get Sage to exhibit $P-Q$ as an element of the ideal generated by $p_1 - q_1$ , ... $p_n - q_n$ , say as
$P-Q = a_1 (p_1-q_1)+\ldots +a_n(p_n-q_n)$
for polynomials $a_1$ , ... $a_n$

give Lean the tactic instruction linear_combination (h1, a1) ... (hn, an).

Thanks @Rob Lewis for the mention! This is actually verbatim what I am working on right now. If all goes well, I should have something ready to use in a few weeks!

Heather Macbeth (Jun 27 2022 at 15:59):

@Marc Masdeu Dhruv's tactic is now being PR'd: #14878. Do you have any more really nasty examples from your elliptic curves project that you could share as tests?

Marc Masdeu (Jul 10 2022 at 11:42):

Sorry for not replying before... Probably there are, but buried inside the proofs as I have them right now. Maybe it's time for an update. The branch ell_add_assoc contains the proof of associativity and it just has 3 sorrys left. They are corresponding to 3 computations corresponding to generic situations (that is, it is always clear which formula to use when adding). I have done them in Sage, checking that certain rational functions are the same (modulo replacing y^n for n > 1 using the Weierstrass equation). If you want to see that code I'd happy to post it here, too. If anyone has ideas on how to do these, please do share them!

(PS: below it's not meant to be a mwe, just to point at what's left in the EllipticCurveGroupLaw.lean file.)

lemma assoc_aux { P Q R : points E} (hPz : P ≠ 0) (hQz : Q ≠ 0) (hRz : R ≠ 0)
(hPQ1 : P ≠ Q) (hPQ2 : P ≠ -Q) (hQR1 : Q ≠ R) (hQR2 : Q ≠ -R) (hPQR1 : P + Q ≠ R) (hPQR2 : P + Q ≠ -R)
(hQRP1 : Q + R ≠ P) (hQRP2 : Q + R ≠ -P) : P + Q + R = P + (Q + R) := sorry

lemma assoc_aux' {P Q : points E} (hPz : P ≠ 0) (hQz : Q ≠ 0) (hneg : P + P ≠ 0)
(hPQ1 : P ≠ Q) (hPQ2 : P ≠ -Q) (h2PQ1 : (P + P) ≠ Q) (h2PQ2 : (P + P) ≠ -Q)
(hPQP1 : P + Q ≠ P) (hPQP1 : P + Q ≠ -P) : P + P + Q = P + (P + Q) := sorry

lemma assoc_aux'' {P : points E} (hz : P ≠ 0) (hneg : P + P ≠ 0)
(htwo : (P + P) + (P + P) ≠ 0) (hPP1 : P + P ≠ P) (hPP2 : P + P ≠ -P)
(hPPP1 : P + P + P ≠ P) (hPPP2 : P + P + P ≠ -P) : (P + P) + (P + P) = P + (P + (P + P)) := sorry

Kevin Buzzard (Sep 03 2022 at 19:54):

Aargh! @Heather Macbeth @Marc Masdeu @Mario Carneiro I had some time when away on holiday to think about associativity and how to do it this fancy class group way, and I discovered to my surprise that the method I'd been proposing does not work in characteristic 2!

Kevin Buzzard (Sep 03 2022 at 19:55):

Which is really annoying because for me that had been one of the selling points -- that it was a truly general argument.

Mario Carneiro (Sep 03 2022 at 19:55):

what goes wrong?

Kevin Buzzard (Sep 03 2022 at 19:55):

Just some Diophantine arithmetic doesn't work out because some extension of fields can be inseparable

Kevin Buzzard (Sep 03 2022 at 19:59):

You assume the field k is alg closed and then try to solve Y^2=X(X-1)(X-\lambda) in the field k(T) using arithmetic but the only solutions are constant (three factors on the right are all coprime so all squares up to units and then do a descent) which shows there's no rational function of X and Y with a simple zero and simple pole, giving you the injectivity of the map to the class group and then you win because the map from the curve to the class group plays well with the group laws

Kevin Buzzard (Sep 03 2022 at 20:00):

But in char 2 you can't complete the square on the left and I can't push through the descent because it's a 2-descent on the curve and in char 2 there inseparable degree 2 isogenies which would have to be dealt with separately.

Kevin Buzzard (Sep 03 2022 at 20:01):

tl;dr: the argument in Fisher's notes needs to complete the square to get rid of the Y term in the cubic equation, and this can't be done in char 2

Kevin Buzzard (Sep 03 2022 at 20:04):

I think that probably one could work hard and construct an analogous proof in char 2; one would perhaps at some stage split into two cases depending on whether a certain discriminant was a square or not, the inseparability issue would manifest itself in some way but I'm hazy on the details, because I then realised it didn't matter.

Kevin Buzzard (Sep 03 2022 at 20:06):

If the first line of the "fancy" proof I'm proposing is definitely going to be "OK let's split into the cases of whether char(k)=2 or not" then after that split you now have to ask "now what's the best method from this point" and I think that in both cases the answer is "not my method"

Kevin Buzzard (Sep 03 2022 at 20:07):

In fact if char(k)>2 then we already know the answer thanks to @Heather Macbeth : we say "WLOG $k$ is algebraically closed" (the step I had missed before when talking to her) and then say "so we can put it in Edwards form and now the machinery Heather and Rob and their team have developed can apparently do it, a result I was not excited about at the time but have now seen the importance of.

Mario Carneiro (Sep 03 2022 at 20:12):

Well, that's still of the form "reduce the problem a bit until it is accessible to a giant computer calculation" which is not what I was hoping from the "abstract" proof

Kevin Buzzard (Sep 03 2022 at 20:12):

What would be left would be translating Lean's "official" definition of an elliptic curve into Edwards form, but my instinct is that the algebra involved here is much easier: one can work out the maps using a computer algebra package and they're just going to be quadratic over cubic or whatever, it's not going to be too taxing I don't think.

And then there's char 2 and again there are apparently char 2 specific methods which can be used to prove associativity much more cheaply than the super-complicated thing I was waffling on about above.

Kevin Buzzard (Sep 03 2022 at 20:12):

It's "reduce the problem until it is amenable to two reasonable computer calculations"

Kevin Buzzard (Sep 03 2022 at 20:13):

These are calculations which one can do by hand if it were the year 1900

Kevin Buzzard (Sep 03 2022 at 20:41):

Dan Bernstein sent me a link to his treatment in the odd characteristic case: https://cr.yp.to/papers.html#completed . He said it was the most painless method he knew. In the char 2 case I thought someone mentioned a method -- I'll come back to this. If you want to do the general case then @Marc Masdeu didn't you conclude that the calculation was unfeasible? Someone at ANTS told me that Bosma--Lenstra was the way to do associativity if you wanted a computation which worked for general $k$ .

Kevin Buzzard (Sep 03 2022 at 20:44):

Here is Bernstein's sage notebook for Edwards:

R.<a,d,X1,Z1,Y1,T1,X2,Z2,Y2,T2,X4,Z4,Y4,T4> = PolynomialRing(ZZ,14)

curve1 = a*X1^2*T1^2+Z1^2*Y1^2-Z1^2*T1^2-d*X1^2*Y1^2
curve2 = a*X2^2*T2^2+Z2^2*Y2^2-Z2^2*T2^2-d*X2^2*Y2^2
curve4 = a*X4^2*T4^2+Z4^2*Y4^2-Z4^2*T4^2-d*X4^2*Y4^2

I = R.ideal([curve1,curve2,curve4])

def origadd(X1,Z1,Y1,T1,X2,Z2,Y2,T2):
  X3 = X1*Y2*Z2*T1+X2*Y1*Z1*T2
  Z3 = Z1*Z2*T1*T2+d*X1*X2*Y1*Y2
  Y3 = Y1*Y2*Z1*Z2-a*X1*X2*T1*T2
  T3 = Z1*Z2*T1*T2-d*X1*X2*Y1*Y2
  return X3,Z3,Y3,T3

def dualadd(X1,Z1,Y1,T1,X2,Z2,Y2,T2):
  X3 = X1*Y1*Z2*T2+X2*Y2*Z1*T1
  Z3 = a*X1*X2*T1*T2+Y1*Y2*Z1*Z2
  Y3 = X1*Y1*Z2*T2-X2*Y2*Z1*T1
  T3 = X1*Y2*Z2*T1-X2*Y1*Z1*T2
  return X3,Z3,Y3,T3

for a3 in origadd,dualadd:
  for a7 in origadd,dualadd:
    for a5 in origadd,dualadd:
      for a0 in origadd,dualadd:
        X3,Z3,Y3,T3 = a3(X1,Z1,Y1,T1,X2,Z2,Y2,T2)
        X7,Z7,Y7,T7 = a7(X3,Z3,Y3,T3,X4,Z4,Y4,T4)
        X5,Z5,Y5,T5 = a5(X1,Z1,Y1,T1,X4,Z4,Y4,T4)
        X0,Z0,Y0,T0 = a0(X2,Z2,Y2,T2,X5,Z5,Y5,T5)
        assert X0*Z7-X7*Z0 in I
        assert Y0*T7-Y7*T0 in I
        print([x.__name__ for x in (a3,a7,a5,a0)])

Michael Stoll (Sep 04 2022 at 10:08):

I have a fairly elementary proof for cubic curves that should be independent of the characteristic, which basically only uses Bézout's theorem for a line and a curve. See Section 9 in these notes (which are in German, but I'd be happy to help with translating :smile:).

Kevin Buzzard (Sep 04 2022 at 10:16):

You don't somewhere use Bezout for two cubics? That was what was scaring me with this approach.

Michael Stoll (Sep 04 2022 at 10:52):

No.

David Michael Roberts (Sep 08 2022 at 08:10):

I haven't followed this discussion, so it may have been mentioned, but if one is going to reduce to Edwards curves, then would https://arxiv.org/abs/2004.12030 be useful?

Kevin Buzzard (Sep 08 2022 at 08:21):

I think that @Heather Macbeth already did the Edwards case and claimed she could give a complete proof of associativity in that case. I feel bad because at the time I'd pointed out what I thought were two problems with the approach -- firstly she was assuming the curve had a point of order 4 and secondly that she'd avoided characteristic 2. But the first issue can be avoided by extending the base field and the second issue turned out to be a problem with my approach as well, unless I can prove my way around it (which I can't, right now)

David Michael Roberts (Sep 08 2022 at 08:41):

OK, thanks!

Kevin Buzzard (Sep 08 2022 at 12:51):

Dan Bernstein managed to prove associativity in char 2 in sage-math using some formulae due to David Kohel:

R.<c,X0,X1,X2,X3,Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3> = PolynomialRing(GF(2),13)

curveX1 = X0^2-X2^2-c^2*X1*X3
curveX2 = X1^2-X3^2-c^2*X0*X2
curveY1 = Y0^2-Y2^2-c^2*Y1*Y3
curveY2 = Y1^2-Y3^2-c^2*Y0*Y2
curveZ1 = Z0^2-Z2^2-c^2*Z1*Z3
curveZ2 = Z1^2-Z3^2-c^2*Z0*Z2

I = R.ideal([curveX1,curveX2,curveY1,curveY2,curveZ1,curveZ2])

def add0(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return (X0*Y0+X2*Y2)^2,c*(X0*X1*Y0*Y1+X2*X3*Y2*Y3),(X1*Y1+X3*Y3)^2,c*(X0*X3*Y0*Y3+X1*X2*Y1*Y2)

def add1(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return c*(X0*X1*Y0*Y3+X2*X3*Y1*Y2),(X1*Y0+X3*Y2)^2,c*(X0*X3*Y2*Y3+X1*X2*Y0*Y1),(X0*Y3+X2*Y1)^2

def add2(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return (X3*Y1+X1*Y3)^2,c*(X0*X3*Y1*Y2+X1*X2*Y0*Y3),(X0*Y2+X2*Y0)^2,c*(X0*X1*Y2*Y3+X2*X3*Y0*Y1)

def add3(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return c*(X0*X3*Y0*Y1+X1*X2*Y2*Y3),(X0*Y1+X2*Y3)^2,c*(X0*X1*Y1*Y2+X2*X3*Y0*Y3),(X1*Y2+X3*Y0)^2

additionlaws = add0,add1,add2,add3

for A in additionlaws:
  for B in additionlaws:
    for C in additionlaws:
      for D in additionlaws:
        A0,A1,A2,A3 = A(X0,X1,X2,X3,Y0,Y1,Y2,Y3)
        B0,B1,B2,B3 = B(A0,A1,A2,A3,Z0,Z1,Z2,Z3)
        C0,C1,C2,C3 = C(Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3)
        D0,D1,D2,D3 = D(X0,X1,X2,X3,C0,C1,C2,C3)
        assert B0*D1-B1*D0 in I
        assert B0*D2-B2*D0 in I
        assert B0*D3-B3*D0 in I
        assert B1*D2-B2*D1 in I
        assert B1*D3-B3*D1 in I
        assert B2*D3-B3*D2 in I
        print([x.__name__ for x in (A,B,C,D)])

Kevin Buzzard (Sep 08 2022 at 12:55):

But right now I guess I am confused about the general case. @Marc Masdeu you seem convinced that the polynomials involved would be astronomical but https://cr.yp.to/papers.html#completed seems to suggest that it will be less painful than we'd imagined.

Marc Masdeu (Sep 08 2022 at 14:58):

All the experiments I've done have been with the usual polynomials valid for the general Weierstrass equation. I would like a proof that worked in all characteristics at once, but I guess that it'd be okay if there was a separate case for characteristic 2. @Kevin Buzzard I've had this project slowly dying for long enough that maybe it'd be good to have a chat at some point and see how feasible it is to knock it off?
By the way, I can do the general case in Sage no problem, the denominators involved get large, and then they explode when you want to clear them out. I had prepared some scripts to share, but maybe it's better to see it live...

Riccardo Brasca (Sep 23 2022 at 13:31):

Maybe this has already been discussed, but I am at a conference where someone said elliptic curves have been formalized in HOL, but they're not sure how nor in which generality. Does anyone have more info about this?

Kevin Buzzard (Sep 23 2022 at 14:18):

They've been formalised in Coq (Weierstrass form, so not the most general form in char 2 or 3) with a conceptual proof of associativity valid in char != 2. There's an Isabelle/HOL formalisation in the AFP: https://www.isa-afp.org/entries/Elliptic_Curves_Group_Law.html again just of the group law (incl proof of associativity) for the Weierstrass curve (from last year), so again the proof is only valid in char != 2 and doesn't cover all elliptic curves in char 3.

Riccardo Brasca (Sep 23 2022 at 14:19):

I see

Adam Topaz (Sep 23 2022 at 14:41):

Maybe we should bite the bullet and start a big push for Riemann Roch so we can do all this stuff properly?

Riccardo Brasca (Sep 23 2022 at 14:47):

This is clearly the right thing to do if we can wait, but we are quite far, aren't we?

Adam Topaz (Sep 23 2022 at 14:50):

I still think the proof from Serre's book on class fields is doable

Adam Topaz (Sep 23 2022 at 14:50):

The "repartitions" proof

Riccardo Brasca (Sep 23 2022 at 14:51):

I don't know this proof, but is it for which "curves"? Schemes blah blah, varieties over algebraically closed field (defined somehow)?

Adam Topaz (Sep 23 2022 at 14:53):

One could say it's for function fields of trdeg 1

Adam Topaz (Sep 23 2022 at 14:54):

Maybe with a perfect base field? I don't remember

Riccardo Brasca (Sep 23 2022 at 14:55):

Oh, skipping algebraic geometry in some sense? Interesting!

Antoine Chambert-Loir (Sep 23 2022 at 15:29):

Kevin Buzzard said:

They've been formalised in Coq (Weierstrass form, so not the most general form in char 2 or 3) with a conceptual proof of associativity valid in char != 2. There's an Isabelle/HOL formalisation in the AFP: https://www.isa-afp.org/entries/Elliptic_Curves_Group_Law.html again just of the group law (incl proof of associativity) for the Weierstrass curve (from last year), so again the proof is only valid in char != 2 and doesn't cover all elliptic curves in char 3.

It is my understanding that the polyrithtactic implemented @Heather Macbeth should give a completely automatic proof for all elliptic curves represented as a plane cubic curve.

Heather Macbeth (Sep 23 2022 at 15:54):

@Antoine Chambert-Loir In fact this tactic was written by @Dhruv Bhatia, a student of @Rob Lewis!

Heather Macbeth (Sep 23 2022 at 15:56):

And regarding your question: I tried this out at some point (on Edwards curves). There are many cases and I skipped all of them except for the one which I guessed would be the hardest. Polyrith did send me some output here (rather than failing), but the output was so long that it got garbled when being passed back to Lean. So my guess is that it could be made to work, but it requires some effort and probably a more robust version of the Sage interaction.

Heather Macbeth (Sep 23 2022 at 15:57):

It was just a quick experiment for me but I'd be glad to see someone (else) push it further.

Antoine Chambert-Loir (Sep 23 2022 at 16:19):

Oh, I see!

Kevin Buzzard (Sep 23 2022 at 17:46):

The problem with the brute force approach is that there are problems if you use the long form because the polynomials involved apparently have hundreds of thousands of terms. I thought I could do a cheap class group proof but the argument I know doesn't work in char 2. Over an alg closed field (which you can assume wlog) in char not two you can prove that the only solutions x,y in k[t] to y^2=x(x-1)(x-a) (with a in k not 0 or ) are necessarily in k, by a descent argument using the fact that coprime stuff whose product is a square must all be squares etc. This is just a concrete proof that the function field of the curve isn't rational. But in char 2 I don't know of a low-level proof of this, and of course I'm worried about the 2-descent because the isogeny could be inseparable...

Junyan Xu (Sep 23 2022 at 19:05):

Adam Topaz said:

One could say it's for function fields of trdeg 1

reminds me that there's also a proof in Michael Rosen, Number Theory in Function Fields https://link.springer.com/book/10.1007/978-1-4757-6046-0

Anne Baanen (Sep 24 2022 at 04:55):

Riccardo Brasca said:

Maybe this has already been discussed, but I am at a conference where someone said elliptic curves have been formalized in HOL, but they're not sure how nor in which generality. Does anyone have more info about this?

Just to be clear, this was HOL Light, not HOL 4.

Kevin Buzzard (Sep 24 2022 at 08:18):

I was thinking about Riemann-Roch recently. Formally it's two completely different theorems -- one for line bundles on compact Riemann surfaces and one for invertible sheaves on smooth projective algebraic curves (in arbitrary characteristic). Neither implies the other (unless you want to develop the entire theory of GAGA first, which we probably don't). Similarly generalisations of the statement never make it clear about what they're generalising -- places like Wikipedia claim that the Atiyah-Singer index theorem is "a generalisation of Riemann-Roch" and that Grothendieck-Riemann-Roch is "a generalisation of Riemann-Roch" but they're two generalisations of two different theorems.

Kevin Buzzard (Sep 24 2022 at 08:20):

PS Jujian Zhang is working on GAGA but it's slow going, not least because as far as I know we still haven't decided how to implement sheaves of modules on a scheme...

Antoine Chambert-Loir (Sep 25 2022 at 09:04):

I don't have thought deeply about this, but I wonder whether one could adopt a high-level point of view and have topoi, ringed topoi… One reason is that many results on rings, modules… will be given for free once the traditional proofs are transferred in this generality.

Joël Riou (Sep 27 2022 at 22:57):

I would agree with @Antoine Chambert-Loir. Sheaves of rings could be identified to ring objects A in the category of sheaves of sets, and sheaves of modules M with "A-Module objects". This may imply redefining internal versions of [has_mul], [add_comm_group], [module], etc (for a symmetric monoidal structure, e.g. the one given by finite products). Then, there could be suitable notations and instances expressing that M(U) is a module over A(U), etc.

Kevin Buzzard (Sep 28 2022 at 16:58):

Some previous discussion on sheaves of modules is here and here.

Wrenna Robson (Sep 29 2022 at 10:44):

Kevin Buzzard said:

Dan Bernstein managed to prove associativity in char 2 in sage-math using some formulae due to David Kohel:

R.<c,X0,X1,X2,X3,Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3> = PolynomialRing(GF(2),13)

curveX1 = X0^2-X2^2-c^2*X1*X3
curveX2 = X1^2-X3^2-c^2*X0*X2
curveY1 = Y0^2-Y2^2-c^2*Y1*Y3
curveY2 = Y1^2-Y3^2-c^2*Y0*Y2
curveZ1 = Z0^2-Z2^2-c^2*Z1*Z3
curveZ2 = Z1^2-Z3^2-c^2*Z0*Z2

I = R.ideal([curveX1,curveX2,curveY1,curveY2,curveZ1,curveZ2])

def add0(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return (X0*Y0+X2*Y2)^2,c*(X0*X1*Y0*Y1+X2*X3*Y2*Y3),(X1*Y1+X3*Y3)^2,c*(X0*X3*Y0*Y3+X1*X2*Y1*Y2)

def add1(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return c*(X0*X1*Y0*Y3+X2*X3*Y1*Y2),(X1*Y0+X3*Y2)^2,c*(X0*X3*Y2*Y3+X1*X2*Y0*Y1),(X0*Y3+X2*Y1)^2

def add2(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return (X3*Y1+X1*Y3)^2,c*(X0*X3*Y1*Y2+X1*X2*Y0*Y3),(X0*Y2+X2*Y0)^2,c*(X0*X1*Y2*Y3+X2*X3*Y0*Y1)

def add3(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
  return c*(X0*X3*Y0*Y1+X1*X2*Y2*Y3),(X0*Y1+X2*Y3)^2,c*(X0*X1*Y1*Y2+X2*X3*Y0*Y3),(X1*Y2+X3*Y0)^2

additionlaws = add0,add1,add2,add3

for A in additionlaws:
  for B in additionlaws:
    for C in additionlaws:
      for D in additionlaws:
        A0,A1,A2,A3 = A(X0,X1,X2,X3,Y0,Y1,Y2,Y3)
        B0,B1,B2,B3 = B(A0,A1,A2,A3,Z0,Z1,Z2,Z3)
        C0,C1,C2,C3 = C(Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3)
        D0,D1,D2,D3 = D(X0,X1,X2,X3,C0,C1,C2,C3)
        assert B0*D1-B1*D0 in I
        assert B0*D2-B2*D0 in I
        assert B0*D3-B3*D0 in I
        assert B1*D2-B2*D1 in I
        assert B1*D3-B3*D1 in I
        assert B2*D3-B3*D2 in I
        print([x.__name__ for x in (A,B,C,D)])

oh my god this is the most Dan code ever

Wrenna Robson (Sep 29 2022 at 10:45):

I think morally I wouldn't call this a proof per se in the same way we use that word on Lean, but it seems to work on a cursory scan.

David Ang (Oct 26 2022 at 15:53):

API for addition of K-rational points

Junyan Xu (Nov 10 2022 at 06:49):

I just came across this proof and it seems to me that the computation involved is feasible for Lean 3, and the prerequisites are also available. Maybe someone could give it a try.

Andrew Yang (Nov 10 2022 at 06:56):

I think this is the "function field" approach mentioned by Kevin and others above. It seems like the "explicit calculation" doesn't really work when the characteristic is 2?

Riccardo Brasca (Nov 10 2022 at 07:00):

It uses short weirstrass equation it seems. In characteristic 2 and 3 it's not always possible to write the curve in that way

Kevin Buzzard (Nov 10 2022 at 07:29):

Yes the problem is precisely that this proof doesn't work in char 2 as it stands and neither does the Edwards curve approach. In fact it seems to me that there might be no formalisation of associativity in characteristic 2 in any system

Eric Rodriguez (Nov 10 2022 at 11:27):

is a char2-only proof just as tricky as a general proof?

David Ang (Nov 10 2022 at 14:39):

If we’re going by the brute-force approach, there are char2-specific simplifications e.g. removing two of the five coefficients, replacing (a-b)^2 with a^2+b^2, etc, but it still doesn’t reduce the computation to a level that’s feasible for Lean3’s ring tactic, especially because there’s no good way to work with expressions in char2

Yaël Dillies (Nov 10 2022 at 14:45):

Is the distinction really char 2 vs char not 2 or rather 2 is invertible vs 2 not invertible? I thought I understood it was the latter

Johan Commelin (Nov 10 2022 at 14:46):

I think we're working over a field for now

Adam Topaz (Nov 10 2022 at 14:54):

Trying to do elliptic curves over general rings (or schemes) would complicate things even further.

Riccardo Brasca (Nov 10 2022 at 14:59):

If we define an elliptic curve as a group scheme such that blah blah, at least associativity will be easy... (just joking)

Kevin Buzzard (Nov 10 2022 at 16:30):

Yes, we need all the definitions, but the point is that we have a lot of the tools necessary to start proving nontrivial theorems about this model. The concrete approach is important for applications, e.g. proving that certain explicit Diophantine equations have some explicit set of rational solutions. As Lean 4 becomes a reality then it will be interesting to see what we can do here.

Junyan Xu (Nov 10 2022 at 19:04):

I think the following argument works: to show injectivity, assume that $[I_P] = [I_Q]$ in the class group (see [MO answer] for the definition of $I_P$ ), then multiplying by $[I_{-Q}]$ we have $[I_{P+(-Q)}] = [I_P I_{-Q}] = [I_Q I_{-Q}] = [I_O] = 0$ . If $P\ne Q$ then $P+(-Q)\ne O$ so assume it's the point (x,y), then $[I_{P+(-Q)}] = 0$ says (X-x, Y-y) is a principal ideal in R, which means that (X-x, Y-y) = (X-x, Y-y, f) = (g, f) for some $g \in k[X,Y]$ . But the cubic f=0 cannot intersect g=0 at a single point, contradiction. (I think the only place where Δ ≠ 0 will be needed is in showing $[I_{P+Q}] = [I_P I_Q]$ .)

Kevin Buzzard (Nov 10 2022 at 20:52):

The cubic f=0 can certainly intersect g=0 at a single point, for example when if g is linear it can meet f at a point of order 2. I don't see how to finish easily from here. The only low level proof I know assumes char not 2.

Kevin Buzzard (Nov 10 2022 at 20:55):

Even if we assume some conceptual stuff then this argument boils down to the assertion that there's no function on E with a simple zero at a given finite point and a simple pole at infinity, so it's the statement that the cubic doesn't have genus 0 (for some definition of genus), but is there an easy way of seeing that? I don't know one

Kevin Buzzard (Nov 10 2022 at 20:56):

This is my understanding of the stumbling block in this approach but I might be missing something

Junyan Xu (Nov 10 2022 at 21:04):

Yes I was wrong, it's necessary to assume some regularity of f in the last part of the argument, for example if g is linear f=0 is one line intersecting g=0 union two lines parallel to g=0 then (g,f) could be a maximal ideal. But if f=0 is smooth it certainly follows from Bezout's theorem; the intersection multiplicity at infinity can be at most deg g, so the total intersection multiplicity in the affine chart is at least 2 * deg g, which cannot be 1. I imagine Bezout is still easier to prove than Riemann-Roch.

Kevin Buzzard (Nov 10 2022 at 22:20):

Even with bezout you will have this annoying issue about things meeting at infinity eg you'll have to rule out g meeting f once away from infinity and 3d-1 times at infinity

Junyan Xu (Nov 11 2022 at 01:19):

Let me note that since f has the form y^2 + p(x) y + q(x) and is in the ideal (f,g), we may assume that g is of the form p(x) y + q(x) and therefore solve y = -q(x) / p(x). I feel that polynomials in one variable should be easy to deal with, but I haven't worked out a proof.

Kevin Buzzard (Nov 11 2022 at 08:10):

The reason I'm nervous about believing there's an easy proof is that in the char not 2 case another way of saying that the ideal isn't principal is that the only solutions to y^2=x(x-1)(x-lambda) in k(t) are all in k, which is a delicate diophantine argument (product of three coprime polys is a square so they're all squares and now you're doing a 2-descent) -- it's arithmetic rather than algebraic.

Junyan Xu (Nov 12 2022 at 04:05):

Okay I hope the following argument now contains no gap, but it depends on ideal norms so I think we first need to show that $k[X,Y]/(f)$ is a Dedekind domain; we could go the regular + dimension 1 route which uses #16860. Ideal norms have been PR'd #17203 but only for finite extensions of ℤ; @Anne Baanen is it easy to generalize your developments to all Dedekind domains?

So our goal is to show the ideal $(X-x, Y-y) \subseteq k[X,Y]/(f)$ is not principal (when $f(x,y)=0$ ), and we consider the degree 2 extension $k[X] \to k[X,Y]/(f)$ and the associated norm. Every element in $k[X,Y]/(f)$ can be written as $p(X)Y+q(X)$ , whose conjugate is $p(X)(-Y-a_1 X-a_3)+q(X)$ , so $N(p(X)Y+q(X))=-(X^3+a_2 X^2+a_4 X+a_6)p(X)^2-(a_1 X+a_3)p(X)q(X)+q(X)^2$ . Let a and b be the degrees of p and q respectively, then the three terms are 2a+3, a+b+(0 or 1), and 2b respectively. When $a+1\ge b$ the first term has the highest degree (without tie), and when $a+1<b$ the last term has the highest degree (without tie), so the degree of the norm of an element is either 2a+3 or 2b and never 1. On the other hand, the norm of the ideal $(X-x, Y-y)$ is $(X-x)$ if x isn't 2-torsion and $(X-x)^2$ if x is. If $(X-x, Y-y)$ were generated by a single element then its norm $(X-x)$ would be generated by the norm of that element, so the norm is a constant multiple of $X-x$ (degree 1, impossible) or $(X-x)^2$ . If the norm of $p(X)Y+q(X)$ is $c(X-x)^2$ , since the degree is even it can't be 2a+3 and must be 2b=2 so we are in the second case, and we have b=1 and a+1<b means that p(X)=0, so the norm is $q(X)^2=c(X-x)^2$ and $q(X)=c'(X-x)$ . But it's easy to show $(X-x, f)=(X-x, (Y-y)^2)\ne (X-x, Y-y)$ in this case.

Indeed I'm solving the norm equation in polynomials here but fortunately it's easy in this case :) This argument is inspired by Exercise I.6.2 in Hartshorne's GTM52, and I checked out the exercise after seeing the comment here.

Junyan Xu (Nov 12 2022 at 08:43):

I think we probably want to develop this general version of ideal norms (notes by Sutherland) in mathlib. To show regularity of k[X,Y]/(f), we may develop stacks#07PF, or may be we can show that the localizations at all points are PID more directly and use the 7th characterization of Dedekind domains here.

Riccardo Brasca (Nov 12 2022 at 09:38):

I think @Anne Baanen and @Alex J. Best are working on the ideal norm.

Anne Baanen (Nov 14 2022 at 15:13):

At the moment we only have working code for finite extensions of ℤ indeed, but I did do some work on the norm definition using this general suggestion, [M_p : N_p]A_p := det φ̂_p. For example, nat_abs_det_equiv in #17299 links this determinant to the cardinality of the quotient. I don't know how developed the theory of localizations for Dedekind is now, to say how quickly we can get the full definition working nicely.

Anne Baanen (Nov 14 2022 at 15:14):

My work in progress can be found on branch#ideal_norm. It is somewhat duplicate with the PRs mentioned above, but should have some more results toward this determinant-based definition of the norm.

Riccardo Brasca (Nov 14 2022 at 15:26):

Let me review some of the PR involved!

David Ang (Nov 14 2022 at 17:04):

David Ang said:

API for addition of K-rational points

I made a PR last month giving a basic API for the addition of K-rational points, the type of which is defined as an inductive over the point at infinity and the rest of the affine points. There were some discussion with @Michael Stoll in the PR page that perhaps we don't want this to be the definition, because it might be unlikely that associativity would be proven easily with this definition. So far the current approaches discussed are (if I didn't miss any):

brute-force (Lean3's ring is ~4 times too slow for this, and even if it were fast enough I need a file with thousands of lines of polynomials),
injecting it into the class/Picard group (need to define the associated ring),
using some form of Cayley-Bacharach (need some form of Bezout), and
translating it into Edwards form (need to define Edwards curves, and this doesn't work for char 2).

The last three methods would involve translating the current definition of affine addition into their respective contexts, then showing that this addition is the same as the translated addition. The ideal definition is obviously to define the Weierstrass model via the Proj construction (or even as a group scheme etc), but I doubt we are close to doing that. I'm thinking that we should fix the affine definition, partly because it's the definition that's easiest to work with and prevalent in other theorem provers, and partly because I already proved a bunch of other things with this without needing associativity. For whatever proof we have for associativity that comes later, we can simply introduce the necessary translation and define an equiv with the affine definition. Thus I'm asking for opinions via a poll here.

David Ang (Nov 14 2022 at 17:08):

/poll What should I do for #17194?
The current affine definition should be approved, and should be the definition (at least until we can give coordinates via Riemann-Roch)
The current affine definition should be approved, but should be refactored as a theorem/equiv once we have a proof of associativity
The current affine definition should be rejected, and we should wait for a better definition with a proof of associativity

Junyan Xu (Nov 15 2022 at 01:31):

I just found that the Picard group proof @Michael Stoll proposed two weeks ago on Github is almost the same as the proof I worked out, except that it's stated in terms of the (smooth) projective curve while mine is totally restricted to the affine chart, working with a Dedekind domain; and in the last part of the argument, I use the norm to show the maximal ideals (with residue field k, i.e. inertial degree 1) can't be principal, while Michael uses that the orders of zeros/poles of a rational function sum up to 0, which is essentially equivalent to Bezout for the cubic and arbitrary plane curves defined by the numerator and denominator of the rational function. (It's also possible to work with abstract curves (formed by valuations of the tr.deg. 1 function field) but that would be more theory development, and some glue is necessary to connect to the concrete plane curve model.) So I think my approach is more elementary (purely commutative algebra). Although the approaches to this last part are distinct, they also share some similarity: showing a rational function cannot have a simple pole at infinity and no pole elsewhere in his approach is the same as showing an element in the Dedekind domain cannot have norm of degree 1. I should have seen his proof in email notification but had forgotten about it when I began to pursue my approach :)

Junyan Xu (Nov 15 2022 at 01:31):

Although a full theory of ideal norms is desirable (it unifies the number field and function field cases), it's also possible to take a shortcut if we just want to prove the maximal ideals (with residue field k) in the Dedekind domain are non-principal: for algebraic number rings we have a natural notion of size of an ideal, which is the cardinality of the quotient. In our case where all ideals are vector subspaces over the base field k, we may use the k-dimension of the quotient (which is additive rather than multiplicative, but we don't even need this fact): the quotient at a rational point has dimension 1, but we can show that the quotient by any nonzero principal ideal (pY+q) has dimension equal to the degree of N(pY+q), which is at least 2 from previous argument. (And when I write this I realize I made an error in that argument: there's no need to split out the 2-torsion case; the norm of the ideal is still (X-x) rather than its square.) My argument uses this result just merged into mathlib:

def ideal.quotient_equiv_pi_span
  (I : ideal S) (b : basis ι R S) (hI : I ≠ ⊥) :
  (S ⧸ I) ≃ₗ[R] Π i, (R ⧸ ideal.span ({I.smith_coeffs b hI i} : set R))

The determinant of the multiplication by pY+q on S=k[X,Y]/(f) as a k[X]-linear map (R=k[X]) is easily calculated to be equal to N(pY+q) using the basis {1,Y}. On the other hand, it's also associated to the product of the smith_coeffs: we may consider S → (pY+q) → S → S where the first map sends each element in the Smith basis to the corresponding smith_coeff times that element (which sends a basis to a basis so is a R-linear_equiv), the second map sends every element x ∈ (pY+q) to (pY+q)⁻¹ * x (which is also a R-linear_equiv when p,q are not both zero since S is a domain), and the third map is multiplication by pY+q. Then last two maps "cancel", so the net effect of the composition of the three maps is multiplying by smith_coeffs and so has determinant equal to the product of the smith_coeffs. On the other hand, the first two maps compose to an R-automorphism of S, whose determinant is a unit (docs#linear_equiv.det). Therefore, N(pY+q), which is the determinant of the third map, is associated to and hence has the same degree as the product of the smith_coeffs. According to ideal.quotient_equiv_pi_span, the dimension of S/(pY+q) over k is the same as the sum of the degrees of the smith_coeffs, which is the same as the degree of N(pY+q), and we are done.

With this shortcut, I don't think we even need to show S is Dedekind, only that S is a domain, i.e. the polynomial f in k[X,Y] is irreducible; if f splits into a product (Y+p(X))(Y+q(X)) it should be easy to show Δ = 0.

Junyan Xu (Nov 15 2022 at 01:32):

Regarding #17194, I think the definition of addition and the case splits are fine, and are quite amenable for showing $[I_P I_Q]=[I_{P+Q}]$ . However, I think def weierstrass should be made a polynomial f : k[X][X] since we're gonna define the Dedekind domain as k[X] adjoining a root of f; we can easily evaluate the polynomial at any point in the affine plane.

Kevin Buzzard (Nov 15 2022 at 05:23):

Yeah I believe that this argument should work even in char 2 and looks like the most promising approach currently

Kevin Buzzard (Nov 15 2022 at 05:23):

@David Ang

David Ang (Nov 15 2022 at 11:58):

Junyan Xu said:

Regarding #17194, I think the definition of addition and the case splits are fine, and are quite amenable for showing $[I_P I_Q]=[I_{P+Q}]$ . However, I think def weierstrass should be made a polynomial f : k[X][X] since we're gonna define the Dedekind domain as k[X] adjoining a root of f; we can easily evaluate the polynomial at any point in the affine plane.

I've defined weierstrass_polynomial for this - I'm guessing the ring is now just K[X][X] ⧸ ideal.span ({E.weierstrass_polynomial} : set K[X][X])?

David Ang (Nov 15 2022 at 15:53):

Probably not the right place to ask but I will ask here anyway since it's relevant: I was trying to define the map $P \mapsto [I_P]$ and I encountered a deterministic timeout, so I tried extracting a MWE:

import ring_theory.class_group
open polynomial
open_locale polynomial

variables (K : Type) [field K]

noncomputable def cusp : K[X][X] := X ^ 2 - C (X ^ 3) -- supposed to be weierstrass_polynomial
@[reducible] def cusp_ring := K[X][X] ⧸ ideal.span ({cusp K} : set K[X][X])
instance : is_domain $ cusp_ring K := sorry
instance : is_dedekind_domain $ cusp_ring K := sorry

noncomputable def to_class : K → class_group (cusp_ring K) := -- supposed to be E(K) -> class_group
λ x, class_group.mk0 ⟨ideal.span {quotient.mk' $ C (X - C x)}, sorry⟩

The last line takes ~10 seconds to compile for me, and making it more complicated (e.g. changing K to E(K) and making the definitions actually correct) makes it time out - any idea what's going on? I can change ... : K → ... := λ x, ... to ... (x : K) : ... := ... and it's fast, but for E(K) I'd rather pattern match the inductive...

Junyan Xu (Nov 15 2022 at 17:34):

It takes <4s on my machine and I'm not yet sure how to speed it up. By the way I think you want to use the default constructor which doesn't require is_dedekind_domain, and you can directly show $[I_{-P}]$ is the inverse of $[I_P]$ because $[I_P I_{-P}]=[I_O]=[R]$ . Showing is_dedekind_domain R would be more work, though certainly doable.

Junyan Xu (Nov 15 2022 at 17:36):

and K[X,Y]/(Y^2-X^3) is definitely not a Dedekind domain; (Y/X)^2=X so it's not integrally closed.

David Ang (Nov 15 2022 at 17:47):

Yeah the Y^2 = X^3 was just a stupid example to show that it’s very slow - once someone approves #17194 I can write down the actual definition that times out (and I don’t know how to fix)

Junyan Xu (Nov 15 2022 at 19:32):

Oooh I just realized the weierstrass polynomial can never be factored as (Y+p(X))(Y+q(X)), because pq needs to have degree 3 so p+q has degree at least 2 but needs to have degree 1. So k[X,Y]/(f) is always a domain no matter whether Δ = 0 and my argument actually shows that the nonsingular rational points in Spec k[X,Y]/(f) always form a group. Maybe we should do things in this generality, @Kevin Buzzard? It would be useful for bad (additive/multiplicative) reduction of elliptic curves, right?

Michael Stoll (Nov 15 2022 at 19:37):

Doing it in this generality would definitely be good!

Andrew Yang (Nov 15 2022 at 21:28):

David Ang said:

the ring should now be just K[X][X] ⧸ ideal.span ({E.weierstrass_polynomial} : set K[X][X])

docs#adjoin_root might be useful here

David Ang (Nov 15 2022 at 22:13):

Andrew Yang said:

David Ang said:

the ring should now be just K[X][X] ⧸ ideal.span ({E.weierstrass_polynomial} : set K[X][X])

docs#adjoin_root might be useful here

Thanks, this magically fixes literally all my current timeout issues as well :saluting_face:

Eric Rodriguez (Nov 15 2022 at 22:27):

adjoin_root has some diamond somewhere iirc, I have a fix on the splitting field diamond fix but I haven't kept that up to date sadly... I'll try finish that soon so that doesn't cause any issues here

Junyan Xu (Nov 16 2022 at 03:49):

Doing it in this generality would definitely be good!

So we need to introduce a new structure without the Δ : R× and make docs#EllipticCurve extends it, and define addition for this new structure more generally. What should the new structure be called? The constructor point.some (x y : K) (h : E.weierstrass_equation x y) would take a nonsingularity condition (or do we want to consider all rational points including the singular point?), and we may define another constructor for points on an EllipticCurve without that condition.

BTW @David Ang In #17194 I think it's better to use [comm_ring] instead of [field] whenever possible; I think weierstrass_equation/polynomial and neg don't need [field], only addition needs it.

P.S. I just learned that is the discriminant of $y^2+y=x^3-x$ src#EllipticCurve.inhabited :) Looks like it's not so random, but maybe just law of small number.

Andrew Yang (Nov 16 2022 at 04:00):

I think that is one of the reasons why 37 is 37 and not 42 or something else.

Junyan Xu (Nov 16 2022 at 04:06):

I think it's always advertised as a random choice, but I've now found another mention of its appearance in elliptic curves ...

Mario Carneiro (Nov 16 2022 at 04:28):

it's definitely not a random choice. It's meme status on this zulip originates from @Kevin Buzzard using it as his favorite number

Mario Carneiro (Nov 16 2022 at 04:28):

and I'm pretty sure the elliptic curve connection is part of it

Thomas Browning (Nov 16 2022 at 05:21):

I didn't know about the elliptic curve connection. I thought it just had to do with it being the first irregular prime.

Junyan Xu (Nov 16 2022 at 05:32):

and it's also the smallest prime not dividing the order of Monster ... and 59 is the only irregular prime dividing the order of Monster.

it's definitely not a random choice.

I always thought it's a randomly chosen prime number.

Johan Commelin (Nov 16 2022 at 06:54):

Never attribute to randomness that which is adequately explained by expertise in number theory.

Antoine Chambert-Loir (Nov 16 2022 at 08:58):

Just a remark on this amazing discussion: When you look at the first papers of Hasse on elliptic curves, 1936, the language was essentially field-theoretic. What made the theory work was that for curves, closed points correspond to places/valuations on the function field. This explains why @Junyan Xu it is not really necessary to develop the Dedekind-ring theory for these results to be proved. On the other hand, the Riemann-Roch theorem is a crucial component of Hasse's papers, it has been proved a few years earlier by F.K. Schmidt. Only with Weil would the theory start to be framed in a geometric language. (For Hasse and Deuring, the product of a curve $C/k$ by another one $E/k$ was constructed as the curve $C$ with scalars extended to the function field of $E$, and its places could be separated into constant one and non-constant ones, leading to an understanding of the divisor class group of the surface $C\times E$ modulo rational equivalence and its action on divisors of $C$, mapping them to divisors on $E$.)

David Ang (Nov 16 2022 at 10:53):

Junyan Xu said:

Doing it in this generality would definitely be good!

So we need to introduce a new structure without the Δ : R× and make docs#EllipticCurve extends it, and define addition for this new structure more generally. What should the new structure be called? The constructor point.some (x y : K) (h : E.weierstrass_equation x y) would take a nonsingularity condition (or do we want to consider all rational points including the singular point?), and we may define another constructor for points on an EllipticCurve without that condition.

BTW David Ang In #17194 I think it's better to use [comm_ring] instead of [field] whenever possible; I think weierstrass_equation/polynomial and neg don't need [field], only addition needs it.

P.S. I just learned that is the discriminant of $y^2+y=x^3-x$ src#EllipticCurve.inhabited :) Looks like it's not so random, but maybe just law of small number.

WeierstrassCurve? I will do the comm_ring change. I think Kevin was intentionally trying to find a curve with discriminant 37 to put in there

Kevin Buzzard (Nov 16 2022 at 11:28):

It's the first rank 1 elliptic curve.

David Ang (Nov 16 2022 at 12:18):

I can refactor both EllipticCurve.lean and EllipticCurve/point.lean to include singular Weierstrass equations, but I think it's better to have this approved first (since it's valid) and do that immediately in a separate PR

Michael Stoll (Nov 16 2022 at 16:14):

We could take a curve of conductor 5077 instead...

Oisin McGuinness (Nov 16 2022 at 19:57):

Michael Stoll said:

We could take a curve of conductor 5077 instead...

Of course 5077 is the smallest conductor of an elliptic curve of rank 3 (this has been proved), and also stands out for its role in the resolution of the class number problem for imaginary quadratic fields (Goldfeld and Gross-Zagier).

My personal favorite would be 19047851, conductor of smallest known (not yet proved, apparently searches haven't quite proved this yet) conductor of rank 5; entry in LMFDB here: Elliptic curve with LMFDB label 19047851.a1, only relatively recently added to LMFDB. See 5th page of this paper this curve's publication.

Junyan Xu (Nov 16 2022 at 23:28):

LMFDB doesn't seem to have Elkies' curves; at least the only known curve (over Q) of rank ≥ 28 y^2 + xy + y = x^3 − x^2 − 20067762415575526585033208209338542750930230312178956502x +34481611795030556467032985690390720374855944359319180361266008296291939448732243429 isn't among the results returned from searching bad prime 48463. (I found the bad prime from this page.) These curves are really special as it's conjectured that only finitely many elliptic curves over Q has rank > 21.

David Michael Roberts (Nov 17 2022 at 04:42):

Another fact: it's a factor of the conductor of what seems to be the oldest recorded curve on which someone (Diophantus, who else) found a rational point, 8732.b1.

David Michael Roberts (Nov 17 2022 at 04:59):

@Junyan Xu do we know the torsion subgroup of the Mordell–Weil group of Elkies' curve?

David Michael Roberts (Nov 17 2022 at 05:43):

Oh, I see that it has trivial torsion. Took a little search to track down that email...

Junyan Xu (Nov 17 2022 at 07:06):

I think the 123-digit number in Elkies's email must have been factored as it's not hard nowadays (at most several days according to this 2018 post). The paper I linked also mentioned "passing a list of bad primes" of the curve but I'm not sure if it means all of them; it doesn't mention how the global root numbers were obtained.

Apparently it gets harder to find elliptic curves of high ranks when there is more torsion: the records in this paper are {20,15,13,8,9,6} for Z/{2,3,4,5,6,7}Z, and heuristics suggest that there are only finitely many of ranks above {13,9,7,5,5,3}.

Elliptic curves in nature (mentioned in the MO thread) is also a nice read :) with 5077-a1 the last example.

Junyan Xu (Nov 17 2022 at 07:16):

Oh, someone added the factorization to factordb.com in 2019: 315574902691581877528345013999136728634663121 x 376018840263193489397987439236873583997122096511452343225772113000611087671413

Kevin Buzzard (Nov 17 2022 at 23:24):

So @David Ang and I went through @Junyan Xu 's approach today at work and convinced ourselves that it should work fine, even in char 2. I realise that I am still a bit confused about some of the story.

If $R$ is a commutative ring and $A$ is an $R$ -algebra which is finitely-generated and projective (or locally free, which might be slightly different in the non-Noetherian case) as an $R$ -module, then there's a norm map $N:A\to R$ , a monoid homomorphism, defined by considering multiplication by $a\in A$ as an $R$ -module endomorphism of $A$ and then taking its determinant (this and the trace are sufficiently "canonical" that they can be computed locally, meaning that $A$ doesn't have to be free, just locally free).

Now if $J$ is an ideal of $A$ we can define its norm to be the ideal of $R$ generated by the norms of all the elements. What I am slightly confused by is that we don't need to assume $A$ is a Dedekind domain in this definition, and going through the argument it seemed to me that all one needs is that the norm of the ideal $(g)$ is the ideal $N(g)$ . In particular we don't need to prove that $k[X,Y]/(Y^2-a_1XY-...)$ is a Dedekind domain. Did I miss something?

Is the idea that one can define the norm of an ideal of $A$ as an ideal of $R$ but one can't prove that this construction is multiplicative unless $R$ is a Dedekind domain and perhaps more too?

Kevin Buzzard (Nov 17 2022 at 23:24):

@Anne Baanen are you working on this?

Kevin Buzzard (Nov 17 2022 at 23:25):

I think David might want to "start at the top", deducing associativity from injectivity and then starting on injectivity and perhaps meeting others coming up from the bottom.

Anne Baanen (Nov 18 2022 at 10:14):

Yes, the idea is just that bundling the map as a homomorphism requires stricter assumptions

Anne Baanen (Nov 18 2022 at 10:16):

In particular we needed for our project that it returns a number, not ideal, and that the map is multiplicative, so we restrict to rings of integers of number fields

Anne Baanen (Nov 18 2022 at 10:17):

I'm pretty sure I have some code on the ideal generated by the norms, let me look it up

Kevin Buzzard (Nov 18 2022 at 12:12):

Aha -- the norm of an ideal is an ideal in my comments above, but then I suspect that you're in the special case where $R=\Z$ and so instead of an ideal you can just take its non-negative generator. Alternatively you can just take the size of $R/N(J)$ if it's finite.

Anne Baanen (Nov 18 2022 at 12:28):

In fact, I recall our development went span of norms → generator of span → cardinality of the quotient as we figured out exactly what we required :)

Alex J. Best (Nov 18 2022 at 12:55):

Yeah some things were way easier to prove with one version than the other in the end

Alex J. Best (Nov 18 2022 at 12:55):

And indeed we considered a version of relative norm, but taking the positive generator of a Z ideal was very convenient for now

Anne Baanen (Nov 18 2022 at 13:09):

Anne Baanen said:

I'm pretty sure I have some code on the ideal generated by the norms, let me look it up

WIP on branch#ideal-span-norm

Junyan Xu (Nov 18 2022 at 22:32):

Kevin Buzzard said:

In particular we don't need to prove that $k[X,Y]/(Y^2-a_1XY-...)$ is a Dedekind domain.

Yes, this is what I concluded in the latest version of my argument, and why I said the argument could be extended to Weierstrass curves with nonzero discriminant / domains that are not Dedekind. In fact I think this part of the argument is now even easier than the other part (showing $[I_{P+Q}]=[I_P I_Q]$ ).

Ideal norm is usually defined either as module index as in Sutherland's notes, or as the cardinality of the quotient ring (for rings of algebraic integers A relative to R=Z as in #17203 and in Conrad's notes). The former definition requires the base ring R to be Dedekind but and latter definition requires R=Z, but both definitions make sense without A being Dedekind. However, indeed multiplicativity may fail when A isn't Dedekind: see Example 8.2 in [Conrad] for an example where |A/P|=p but |A/P^2|=p^{n+1}. (Since A is of Krull dimension 1, It should be clear that for P nonzero prime the dimension of P/P^2 over A/P is 1 iff A is regular at P.) We also have

Remark 7.6. In a domain where all nonzero ideals have finite index, the multiplicativity of the ideal norm implies unique factorization of ideals.
[3] H. S. Butts and L. Wade, Two Criteria for Dedekind Domains, Amer. Math. Monthly 73 (1966), 14-21.

If we instead define ideal norm as the ideal generated by norms of elements as you do, it's also not obviously multiplicative (because the norm of a sum has no obvious relation with the norms of the summands), but N((g)) = N(g), N(gI) = N(g)N(I) and N(I)N(J) ≤ N(IJ) are obvious; curiously this inequality seems to be opposite to what holds with the usual definition. We also know that the two definitions agree when A is a Dedekind domain, see Theorem 7.10 in [Conrad] or Corollary 6.9 in [Sutherland], in which case both are multiplicative. When A isn't Dedekind they need not agree: if A=Z+2Z[√-1] and p=2Z[√-1] (like in Example 8.2), then |A/p|=2 but N(p)=4, and |A/p^2|=8 but N(p^2)=16, so multiplicativity fails for the usual ideal norm but not your ideal norm, and I struggle to come up with counterexample to multiplicativity of your ideal norm.

Notice that [Sutherland] develops the theory of module index for modules that are lattices over a Dedekind domain; lattices are submodules of vector spaces, so they are torsion-free and hence (beingn over a Dedekind domain) projective, which is exactly the setting under which you defined the norm of an element.

David Ang (Nov 19 2022 at 21:23):

This PR, which depends on #17194 and two small PRs, proves case-by-case that the Weierstrass polynomial is irreducible and hence that the quotient ring is a domain, and this branch defines the add_monoid_hom $E(K) \to \mathrm{Cl}(R)$ and a proof of the group law on $E(K)$ assuming injectivity and that it respects addition - feel free to make commits into it

Antoine Chambert-Loir (Nov 21 2022 at 21:01):

Kevin Buzzard said:

If $R$ is a commutative ring and $A$ is an $R$ -algebra which is finitely-generated and projective (or locally free, which might be slightly different in the non-Noetherian case) as an $R$ -module,

it depends on what you call “locally free”. It's equivalent if you define it as the existence of an open covering of $\mathrm{Spec}(A)$ by, say $D(f_i)$ such that each module $M_{f_i}$ is free and finitely generated. It is not equivalent if, by that, you mean that the localization $M_{p}$ at all prime ideals are free modules over the local rings $A_p$ .

… then there's a norm map $N:A\to R$ , a monoid homomorphism, defined by considering multiplication by $a\in A$ as an $R$ -module endomorphism of $A$ and then taking its determinant (this and the trace are sufficiently "canonical" that they can be computed locally, meaning that $A$ doesn't have to be free, just locally free).

Indeed.

Now if $J$ is an ideal of $A$ we can define its norm to be the ideal of $R$ generated by the norms of all the elements. What I am slightly confused by is that we don't need to assume $A$ is a Dedekind domain in this definition, and going through the argument it seemed to me that all one needs is that the norm of the ideal $(g)$ is the ideal $N(g)$ . In particular we don't need to prove that $k[X,Y]/(Y^2-a_1XY-...)$ is a Dedekind domain. Did I miss something?

What is not clear to me is what you expect of this norm, especially when the ideals are not locally principal.

Is the idea that one can define the norm of an ideal of $A$ as an ideal of $R$ but one can't prove that this construction is multiplicative unless $R$ is a Dedekind domain and perhaps more too?

There is one case where good things happen, namely when your ideals are locally principal.
In this case, the norm will be multiplicative: this reduces to the multiplicativity of the determinant in local charts.

Kevin Buzzard (Nov 21 2022 at 22:27):

Re what we expect of the norm: the only thing needed is the observation that the norm of $(X-x,Y-y)$ down to $k[X]$ is principal generated by a polynomial of degree 1, but that if $(g)$ is principal then its norm is generated by $N(g)$ which is never a polynomial of degree 1.

Junyan Xu (Nov 22 2022 at 07:42):

I've previously outlined on Github a unified way of proving that the point given by the addition and doubling formulas lies on the Weierstrass curve, and I think it's most convenient to use a multivariate chain rule, so I asked about it on Zulip, though it may be more straightforward to compute the derivative of $f(X,g(X))$ directly (where g is linear). I have now also worked out a unified way to show $[I_{P+Q}]=[I_P I_Q]$ whenever $P ≠ -Q$ and both are not $O$ :

Let L be the slope of the line connecting the two points $P=(x_1, y_1)$ and $Q=(x_2, y_2)$ on the Weierstrass curve $f(X,Y)=0$ , or of the tangent line when the two points are equal. This is called add_L or dbl_L in #17194. Define $g(X) := L(X-x_1) + y_1$ . As shown in the outline, $f(X, g(X))=(X-x_1)(X-x_2)(X-x_3)$ with $x_3$ the x-coordinate of $P+Q$ . It suffices to show that $(X-x_3) I_P I_Q = (Y - g(X)) I_{P+Q}$ .

Since P,Q both lie on the line $Y=g(X)$ , it is easy to show that $I_P = (X-x_1, Y-y_1) = (X-x_1, Y-g(X))$ and $I_Q = (X-x_2, Y-g(X))$ ; on the other hand, P+Q lies on $\sigma(Y)=g(X)$ so $I_{P+Q} = (X-x_3, \sigma(Y)-g(X))$ , where $\sigma(Y) := -Y - a_1 X - a_3$ is the conjugate of Y. (It would be most convenient to define these ideals via docs#ideal.map from K[X,Y] to the quotient ring.) Notice that we have $Y\sigma(Y) = -(X^3 + a_2 X^2 + a_4 X + a_6)$ in the quotient ring $R[X,Y]/(f(X,Y))$ , and it follows that $(Y-g(X))(\sigma(Y)-g(X))=f(X,g(X))$ in the quotient ring. Therefore $I_P I_Q = ((X-x_1)(X-x_2)) + (Y-g(X)) (X-x_1, X-x_2, Y-g(X))$ and $(X-x_3) I_P I_Q = (f(X,g(X))) + (Y-g(X))(X-x_3) (X-x_1, X-x_2, Y-g(X)) = (Y-g(X)) ((\sigma(Y)-g(X)) + (X-x_3)(X-x_1, X-x_2, Y-g(X)))$ , so it suffices to show $(\sigma(Y)-g(X)) + (X-x_3)(X-x_1, X-x_2, Y-g(X)) = (\sigma(Y)-g(X), X-x_3)=I_{P+Q}$ . We can subtract $\sigma(Y)-g(X)$ from $Y-g(X)$ to get $Y-\sigma(Y)$ , so it suffices to show $(X-x_1, X-x_2, Y-\sigma(Y), f(X,Y))$ is the unit ideal in K[X,Y]. But the vanishing locus of the ideal is nonempty only when $x_1=x_2$ and $f(x_1,Y)$ has a unique double root (since the derivative of $f(X,Y)$ w.r.t. Y is exactly $Y-\sigma(Y)$ ), in which case we must have $P=Q=-P=-Q$ . Thus when $P\ne Q$ the vanishing locus is empty and therefore the ideal is the unit ideal by Nullstellensatz docs#mv_polynomial.vanishing_ideal_zero_locus_eq_radical (and docs#ideal.radical_eq_top). (It'd be slightly annoying but manageable to switch between K[X][X] and the mv_polynomial ring K[X,Y].)

Junyan Xu (Nov 22 2022 at 08:45):

For the case $P=-Q$ , we want to show $I_P I_Q = (x-x_1) = (x-x_2)$ . We have $I_Q=(X-x_2, Y-y_2)=(X-x_1, Y-y_2)=(X-x_1, \sigma(Y)-y_1)$ and $I_P I_Q = (X-x_1)(X-x_1, Y-y_1, \sigma(Y)-y_1)+(f(X,y_1))$ because $f(X,y_1)=(Y-y_1)(\sigma(Y)-y_1)$ in the quotient ring. Notice that $f(X,y_1)$ always has the root $X=x_1$ and it's a multiple root iff the X-derivative of f is zero at $P$ . If $P\ne -P$ , it's easy to show $(X-x_1, Y-y_1, \sigma(Y)-y_1)$ is the unit ideal, and we're done. If $P=-P$ then the Y-derivative of f at $P$ vanishes, so P is a singular point iff the X-derivative also vanish, iff $X=x_1$ is a root of $f(X,y_1)/(X-x_1)$ . Moreover in this case $I_P I_Q = (X-x_1)(X-x_1, Y-y_1, f(X,y_1)/(X-x_1))$ and the three-generator ideal is the unit ideal (and we are done) iff $X=x_1$ isn't a root of $f(X,y_1)/(X-x_1)$ iff P is nonsingular, and is the maximal ideal $(X-x_1, Y-y_1)$ iff it is a root iff P is singular.

Since we know ideals I with R/I one dimensional are never principal (for all Weierstrass curves, no matter singular or not), we see that $[(1)]=[I_O]=[I_{P+Q}]=[I_P I_Q]$ indeed fails in the last case, and it's the only case where it fails. This is natural since the maximal ideal at a singular point should not be invertible, so $[I_P I]\ne [(1)]$ for all ideals $I$ , in particular for $I=I_Q=I_P$ .

To show that the nonsingular points form a group, we still need to show that the sum of two nonsingular points is nonsingular, but that should be easy.

To show that nonzero discriminant implies no singular points exist, Silverman uses a coordinate transform to get a simpler Weierstrass equation, and characteristic 2 requires special treatment.

Alex J. Best (Nov 22 2022 at 22:21):

We have some lean 4 proofs relating singularity and the discriminant in all characteristics at GitHub.com/kisarablue/ec-tate-lean the project is still wip, but that part is sorry free. It sounds like the one you want is the easier one though. It suffices to express the discriminant as a member of the ideal generated by the Weierstrass equation and the partial derivatives, which is just one huge ring call at the end of the day

Kevin Buzzard (Nov 22 2022 at 22:35):

It suffices, but it's not necessary and sufficient because it could have been e.g. that the 37th power of the discriminant is in the ideal, so in some sense we're lucky here.

Alex J. Best (Nov 22 2022 at 22:51):

Indeed, thats why this is the easier direction! Somehow geometry alone doesn't capture the fact the discriminant is the resultant of those polynomials

Junyan Xu (Nov 23 2022 at 08:12):

In fact we could also use a computation-light proof for Δ ≠ 0 ⇒ nonsingular. Mathlib already have docs#EllipticCurve.two_torsion_polynomial.disc_ne_zero which shows over a field of characteristic nonzero, the discriminant of the "two-torsion polynomial" of an elliptic curve (Δ ≠ 0) is nonzero, which means that the two-torsion polynomial has no multiple root, i.e. its derivative at any root is nonzero. But the two-torsion polynomial is exactly a multiple of $f(X, -(a_1 X+a_3)/2)$ (obtained by solving the Y-derivative=0 equation $2Y+a_1 X+a_3=0$ for Y and plugging into f), and its derivative at X=x is the X-derivative of f at $(x, -(a_1 x + a_3)/2)$ minus the Y-derivative of f at the same point times $a_1/2$ , according to the chain rule. For any point $(x,y)$ on the curve where the Y-derivative is zero, we have $y = -(a_1 x+a_3)/2$ , and $x$ is a root of the two-torsion polynomial because $f(x,y)=0$ , so the derivative at $x$ ( $=f_X-a_1 f_Y/2$ ) is nonzero; since $f_Y$ is zero there we conclude $f_X\ne 0$ , so $(x,y)$ is nonsingular. The characteristic 2 case does need special treatment but should be easy.

Kevin Buzzard (Nov 23 2022 at 08:19):

I am secretly hoping that we never have to treat char 2 as a special case but I know this is not always possible in general

David Ang (Nov 23 2022 at 14:33):

I plan to refactor the stuff on two-torsion polynomials (any name suggestions are highly appreciated) once we define n-division polynomials, but it’s there to demonstrate the usage of the cubic discriminant file

Marc Masdeu (Dec 01 2022 at 06:57):

I have a computational proof of discriminant nonzero implies nonsingular, for an appropriate definition of nonsingular (the one needed in the computational proof of associativity)...

David Ang (Dec 06 2022 at 16:47):

Marc Masdeu said:

I have a computational proof of discriminant nonzero implies nonsingular, for an appropriate definition of nonsingular (the one needed in the computational proof of associativity)...

Thank you for this - I originally copied the huge expressions over in #17631, but you can actually avoid them (and the community probably won't like it anyway), by first applying a variable change to (0, 0), then proving that the discriminant is unchanged and the same result holds. I called it a smooth point just because the terminology is shorter (is there actually a semantic difference between being smooth and being nonsingular?)

Kevin Buzzard (Dec 06 2022 at 18:52):

nonsingular is usually a property of objects and smooth a property of morphisms.

Jz Pan (Dec 07 2022 at 00:24):

I have a proof that if a Weierstrass equation over a field has non-zero discriminant, then it does not have rational singular point.

Conversely, if it does not have rational singular point, under the assumption that the field is not of char 2 or 3, or is perfect, then it has non-zero discriminant. (If the assumption on the field does not hold, then there can be singular point defined over a finite inseparable extension.)

Jz Pan (Dec 07 2022 at 00:26):

The computation does not involve lengthy explicit calculations if I remembered correctly.

Jz Pan (Dec 07 2022 at 00:42):

Check this https://github.com/acmepjz/my-lean-test/blob/master/gtm106/weierstrass_equation/non_singular_criterion.lean

Kevin Buzzard (Dec 07 2022 at 00:43):

Very nice!

Junyan Xu (Jan 23 2023 at 05:02):

I just pushed a commit to make #18000 sorry-free. Lean now knows rational points on an elliptic curve over a field form an abelian group! (Actually we show the same for the nonsingular rational points on a possibly singular curve defined by a long Weierstrass equation. As far as we know, associativity has not been proved for long Weierstrass equations in any theorem prover before.)

Thanks to @David Ang's hard work spread out in numerous PRs which reduced the whole argument to the final sorry. He intended to attack the final sorry with another approach, but it depends on branch#ideal-span-norm which is not yet merged into master, so after discussions he generously left the final sorry to me while he works on our joint paper, and I filled it using the original approach suggested by myself. It's a bit painful to fight timeouts and pi instance issues (in the auxiliary file) and to construct a certain isomorphism, but finally Lean accepts the proof!

Thanks for everyone who shared thoughts in this thread; this elementary new proof of associativity is really the fruit of online interchange of ideas. And thanks to all mathlib contributors, whose work our formalization builds upon.

Yaël Dillies (Jan 23 2023 at 07:36):

This is one of those things I thought I would never see solved. Congratulations!

Johan Commelin (Jan 23 2023 at 07:38):

@Junyan Xu @David Ang Huge congratulations!

Riccardo Brasca (Jan 23 2023 at 08:21):

Congratulations! I will review today one of the blocking PR... we really want this in mathlib!

Chris Birkbeck (Jan 23 2023 at 09:03):

Oh amazing work! I'm happy this is finally done :)

Kevin Buzzard (Jan 23 2023 at 09:53):

Just to remark that in particular (as far as I know) this is the first formalisation of associativity which works for all elliptic curves over all fields and in particular in characteristic 2. Furthermore it does not use the coordinate bash, which in this generality involves manipulating polynomials with tens of thousands of coefficients, and does not involve a case split on the characteristic either. The argument is much more conceptual and uses the work of @Anne Baanen et al on class groups of Dedekind domains. The work is a testament to what we are doing here in this community. The effort put into formalising harder and more conceptual commutative algebra means that we can use more powerful tools to take things further than before. Furthermore as far as I can make out, the formalised proof is actually mathematically novel. My understanding is that it has its roots in an exercise in Hartshorne's book on algebraic geometry which specifically excludes characteristic 2. Of course the proof of Fermat's last theorem needs a robust theory of elliptic curves which works in all characteristics.

Anne Baanen (Jan 23 2023 at 10:06):

Amazing! I didn't realize you needed the ideal norm work for this, so I'll definitely hurry up more and finish PR'ing soon :O

Riccardo Brasca (Jan 23 2023 at 10:13):

Concerning the ideal norm I've this comment. I am not sure it is relevant here, and everything is perfectly fine for ℤ, but I am sure it is going to cause troubles later on.

Riccardo Brasca (Jan 23 2023 at 10:14):

I've opened this issue, that is essentially what we need to replace the norm from 𝓞 K with the restriction of the norm from K.

Junyan Xu (Jan 23 2023 at 17:19):

@Anne Baanen To clarify, @David Ang told me he has an approach using your branch#ideal-span-norm (which he has not explained to me), but the way I filled the final sorry relies only on what's already in mathlib. However, our formalization definitely rely crucially on your work everywhere, including but not limited to:
docs#ideal.quotient_equiv_pi_span (Smith normal form stuff, used in the final sorry)
docs#linear_map.associated_det_of_eq_comp (used in the final sorry)
docs#algebra.norm (proof of non-rationality/injectivity from points to class group centers around this)
and of course docs#class_group. It's fortunate that the class group is defined in the generality of all domains, not just Dedekind domains, so that we can generalize from elliptic curves to possibly singular Weierstrass curves. (We use the constructor that takes an invertible (fractional) ideal in an arbitrary domain rather than the one that takes a nonzero ideal in a Dedekind domain.)

Anne Baanen (Jan 23 2023 at 17:26):

Oh I see. It makes me very happy to know my work is useful outside of my immediate area of concern! :D

David Ang (Jan 24 2023 at 00:45):

Junyan Xu said:

Anne Baanen To clarify, David Ang told me he has an approach using your branch#ideal-span-norm (which he has not explained to me), but the way I filled the final sorry relies only on what's already in mathlib. However, our formalization definitely rely crucially on your work everywhere, including but not limited to:
docs#ideal.quotient_equiv_pi_span (Smith normal form stuff, used in the final sorry)
docs#linear_map.associated_det_of_eq_comp (used in the final sorry)
docs#algebra.norm (proof of non-rationality/injectivity from points to class group centers around this)
and of course docs#class_group. It's fortunate that the class group is defined in the generality of all domains, not just Dedekind domains, so that we can generalize from elliptic curves to possibly singular Weierstrass curves. (We use the constructor that takes an invertible (fractional) ideal in an arbitrary domain rather than the one that takes a nonzero ideal in a Dedekind domain.)

The approach I had in mind was just to directly prove that the span_norm of (X-x, Y-y) is (c(X-x)) for some c, so it can’t be the span_norm of (f) for any f, by degree_norm_ne_one using span_norm (f) = (algebra.norm f) which is already in Anne’s branch. For the proof itself I was only left with showing that you can extract c(X-x) from linear combinations of products of X-x and Y-y with their conjugates, but that calculation was already embedded somewhere in XY_ideal_neg_mul, and I was just dreading to refactor that part.

Anne Baanen (Jan 25 2023 at 17:28):

Junyan Xu said:

it depends on branch#ideal-span-norm which is not yet merged into master

I'm happy to announce that I finished the last sorry in span_norm_mul :octopus: This is what I was waiting for before opening the PR, hopefully tomorrow I can get through all of this.

Anne Baanen (Jan 26 2023 at 13:21):

#18303 :tada:

Kevin Buzzard (Jan 26 2023 at 13:27):

Thanks so much!

David Michael Roberts (Jan 27 2023 at 06:23):

It would be nice if this proof was summarised for instance in a blog post, or I guess an article on the arXiv would be even more stable.

Stepan Nesterov (Feb 02 2023 at 21:24):

I know I'm late to the party, but why is characteristic 2 a problem?
I think it suffices to check associativity for the universal curve over Z[A_1,...,A_6], which boils down to the fact that two elements of Z[X,Y,A_1,...,A_6] are equal. But then it suffices to check that they are equal in C[X,Y,A_1,...,A_6], and we can use a short form (and Weierstrass pi even)

Kevin Buzzard (Feb 02 2023 at 21:45):

The approach using explicit polynomials was just too unwieldy

Kevin Buzzard (Feb 02 2023 at 21:47):

You're right that you just have to check that some very large polynomials are equal after applying the ring axioms many thousands of times. But lean 3 couldn't even prove that two equal polynomials of that size were equal. These things were gigantic when you used the full A1..A6 model. I had a PhD student try and fail to pull this off.

Kevin Buzzard (Feb 02 2023 at 21:49):

Furthermore it was decided that this was not the proof we wanted in mathlib anyway, because it would just be a diabolical file

Patrick Massot (Feb 02 2023 at 21:49):

And this way sounds so ugly compared to a conceptual proof. Would you get any understanding of why the theorem is true from such a proof?

Reid Barton (Feb 02 2023 at 21:53):

This argument is supposed to show that you can reduce to the char 0 case without doing any computation.

Mario Carneiro (Feb 02 2023 at 21:55):

I am definitely interested to know how to do this without doing any computation, since that sounds like the more efficient approach from a proof theoretic and compile time standpoint. Even after reducing to char 0 there is still a pretty hard problem remaining, no?

Reid Barton (Feb 02 2023 at 21:59):

Hmm, is there a problem with having to invert the discriminant?

Reid Barton (Feb 02 2023 at 22:06):

I guess not. Here is a geometric reformulation:
Every elliptic curve is pulled back from the universal elliptic curve $E$ over $\mathrm{Spec}\,\mathbb{Z}[a_1, \ldots, a_6, \Delta^{-1}]$ . So it is enough to check that the multiplication on $E$ is associative. Since $E$ is separated this equation has to hold on a closed subset, so it's enough to check it over the generic point $\mathrm{Spec}\,\mathbb{Q}(a_1, \ldots, a_6)$ .

Kevin Buzzard (Feb 02 2023 at 22:06):

I don't really follow the argument. The claim is that the difference between two polynomials in Z[A1,..A6] is in some principal ideal. Marc Masdeu worked out the coefficient which one needs to multiply the generator of the ideal by to make the difference and the polynomial had hundreds of thousands of coefficients. We now need to prove this identity. Lean 3 can't even state the identity without falling over (so that's why this method doesn't work) The observation being made as I understand it is that if we now invert 2 and 3 we can introduce new variables A and B which are messy polynomials in A1..A6 and it turns out that some of the polynomials in question can be rewritten as polynomials in A and B, but the problem is already in the form "prove these two polynomials are equal" and lean has already become unresponsive, it can't even prove that the coefficient is equal to itself by refl, it times out. Furthermore the polynomial with hundreds of thousands of coefficients needs to be explicitly entered into the proof.

Stepan Nesterov (Feb 02 2023 at 22:10):

But then how does the conceptual proof work if Lean times out just because you typed in "A * B * C"?

Mario Carneiro (Feb 02 2023 at 22:11):

because in the conceptual proof you never have to type that much

Mario Carneiro (Feb 02 2023 at 22:12):

just like how you can prove 1234 * 5678 = 5678 * 1234 by an explicit computation, or you can do it by an induction proof and never touch numbers as big as that

Kevin Buzzard (Feb 02 2023 at 22:12):

In the conceptual proof there is no "explicit" addition of three points on an elliptic curve anywhere

Kevin Buzzard (Feb 02 2023 at 22:13):

The largest explicit computation is showing that the class of P+Q is linearly equivalent to the product of the classes of P and Q

Stepan Nesterov (Feb 02 2023 at 22:13):

Reid Barton said:

I guess not. Here is a geometric reformulation:
Every elliptic curve is pulled back from the universal elliptic curve $E$ over $\mathrm{Spec}\,\mathbb{Z}[a_1, \ldots, a_6, \Delta^{-1}]$ . So it is enough to check that the multiplication on $E$ is associative. Since $E$ is separated this equation has to hold on a closed subset, so it's enough to check it over the generic point $\mathrm{Spec}\,\mathbb{Q}(a_1, \ldots, a_6)$ .

Yes, this is exactly what I had in mind. Even more so, because it suffices to check the equality for the three generic points, you don't have 30 cases, you automatically have that the three points are not on a line and nothing is on a line.

Stepan Nesterov (Feb 02 2023 at 22:13):

Although I guess Lean doesn't know about Zariski density yet

Mario Carneiro (Feb 02 2023 at 22:14):

but it sounds like "it suffices to check the equality" still boils down to equality of some huge polynomials in this case

Stepan Nesterov (Feb 02 2023 at 22:15):

Patrick Massot said:

And this way sounds so ugly compared to a conceptual proof. Would you get any understanding of why the theorem is true from such a proof?

I understand why the theorem is true, it's just a pedagogical probelm of teaching it to an undergrad named Lean, who doesn't know what a scheme is :)

Kevin Buzzard (Feb 02 2023 at 22:15):

Lean is well aware of what a scheme is ;-) docs#algebraic_geometry.Scheme

Patrick Massot (Feb 02 2023 at 22:16):

I didn't mean you don't know why it's true, I meant your understanding does not come from that computational proof.

Stepan Nesterov (Feb 02 2023 at 22:16):

Mario Carneiro said:

but it sounds like "it suffices to check the equality" still boils down to equality of some huge polynomials in this case

There were some slick ideas which were proposed for the proof in this topic but then discarded because they would not work in characteristic 2. My observation is just that the char 2 case follow formally from char 0 case.

Kevin Buzzard (Feb 02 2023 at 22:17):

In this chat we use the word "formally" to mean something a little different :-)

Mario Carneiro (Feb 02 2023 at 22:18):

I would use "by abstract nonsense" there

Stepan Nesterov (Feb 02 2023 at 22:18):

By concrete nonsense which is too large for Lean supposedly

Mario Carneiro (Feb 02 2023 at 22:19):

wait, I'm hoping it's not concrete nonsense since that's the method we want to avoid

Kevin Buzzard (Feb 02 2023 at 22:19):

I think that the proof we ended up with is absolutely the best proof. To check the char 0 case is still a horrible computatation if you want to do the computation route

Mario Carneiro (Feb 02 2023 at 22:19):

I'll definitely be reading the paper when it comes out :)

Stepan Nesterov (Feb 02 2023 at 22:19):

Kevin Buzzard said:

In the conceptual proof there is no "explicit" addition of three points on an elliptic curve anywhere

So does Lean have the theorem add_on_elliptic_curve_assoc: (E : elliptic_curve K, A B C : E(K)): A * B * C = A * (B * C) or it times out?

Kevin Buzzard (Feb 02 2023 at 22:19):

The alternative in the char 0 case is the Picard group proof but we have shown that the Picard group proof works in general

Kevin Buzzard (Feb 02 2023 at 22:20):

Yes we have that proof now, for K any field

Stepan Nesterov (Feb 02 2023 at 22:21):

Can you generalise to any ring by the method I proposed?

Kevin Buzzard (Feb 02 2023 at 22:21):

By "we" I mean David Angdinata and Junyan Xu; I was just cheering from the sidelines

Stepan Nesterov (Feb 02 2023 at 22:21):

I don't get what exactly times out

Kevin Buzzard (Feb 02 2023 at 22:22):

I don't think your question about rings makes sense

Mario Carneiro (Feb 02 2023 at 22:23):

As far as what was concretely shown, I see https://github.com/leanprover-community/mathlib/blob/bd7a6fecf38b4b606508eea4a81d7c1f708d7250/src/algebraic_geometry/elliptic_curve/point.lean#L684 , can someone confirm that this is the general group law for an elliptic curve in any characteristic?

Kevin Buzzard (Feb 02 2023 at 22:23):

What times out is that lean cannot cope with manipulating polynomials with hundreds of thousands of coefficients, and we have no guarantee that it will not time out when manipulating the simpler polynomials which come up in the char 0 case

Mario Carneiro (Feb 02 2023 at 22:23):

It doesn't look exactly like Stepan Nesterov 's sketch

Kevin Buzzard (Feb 02 2023 at 22:24):

Yes it's the Picard group proof.

Mario Carneiro (Feb 02 2023 at 22:25):

I thought that there was already a definition of elliptic curves, such that we could write (and sorry) the associativity statement. This W.point type is new with the PR

Kevin Buzzard (Feb 02 2023 at 22:26):

Yes, they prove something more general.

Kevin Buzzard (Feb 02 2023 at 22:26):

They allow curves with discriminant zero and prove that the nonsingular points on such curves form a group

Mario Carneiro (Feb 02 2023 at 22:27):

I see, so is this a candidate to replace the existing definition?

Kevin Buzzard (Feb 02 2023 at 22:27):

If the discriminant is nonzero then all points are nonsingular

Reid Barton (Feb 02 2023 at 22:28):

But there is a different proof also based on the Picard group, that only works in char $\ne 2$ (formalized in Coq).

Kevin Buzzard (Feb 02 2023 at 22:30):

I think they used the short form so it doesn't cover all cases in char 3 either

Stepan Nesterov (Feb 02 2023 at 22:35):

Anyway let me tell you the application that I have in mind for the ring generalization. I am looking for a beginner-friendly short project with elliptic curves now that mathlib knows about them and I am thinking maybe formalizing good reduction will be possible. Rough plan is:

Generalize the associativity to rings
Define base change for ellliptic curves.
Define good reduction (we have isomorphisms right?)
Prove that an elliptic curve over Q (do we have enough number fields and Dedekind domains story?) has good reduction at all but finitely many primes.
Does this look realistic and/or useful for something else?

Kevin Buzzard (Feb 02 2023 at 22:46):

1 doesn't make sense really. Let R be a commutative ring. What is your definition of an elliptic curve over R and what is your definition of the R-points on the curve?

Kevin Buzzard (Feb 02 2023 at 22:46):

3 and 4 look good though.

Stepan Nesterov (Feb 02 2023 at 22:46):

I only really care about R local to be honest

Kevin Buzzard (Feb 02 2023 at 22:47):

Yes we have number fields and Dedekind domains

Kevin Buzzard (Feb 02 2023 at 22:47):

The issue is that we only really have a definition of an elliptic curve over a field

Stepan Nesterov (Feb 02 2023 at 22:47):

So the definition is just the Weierstrass equation
And bad reduction will be potentially worked out using minimal Weierstrass equation, Silverman style

Kevin Buzzard (Feb 02 2023 at 22:48):

Are you happy to restrict to DVRs?

Kevin Buzzard (Feb 02 2023 at 22:48):

Alex Best has formalised Tate's algorithm

Stepan Nesterov (Feb 02 2023 at 22:50):

Yeah, sure, DVR’s are fine.

Kevin Buzzard (Feb 02 2023 at 22:50):

One issue is how much of the theory one really wants to develop for Weierstrass models. I had Diophantine applications in mind (David Ang is working on Mordell-Weil) but beyond some point we want to switch to the scheme definition

Stepan Nesterov (Feb 02 2023 at 22:52):

Where can I find dedekind domains and DVR’s in mathlib?

Kevin Buzzard (Feb 02 2023 at 22:53):

Right now we have the category of schemes, so we have isomorphisms there, and then we have this type of elliptic curves over a ring with trivial 12-torsion in the Picard group as a Weierstrass model but I think no concept of a morphism although it would certainly be possible to make it if it weren't there already

Kevin Buzzard (Feb 02 2023 at 22:53):

Do you know about the API docs? Just search them for Dedekind domain and discrete valuation ring

Kevin Buzzard (Feb 02 2023 at 22:53):

https://leanprover-community.github.io/mathlib_docs/

Kevin Buzzard (Feb 02 2023 at 22:55):

We have class groups of Dedekind domains, factorisation of ideals in extensions etc

Stepan Nesterov (Feb 02 2023 at 22:55):

Ok found them:)

Kevin Buzzard (Feb 02 2023 at 22:55):

I think we have less about DVRs

Stepan Nesterov (Feb 02 2023 at 22:58):

Is it a bad idea to try and use elliptic_curve.variable_change as a working definition for isomorphism of elliptic curves (over fields)?

Kevin Buzzard (Feb 02 2023 at 23:00):

docs#elliptic_curve.variable_change

Stepan Nesterov (Feb 02 2023 at 23:00):

The definition of good reduction I have in mind is:
“Let K be a field, R a sub ring of K such that R is a discrete valuation ring. An elliptic curve E over K has good reduction if there is an elliptic curve script E over R such that the base change of (script E) to K has a variable change to E and the discriminant of (script E) in invertible in R”

Kevin Buzzard (Feb 02 2023 at 23:02):

The "discriminant of \E is invertible in R" part is unnecessary because that's part of the definition of being an elliptic curve over R. Otherwise that looks like a good definition to me

Kevin Buzzard (Feb 02 2023 at 23:03):

It's a definition of "good reduction at the place corresponding to R", not "good reduction" (there could be many such Rs)

Stepan Nesterov (Feb 02 2023 at 23:04):

Ok cool!
I just found out that we have a Lean club here at Stanford which meets on Tuesdays so I can try and talk to people who can actually code in Lean and can help me with compliling :)
I’ll start writing something then.

Kevin Buzzard (Feb 02 2023 at 23:04):

We have the concept of a valuation subring of K, you might want to use that instead, in situations where you know the valuation is guaranteed to be discrete.

Kevin Buzzard (Feb 02 2023 at 23:05):

Oh that's good to hear about the lean club!

Stepan Nesterov (Feb 02 2023 at 23:06):

Yes, the next definition is “E over a number field has a good reduction at a place p, if it has good reduction corresponding to O_K,p”

David Ang (Feb 03 2023 at 00:20):

I think reduction types will be very very useful for a conceptual proof of Mordell-Weil, at least based on Silverman’s argument; back then this was something I skipped because it’s possible to take a giant shortcut to go straight to descent

David Ang (Feb 03 2023 at 00:48):

Stepan Nesterov said:

Anyway let me tell you the application that I have in mind for the ring generalization. I am looking for a beginner-friendly short project with elliptic curves now that mathlib knows about them and I am thinking maybe formalizing good reduction will be possible. Rough plan is:

Generalize the associativity to rings

Define base change for ellliptic curves.

Define good reduction (we have isomorphisms right?)

Prove that an elliptic curve over Q (do we have enough number fields and Dedekind domains story?) has good reduction at all but finitely many primes.
Does this look realistic and/or useful for something else?

2 has docs#weierstrass_curve.base_change

Antoine Chambert-Loir (Feb 03 2023 at 09:38):

Kevin Buzzard said:

We have the concept of a valuation subring of K, you might want to use that instead, in situations where you know the valuation is guaranteed to be discrete.

Why would you want to stick to discrete valuations?

Kevin Buzzard (Feb 03 2023 at 11:09):

I only put in that clause because I was scared that the definition of elliptic curve which we have ("long Weierstrass equation") was not mathematically correct for some random valuation ring (because I don't know anything about Picard groups of arbitrary valuation subrings of fields). I convinced myself that as long as Pic(R)[12]={1} the Weierstrass definition is OK and this condition holds for DVRs and fields.

Junyan Xu (Feb 03 2023 at 19:14):

Another issue is the definition of R-points for a general commutative ring R. Although people do talk about integral points on curves, if we want the R-points of E to form a group then we have to consider the scheme-theoretic points E(R), which are morphisms from Spec R to E. If E is defined by a Weierstrass equation (as is the case in mathlib), E(R) is a subset of P^2(R). If R=F is a field then Spec R is one point, so its image in P^n must lie in some affine chart, so our current definition weierstrass_curve.point (as the affine points in the (x,y) plane union a point at infinity) works, but in general P^n(R) classifies R-linear epimorphisms from R^{n+1} to an invertible R-module up to isomorphism, which doesn't admit a simple description using homogeneous coordinates (note that Eisenbud-Harris made an error here), and that's why docs#projectivization was only defined for division rings. When Pic(R)=0, all invertible modules are isomorphic to R, and by considering images of the free generators of R^{n+1} we get (n+1)-elements of R that together generates R (by surjectivity) up to simultaneous multiplication by a unit, and these give the familiar homogeneous coordinates. If R is local or a PID, all projective modules are free (finite case is easier), in particular the rank 1 (invertible) ones, so we do get coordinates. (When R is local surjectivity says at least one of the coordinates is a unit (which we may as well choose to be 1), and when R is a PID it says the GCD of the coordinates is 1.) I guess we should first develop these in mathlib before talking about E(R) over your favorite ring R. (Or maybe you can just work over the field of fractions?)

Junyan Xu (Feb 04 2023 at 07:15):

Some updates: Pic(R)=0 also holds for all UFDs (stacks#0BCH) and the argument here seems to show it holds more generally for all GCD domains. See this paper for a study of the group of points valued in finite rings, which has cryptographic applications. These rings are not necessarily domains, but since they're direct products of local rings, we still have Pic=0 and P^n(R) still admits the simple description via homogeneous coordinates.

In general, if R is a domain, K its field of fractions and X a separated scheme over R (in particular P^n_R), then X(R) → X(K) is injective, according to the argument here, because the generic point of R is dense in R. So any R-point can be represented as a K-point, though in the case of elliptic curves in Weierstrass form, it's not immediately clear X(R) is closed under addition in X(K). For surjectivity of P^n(R) → P^n(K), I think all nonzero ideals of R generated by (n+1) elements need to be invertible, which is true for valuation rings (where all f.g. ideals are principal, and surjectivity also follows from valuative criterion since P^n is proper) and for Dedekind domains. My conclusion is that for rings we care about it's probably good enough to work with P^n(K). (Maybe that has always been the plan?)

Kevin Buzzard (Feb 04 2023 at 07:42):

I guess the long term plan is to move to a scheme-theoretic definition of elliptic curve and use algebraic Riemann Roch to relate it to what we have. But for Diophantine applications there's no point going via this route

Antoine Chambert-Loir (Feb 04 2023 at 11:41):

Projective modules over a local ring are free (Kaplansky).

Stepan Nesterov (Feb 07 2023 at 02:55):

Is docs#weierstrass_curve.equation supposed to be the set of points of an elliptic curve? Because a function $R \to R \to Prop$ is definitionally the same as subset of $R^2$ or something?
But that can't be right, though, I need a set of all these plus a point at infinity. Is this not at docs#algebraic_geometry.elliptic_curve.weierstrass?

Junyan Xu (Feb 07 2023 at 03:21):

The type of points is defined here, it's not merged yet. (It's not quite the type of points when R isn't a field, in which case you can probably use W⟮fraction_ring R⟯.)

Junyan Xu (Feb 07 2023 at 03:28):

#18000 still needs some polishing, but we should be able to merge #17194 relatively quickly (which also contains the definition point), once we merge some latest changes into the branch. (Maybe @David Ang could do that soon; there are a few things I want to refactor, but that could be done in a later PR.)

Stepan Nesterov (Feb 07 2023 at 03:43):

I would like to define the reduction homomorphism in the following way:
Let $E$ be an elliptic curve over $K$ , let $v$ be a discrete valuation on $K$ , with its ring of integers $\mathcal{O}$ and residue field $k$ .

Define $E(\mathcal{O}) = \{ (X,Y,Z) \in \mathcal{O}^3: Y^2Z + a_1XYZ + a_3YZ^2 = X^3 + a_2X^2Z + a_4XZ + a_6Z^3, \{X,Y,Z \} \cap \mathcal{O}^{\times} \neq \varnothing \} / \mathcal{O}^{\times}$ .
Define a map $f : E(\mathcal{O}) \to E(K)$ by $(X,Y,Z) \mapsto some(X/Z, Y/Z)$ unless $Z=0$ , in which case $(X,Y,0) \mapsto zero$ .
Prove that $f : E(\mathcal{O}) \to E(K)$ is surjective (by clearing denominators) and injective (this is the most involved step and uses invertibility of the discriminant)
Define the reduction map $E(K) \to E(\mathcal{O}) \to E(k)$ as $f^{-1}$ followed by reduction.

Stepan Nesterov (Feb 07 2023 at 03:44):

Should I just make an auxilliary type for my $E(\mathcal{O})$ then? It's not mathematically correct, of course, but it works fine for dvrs.

Stepan Nesterov (Feb 07 2023 at 03:45):

Also ideas which would work less painfully in Lean are appreciated, of course :)

Johan Commelin (Feb 07 2023 at 03:53):

I think an aux type for $E(\mathcal{O})$ would indeed make sense

Junyan Xu (Feb 07 2023 at 05:12):

Sorry, this is a better link to the definition of point; the link to the PR automatically hides the point.lean file because it's too long.

1 & 2. We don't have homogenization API yet in mathlib (it's in the flt-regular project I think), so you probably have to define the polynomial by hand as a term of type O[X][X][X]. You can see here for a way to introduce notation in order to write O[X][Y][Z]. I'd suggest you define E(O) as a subset of P^2(O), for which you can mimic docs#projectivization and docs#projectivization_setoid; you'd then use docs#quotient.lift to descend the equation to a predicate on the quotient and to define the map from E(O) to E(K).

I don't see how injectivity uses the invertibility of the discriminant ... It's also pretty trivial, I think.

I suspect instead of working with the quotient type, it might be easier to work with a canonical representative for each equivalence class: if Z is a unit then set Z=1, otherwise if Y is a unit then set Y=1, and otherwise set X=1. This would also make the map from E(O) to E(k) easier to define; if Z=1 then it's some X Y and otherwise it's zero. (That doesn't seem pleasant to work with, unfortunately.)

Stepan Nesterov (Feb 07 2023 at 05:33):

Oh yes, never mind, weierstrass curves are always proper as we defined them, so of course $E(\mathcal{O}) \to E(K)$ is bijective.
The place where we actually need hypothesis on discriminant is actually in checking that $E(K) \to E(k)$ is a homomorphism, because otherwise something reduces to a singular point of $E_k$ and we can't even add it to anything. This might be tricky to check depending on how exactly the definition of addition works. Is it just by explicit addition formulas?

Junyan Xu (Feb 07 2023 at 05:49):

Addition is defined in affine coordinates in multiple steps in point.lean, with add_X, add_Y and slope as the key definitions: x₃ := W.add_X x₁ x₂ $ W.slope x₁ x₂ y₁ y₂and y₃ := W.add_Y x₁ x₂ y₁ $ W.slope x₁ x₂ y₁ y₂ as you can see here. Not sure how easy it is to show E(K)->E(k) is a homomorphism ...

David Ang (Feb 07 2023 at 15:58):

Junyan Xu said:

#18000 still needs some polishing, but we should be able to merge #17194 relatively quickly (which also contains the definition point), once we merge some latest changes into the branch. (Maybe David Ang could do that soon; there are a few things I want to refactor, but that could be done in a later PR.)

#17194 should be more-or-less ready now.

Junyan Xu (Feb 10 2023 at 08:53):

The proof that the reduction map is a homomorphism is apparently quite complicated (see also Silverman, Proposition VII.2.1, where the discriminant can be nonzero and we restrict to points that don't reduce to the singular point), so I tried and worked out a proof using ideal class groups, since we already have an injective homomorphism from the rational points into the ideal class group (isomorphic to the Picard group for domains). However, I have trouble making the argument work in residue characteristic two.

The Picard group is a contravariant functor from schemes to abelian groups, or a covariant functor from commutative rings, so we get homomorphisms between Picard groups of coordinate rings Pic(O[E]) → Pic(K[E]) and Pic(O[E]) → Pic(k[E]) given by base change of modules, and we can show the former is injective because O is a PID (moreover, it agrees with docs#ideal.map because O → K is a localization map). Therefore, if for each P = (x,y) in E(K), the ideal class of (X-x, Y-y) lies in the image of Pic(O[E]), then we can map the element of Pic(O[E]) to Pic(k[E]), and we just need to show the final result is the ideal corresponding to the reduction of P.

Let's write P=(x,y)=[x:y:1]=[u:v:w] with u,v,w in O and at least one of them a unit. If w is a unit, then x,y are in O and we just take the ideal (X-x, Y-y) of O[E]: its product with (X-x, Y+a1*x+a3+y) is equal to the ideal generated by the two partial derivatives of the Weierstrass polynomial, which contains the discriminant, so it's the unit ideal if the discriminant is a unit. And since (X-x, Y-y) is a free O-module, it's easy to see the canonical map (X-x,Y-y) ⊗O k → (X-x~,Y-y~) is an isomorphism, where (x~,y~) is the reduction of (x,y), i.e. the image of (X-x, Y-y) in Pic(k[E]) indeed corresponds to the point (x~,y~).

If w is not a unit, then u must also not be a unit, since in the homogeneous equation, every term except u^3 contains w as a factor. Therefore v must be a unit, so P = [u:v:w] reduces to [0:1:0], the point at infinity. If w=0, then u=0 as well, so P itself is the point at infinity, corresponding to the principal ideal class, so we can just lift it to the unit ideal of O[E], which reduces to the unit ideal of k[E].

Otherwise, we have val(u) > 0, val(v) = 0, val(w) > 0, and val(y) = val(v/w) < 0. Consider the inhomogeneous equation y^2 + a1 * x * y + a3 * y = x^3 + a2 * x^2 + a4 * x + a6. If val(x) ≤ val(y) < 0, then val(RHS) = 3val(x) < 2val(x) ≤ min{2val(y), val(x)+val(y), val(y)} ≤ val(LHS), contradiction. Therefore val(x) > val(y), which implies that val(LHS) = 2val(y) and val(RHS) = 3val(x) are both < 0 and equal.

We have the freedom to choose a nonzero factor t when lifting (X-x, Y-y) to an (fractional) ideal (t(X-x), Y-y) of O[E]. We can check its product with (t(X-x), Y+a1 * x+a3+y) is the principal ideal t(X-x) multiplied by (t(X-x), Y-y, Y+a1 * x+a3+y, (X^2 + (x + a2)X + (x^2 + a2 * x + a4) − a1 * Y)/t). This ideal clearly contains 2y+a1 * x+a3, whose valuation is equal to val(y) if 2 is a unit. So this ideal is generated by the single element y if it divides all coefficients: it suffices to have val(y) ≤ val(tx) and val(y) ≤ val(x^2/t). We can take t=y/x to make both equalities hold, because 2val(y)=3val(x) (if 2 is not a unit, 2val(2y+a1 * x+a3)>3val(x), so this doesn't work).

Now we've lifted the invertible ideal (X-x, Y-y) to the invertible ideal ((y/x)(X-x), Y-y) = y(X/x-1, Y/y-1), which is in the same class as I = (bX-1, cY-1) = (bX-1, bX-cY), where b=1/x, c=1/y. Tensoring with k=O/m we get I/mI. Since b and c are in the maximal ideal m, b=b * bX and c=c * bX modulo mI. Therefore bX-cY = bX * (bX-cY) lies in mI, so I/mI is generated by (bX-1) only, hence principal, corresponding to the point at infinity, which P reduces to.

If someone could figure out how to treat the case char k = 2, we would have a complete proof.

Kevin Buzzard (Feb 10 2023 at 09:46):

I haven't thought about this too hard but can you explain "We have the freedom to choose a nonzero factor t when lifting"? y isn't in O, I'm a bit confused about what (t(X-x),Y-y) means.

Junyan Xu (Feb 10 2023 at 14:39):

I wrote "(fractional) ideal", which is a kind of O[E]-submodule of Frac(O[E])=Frac(K[E]), so it makes sense to write generators in K[E]; of course we could replace (t(X-x), Y-y) by an integral ideal in the same class: just multiply it by some common denominator of tx and y, and that's what I did when I wrote ((y/x)(X-x), Y-y) = y(X/x-1, Y/y-1) and replaced it with the integral ideal (X/x-1, Y/y-1) in the same class.

Kevin Buzzard (Feb 10 2023 at 20:05):

I'm just missing something really stupid. Why isn't it (X - x, Y - y)? Where is this t coming from?

Junyan Xu (Feb 10 2023 at 22:51):

You can take (X-x, Y-y) (and it's what I took when w is a unit), but when w isn't a unit, I think the resulting fractional ideal isn't invertible. We want to extend the invertible sheaf / projective module (X-x, Y-y) on Spec K[E] to a projective module on Spec O[E], but not every extension works; actually we have the freedom to choose a,b,c,d in GL(2,K) and use (a(X-x)+b(Y-y), c(X-x)+d(Y-y)) which all boils down to (X-x, Y-y) when you tensor with K, but I think most choices won't give you projective modules.

I think the most natural choice is Hom_{O[E]}((X-x, Y+a1*x+a3+y), O[E]) (sort of double dual) and it should work if anything works at all, but I need to work out the details. (I think it amounts to computing a colon ideal, or the intersection of two principal ideals.) Another approach is to decompose K[E]^2 into a direct sum of (X-x, Y-y) with another summand, and see how we can lift it to O[E], or maybe try to lift the idempotent matrix representing the projection to a summand.

Another thought is that maybe it suffices to come up with a lift that becomes projective after tensoring with k (then it should be free, isomorphic to k[E] since it's supposed to be principal); since tensoring with K already produces an invertible module (X-x, Y-y), this probably implies the lift is itself projective, or else we might come up with a substitute for Pic(O[E]) that includes more than invertible modules.

Junyan Xu (Feb 11 2023 at 05:07):

Let me add if the discriminant is a unit in O then Spec O[E] should be regular, and so Hom_{O[E]}((X-x, Y+a1*x+a3+y), O[E]) should indeed be projective, according to stacks#0AV3 (dual modules over a Noetherian domain is reflexive), this MO answer (reflexive implies projective if global dimension <=2) and stacks#00OE (regular of Krull dimension 2 implies global dimension 2).

We don't really need to formalize these advanced results for our purpose, as it suffices to come up with the correct candidate for the lift and verify it's indeed invertible. But it would be nicer if we could prove the statement on Silverman which allows singularities, saying that the points of E(K) that reduces to nonsingular points of E(k) form a subgroup, and the reduction map is still a homomorphism.

Kevin Buzzard (Feb 11 2023 at 10:41):

It's important to allow singularities if you want to use these techniques to go on and formalise the formal group; in that approach you keep scaling the a_i by pi^Ni and then reduce, in order to understand the solutions to the equation near the origin

Junyan Xu (Feb 12 2023 at 20:29):

I've now computed an example (in mixed characteristic 2) of the dual module in Macaulay2, and verified it's indeed invertible. Hopefully it's not too hard to work out the general form of the lift from here. It might be necessary to compute an example in equal characteristic 2 to figure out the general form there.

Macaulay2, version 1.21
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, Isomorphism, LLLBases, MinimalPrimes, OnlineLookup, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : R = ZZ[x,y]/(y^2-x^3)
o1 = R
o1 : QuotientRing
// This is supposed to be O[W], where O is a DVR with 2 non-invertible
// but for convenience we use Z rather than its localization at (2)

i4 : intersect(ideal(4*x-1), ideal(8*y-1))
// (4x-1,8y-1) corresponds to the point (1/4,1/8) on y^2=x^3
// we want to compute the dual of (4x-1,8y-1), which is isomorphic to the intersection of
// the fractional ideals generated by the inverse of (4x-1) and by the inverse of (8y-1),
// which is isomorphic to the intersection of the integral ideals (8y-1) and (4x-1).

o4 = ideal(64y2−1,8xy+16y2−x−2y)
// factorizes as (8y-1)*(8y+1,x+2y)
// so (8y+1,x+2y) is the candidate for an invertible lift of (x-1/4,y+1/8)
// it's clearly a lift: x-1/4 = -1/4*(8y+1)+(x+2y), y+1/8 = 1/8*(8y+1)
o4 : Ideal of R

i5 : ideal(8*y+1,x+2*y)*ideal(8*y-1,x-2*y)
// Take the product of candidate lift of (x-1/4,y+1/8) with lift of (x-1/4,y-1/8)
o5 = ideal(64y2−1,8xy−16y2+x−2y,8xy+16y2−x−2y,x2−4y2)
o5 : Ideal of R

i7 : mingens I
// simplify the ideal
o7 = (4x−1,32y2+2x−1,x2−4y2)
// This ideal is actually principal, generated by (4x-1): x^2-4y^2 = x^2-4x^3 = -x(4x-1),
// and 32y^2+2x-1 = 8x^2+2x-1 - 8(x^2-4y^2), 8x^2+2x-1 = (4x-1)(2x+1),
// but Macaulay2 isn't able to simplify it to a single generator, or I haven't found the way.
// Therefore, both the lifts (8*y+1,x+2*y) and (8*y-1,x-2*y) are invertible.

Junyan Xu (Feb 12 2023 at 21:00):

Actually I first tried Singular (on CoCalc) but didn't have as much luck:

                     SINGULAR                                 /  Development
 A Computer Algebra System for Polynomial Computations       /   version 4.1.1
                                                           0<
 by: W. Decker, G.-M. Greuel, G. Pfister, H. Schoenemann     \   Feb 2018
FB Mathematik der Universitaet, D-67653 Kaiserslautern        \  Debian 1:4.1.1-p2+ds-4build2
> ring R = integer,(x,y), dp;
> ideal w = y2 - x3;
> qring r = std(w);
> ideal i = 4x-1;
> ideal j = 8y-1;
> intersect(i,j);
_[1]=32xy-4x-8y+1
_[2]=8x2y+16xy2-x2-2xy
// factorizes as (8y-1)*(4x-1,x(x+2y))
// multiply the second factor with its conjugate and simplify:
> ideal a = 4x-1, x*(x+2y);
> ideal b = 4x-1, x*(x-2y);
> simplify(a*b,63);
_[1]=16x2-8x+1 = (4x-1)^2
_[2]=4x3-8x2y-x2+2xy = x(4x-1)(x-2y)
_[3]=x4-4x2y2 = x^2(x+2y)(x-2y)
// doesn't include the generator (4x-1)

// but Macaulay2 is able to find (4x-1) also in this ideal:
i9 : J = ideal(4*x-1, x*(x+2*y)) * ideal(4*x-1, x*(x-2*y))
o9 = ideal(16x2−8x+1,−8x2y−x2+2xy+4y2,8x2y−x2−2xy+4y2,−4x2y2+xy2)
o9 : Ideal of R

i10 : mingens J
o10 = (4x−1,16y2−x,x2−4y2)

Junyan Xu (Feb 12 2023 at 21:13):

~~I'm guessing (X/x-1, X-xY/y) may be invertible in any characteristic (in the w nonunit case)~~

Junyan Xu (Feb 21 2023 at 08:26):

Junyan Xu said:

We have the freedom to choose a nonzero factor t when lifting (X-x, Y-y) to an (fractional) ideal (t(X-x), Y-y) of O[E]. We can check its product with (t(X-x), Y+a1 * x+a3+y) is the principal ideal (X-x) multiplied by (t^2(X-x), t(Y-y), t(Y+a1 * x+a3+y), X^2 + (x + a2)X + (x^2 + a2 * x + a4) − a1 * Y).

I think I worked this out. It turns out t=y/x always works, even in residue characteristic 2. Since val(t^2)=val(x), we actually have xX=x^2 and X^2=x^2 modulo t^2(X-x), so the last generator can be replaced by 3x^2+2a2 * x+a4-a1 * y, whose valuation is 2val(x) if 3 is invertible, and it's easy to check this is the lowest valuation among all coefficients of all generators. The case where 2 is invertible was handled before (where we can get val(t(2 * y + a1 * x + a3)) = 2val(x)). Since one of 2 and 3 must be invertible (in any local ring, in particular DVR), we conclude this ideal is always generated by x^2.

Junyan Xu (Feb 21 2023 at 08:29):

Fun exercise: for a long time I thought (4x-1,8y-1) and (4x-1,x-2y) are different ideals, but in fact they're the same ...

Junyan Xu (Feb 21 2023 at 08:35):

I'll try to work out a skeleton of a formalization in the following days; the first step would be to define Picard group of a comm_ring, which I plan to call Pic_Spec, and will be defined to be ~~projective modules of rank 1 (free of rank 1 when localized at any prime ideal)~~ (update: let's use modules with a tensor-inverse) up to isomorphism (which lives in one universe higher), and then show that for an integral domain this is isomorphic to docs#class_group, and the functoriality of Pic_Spec.

David Ang (Feb 22 2023 at 01:35):

David Michael Roberts said:

It would be nice if this proof was summarised for instance in a blog post, or I guess an article on the arXiv would be even more stable.

https://arxiv.org/abs/2302.10640

Junyan Xu (Feb 22 2023 at 03:46):

Thanks to David for writing the paper (I mostly only provided suggestions based on his draft) and taking care of submission (to ITP 2023 and arXiv)!

It's not a purely mathematical exposition as some people might hope, but also includes formalization details. I might consider writing a mathematical exposition of the associativity proof together with the reduction homomorphism proof (or any other applications we find in the meantime), to be titled (tentatively) "Ideal treatment of elliptic curves".

Oliver Nash (Feb 22 2023 at 10:16):

It's great to see this paper. Congratulations @David Ang and @Junyan Xu ! I for one would be very interested to read the hypothetical companion "Ideal treatment of elliptic curves" paper.

Junyan Xu (Mar 07 2023 at 02:52):

Sorry for the belated update. I discovered two gaps when I started to formalize the arguments, but fortunately they can be fixed while keeping the argument elementary. The same constructions are used to fill both gaps, so the proof doesn't get much longer, and still seems simpler than Silverman's or Knapp's proof (on paper or to formalize).

Recall our goal is to show that if W is a Weierstrass curve over a DVR R with field of fractions K and residue field k=R/p, then the points in W(K) that reduce to nonsingular points in W(k) form a subgroup of the group of nonsingular points in W(K), and the reduction map is a homomorphism when restricted to this subgroup. (In particular, if we take R to be the localization of Z at a prime p, then we get the reduction homomorphisms from W(Q) to W(Z/pZ). More generally, for any integral domain R that is regular at a height 1 prime p, we get a reduction homomorphism W(Frac(R)) → W(Frac(R/p)).)

The first gap is the injectivity of Pic(R[W]) → Pic(K[W]) induced by the inclusion of coordinate rings. My original argument in mind is too simple to work, beause it doesn't even use invertibility of the R[W]-module. It's true according to a theorem of Raynaud in EGA IV addenda because Pic(R)=0. Fortunately, it is possible to extract an elementary argument in terms of rings and fractional ideals from the scheme-theoretic proof involving Cartier divisors.

The key theorem EGA IV 21.1.8 translates to the following: to check two fractional ideals are equal, it suffices to consider their localizations at the primes whose local rings have depth 1. In our situation, these are either the generic point of the special fiber Spec R[W], or contained in the generic fiber Spec K[W], so in particular we have that R[W] is the intersection of K[W] with the localization R[W]_p at the prime ideal p generated by the uniformizer of R (in which all polynomials with some coefficient not divisible by p are inverted). (p is prime because the quotient (R/p)[W] is an integral domain, corresponding to the condition that the fibers are integral in Raynaud's theorem.) This can be checked directly, and it's easy to see that R[W]_p is also a DVR with the same uniformizer as R, using the fact that R[W] is a free module over R, which implies the faithfully flat condition in Raynaud's theorem.

Now, for an invertible fractional ideal I of R[W] such that KI is a principal K[W]-module (i.e. its image in Pic(K[W]) is trivial), we may divide I by the generator and assume KI = K[W]. On the other hand, R[W]_p I must be generated by a (integral) power of the uniformizer, and we may divide I by the power and assume R[W]_p I is the unit ideal. The power is a nonzero constant and hence a unit in K[W], so we still have KI = K[W]; K[W] and R[W]_p are "orthogonal" in some sense. Therefore, I is contained in both KI = K[W] and R[W]_p I = R[W]_p and therefore contained in R[W]. On the other hand, I⁻¹ = R[W]/I ⊆ (K[W]/I) ∩ (R[W]_p/I) = (K[W]/KI) ∩ (R[W]_p/R[W]_p I) = KI⁻¹ ∩ (R[W]_p I)⁻¹ = K[W] ∩ R[W]_p = R[W], where M/N here means docs#submodule.has_div with R := R[W] and A := Frac(K[W]), so I = I⁻¹⁻¹ ⊇ R[W] as well. This closes the first gap.

As for the second gap, recall my earlier argument showed that if two points of W(K) both reduce to nonsingular points in W(k), AND if their sum also reduces to a nonsingular point, then the sum of the reductions equals the reduction of the sum. But in order for the reduction-nonsingular points to form a group, we need to show the second condition is unnecessary, i.e. the sum of two reduction-nonsingular points is always reduction-nonsingular. (If W/k is an elliptic curve or more generally if W(k) contains no singular point, then this is vacuously true.) This is a more general version of docs#weierstrass_curve.nonsingular_add.

The idea is to show that if (x,y) is nonsingular but reduction-singular, then the line bundle (X-x,Y-y) on Spec K[W] does not extend to a line bundle on Spec R[W], i.e. it's not in the image of Pic(R[W]) → Pic(K[W]). Once we have this, since we already constructed such an extension when (x,y) is reduction-nonsingular, if (x,y) is the sum of two reduction-nonsingular points, then (X-x,Y-y) extends to the tensor product of the extensions for those two points, so (x,y) can't be reduction-singular.

My proof of non-extendability proceeds by first realizing the hypothetical extension as an invertible fractional ideal I of R[W] using class_group R[W] ≃* Pic_Spec R[W], and then modify I like before so that KI is the K[W]-ideal (X-x,Y-y) and R[W]_p I = R[W]_p, while keeping I invertible. I is contained in the intersection of (X-x,Y-y) and R[W]_p, which is contained in K[W] ∩ R[W]_p = R[W]. But an element of K[W] lies in the K[W]-ideal (X-x,Y-y) iff it evaluates to 0 at (x,y), and the same is true for an element of R[W] to lie in the R[W]-ideal (X-x,Y-y). Therefore the intersection is the R[W]-ideal (X-x,Y-y) and it contains I. On the other hand, I⁻¹ ⊆ KI⁻¹ ∩ (R[W]_p I)⁻¹ = (X-x)⁻¹(X-x,Y-y') ∩ R[W]_p = (X-x)⁻¹((X-x,Y-y') ∩ R[W]_p), which equals the R[W]-submodule (X-x)⁻¹(X-x,Y-y'), where (x,y') is the conjugate of (x,y). Therefore, II⁻¹ is contained in (X-x)⁻¹(X-x,Y-y)(X-x,Y-y') = (X-x,Y-y,W_X(x,y),W_Y(x,y)) [*], but this can't be the unit ideal in R[W], because neither of the derivatives W_X(x,y) and W_Y(x,y) can be a unit in R, because the reduction of (x,y) to k=R/p is singular. This contradicts the invertibility of I.

[*] This computation is used in XY_ideal_neg_mul, and will be used to show (X-x,Y-y) is an invertible R[W]-module if (x,y) is reduction-nonsingular. A more general version with X-x replaced by t(X-x) was used to show ((y/x)(X-x),Y-y) is invertible when (x,y) reduces to O.

I am posting these in the hope that people might come up with some simplifications.

Junyan Xu (Mar 07 2023 at 03:05):

I came up with these proofs last Monday but didn't have much time to continue the formalization. I defined Pic_Spec last week, and plan to add facts about base change of modules and projective modules next (to un-sorry the Pic_Spec functor and the isomorphism with class_group), and then a robust API around the rings R[W], K[W] and R[W]_p and submodules inside Frac(K[W]).

Those needed facts about tensor product, base change of modules and projective modules are:

tensoring with an arbitrary module is right exact (two statements: surjectivity, and kernel = image), to show (R/I) ⊗[R] M ≃ₗ[R/I] M/IM.
tensoring with a projective module preserves injectivity (i.e. projective modules are flat), used to embed invertible R-modules into K, and to show the multiplication in Pic_Spec agrees with that in class_group.
invertible modules and invertible fractional ideals are projective (working on this)
A ⊗[R] R ≃ₗ[A] A and A ⊗[R] (M ⊗[R] N) ≃ₗ[A] (A ⊗[R] M) ⊗[A] (A ⊗[R] N) for [algebra R A], used to show Pic_Spec is a functor and K ⊗[R] M is 1-dimensional for an invertible R-module M. I'm using docs#category_theory.Module.extend_scalars by @Jujian Zhang so I'm wondering if you already have these somewhere.

Kevin Buzzard (Mar 07 2023 at 07:14):

In Cassels' book the fact that the sum of two things-with-nonsingular-reduction has nonsingular reduction is: the line through them reduces to the line through the reductions, and a line through two nonsingular points hits the cubic at a third nonsingular point.

Kevin Buzzard (Mar 07 2023 at 07:17):

PS seeing posts like this and other potential nontrivial mathematics which we're dealing with now in algebraic geometry was basically beyond my (and I think everyone in the community's) wildest dreams five years ago.

Junyan Xu (Mar 07 2023 at 08:10):

In Cassels' book the fact that the sum of two things-with-nonsingular-reduction has nonsingular reduction is: the line through them reduces to the line through the reductions, and a line through two nonsingular points hits the cubic at a third nonsingular point.

I think this is more or less what Knapp/Langlands and Silverman do as well. The problem is that you need to count multiplicities when the two points agree upstairs or downstairs. Silverman does a case bash and relegate many cases into exercises. Knapp/Langlands apparently avoids that but the proof also doesn't look trivial to formalize.

Bolton Bailey (Apr 09 2023 at 12:53):

Hello elliptic curve people. I have been going over my project thinking of ways I could extend it, and it occurred to me I've seen a lot of discussion about elliptic curves here. While the cryptography that is formalized there is based on elliptic curve pairings, I avoided formalizing the actual elliptic curve operations that would give you the ability to run the protocols. I have a few questions about what it would take to make that happen.

It seems mathlib doesn't have implementations of the Weil or Tate pairings. How much code would be needed to implement those and prove the bilinearity property? Would it be a big undertaking on the scale of your all's efforts to prove the group law, or would it be easier somehow? On computability for elliptic curves - is it actually possible to execute code that computes the elliptic curve, or are there issues with that like there are for polynomials? Has any of this code been ported to lean 4 yet?

Bolton Bailey (Apr 09 2023 at 12:54):

Sorry if this is the wrong place the ask this, I know elliptic curve pairings obviously are based on elliptic curves, but I don't know how familiar or interesting they are to people outside cryptography.

Kevin Buzzard (Apr 09 2023 at 12:58):

I would imagine that what we have for elliptic curves is optimised for proving rather than running. For example you probably don't want to use the full Weierstrass form if you actually want to do computations. On the other hand you can prove that every elliptic curve over every field can be put into this form, which is something we want.

Reid Barton (Apr 09 2023 at 12:59):

Computability is never a problem, you write the code you actually want to run and then prove it is equal to the mathlib definition. (You probably want to do this even if the mathlib definition is already computable.)

Kevin Buzzard (Apr 09 2023 at 13:03):

The problem is that already the mathlib definition of elliptic curve is suboptimal unless you actually care about characteristic <= 3

Yaël Dillies (Apr 09 2023 at 13:06):

Surely it's just a matter of writing a norm_num extension for the cases you care about?

Kevin Buzzard (Apr 09 2023 at 14:09):

The issue is that the very model is wrong for computation.

Reid Barton (Apr 09 2023 at 14:25):

Are you talking about the specification of the curve itself a1 ... a6?

Kevin Buzzard (Apr 09 2023 at 14:32):

Right. If you want to do fast computation then you use the Edwards curve model.

David Ang (Apr 09 2023 at 14:56):

There are implementations of pairing-based cryptography for elliptic curves over fields of characteristic two, and some of them do use the full Weierstrass form minus a few coefficients

Reid Barton (Apr 09 2023 at 15:09):

In this case, it's even more irrelevant whether the stuff in mathlib is computable.

David Loeffler (Apr 09 2023 at 15:28):

Just to pick up on a part of the question which didn't get (explicitly) answered:

Bolton Bailey: It seems mathlib doesn't have implementations of the Weil or Tate pairings. [...] Sorry if this is the wrong place the ask this, I know elliptic curve pairings obviously are based on elliptic curves, but I don't know how familiar or interesting they are to people outside cryptography.

The Weil pairing is definitely totally familiar and essential all over the theory of elliptic curves, it's not at all specific to cryptography. The Tate pairing (as understood in crypto) is perhaps a bit more specialised.

Bolton Bailey (Apr 09 2023 at 15:30):

Ok cool, well then maybe implementing it might have some more broad appeal. I'm not too concerned about making a super performant cryptographic implementation, I more just think it would be nice to round out my project and see if it would be useful to any of the people I know of who are interested in formal methods for cryptographic proof systems.

Jz Pan (Jul 11 2023 at 19:54):

Sorry to bump this message again. I see that the elliptic curve codes are already ported to mathlib4.
If possible I'd like to contribute something starting from it.

Jz Pan (Jul 11 2023 at 19:55):

More specifically, I'd like to find out if there are useful things in my previous attempts https://github.com/acmepjz/my-lean-test/blob/master/gtm106/ which worth to add to mathlib.

Jz Pan (Jul 12 2023 at 00:03):

Currently my work has the following which are not present in mathlib4 elliptic curves:

Weierstrass curve (over a field) which have a node or cusp, and the criteria of them (namely disc=0 and c4!=0, resp. c4=0)
change of variable defined in a structure (which allows to define the type of isomorphisms between two Weierstrass curves, of type of isomorphisms of a Weierstrass curve)
normal forms of Weierstrass curves of char not 2 or 3 (resp. char 3, and char 2); normal form of singular Weierstrass curve
for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure

Unfortunately, some of them require lengthy calculations in char 2 or 3 or not-perfect case.

Scott Morrison (Jul 12 2023 at 00:18):

It sounds like you have a lot of material, and that it covers material Mathlib certainly wants to have. It also sounds like it would be many separate PRs (i.e. at least one PR per phrase in your bullet points above), so I would advise just starting with the simplest and easier chunk that would work as a standalone PR.

Do be aware that in starting the review process of PRs, very likely reviewers will ask for some design changes, and this may constitute a fair bit of work to adapt all your later material for this. Obviously do explain if proposed changes would cause problems downstream, but also try to take the attitude that big refactors are good, and that the review process, while sometimes a lot of work, does work!

Scott Morrison (Jul 12 2023 at 00:19):

@Jz Pan, I've just sent you a github invite, which should give you the ability to create a new branch on the mathlib4 repository.

David Ang (Jul 12 2023 at 00:26):

@Jz Pan I have had a quick glance at your old repository.

Weierstrass curve (over a field) which have a node or cusp, and the criteria of them (namely disc=0 and c4!=0, resp. c4=0)

I think these definitions would be nice in Weierstrass.lean. What eventual theorem do you have in mind?

change of variable defined in a structure (which allows to define the type of isomorphisms between two Weierstrass curves, of type of isomorphisms of a Weierstrass curve)

We already have this as WeierstrassCurve.variableChange, etc. unless I'm missing something.

normal forms of Weierstrass curves of char not 2 or 3 (resp. char 3, and char 2); normal form of singular Weierstrass curve

These are great! I have some normal forms for char 2 in my old mathlib3 branch but yours is a lot more complete.

for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure

You should PR the former! I think benjeffers (from the Xena Discord) is working on the latter and I'll let them know.

Something I noticed is that your repo seems to have a good projective API, whereas mine only defines affine points. Maybe you could also add these (perhaps using the new Projectivization API), and a good way to transfer between the two? In what I'm working right now I need "Jacobian coordinates" (as Sutherland calls it) where the curve lives in weighted projective space (x, y, z) -> (l^2x, l^3y, lz), and I don't know a clean way to do this just yet, so I'm hoping your usual projective version would give me a better idea.

AFAIK other ongoing projects include: @Junyan Xu is defining the reduction map over a DVR, and I'm defining division polynomials which requires Jacobian coordinates to prove anything useful.

Jz Pan (Jul 12 2023 at 01:01):

Scott Morrison said:

It sounds like you have a lot of material, and that it covers material Mathlib certainly wants to have. It also sounds like it would be many separate PRs (i.e. at least one PR per phrase in your bullet points above), so I would advise just starting with the simplest and easier chunk that would work as a standalone PR.

Sure, I'll reorganize and upgraded the code to lean4, and divided them into small chunks.

Do be aware that in starting the review process of PRs, very likely reviewers will ask for some design changes, and this may constitute a fair bit of work to adapt all your later material for this. Obviously do explain if proposed changes would cause problems downstream, but also try to take the attitude that big refactors are good, and that the review process, while sometimes a lot of work, does work!

That's OK. In fact my code are _just works_ without much design consideration; I'm open to discussions and suggestions.

I've just sent you a github invite

Thank you very much, I'll look at them when I have spare time.

Jz Pan (Jul 12 2023 at 01:21):

David Ang said:

Weierstrass curve (over a field) which have a node or cusp, and the criteria of them (namely disc=0 and c4!=0, resp. c4=0)

I think these definitions would be nice in Weierstrass.lean. What eventual theorem do you have in mind?

Currently I have the following theorems (only for Weierstrass curve over fields, not sure if they are meaningful over a ring):

if disc is not zero then all points on the curve are non-singular (which already in mathlib4)
conversely, if all points are non-singular, under some field characteristic and perfectness assumptions, disc is not zero
there are at most one singular point on a Weierstrass curve
if disc=0 and c4 not zero (resp. c4=0 plus some field characteristic and perfectness assumptions), it has a rational node (resp. cusp)
if it has a rational node (resp. cusp), then disc=0 and c4 not zero (resp. c4=0)

change of variable defined in a structure (which allows to define the type of isomorphisms between two Weierstrass curves, of type of isomorphisms of a Weierstrass curve)

We already have this as WeierstrassCurve.variableChange, etc. unless I'm missing something.

I see. In my opinion, is it better if we define a structure which consists of u,r,s and t? Since variable change can be composed, inverted, etc. (which are useful to define the automorphism group of an elliptic curve), which is cumbersome to define if you don't define it as a structure, but mention u,r,s,t separately everywhere.

normal forms of Weierstrass curves of char not 2 or 3 (resp. char 3, and char 2); normal form of singular Weierstrass curve

These are great! I have some normal forms for char 2 in my old mathlib3 branch but yours is a lot more complete.

for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure

You should PR the former! I think benjeffers (from the Xena Discord) is working on the latter and I'll let them know.

These two are the parts which use polynomial calculus heavily (especially the latter).

Something I noticed is that your repo seems to have a good projective API, whereas mine only defines affine points. Maybe you could also add these (perhaps using the new Projectivization API), and a good way to transfer between the two? In what I'm working right now I need "Jacobian coordinates" (as Sutherland calls it) where the curve lives in weighted projective space (x, y, z) -> (l^2x, l^3y, lz), and I don't know a clean way to do this just yet, so I'm hoping your usual projective version would give me a better idea.

I have defined projective points and the conversions between them and affine points. Current mathlib4 elliptic curve only has affine points. Nevertheless, it looks like affine points API is enough (I don't remember if I actually use projective points to prove anything). Maybe I'll experiment them later.

David Ang (Jul 12 2023 at 11:08):

I don’t mind changing variableChange into a structure. You can make a PR and we’ll see how it goes.

Jz Pan (Jul 12 2023 at 22:37):

I've just submitted the PR which changes variableChange into a structure: https://github.com/leanprover-community/mathlib4/pull/5841 . By the way, I created the branch acmepjz_ec_variable_change which turns out to be wrong. Could someone help me to delete that branch? Thanks.

Junyan Xu (Jul 12 2023 at 22:57):

You should be able to find your branch at https://github.com/leanprover-community/mathlib4/branches and delete it.

Jz Pan (Jul 16 2023 at 10:30):

David Ang said:

for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure

You should PR the former! I think benjeffers (from the Xena Discord) is working on the latter and I'll let them know.

I have submitted a draft PR https://github.com/leanprover-community/mathlib4/pull/5935 for the first part.

Jz Pan (Jul 16 2023 at 10:31):

This is mostly done, but I meet a problem in the final theorem, and need some help. I'm trying to make an MWE, but due to if's in the code I failed to make it shorter:

import Mathlib.AlgebraicGeometry.EllipticCurve.Weierstrass

open scoped Classical

universe u

variable (R : Type u) [CommRing R] (F : Type u) [Field F]

namespace WeierstrassCurve

/-- The Weierstrass curve $Y^2 + Y = X^3$. -/
def ofJInvariant0 : WeierstrassCurve R :=
  ⟨0, 0, 1, 0, 0⟩

lemma ofJInvariant0_c₄ : (ofJInvariant0 R).c₄ = 0 := by
  simp only [ofJInvariant0, c₄, b₂, b₄]
  ring1

lemma ofJInvariant0_Δ : (ofJInvariant0 R).Δ = -27 := by
  simp only [ofJInvariant0, Δ, b₂, b₄, b₆, b₈]
  ring1

/-- The Weierstrass curve $Y^2 = X^3 + X$. -/
def ofJInvariant1728 : WeierstrassCurve R :=
  ⟨0, 0, 0, 1, 0⟩

lemma ofJInvariant1728_c₄ : (ofJInvariant1728 R).c₄ = -48 := by
  simp only [ofJInvariant1728, c₄, b₂, b₄]
  ring1

lemma ofJInvariant1728_Δ : (ofJInvariant1728 R).Δ = -64 := by
  simp only [ofJInvariant1728, Δ, b₂, b₄, b₆, b₈]
  ring1

end WeierstrassCurve

namespace EllipticCurve

/-- The elliptic curve $Y^2 + Y = X^3$ when $3$ is invertible. -/
def ofJInvariant0 [Invertible (3 : R)] : EllipticCurve R :=
  ⟨
    WeierstrassCurve.ofJInvariant0 R,
    ⟨-27, -(⅟3) ^ 3,
      by
        rw [calc (27 : R) = (3 : R) ^ 3 := by norm_num]
        simp only [mul_neg, neg_mul, neg_neg]
        exact pow_mul_pow_eq_one 3 (mul_invOf_self (3 : R))
      , by
        rw [calc (27 : R) = (3 : R) ^ 3 := by norm_num]
        simp only [mul_neg, neg_mul, neg_neg]
        exact pow_mul_pow_eq_one 3 (invOf_mul_self (3 : R))
    ⟩,
    by rw [WeierstrassCurve.ofJInvariant0_Δ R]
  ⟩

lemma ofJInvariant0_j [Invertible (3 : R)] : (ofJInvariant0 R).j = 0 := by
  simp only [j, ofJInvariant0, WeierstrassCurve.ofJInvariant0_c₄]
  ring1

/-- The elliptic curve $Y^2 = X^3 + X$ when $2$ is invertible. -/
def ofJInvariant1728 [Invertible (2 : R)] : EllipticCurve R :=
  ⟨
    WeierstrassCurve.ofJInvariant1728 R,
    ⟨-64, -(⅟2) ^ 6,
      by
        rw [calc (64 : R) = (2 : R) ^ 6 := by norm_num]
        simp only [mul_neg, neg_mul, neg_neg]
        exact pow_mul_pow_eq_one 6 (mul_invOf_self (2 : R))
      , by
        rw [calc (64 : R) = (2 : R) ^ 6 := by norm_num]
        simp only [mul_neg, neg_mul, neg_neg]
        exact pow_mul_pow_eq_one 6 (invOf_mul_self (2 : R))
    ⟩,
    by rw [WeierstrassCurve.ofJInvariant1728_Δ R]
  ⟩

lemma ofJInvariant1728_j [Invertible (2 : R)] : (ofJInvariant1728 R).j = 1728 := by
  simp only [j, ofJInvariant1728, WeierstrassCurve.ofJInvariant1728_c₄]
  field_simp
  ring1

noncomputable def ofJInvariant0' : EllipticCurve F := by
  by_cases hchar3 : (3 : F) = 0
  · have : Invertible (2 : F) := by
      apply invertibleOfNonzero
      rw [←hchar3]
      apply_fun (· - 2)
      norm_num
      exact zero_ne_one
    exact ofJInvariant1728 F
  · have : Invertible (3 : F) := invertibleOfNonzero hchar3
    exact ofJInvariant0 F

lemma ofJInvariant0'_j : (ofJInvariant0' F).j = 0 := by
  by_cases hchar3 : (3 : F) = 0
  · have : Invertible (2 : F) := by
      apply invertibleOfNonzero
      rw [←hchar3]
      apply_fun (· - 2)
      norm_num
      exact zero_ne_one
    simp only [ofJInvariant0', hchar3, dite_true]
    have := ofJInvariant1728_j F
    rw [calc (1728 : F) = (3 : F) * (576 : F) := by norm_num, hchar3, zero_mul] at this
    -- FIXME: does not work
    exact this
  · have : Invertible (3 : F) := invertibleOfNonzero hchar3
    simp only [ofJInvariant0', hchar3, dite_false]
    -- FIXME: does not work
    exact ofJInvariant0_j F

end EllipticCurve

Jz Pan (Jul 16 2023 at 10:33):

It produces weird error messages:

type mismatch
  this
has type
  j (ofJInvariant1728 F) = 0 : Prop
but is expected to have type
  j (ofJInvariant1728 F) = 0 : Prop

and

type mismatch
  ofJInvariant0_j F
has type
  j (ofJInvariant0 F) = 0 : Prop
but is expected to have type
  j (ofJInvariant0 F) = 0 : Prop

I don't understand where does it go wrong.

Jz Pan (Jul 16 2023 at 10:35):

Also, any suggestions on the proof that -27 and -64 are units of R? Currently my code works, but looks stupid.

Scott Morrison (Jul 16 2023 at 10:35):

You might try turning on more pretty printer options (e.g. set_option pp.all true for the most extreme version).

Scott Morrison (Jul 16 2023 at 10:35):

Also convert this rather than exact this can be helpful for diagnosing why Lean doesn't think the type matches.

Jz Pan (Jul 16 2023 at 10:39):

Ah, convert this works! Thank you very much.

Jz Pan (Aug 07 2023 at 10:12):

Is it possible to overwrite the ext lemma generated by the system? Currently it is

theorem EllipticCurve.ext {R : Type u} :
∀ {inst : CommRing R} (x y : EllipticCurve R),
  x.a₁ = y.a₁ → x.a₂ = y.a₂ → x.a₃ = y.a₃ → x.a₄ = y.a₄ → x.a₆ = y.a₆ → x.Δ' = y.Δ' → x = y

But clearly x.Δ' = y.Δ' is not necessary.

Scott Morrison (Aug 07 2023 at 10:13):

You should just write your own ext lemma rather than using the attribute.

Ruben Van de Velde (Aug 07 2023 at 10:14):

And in particular, you can stick @[ext] on that manually written theorem rather than on the structure

Jz Pan (Aug 07 2023 at 10:20):

Thanks. If I provide ext, can I still ask the system to generateext_iff?

Scott Morrison (Aug 07 2023 at 12:06):

Unfortunately no, you need to that yourself.

David Ang (Aug 07 2023 at 12:20):

I definitely wanted to do this originally, but I didn't want to write the ext_iff manually

Junyan Xu (Aug 08 2023 at 04:51):

Recently I had the chance to think deeply about the treatment of reduction on Cassels' book and work out many details, and have been able to convince myself the approach does work; in fact, both map_add and nonsingularity of reduction of the sum can be proven relatively easily with the approach (probably easier than my former Picard group approach), and the sophisticated multiplicity-based treatments by Silverman/Knapp-Langlands are unnecessary, contrary to what I previously thought. This approach involves homogenization of 2-variable polynomials, and defining reduction definitely requires projective coordinates, which people seem to be developing recently. Thus I'd like to write down a detail exposition of the proofs here to pin down what exactly are needed, so as to better coordinate the efforts.

Junyan Xu (Aug 08 2023 at 04:52):

The basic setting consists of a valuation ring R, its field of fractions K, its maximal ideal m, and the residue field k. Let W be a Weierstrass curve with equation f(X,Y)=0 with coefficients in R, and let F(X,Y,Z) be the homogenization of `f(X,Y). Explicitly,

$F(X,Y,Z) = Y^2 Z + a_1 XYZ + a_3 YZ^2 - (X^3 + a_2 X^2 Z + a_4 XZ^2 + a_6 Z^3).$

It's easy to verify the projective solutions of F(X,Y,Z)=0 exactly consists of affine points [x:y:1] (where (x,y) are solutions of f(X,Y)=0) and the point at infinity O=[0:1:0] (Lemma 1). Every point (singular or not) in W(K) has some projective coordinate $P=[x:y:z]$ ; one or more of x,y,z has minimum valuation and is a GCD of x,y,z. Dividing by the GCD, we may assume one of x,y,z is a unit (or actually 1, if necessary), and x,y,z all lie in R. Any two such choices of x,y,z differ by a unit in R, so the reduction $\tilde{P}=[\tilde{x}:\tilde{y}:\tilde{z}]$ is a well-defined point in $\mathbb{P}^2(k)$ that is a solution of the reduced homogeneous polynomial $\tilde{F}(X,Y,Z)$ (which is the homogenization of $\tilde{f}(X,Y)$ ), and by Lemma 1 again this is either an affine point or O.

Now let P1 and P2 be two arbitrary points in W(K). If both points are affine and P1 ≠ -P2, we know from docs#WeierstrassCurve.addPolynomial_slope that there exists a linear function l(X) = kX + b such that f(X, l(X)) = -(X-x1)(X-x2)(X-x3), where P1 = (x1,y1), P2 = (x2,y2) and P3 = -(P1 + P2) = (x3,y3). Homogenization of this factorization yields L(X,Z) = kX + bZ and $F(X, L(X,Z), Z) = -(X-x_1 Z)(X-x_2 Z)(X-x_3 Z)$ . To be able to reduce the LHS, we need L to have coefficients (k and b) in R, and when this is not the case, we can transform the equation Y = kX + bZ to X = (1/k)Y - (b/k)Z if k has minimal valuation, and Z = (1/b)Y - (k/b)X if b (but not k) has minimal valuation. In the former case, letting L'(Y,Z) = (1/k)Y - (b/k)Z, we obtain $F(L'(Y,Z), Y, Z) = F(L'(Y,Z), L(L'(Y,Z),Z), Z) = -\prod_{i=1}^3 (L'(Y,Z)-x_i Z)$ , and the latter case is similar. We need to check the following key fact is preserved: if we write $L'(Y,Z)-x_i Z = y_i Z - z_i Y$ , then $P_i = [L'(y_i,z_i):y_i:z_i]$ .

If both points are affine but P1 = -P2, or if exactly one of two points (wlog P1) is affine, then we take $L(Y,Z)=x_1 Z$ and check $F(L(Y,Z),Y,Z)=(Y-y_1 Z)(Y-y_i Z)Z$ (where i=2 in the former case and i=3 in the latter case); if $x_1 \notin R$ , we use $L(X,Y)=(1/x_1)X$ instead. Finally, if both points are O we take $L(X,Y)=0$ and check $F(X,Y,L(X,Y))=-X^3$ . In all cases we substitute one of the variable in the homogeneous polynomial by a R-combination of the other two variables, and factorize the resulting homogeneous polynomial in two variables as a product of three linear homogeneous polynomials, from which the projective coordinates of the three points $P_i$ can be read off. Without loss of generality, we'll only discuss the L(Y,Z) case below.

Consider the factorization $F(L(Y,Z), Y, Z) = c\prod_{i=1}^3 (y_i Z-z_i Y)$ ; the factor $c\in K^\times$ appears because we want to ensure that for each $i$ , one of $y_i$ and $z_i$ is 1 and the other also lies in R, in order to perform reduction. This $c$ actually lies in R, because the product of the three linear polynomials is primitive and therefore divides the LHS as polynomials in $R[Y,Z]$ by docs#Polynomial.IsPrimitive.dvd_of_fraction_map_dvd_fraction_map. (Note: the redundant Polynomial.IsPrimitive q condition should be removed, and NormalizedGCDMonoid should be replaced by GCDMonoid. Since mathlib doesn't have primitivity of multivariate polynomials, we may evaluate at Z=1 before applying this argument.) Now we can reduce both sides of the factorization modulo m, and it's crucial that $c$ doesn't reduce to 0 (i.e. $c$ is a unit), by the following argument: if $c=0$ then $\tilde{F}(\tilde{L}(Y,Z),Y,Z)$ is the zero polynomial, which is equivalent to $(X-\tilde{L}(Y,Z))\mid\tilde{F}(X,Y,Z)$ , so $\tilde{F}(X,Y,Z)$ is not irreducible. However, we know that $\tilde{f}(X,Y)$ is irreducible by docs#WeierstrassCurve.irreducible_polynomial, and the homogenization of an irreducible polynomial is again irreducible (Lemma 2), contradiction.

We conclude that $c$ is a unit in R and therefore can be incorporated into one of the factors $y_i Z-z_i Y$ while keeping $(y_i,z_i)=1$ for all $i$ . Reducing both sides of the factorization yields $\tilde{F}(\tilde{L}(Y,Z),Y,Z)=\prod_{i=1}^3 (\tilde{y_i}Z-\tilde{z_i}Y)$ , and $\tilde{P_i}=[\tilde{L}(\tilde{y_i},\tilde{z_i}):\tilde{y_i}:\tilde{z_i}]$ is exactly the reduction of $P_i$ . Because of the way we chose to apply the transformations X = (1/k)Y - (b/k)Z, Z = (1/b)Y - (k/b)X, and Z = (1/x)X, we've made sure that the reduced linear substitution polynomials are of one of three forms: $\tilde{L}(X,Z)=kX+bZ$ , $\tilde{L}(Y,Z)=xZ$ , or $\tilde{L}(X,Y)=0$ . Given the factorization of the substituted polynomial, it should be straightforward to establish that $\tilde{P_3}=-(\tilde{P_1}+\tilde{P_2})$ (i.e., map_add), and that $\tilde{P_3}$ is nonsingular if both $\tilde{P_1}$ and $\tilde{P_2}$ are (e.g. dehomogenize and apply docs#WeierstrassCurve.nonsingular_add' for the affine case).

Proof of Lemma 2: Assume that f(X,Y) is an irreducible polynomial with coefficient in a domain. It's easy to see the newly introduced variable Z doesn't divide the homogenization F(X,Y,Z). For any factorization of its homogenization F=GH, both G and H must be homogeneous (already formalized here) and not divisible by Z, and f=gh where g(X,Y)=G(X,Y,1) and h(X,Y)=H(X,Y,1), so wlog we may assume g is a unit by irreducibility of f. We can write $G(X,Y,Z)=\sum_{i=0}^n G_i(X,Y)Z^i$ with $G_i$ homogeneous of degree n-i, where n is the degree of $G$ , so $g(X,Y)=G(X,Y,1)=\sum_{i=0}^n G_i(X,Y)$ also has degree n because the $Z^0$ coefficient is nonzero. Since g is a unit, n=0 and therefore G is also a unit.

Johan Commelin (Aug 08 2023 at 06:38):

Thanks so much for writing down these observations. Feel free to post them as GH issue. That way it might be easier to preserve them. Here on zulip they might get lost by the flood of messages that gets posted each week.

David Ang (Aug 08 2023 at 18:36):

I have a basic working API of projective coordinates of Weierstrass curves (defined using MvPolynomial over an arbitrary ring, but some things are only correct over a field), but the group law I only have it over a field. The PR should be ready in a couple of weeks.

David Ang (Aug 08 2023 at 18:49):

Although I'm not sure how best to link it back to affine coordinates, since our current affine coordinates is defined as a Polynomial Polynomial. It's not realistic to add another layer of Polynomial since even the definition will time out, so I had to resort to MvPolynomial. We could prove that Polynomial Polynomial is equivalent to MvPolynomial over bool or fin 2, then define Projective.polynomial as the homogenisation of this, but I think that's going to make proving anything rather annoying. If I remember correctly, I think homogenisation exists in FLT-regular but not in mathlib yet?

Currently I just repeated the definition:

local notation "𝐗" => X 0
local notation "𝐘" => X 1
local notation "𝐙" => X 2

noncomputable def polynomial : MvPolynomial (Fin 3) R :=
  𝐘 ^ 2 * 𝐙 + C W.a₁ * 𝐗 * 𝐘 * 𝐙 + C W.a₃ * 𝐘 * 𝐙 ^ 2
    - (𝐗 ^ 3 + C W.a₂ * 𝐗 ^ 2 * 𝐙 + C W.a₄ * 𝐗 * 𝐙 ^ 2 + C W.a₆ * 𝐙 ^ 3)

Then have things like lemma nonsingular_iff_affine (x y : R) : W.nonsingular ![x, y, 1] ↔ W.toAffine.nonsingular x y, etc. without going through homogenisation at all. Is this sufficient?

Eric Rodriguez (Aug 08 2023 at 20:28):

cc @Alex J. Best - I think the homogenisation code would be nice to have in Mathlib - if you don't have time, should we port it?

Damiano Testa (Aug 08 2023 at 21:38):

Junyan, I wonder if the proofs will become simpler by defining the reduction for points on vanishing set of ideals in multivariate polynomials over a valuation ring.

The ideals part may take some of the pain away from dealing with the equation of a line. Moreover, you can then apply the reduction map both to points on E(K) and to collinear points, since those are also points on a closed subset.

Alex J. Best (Aug 08 2023 at 22:24):

Eric Rodriguez said:

cc Alex J. Best - I think the homogenisation code would be nice to have in Mathlib - if you don't have time, should we port it?

I think @Riccardo Brasca already ported it at https://github.com/riccardobrasca/flt-regular/blob/master/FltRegular/ReadyForMathlib/Homogenization.lean so it just needs a little bit of cleanup and then it should be ready to PR. help is always appreciated :smile:

Eric Rodriguez (Aug 08 2023 at 22:44):

I wonder if it'd be nice to have homogenization be defined over two index types, a map from the old to the new, and a place to homogenise at. It'd be weird but for example in this case I guess it'd be nice to have mv_poly fin2 R homogenized to mv_poly fin3 R

Alex J. Best (Aug 08 2023 at 22:50):

Yeah my original thought process is that the way to do that should be to simply move your polynomial to the bigger ring (as its functorial or monadic or whatever in the variable set) and then homogenise using the version I wrote. This seemed to be the more flexible basic set-up rather than baking in two index types with a possibly identity map into the core definition. We probably do want some api for that eventually.

Yaël Dillies (Aug 09 2023 at 19:49):

Let me @Mantas Baksys because we wanted homogenisation for Zsigmondy's theorem.

Junyan Xu (Aug 12 2023 at 04:00):

David Ang said:

I have a basic working API of projective coordinates of Weierstrass curves (defined using MvPolynomial over an arbitrary ring, but some things are only correct over a field), but the group law I only have it over a field. The PR should be ready in a couple of weeks.

Thanks! I won't need to use the group law written in projective coordinates; I just need to know the solutions (over a field) of F(X,Y,Z)=0 (up to simultaneous multiplication) are in 1-1 correspondence to the solutions of f(X,Y)=0 plus O.

Then have things like lemma nonsingular_iff_affine (x y : R) : W.nonsingular ![x, y, 1] ↔ W.toAffine.nonsingular x y, etc. without going through homogenisation at all. Is this sufficient?

Yeah I think a general API connecting R[X][X] and MvPolynomial (Fin 3) R is probably not of very broad interest, so anything we develop will be probably tailored to our specific need and not be very complete. The flt-regular homogenization API is nice, and we could map f(X,Y) into MvPolynomial (Fin 3) R and homogenize it, and I think we could perform the substitutions F(X,Y,1) and F(X,Y,L(X,Y)) all within MvPolynomial (Fin 3) R without the involvement of a bivariate polynomial ring.

Some API that I do want to develop here is for the "concrete projectivization", i.e. P^n(R) for a general ring, as suggested here. Rather than working with a tuple ![x,y,1], we'd work with the quotient of tuples by simultaneous (left) multiplication by a unit of R, which I tentatively call TupleQuotUnits n R. I think the equation and nonsingular predicates would be more convenient when descended to TupleQuotUnits n R. The map TupleQuotUnits n R → TupleQuotUnits n S can be defined for any MonoidHom via Quotient.lift, so the reduction map can be defined on this level. When applied to a RingHom, this map respects the predicate (again defined via Quotient.lift) on TupleQuotUnits n R that says the (left) ideal generated by the elements is the unit ideal, which cuts out the projective space P^n(R) from TupleQuotUnits (n+1) R. (Note for general commutative ring R this only gives a subset of the points in P^n(R), but if Pic(R)=0 we get all points, and this is good enough for our purpose.) We can then show P^n(R) → P^n(Frac(R)) is bijective when R is a valuation ring or a PID, or some more general class of rings. This would allow us to define the reduction map W(K) → W(k).

Some more words about the homogenization API: I originally thought that homogenization would take an arbitrary element in a Nat-graded algebra A and produce an element in A[X] homogeneous w.r.t. the induced Nat-grading (sum of grading on A and polynomial degree). I don't know any application of this generalized version, even though docs#AlgebraicGeometry.Proj.toLocallyRingedSpace does take a Nat-graded algebra. However, since there are the following three advantages of using MvPolynomial (Fin 3) R, I'm now in favor of the flt-regular API:

David has encountered timeout when trying to work with R[X][X][X] but MvPolynomial is OK
Homogeneity is already defined for MvPolynomial but not yet available for R[X][X][X] (also, Nat-grading on MvPolynomial has been PR'd in mathlib#10119 but not for R[X][X])
The three variables have equal status, which makes it easier to make general statements that apply to any of them

Junyan Xu (Aug 12 2023 at 04:04):

Damiano Testa said:

Junyan, I wonder if the proofs will become simpler by defining the reduction for points on vanishing set of ideals in multivariate polynomials over a valuation ring.

The ideals part may take some of the pain away from dealing with the equation of a line. Moreover, you can then apply the reduction map both to points on E(K) and to collinear points, since those are also points on a closed subset.

I'm afraid I have trouble making sense of your idea @Damiano Testa . You are saying there's another definition of reduction when restricted to points on vanishing set of ideals / closed subsets? What can we say about reduction defined this way, and wouldn't it still agree with the map P^n(K) → P^n(k)?

Damiano Testa (Aug 12 2023 at 06:03):

I expressed myself badly. What I meant is basically that once you prove that projective space satisfies the valuative criterion for properness, then you get essentially for free that closed subsets of projective space also satisfy it.

Thus, the reduction map on an elliptic curve, not only maps points of the elliptic curve over K to points over k, but it also maps triples of points summing to zero to triples of points summing to zero.

That was all the content of my message!

Kevin Buzzard (Aug 12 2023 at 08:43):

So we need projective morphisms are proper, which is elimination theory.

Re F[X][X] being slow: I saw an example yesterday where this was due to the ring1 tactic trying and failing to find a proof that F[X][X] had characteristic zero. Removing the instances it was trying made things faster.

David Ang (Aug 12 2023 at 10:49):

Something to note while writing my projective API over a field is that it was a lot easier to work with [(triple, nonsingular)], where [.] denotes the class under the usual equivalence relation, rather than ([(triple, nonzero)], nonsingular), which was why I didn’t reuse the existing projective space API

David Ang (Aug 12 2023 at 10:53):

In any case what I really needed is weighted homogenisation (x,y,z) ~ (u^2x, u^3y, uz), so I’m only writing this to get a feel on how to write my weighted projective API (which cryptographers usually call “Jacobian coordinates”), and I don’t know if flt-regular or anywhere else has this

Antoine Chambert-Loir (Aug 12 2023 at 14:10):

Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'

David Ang (Aug 12 2023 at 15:31):

Antoine Chambert-Loir said:

Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'

Would it be easy to generalise the homogenisation in flt-regular to this weighted context?

Junyan Xu (Aug 13 2023 at 03:26):

David Ang said:

Antoine Chambert-Loir said:

Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'

Would it be easy to generalise the homogenisation in flt-regular to this weighted context?

For the definition, we just need to change totalDegree here to weightedDegree' (w i needs to be 1 though) ...

Something to note while writing my projective API over a field is that it was a lot easier to work with [(triple, nonsingular)], where [.] denotes the class under the usual equivalence relation, rather than ([(triple, nonzero)], nonsingular), which was why I didn’t reuse the existing projective space API

That's good to know, though what I'm proposing is something like ([triple], [nonsingular]).

I expressed myself badly. What I meant is basically that once you prove that projective space satisfies the valuative criterion for properness, then you get essentially for free that closed subsets of projective space also satisfy it.

If I understand correctly, this is saying if a point in P^n(K) lies on some closed subset cut out by some equations like F(X,Y,Z)=0 or Z=L(X,Y) with coefficients in R, then its reduction also satisfies the equation. But this isn't hard to prove, right? (Just reduce the equations.) And we don't just want to know the reduction of the third point lies on the line: if the reduction of the three points are distinct then it's enough, but if not we need to know the third reduction is the negative of the sum of the other two reductions, for which it's crucial to use a factorization so we can express the coordinates of the third reduction linearly in terms of the other two reductions via Vieta's rule.

We can then show P^n(R) → P^n(Frac(R)) is bijective when R is a valuation ring or a PID, or some more general class of rings.

Just noticed Bezout domains are a common generalization.

Eric Rodriguez (Aug 14 2023 at 23:16):

David Ang said:

Antoine Chambert-Loir said:

Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'

Would it be easy to generalise the homogenisation in flt-regular to this weighted context?

this should be easy, the definition is finsupp.map_domain (λ j, j + finsupp.single i (p.total_degree - j.sum (λ _ m, m))) p; the most annoying thing would probably just be to make API linking this to the non-weighted case

Damiano Testa (Aug 16 2023 at 07:44):

Junyan Xu said:

I expressed myself badly. What I meant is basically that once you prove that projective space satisfies the valuative criterion for properness, then you get essentially for free that closed subsets of projective space also satisfy it.

If I understand correctly, this is saying if a point in P^n(K) lies on some closed subset cut out by some equations like F(X,Y,Z)=0 or Z=L(X,Y) with coefficients in R, then its reduction also satisfies the equation. But this isn't hard to prove, right? (Just reduce the equations.) And we don't just want to know the reduction of the third point lies on the line: if the reduction of the three points are distinct then it's enough, but if not we need to know the third reduction is the negative of the sum of the other two reductions, for which it's crucial to use a factorization so we can express the coordinates of the third reduction linearly in terms of the other two reductions via Vieta's rule.

I need to think more about how to formalize this, but in the "informal" case, you would get the "correct distribution of multiplicities" by functoriality. I suspect that this kind of functoriality would come from the "functor of points", which I think is actually not in mathlib. So maybe your approach with explicit factorisations might be quicker for your goal.

David Ang (Nov 17 2023 at 22:31):

I've been busy with other things, but here is finally some basic API for projective coordinates. @Junyan Xu only wanted the definition and some predicates, but I needed to write out the full group law for other purposes. I'm requesting for comments on the code before going further and adding documentation. In particular I have not defined the map induced by a MonoidHom but this should be easy once this PR is OK.

Jz Pan (Apr 09 2024 at 22:44):

I have some random remarks on the design of the elliptic curve API, which may be nonsense.

Contrary to current design of EllipticCurve which extends WeierstrassCurve with a discriminant unit assumption, I think maybe it's better to only have WeierstrassCurve structure, and we can make discriminant unit assumption as a typeclass WeierstrassCurve.NonDegenerate. In this way the results requiring elliptic curve can be redesigned as requiring E : WeierstrassCurve with [E.NonDegenerate].

Kevin Buzzard (Apr 09 2024 at 22:55):

My initial reaction to this is "how do we explain to mathematicians this weird way of saying ""let E be an elliptic curve""?" . But I don't want to dismiss your idea, especially if we can somehow do it with notation or other trickery.

David Ang (Apr 09 2024 at 22:57):

This might be a good idea. I noticed that it's somewhat awkward to work with elliptic curves E, since most of the API are developed around Weierstrass curves, so we keep having to do E.toWeierstrassCurve e.g. for points and base changes. Even worse, now that I added affine coordinates, etc, so we might have to do E.toWeierstrassCurve.toAffine etc...

Jz Pan (Apr 09 2024 at 23:08):

Kevin Buzzard said:

how do we explain to mathematicians this weird way of saying ""let E be an elliptic curve""?

Informally we would say "let y ^ 2 = x ^ 3 + a * x + b be an elliptic curve", which just means a Weierstrass curve, implicitly with invertible discriminant assumption.

Jz Pan (Apr 16 2024 at 20:45):

A disadvantage is that it makes the definition of the inverse of discriminant and the j-invariant non-computable, if WeierstrassCurve.NonDegenerate is a Prop. Perhaps we can make it a Type which contains the inverse of the discriminant, but in this way it is not a subsingleton.

Johan Commelin (Apr 17 2024 at 03:00):

It will still be a subsingleton, right?

Jz Pan (Apr 17 2024 at 10:31):

Johan Commelin said:

It will still be a subsingleton, right?

Yes, but it needs a proof I think. I'm not sure the impact if two WeierstrassCurve.NonDegenerate are provided.

Junyan Xu (Apr 17 2024 at 17:30):

The Type proposal sounds good to me, and it just amounts to extracting the two fields Δ' : Rˣ and coe_Δ' : ↑self.Δ' into a structure Nondegenerate out of EllipticCurve, right?

Jz Pan (Apr 17 2024 at 19:32):

Junyan Xu said:

The Type proposal sounds good to me, and it just amounts to extracting the two fields Δ' : Rˣ and coe_Δ' : ↑self.Δ' into a structure Nondegenerate out of EllipticCurve, right?

I think yes. And we can define WeierstrassCurve.Δ' to be the Δ' in corresponding Nondegenerate, etc., and the current API shouldn't need to be changed (besides the input arguments).

Jz Pan (Apr 26 2024 at 19:31):

I think there is a problem when using typeclasses. For example, if we have h : E = E' and E.NonDegenerate and E'.NonDegenerate, how can we get E.j = E'.j? I think in this case rw [h] does not work, since it cannot transfer instances.

Joël Riou (Apr 26 2024 at 20:17):

subst h should work.

Jz Pan (Apr 26 2024 at 20:57):

Thank you. The following code works:

import Mathlib.Tactic

variable (R : Type*) [CommRing R]

structure Test where
  a : R
  b : R

variable {R}

class Test.NonDegenerate (E : Test R) where
  b' : Rˣ
  coe_b' : (b' : R) = E.b

instance Test.NonDegenerate.subsingleton (E : Test R) : Subsingleton E.NonDegenerate where
  allEq x y := by
    obtain ⟨x, hx⟩ := x
    obtain ⟨y, hy⟩ := y
    have : x = y := by ext; rw [hx, ← hy]
    simp_rw [this]

def Test.b' (E : Test R) [E.NonDegenerate] : Rˣ :=
  Test.NonDegenerate.b' E

theorem Test.coe_b' (E : Test R) [E.NonDegenerate] : (E.b' : R) = E.b :=
  Test.NonDegenerate.coe_b'

def Test.j (E : Test R) [E.NonDegenerate] : R :=
  E.a * E.b'⁻¹

example (E E' : Test R) [E.NonDegenerate] [E'.NonDegenerate] (h : E = E') :
    E.j = E'.j := by
  subst h
  congr!

Jz Pan (Apr 26 2024 at 20:58):

The simp_rw [this] could be replaced by subst this; rfl.

Jz Pan (Apr 26 2024 at 21:07):

The Prop version seems simpler, but at the expense that the j becomes noncomputable:

import Mathlib.Tactic

variable (R : Type*) [CommRing R]

structure Test where
  a : R
  b : R

variable {R}

class Test.NonDegenerate (E : Test R) : Prop where
  isUnit : IsUnit E.b

noncomputable def Test.b' (E : Test R) [E.NonDegenerate] : Rˣ :=
  (Test.NonDegenerate.isUnit (E := E)).unit

theorem Test.coe_b' (E : Test R) [E.NonDegenerate] : (E.b' : R) = E.b := rfl

noncomputable def Test.j (E : Test R) [E.NonDegenerate] : R :=
  E.a * E.b'⁻¹

example (E E' : Test R) [E.NonDegenerate] [E'.NonDegenerate] (h : E = E') :
    E.j = E'.j := by
  simp_rw [h]

David Ang (May 03 2024 at 12:23):

Junyan Xu said:

Just noticed Bezout domains are a common generalization.

Can also prove W(K) -> W(Frac(K)) is bijective for GCDMonoids since all we need is division by the GCD of the coordinates? However the argument to prove reduction is a group homomorphism seems to still need the restriction to valuation rings...

David Michael Roberts (May 03 2024 at 12:34):

Maybe https://ncatlab.org/nlab/show/GCD+ring ? (perhaps a stupid suggestion, I haven't actually thought about this for one moment)

David Michael Roberts (May 03 2024 at 12:37):

an integral gcd ring with [Krull] dimension ≤ 1 is a Bézout ring

From http://hlombardi.free.fr/CACM.pdf

Jz Pan (Sep 16 2024 at 18:48):

I created a draft PR #16864 which defines some normal forms of elliptic curves under various characteristic. Currently only char not equal to 2 normal form is defined. Do you have suggestions on the API style? Thanks.

Johan Commelin (Sep 17 2024 at 08:39):

I left some minor comments

Jz Pan (Sep 17 2024 at 10:29):

Johan Commelin said:

I left some minor comments

Thank you!

Now there is a problem that the name IsNormalFormOfCharNeTwo is too long. Since we have normal forms of characteristic not equal to 2 case and characteristic equal to 2 case, it cannot simply be IsNormalForm₂.

What about IsNFCharNeTwo or IsNFCharNe2?

David Ang (Sep 17 2024 at 10:34):

Thanks for this! I did something similar in Lean 3, and also proved that the change of variables induces an isomorphism on the group of points. Maybe there's something that you can get out of the old file.

Jz Pan (Sep 17 2024 at 10:38):

David Ang said:

Thanks for this! I did something similar in Lean 3, and also proved that the change of variables induces an isomorphism on the group of points. Maybe there's something that you can get out of the old file.

Thank you! It happened that I also did it in Lean 3 :joy:

Jz Pan (Sep 17 2024 at 10:40):

proved that the change of variables induces an isomorphism on the group of points

By the way, this should be a general result: any linear change of variable (docs#WeierstrassCurve.VariableChange) should induce an isomorphism on the group of points. Looks like that it is not in mathlib4 yet.

David Ang (Sep 17 2024 at 11:00):

Yep, we should add that first...

Johan Commelin (Sep 17 2024 at 11:26):

Jz Pan said:

Now there is a problem that the name IsNormalFormOfCharNeTwo is too long. Since we have normal forms of characteristic not equal to 2 case and characteristic equal to 2 case, it cannot simply be IsNormalForm₂.

What about IsNFCharNeTwo or IsNFCharNe2?

Yeah, that could work. I thought the normal forms depended on what is invertible: 2, or 3, or 6. But maybe I misremember...

David Ang (Sep 17 2024 at 12:14):

I thought the normal forms depended on what is invertible

Mostly yeah, but Silverman also defined normal forms in characteristic 2

Jz Pan (Sep 29 2024 at 07:41):

@David Ang

BTW I do think your idea posted on Zulip of changing EllipticCurve into a WeierstrassCurve with an extra predicate (IsElliptic/IsSmooth/IsNeZeroDisc? Rather than have it be an extended structure) is fantastic, and would prevent issues like this in the future, but I haven't got around it. Have you made any attempts at this?

I finished a small experiment, so far so good, except for Inhabited (EllipticCurve F) cannot be stated anymore...

Jz Pan (Sep 29 2024 at 07:41):

WeierstrassTest.lean

Jz Pan (Oct 04 2024 at 16:57):

Are we able to define the invariant differential for elliptic curves defined by Weierstrass equation now?

David Ang (Oct 05 2024 at 14:12):

How do you write dx?

Jz Pan (Oct 05 2024 at 14:21):

David Ang said:

How do you write dx?

I'm thinking of an ad-hoc way to define it: first define the space of rational differential forms to be free K(E)-module of rank one with generator dx which is a formal symbol, here K(E) is the field of rational functions of E over K. Then we can define dy, or more generally, if f is a regular function (or rational function), we can define df. For a rational differential form we can define the order of zeroes/poles at a point on E. Then we can define regular differential forms. We define translation of differential forms with respect to the addition law. Finally we can define invariant differential.

Or can we use Kahler differential of K[E] over K? I think it is already defined in mathlib.

Jz Pan (Nov 01 2024 at 16:53):

I created a draft PR #18531 to add WeierstrassCurve.Elliptic and remove EllipticCurve. Of course, currently there are errors everywhere. @David Ang do you have any suggestions?

David Ang (Nov 01 2024 at 17:51):

Jz Pan said:

I created a draft PR #18531 to add WeierstrassCurve.Elliptic and remove EllipticCurve. Of course, currently there are errors everywhere. David Ang do you have any suggestions?

This looks promising, but I'll only have time to look at it later next week. Should it be IsElliptic instead?

Jz Pan (Nov 01 2024 at 18:21):

David Ang said:

I'll only have time to look at it later next week.

That's fine, meanwhile I'll try to fix the other errors.

Should it be IsElliptic instead?

No, because it is Type but not Prop, it contains the (unique) choice of the inverse of discriminant.

Kevin Buzzard (Nov 02 2024 at 00:55):

Maybe you can use docs#IsUnit instead?

Jz Pan (Nov 02 2024 at 05:31):

Kevin Buzzard said:

Maybe you can use docs#IsUnit instead?

See https://leanprover.zulipchat.com/#narrow/channel/116395-maths/topic/thoughts.20on.20elliptic.20curves/near/435659226. The original design of EllipticCurve includes a choice of the inverse of discriminant, so my Elliptic also includes it. It is possible to use IsUnit, but then Δ' and j, etc., becomes noncomputable, as IsUnit.unit is noncomputable (which uses Classical.choose).

Ruben Van de Velde (Nov 02 2024 at 09:12):

And?

Jz Pan (Nov 02 2024 at 09:20):

I think using Invertible makes minimal changes to existing code. :cold_sweat:

Jz Pan (Nov 02 2024 at 09:31):

I think all compile errors in #18531 is fixed, but there are some weird errors:

/- The `simpNF` linter reports:
SOME SIMP LEMMAS ARE NOT IN SIMP-NORMAL FORM.
see note [simp-normal form] for tips how to debug this.
https://leanprover-community.github.io/mathlib_docs/notes.html#simp-normal%20form -/
-- Mathlib.AlgebraicGeometry.EllipticCurve.VariableChange
Error: ././././Mathlib/AlgebraicGeometry/EllipticCurve/VariableChange.lean:301:1: error: @WeierstrassCurve.variableChange_j LINTER FAILED:
simplify fails on left-hand side:
(deterministic) timeout at `isDefEq`, maximum number of heartbeats (200000) has been reached
Use `set_option maxHeartbeats <num>` to set the limit.
Additional diagnostic information may be available using the `set_option diagnostics true` command.

-- Mathlib.AlgebraicGeometry.EllipticCurve.Weierstrass
Error: ././././Mathlib/AlgebraicGeometry/EllipticCurve/Weierstrass.lean:514:1: error: @WeierstrassCurve.map_j LINTER FAILED:
simplify fails on left-hand side:
(deterministic) timeout at `whnf`, maximum number of heartbeats (200000) has been reached
Use `set_option maxHeartbeats <num>` to set the limit.
Additional diagnostic information may be available using the `set_option diagnostics true` command. ::error::The process '/usr/bin/env' failed with exit code 1

Jz Pan (Nov 02 2024 at 11:38):

I don't know what's that, and I can't reproduce this at my computer.

Dean Young (Nov 03 2024 at 06:57):

The Weil uniformization theorem is very interesting and related to the double quotients in Langlands'. Has anyone thought ab out how far we are from something like that?

Jz Pan (Nov 03 2024 at 09:41):

I've tested that Elliptic : Type -> IsElliptic : Prop and which eliminates the above mysterious timeout error. At the expense that Δ' and j being noncomputable...

Kevin Buzzard (Nov 03 2024 at 14:51):

I wouldn't worry about noncomputability. If someone wants to compute j or Delta in the sense of "work out the coefficient of q^37" then they're certainly not going to do it by using what we have, they're going to write completely different algorithms and then prove that they agree with our (soon to be noncomputable?) definitions. The moment we leave the absolute basics, there's a big difference between the largely theoretical concept of being computable as far as lean goes, and being useful.

Jz Pan (Nov 04 2024 at 06:16):

PR #18531 is ready for review. It adds WeierstrassCurve.IsElliptic and removes EllipticCurve.

Jz Pan (Nov 04 2024 at 06:17):

E. Dean Young said:

The Weil uniformization theorem is very interesting and related to the double quotients in Langland's. Has anyone thought ab out how far we are from something like that?

I don't know if it is directly related to elliptic curves. Maybe you should open a new topic?

Jz Pan (Dec 01 2024 at 09:58):

I'm thinking of refactoring WeierstrassCurve.VariableChange.

My plan is to promote the use of standard notation 1, * and ⁻¹ instead of id, comp and inv. Remove id_comp, etc. and use standard one_mul, etc. instead. Meanwhile comp_def (or mul_def) and comp_(u|r|s|t) (or mul_(u|r|s|t)) is preserved as projection lemmas. How can I accomplish that?

Johan Commelin (Dec 02 2024 at 17:36):

I'm not sure I understand what the question is.

Jz Pan (Dec 02 2024 at 17:52):

Let me see how to say.

Take PerfectClosure as an example. It has PerfectClosure.zero_def which states 0 = PerfectClosure.mk K p (0, 0) and PerfectClosure.mk_zero which is a simp lemma and stats PerfectClosure.mk K p 0 = 0. There is no def PerfectClosure.zero := PerfectClosure.mk K p 0. There is no PerfectClosure.mul_assoc, one should use the general mul_assoc instead.

I want to do the same thing for WeierstrassCurve.VariableChange, i.e. don't def WeierstrassCurve.VariableChange.id := ⟨1, 0, 0, 0⟩, but WeierstrassCurve.VariableChange.one_def : 1 = ⟨1, 0, 0, 0⟩ instead. Don't WeierstrassCurve.VariableChange.comp_assoc but use the general mul_assoc instead, etc. etc.

Do you think it's a good idea?

Junyan Xu (Dec 02 2024 at 18:08):

You just want to put a Group structure on WeierstrassCurve.VariableChange, right? And then you can define a MulAction on Weierstrass/EllipticCurve. Sounds like a good idea!

Jz Pan (Dec 02 2024 at 18:35):

Junyan Xu said:

You just want to put a Group structure on WeierstrassCurve.VariableChange, right? And then you can define a MulAction on Weierstrass/EllipticCurve. Sounds like a good idea!

We already have these: WeierstrassCurve.VariableChange.instGroup and WeierstrassCurve.instMulActionVariableChange.

What I mean is currently we have two ways to say the same thing, e.g. WeierstrassCurve.VariableChange.id and 1, WeierstrassCurve.VariableChange.comp and *, WeierstrassCurve.VariableChange.inv and ⁻¹. This makes rw sometimes not working; you need change or erw.

For PerfectClosure there is only one way to say zero: 0. (Of course you can always write PerfectClosure.mk K p 0, just like VariableChange you can always write ⟨1, 0, 0, 0⟩ instead of 1.) I think this makes the code unified.

Johan Commelin (Dec 03 2024 at 07:31):

I see! Sounds like a good change

David Ang (Feb 26 2025 at 16:41):

I'm writing refactor PRs for splitting the coordinate files, and the topic of having many erw's came up - pretty much all of them are just a result of P : Fin 3 → R and ![P 0, P 1, P 2] not being definitionally equal:

lemma fin3_def {R : Type} [CommRing R] (P : Fin 3 → R) : ![P 0, P 1, P 2] = P := by ext n; fin_cases n <;> rfl
lemma smul_fin3 {R : Type} [CommRing R] (P : Fin 3 → R) (u : R) : u • P = ![u * P x, u * P y, u * P z] := by simp [← List.ofFn_inj]

These two lemmas are used everywhere, and sometimes it's just inconvenient to have them, so I just used erw to make proofs shorter. A PR by @Kim Morrison a week ago added an implementation note saying "This file makes extensive use of erw to get around this problem. While erw is often an indication of a problem, in this case it is self-contained and should not cause any issues. It would alternatively be possible to add some automation to assist here.".

I think we can do slightly better. I propose to use R × R × R instead of Fin 3 → R for Jacobian/projective coordinates:

example {R : Type} [CommRing R] (P : R × R × R) : (P.1, P.2.1, P.2.2) = P := rfl
example {R : Type} [CommRing R] (P : R × R × R) (u : R) : u • P = (u * P.1, u * P.2.1, u * P.2.2) := rfl

Note that my original PRs for these coordinates use custom types of triples, but @Junyan Xu pointed out that Fin 3 → R already has a natural scalar action by units so we switched to that, but R × R × R also has a natural scalar action by units so we're also good here. This way (I think) all the 30-ish erw's in those files can be removed, and IMO (x, y, z) looks slightly better than ![x, y, z]. We can also do something similar for affine coordinates instead of having two separate arguments x and y. There are two catches: (1) P.2.1 and P.2.2 look stupid (can we abuse notation so that .3 does exactly what .2.2 does?) and (2) the comp_* lemmas have to be written slightly differently.

Kevin Buzzard (Feb 26 2025 at 16:44):

def Prod.thd {A B C : Type*} (x : A × B × C) := x.2.2

David Ang (Feb 26 2025 at 16:45):

Something like that, but we also need to reprogram Prod.snd to do Prod.snd.fst :upside_down:

Andrew Yang (Feb 26 2025 at 16:45):

Um... this surprisingly works? How desirable this is is another question though...

notation a ".3" => Prod.snd (Prod.snd a)
example {α β γ : Type*} (e : α × β × γ) : γ := e.3

David Ang (Feb 26 2025 at 16:46):

We can do scoped notation just for these files

Aaron Liu (Feb 26 2025 at 16:50):

.2 works too

import Mathlib.Tactic.TypeStar

notation a ".2" => Prod.fst (Prod.snd a)
notation a ".3" => Prod.snd (Prod.snd a)

variable {α β γ : Type*} (e : α × β × γ)

example : α := e.1
example : β := e.2
example : γ := e.3

Johan Commelin (Feb 26 2025 at 17:21):

Alternatively, define .x, .y and .z.

David Ang (Feb 26 2025 at 17:23):

Thanks everyone - I think I don't doubt that this is possible, just whether refactoring into this is a good idea

Jz Pan (Feb 26 2025 at 17:57):

Aaron Liu said:

.2 works too

Would this break h.2 if h : α × β or h : P /\ Q or h : P <-> Q?

Aaron Liu (Feb 26 2025 at 18:02):

Jz Pan said:

Aaron Liu said:

.2 works too

Would this break h.2 if h : α × β or h : P /\ Q or h : P <-> Q?

I don't know! Let's try it out.

import Mathlib.Tactic.TypeStar

notation a ".2" => Prod.fst (Prod.snd a)
notation a ".3" => Prod.snd (Prod.snd a)

variable {α β γ : Type*} (e : α × β × γ)

example : α := e.1
example : β := e.2
example : γ := e.3

/-
application type mismatch
  Prod.snd h
argument
  h
has type
  True ∧ False : Prop
but is expected to have type
  ?m.3441 × ?m.3437 × ?m.3438 : Type (max ?u.3440 ?u.3435 ?u.3436)
-/
example (h : True ∧ False) : False := h.2

It seems like it breaks.

Kevin Buzzard (Feb 26 2025 at 20:13):

I think .x, .y and .z (or perhaps .X, .Y and .Z) is probably a better idea: these are the standard variable names for what is going on here.

Damiano Testa (Feb 26 2025 at 20:52):

This suggestion comes up periodically (#Is there code for X? > ✔ product projection dot notation beyond 2? ) and it is not really too good of an idea.

Aaron Liu (Feb 26 2025 at 21:31):

Here's a fun toy implementation I came up with, what do you all think?

import Lean

section
open Lean Meta Elab

syntax (name := extendedDot) term ".." num : term

@[term_elab extendedDot]
def elabExtendedDot : Term.TermElab := fun stx type? => do
  match stx with
  | `($t:term..$n:num) => do
    let et ← Term.elabTerm t none
    let n := n.getNat
    let tt ← whnfD (← inferType et)
    tt.withApp fun e args =>
      matchConstStructureLike e
        (fun _ ↦ if n = 1 then Term.elabTerm t e else throwUnsupportedSyntax)
        fun i _ c => do
          let a := c.numFields
          let fallback : Unit → TermElabM Expr := fun _ ↦ do
            let ti ← do
              let info ← MonadRef.mkInfoFromRefPos
              pure
                {raw :=
                  Syntax.node3 info `Lean.Parser.Term.proj t.raw (Syntax.atom info ".")
                  (Syntax.node1 info `fieldIdx (Syntax.atom info (toString n))) : TSyntax `term}
            let ti ← Term.elabTerm ti type?
            return ti
          if n < a then return ← fallback ()
          let nc := n - a + 1
          let some ll := args[args.size - 1]? | throwUnsupportedSyntax
          let ti ← do
            let info ← MonadRef.mkInfoFromRefPos
            pure
              {raw :=
                Syntax.node3 info `extendedDot
                  (Syntax.node3 info `Lean.Parser.Term.proj t.raw (Syntax.atom info ".")
                  (Syntax.node1 info `fieldIdx (Syntax.atom info (toString a))))
                  (Syntax.atom info "..") (Syntax.mkNatLit nc) : TSyntax `term}
          Term.elabTerm ti type?
  | _ => throwUnsupportedSyntax

end

universe u v w
variable {α : Type u} {β : Type v} {γ : Type w} (e : α × (Sigma fun (_ : β) => γ))

example : α := e..1
example : β := e..2
example : γ := e..3

example (h : True ∧ False ∧ False) : False := h..3

Jz Pan (Feb 26 2025 at 23:42):

I think .. is used in somewhere else in mathlib. What about .' ?

Another idea is write a x[0] x[1] x[2] ... processor for Prod. This should not conflict with existing things as Prod doesn't have [ ] yet, it's only for Array, List, etc.

Ruben Van de Velde (Feb 26 2025 at 23:45):

I think the fact that we're using fixed length vectors is at least a pink flag, and a simple structure with three fields might be better

Aaron Liu (Feb 27 2025 at 00:52):

Jz Pan said:

I think .. is used in somewhere else in mathlib. What about .' ?

Indeed .. is used in interval integrals. The reason I chose .. was because . didn't work and it was the first thing that came to mind. I have no problem with any other notation, .' should be fine.

Jz Pan (Feb 27 2025 at 01:07):

I propose a "recursive projection" notation .' whose implementation should be in #maths > thoughts on elliptic curves @ 💬 . This notation should not be conflict with the existing dot notation and other notations, since an identifier cannot start with '. This can be viewed as an inverse to recursive anonymous constructor (i.e. ⟨x, y, z⟩ is ⟨x, ⟨y, z⟩⟩ but not ⟨⟨x, y⟩, z⟩, and (x, y, z) is (x, (y, z)) but not ((x, y), z)). It only recursive at last member. Example: if x : A × B × C then x.'1, x.'2, x.'3 are x.1, x.2.1, x.2.2 respectively. If x: (A × B) × C then there are only x.'1 and x.'2.

Yakov Pechersky (Feb 27 2025 at 01:42):

Wouldn't a docs#GetElem instance work, like your previous message? And it can work recursively if the right side of the prod has its own GetElem instance.

Jz Pan (Feb 27 2025 at 02:45):

Good idea! But seems that GetElem does not support heterogeneous output types, right? So we can only add it to α × α, α × (α × α), etc.

Let me try.

Jz Pan (Feb 27 2025 at 03:11):

That's working!

universe u v

namespace Prod

variable (α : Type u) (β : Type v) (valid : outParam (β → Nat → Prop)) [GetElem β Nat α valid]

instance instGetElem : GetElem (α × α) Nat α fun _ i => i < 2 where
  getElem xs i _ := if i = 0 then xs.1 else xs.2

protected abbrev instGetElemOfGetElem.valid : α × β → Nat → Prop
| _, 0 => True
| (_, y), n + 1 => valid y n

instance instGetElemOfGetElem : GetElem (α × β) Nat α (instGetElemOfGetElem.valid α β valid) where
  getElem xs i h := match xs, i with
    | (x, _), 0 => x
    | (_, y), n + 1 => y[n]'h

end Prod

#eval (1, 2)[0]
#eval (1, 2)[1]
#eval (1, 2, 3, 4)[3]
#eval (1, 0, [2, 3, 4])[3]

The instGetElemOfGetElem.valid must be abbrev but not def, otherwise the last two #eval is not working.

Jz Pan (Feb 27 2025 at 03:15):

Seems that GetElem? instance is not synthesized automatically...

Aaron Liu (Feb 27 2025 at 03:22):

You probably need a Decidable instance.

Jz Pan (Feb 27 2025 at 03:34):

Thank you! This works:

universe u v

namespace Prod

variable (α : Type u) (β : Type v) (valid : outParam (β → Nat → Prop)) [GetElem β Nat α valid]

instance instGetElem : GetElem (α × α) Nat α fun _ i => i < 2 where
  getElem xs i _ := if i = 0 then xs.1 else xs.2

protected abbrev instGetElemOfGetElem.valid : α × β → Nat → Prop
  | _, 0 => True
  | (_, y), n + 1 => valid y n

instance instGetElemOfGetElem.instDecidableValid [(xs : β) → (i : Nat) → Decidable (valid xs i)] :
    (xs : α × β) → (i : Nat) → Decidable (instGetElemOfGetElem.valid α β valid xs i) :=
  fun xs i => match xs, i with
    | _, 0 => .isTrue trivial
    | (_, y), n + 1 => inferInstanceAs (Decidable (valid y n))

instance instGetElemOfGetElem : GetElem (α × β) Nat α (instGetElemOfGetElem.valid α β valid) where
  getElem xs i h := match xs, i with
    | (x, _), 0 => x
    | (_, y), n + 1 => y[n]'h

end Prod

#eval (1, 2)[0]
#eval (1, 2)[1]
#eval (1, 2, 3, 4)[3]
#eval (1, 0, [2, 3, 4])[5]?

Kim Morrison (Feb 27 2025 at 03:45):

#eval (1, 0, [2, 3, 4])[5]? this is surprising, and would prefer an implementation where this didn't work.

Jz Pan (Feb 27 2025 at 03:50):

Kim Morrison said:

#eval (1, 0, [2, 3, 4])[5]? this is surprising, and would prefer an implementation where this didn't work.

it's [ ]? and the return type is Option Nat, and of course the value is none.

Yakov Pechersky (Feb 27 2025 at 03:58):

Maybe an @[inline] def is enough, thinking about why you need abbrev

Kim Morrison (Feb 27 2025 at 04:06):

Jz Pan said:

Kim Morrison said:

#eval (1, 0, [2, 3, 4])[5]? this is surprising, and would prefer an implementation where this didn't work.

it's [ ]? and the return type is Option Nat, and of course the value is none.

Yes, I understand --- and I'm saying I don't want such an instance to exist!

Kim Morrison (Feb 27 2025 at 04:07):

That is, if we do this it should only be for homogeneous products.

Johan Commelin (Feb 27 2025 at 04:38):

I'm not so concerned about the fact that

#eval (1, 0, [2, 3, 4])[5]? -- returns `some none`

but I don't like that

#eval (1, 0, [2, 3, 4])[2]? -- returns `some 2`

Can GetElem be heterogeneous? Could the latter return some [2,3,4]?
If not, then I would also prefer to stick with homogeneous products.

Jz Pan (Feb 27 2025 at 05:47):

Yakov Pechersky said:

Maybe an @[inline] def is enough, thinking about why you need abbrev

Well, I'm not aware of @[inline] def. Does this still work with Decidable instance?

Jz Pan (Feb 27 2025 at 05:54):

Johan Commelin said:

... but I don't like that

#eval (1, 0, [2, 3, 4])[2]? -- returns `some 2`

Why not? It's cool :sunglasses:

Jz Pan (Feb 27 2025 at 05:58):

Johan Commelin said:

Can GetElem be heterogeneous? Could the latter return some [2,3,4]?

Unfortunately, I think GetElem can only be homogeneous. For heterogeneous one, I think we should stick to the proposed .' syntax as above. Another advantage to .' is that it works on meta level, so it won't produce additional intermediate codes for proof term.

If not, then I would also prefer to stick with homogeneous products.

That's fine. Anyways, the following code should restrict ourselves to homogeneous products (unless a GetElem (α × β) instance is provided manually, e.g. change def instGetElemOfGetElem to an instance).

universe u v

namespace Prod

variable (α : Type u) (β : Type v) (valid : outParam (β → Nat → Prop)) [GetElem β Nat α valid]

instance instGetElem : GetElem (α × α) Nat α fun _ i => i < 2 where
  getElem xs i _ := if i = 0 then xs.1 else xs.2

protected abbrev instGetElemOfGetElem.valid : α × β → Nat → Prop
  | _, 0 => True
  | (_, y), n + 1 => valid y n

instance instGetElemOfGetElem.instDecidableValid [(xs : β) → (i : Nat) → Decidable (valid xs i)] :
    (xs : α × β) → (i : Nat) → Decidable (instGetElemOfGetElem.valid α β valid xs i) :=
  fun xs i => match xs, i with
    | _, 0 => .isTrue trivial
    | (_, y), n + 1 => inferInstanceAs (Decidable (valid y n))

def instGetElemOfGetElem : GetElem (α × β) Nat α (instGetElemOfGetElem.valid α β valid) where
  getElem xs i h := match xs, i with
    | (x, _), 0 => x
    | (_, y), n + 1 => y[n]'h

instance instGetElemOfGetElem'
    (valid : outParam (α × β → Nat → Prop)) [GetElem (α × β) Nat α valid] :
    GetElem (α × α × β) Nat α (instGetElemOfGetElem.valid α _ valid) :=
  instGetElemOfGetElem α _ valid

end Prod

#eval (1, 2)[0]
#eval (1, 2)[1]
#eval (1, 2, 3, 4)[3]
#eval (1, 0, [2, 3, 4])[2] -- now this doesn't work

Yakov Pechersky (Feb 27 2025 at 06:01):

Jz Pan said:

Well, I'm not aware of @[inline] def. Does this still work with Decidable instance?

Yes, your example above still works when it's an @[inline] def. AFAIK, abbrev is @[inline, reducible].

Yakov Pechersky (Feb 27 2025 at 06:02):

I implemented it to work only on homogeneous products this way:

universe u v

namespace Prod

variable (α : Type u) (β : Type u) (valid : outParam (α × β → Nat → Prop)) [GetElem (α × β) Nat α valid]

instance instGetElem : GetElem (α × α) Nat α fun _ i => i < 2 where
  getElem xs i _ := if i = 0 then xs.1 else xs.2

@[inline]
protected def instGetElemOfGetElem.valid : α × α × β → Nat → Prop
  | _, 0 => True
  | (_, y), n + 1 => valid y n

instance instGetElemOfGetElem.instDecidableValid [(xs : α × β) → (i : Nat) → Decidable (valid xs i)] :
    (xs : α × α × β) → (i : Nat) → Decidable (instGetElemOfGetElem.valid α β valid xs i) :=
  fun xs i => match xs, i with
    | _, 0 => .isTrue trivial
    | (_, y), n + 1 => inferInstanceAs (Decidable (valid y n))

instance instGetElemOfGetElem : GetElem (α × α × β) Nat α (instGetElemOfGetElem.valid α β valid) where
  getElem xs i h := match xs, i with
    | (x, _), 0 => x
    | (_, y), n + 1 => y[n]'h

end Prod

#eval (1, 2)[0]
#eval (1, 2)[1]
#eval (1, 2, 3, 4)[3]
#eval (1, 0, [2, 3, 4])[5]? -- no longer works
#eval (1, 0, [2, 3, 4])[2]? -- no longer works

Jz Pan (Feb 27 2025 at 06:02):

Yakov Pechersky said:

Maybe an @[inline] def is enough, thinking about why you need abbrev

Interestingly, a mere def also works if the Decidable instance is provided. Otherwise it does not work.

Jz Pan (Feb 27 2025 at 06:07):

Yakov Pechersky said:

I implemented it to work only on homogeneous products this way

Thank you! This is just the inlined version of my instGetElemOfGetElem'.

Yakov Pechersky (Feb 27 2025 at 06:07):

As a smoke test for what David was asking for initially, which was rfl defeq of ![P 0, P 1, P2] = P, I tried the following, in either your or my homogeneous product implementation, and had failures:

theorem foo (a b : α) : (a, b)[0] = a := rfl
theorem bar (a b : α) : (a, b)[1] = b := rfl
theorem baz (a b c : α) : (a, b, c)[0] = a := rfl
theorem baz' (a b c : α) : (a, b, c)[1] = b := rfl -- can't find proof for valid
theorem baz'' (a b c : α) : (a, b, c)[2] = c := rfl -- can't find proof for valid

Yakov Pechersky (Feb 27 2025 at 06:09):

But it does work in the "uh oh this also supports any GetItem"

Jz Pan (Feb 27 2025 at 06:11):

Seems that @[inline] def does not work, but abbrev works.

Yakov Pechersky (Feb 27 2025 at 06:12):

Ah, yes. abbrev or @[inline, reducible] works.

Jz Pan (Feb 27 2025 at 06:13):

I've tested that the homogeneous only version also works.

Yakov Pechersky (Feb 27 2025 at 06:14):

So, for clarity, an implementation that I think achieves the desiratum that David spelled out:

universe u v

namespace Prod

variable (α : Type u) (β : Type u) (valid : outParam (α × β → Nat → Prop)) [GetElem (α × β) Nat α valid]

instance instGetElem : GetElem (α × α) Nat α fun _ i => i < 2 where
  getElem xs i _ := if i = 0 then xs.1 else xs.2

protected abbrev instGetElemOfGetElem.valid : α × α × β → Nat → Prop
  | _, 0 => True
  | (_, y), n + 1 => valid y n

instance instGetElemOfGetElem.instDecidableValid [(xs : α × β) → (i : Nat) → Decidable (valid xs i)] :
    (xs : α × α × β) → (i : Nat) → Decidable (instGetElemOfGetElem.valid α β valid xs i) :=
  fun xs i => match xs, i with
    | _, 0 => .isTrue trivial
    | (_, y), n + 1 => inferInstanceAs (Decidable (valid y n))

instance instGetElemOfGetElem : GetElem (α × α × β) Nat α (instGetElemOfGetElem.valid α β valid) where
  getElem xs i h := match xs, i with
    | (x, _), 0 => x
    | (_, y), n + 1 => y[n]'h

end Prod

theorem foo (a b : α) : (a, b)[0] = a := rfl
theorem bar (a b : α) : (a, b)[1] = b := rfl
theorem baz (a b c : α) : (a, b, c)[0] = a := rfl
theorem baz' (a b c d: α) : (a, b, c, d)[1] = b := rfl
theorem baz'' (a b c d: α) : (a, b, c, d )[2] = c := rfl

theorem defeq (p : α × α × α) : (p[0], p[1], p[2]) = p := rfl

Jz Pan (Mar 03 2025 at 14:55):

Should we allow Ш for identifier names? For now it's still not possible.

David Ang (Mar 03 2025 at 14:57):

It would be very nice, but I think we should still call it TateShafarevichGroup (and set Ш as an abbreviation?)

Johan Commelin (Mar 03 2025 at 14:59):

Or just call it Sha if TateShafarevichGroup is too long (-;

Kevin Buzzard (Mar 03 2025 at 17:17):

Historical fact: before Cassels' push to rename it Ш it was always called TS.

David Ang (Mar 06 2025 at 16:53):

Regarding the dumb issue with R × R × R: I just remembered that one of the reasons we used Fin 3 → R was to match the API for MvPolynomial. We could evaluate f : MvPolynomial (Fin 3) R at some element P : Fin 3 → R by doing eval P f rather than having to do eval ![P.1, P.2.1, P.2.2] f for some P : R × R × R.

This is not a big deal, so I will make a PR to change Fin 3 → R to R × R × R since there seems to be no objections. For now I will have the local notations x, y, z for .1, .2.1, .2.2 because I think it's quite clean, but we should be able to switch to the getElem instance version pretty easily.

The other annoyance has to do with the comp_* lemmas for a map f : R → S: previously we could do f ∘ P for the point ![f (P x), f (P y), f (P z)] since P : Fin 3 → R happens to be a map, but now we have to do something like Prod.map f (Prod.map f f) P. I don't think there is a Prod.map₂ (where Prod.map₂ f = Prod.map f f)? I'd love a Prod.map₃ but that's even more of a stretch, and I can't really make it a local notation since this should be exposed to the user.

Jz Pan (Mar 06 2025 at 17:08):

David Ang said:

previously we could do f ∘ P for the point ![f (P x), f (P y), f (P z)] since P : Fin 3 → R happens to be a map

Again add an instance to view R × R × R as a map Fin 3 → R. Will it clash with existing things?

Jz Pan (Mar 06 2025 at 17:10):

Ahh then f ∘ P is Fin 3 → S but not S × S × S anymore.

Jz Pan (Mar 06 2025 at 17:12):

David Ang said:

I'd love a Prod.map₃ but that's even more of a stretch

Maybe just define a Prod.map\_n which only works for homogeneous product R × R × ... × R.

David Ang (Mar 06 2025 at 17:14):

Do we want that exposed?

David Ang (Mar 06 2025 at 17:16):

BTW, I could simulate the previous comp behaviour with this:

lemma map_eq_comp (f : R → S) :
    ![Prod.map f (Prod.map f f) P x, Prod.map f (Prod.map f f) P y, Prod.map f (Prod.map f f) P z] =
      f ∘ ![P x, P y, P z] := by
  ext a; fin_cases a <;> simp

So that's not a big deal either

Jz Pan (Mar 06 2025 at 17:18):

David Ang said:

Do we want that exposed?

I don't know. I think Prod.map\_n as well as the above getElem instance need an RFC discussion, since they add new things to fundamental data types.

David Ang (Mar 06 2025 at 17:28):

I guess this works too lol:

local notation3 f " ∘ " P:max => Prod.map f (Prod.map f f) P

But it might be highly misleading?

Aaron Liu (Mar 06 2025 at 17:31):

If you make it something like ∘ₚ it will be less confusing

Ruben Van de Velde (Mar 06 2025 at 18:11):

I would try a structure with three fields rather than notation for .x,.y,.z

Jz Pan (Mar 06 2025 at 19:07):

Ruben Van de Velde said:

I would try a structure with three fields rather than notation for .x,.y,.z

Then you need to rewrite SMul instances...

Aaron Liu (Mar 06 2025 at 19:08):

You need to rewrite all the instances...

Jz Pan (Mar 23 2025 at 05:41):

Jz Pan said:

Junyan Xu said:

You just want to put a Group structure on WeierstrassCurve.VariableChange, right? And then you can define a MulAction on Weierstrass/EllipticCurve. Sounds like a good idea!

We already have these: WeierstrassCurve.VariableChange.instGroup and WeierstrassCurve.instMulActionVariableChange.

What I mean is currently we have two ways to say the same thing, e.g. WeierstrassCurve.VariableChange.id and 1, WeierstrassCurve.VariableChange.comp and *, WeierstrassCurve.VariableChange.inv and ⁻¹. This makes rw sometimes not working; you need change or erw.

For PerfectClosure there is only one way to say zero: 0. (Of course you can always write PerfectClosure.mk K p 0, just like VariableChange you can always write ⟨1, 0, 0, 0⟩ instead of 1.) I think this makes the code unified.

OK this refactor is implemented as PR #23217. Comments welcome.

Last updated: May 02 2025 at 03:31 UTC