Zulip Chat Archive
Stream: maths
Topic: thoughts on elliptic curves
Kevin Buzzard (Dec 15 2021 at 13:53):
Some students at Imperial are thinking about associativity of the group law on an elliptic curve.
Let's fix a field , and for simplicity let's assume char 0. An elliptic curve over can be written as where , with (and some polynomial is assumed non-zero to guarantee the cubic is smooth). The formalisation literature only ever does one thing now: defines to be the union of a mysterious "point at infinity" plus the set of solutions to the equation , and then defines an addition on via some strange geometric shenannigans involving drawing lines or just directly via "this is the equation, deal with it".
What we in mathlib are in a very good position to do now is to develop a basic theory of elliptic curves over number fields: things like a proof of the Mordell-Weil theorem (this group is finitely-generated if is a number field), the statement of the Taniyama--Shimura conjecture (the -function of the curve has an analytic extension to the complexes), and the statement of the Birch and Swinnerton--Dyer conjecture (in both its weak and strong forms) are within scope over the next few years and would be a really nice project. Nothing like this is in any other theorem prover.
However, before we get to those shiny things, we need to prove associativity of the group law!
My initial plan for this was "just grind out the algebra". The question is literally "here is a field k, here are 11 variables a1,a2,a3,a4,a6,x1,y1,x2,y2,x3,y3 in that field, here is some really big rational function in these 11 variables, we claim that the numerator can be written as a simple combination of y1^2-f(x1), y2^2-f(x2), y3^2-f(x3) and we're only allowed to divide by the discriminant". Apparently Lean 3 times out (according to a student -- I've not seen the code but the student is about to get push access to mathlib non-master branches so I'm about to).
There are two other proofs I know. One involves some projective geometry. It goes like this: say the three points are A,B,C and the point at infinity is O. Draw the line through A and B; it intersects the curve again at D. Reflect (change the sign of the y coordinate) to get E. Draw the line through E and C and get a new point F. Now go the other way around; the line through B,C meets the curve again at G; the reflection of G is H; the line through H and A meets the curve again at I. STP F=I ((A+B)+C is the reflection of F; A+(B+C) is the reflection of I). The proof goes like this. First show that for eight points P1,P2,...,P8 in projective 2-space in "general position" (e.g. no 4 on a line etc), any cubic through these 8 points also passes through a mysterious 9th point P9 (this is done by a dimension count) and now apply to the 8 points OABCDEGH. Using the elliptic curve and the degenerate (i.e. product of three linear factors) cubic ABD+ECF+OGH you deduce the 9th point is F, and using BCG+AHI+ODE
you deduce that the 9th point is I, so F=I. I don't really have a feeling as to how practical this method is in practice. One worry is that whatever "general position" actually does mean, there will be a ton of edge cases which will probably need doing; in the books you just say "oh blah blah blah, Zariski-closed set in a scheme containing the generic point so it's everything" but I'm not convinced that this would be any easier.
The other proof I know of associativity is a more high-powered one using the Picard group. Basically the idea is that you write down another object which is obviously an abelian group (it's a subquotient of the free abelian group generated by the points on the curve) and then you write down a bijection between this object and the points on the curve and prove that addition on the object and addition on the curve commutes with the bijection, and hence addition on the curve is associative. This proof looks at first sight like it's way out of reach because it looks like it might need a whole bunch of algebraic geometry (divisors, principal divisors etc) but actually when thinking about this this morning I realised that a lot of this material is closer than we might think. Let me say more about this.
is our ground field (fixed), equation is . Let be the ring . This ring is an integral domain, and indeed it's a Dedekind domain. I claim that the elliptic curve is "canonically" isomorphic to the class group of ! The proof of this involves a little bit of diagram chasing etc but let's just assume it; the point is that Pic^0(E)=Pic(affine model). The map from E(k) to the class group sends the point at infinity to the zero class (the identity of the class group) and sends (x,y) to the ideal (X-x,Y-y). This is easily checked to be a maximal ideal. Now there are two things which need to be done to finish the proof of associativity:
1) Prove that this map is injective
2) Prove that addition on the curve corresponds to the group law on the class group
For (1) this boils down to the following assertion: prove that if (x1,y1) != (x2,y2) are two distinct points on the curve, then there does not exist non-zero f,g in R such that , this being equality of ideals in R.
For (2) this boils down to an identity of ideals -- if (x1,y1)+(x2,y2)=(x3,y3) addition on the curve, then we want to write down explicit f and g (linear) such that . This looks to me to be much easier algebra.
I am a bit unclear about how to do (1). Again my head says "of course the divisor P-Q isn't principal because this would give an isomorphism of the curve with the projective line, which can't happen because genus is an isomorphism-invariant". If we could come up with a low-level translation then this would be a really cool application of the class group stuff. Do any of the more algebro-geometric people (@Damiano Testa , @Riccardo Brasca , @Antoine Chambert-Loir ) know if there is a proof of (1) which translates down into only ring theory somehow? I think that if we can do this then this would be a really nice way to prove associativity because we can claim that we're doing it the conceptual way which is of course the "correct" way.
Note that we need associativity (which works over a general field) before we can start on all of the arithmetic stuff (which works over a number field e.g. the rationals).
Riccardo Brasca (Dec 15 2021 at 13:58):
I am afraid I don't know a simpler proof, but I really like your idea of using the class group!
Filippo A. E. Nuccio (Dec 15 2021 at 14:01):
Since you would be interested in number fields, wouldn't the option through Weierstrass work? I have never tried the details myself, but what I mean is that you first prove that is abelian (associative!); that's easy (coming up with the right lattice might be tricky). And then you prove "by hand" that and produce a bijection, so you can transfer the group law, and it will be associative for elliptic curves over . Then is a substuff of and associativity is preserved. Again, I wrote this without too much thinking, so I apologise if it does not work - but it would be nice to see meromorphic functions in action!
Filippo A. E. Nuccio (Dec 15 2021 at 14:02):
That being said, I agree with Riccardo that the idea with the class group is nice! We actually mentioned this example in our final version of our Dedekind paper with @Anne Baanen , @Ashvni Narayanan and @Sander Dahmen , to give an example of a non-finite class group.
Antoine Chambert-Loir (Dec 15 2021 at 14:04):
That's unclear to me that there is a low-tech proof, precisely for the reason that you indicate : it would prove that the cubic is not rational.
On the other hand, some simple cubics can be proved to be nonrational by applying Stothers/Mason, when their equation reduces to a sum of three terms. I remember it works for the Fermat_3 curve, and possibly for $y^2=x^3+1$ (I'm not sure), but in general, you have 4 terms…
Antoine Chambert-Loir (Dec 15 2021 at 14:08):
I remember Mestre telling that with a computer algebra system, associativity of the group law is a mere triviality.
I believe that one should try to understand why the formal computation does not work in Lean, or how to make a computer algebra system produce compilable Lean code.
I think that this is along these lines that Hales and coauthors formalized the associativity for Edwards models of elliptic curves, see https://arxiv.org/abs/2004.12030.
Kevin Buzzard (Dec 15 2021 at 14:31):
Yes associativity is a triviality with a computer algebra system. It is the statement that some element is in some ideal and if the computer algebra system has Groebner bases implemented for polynomial rings over fields it's easy. What we would have to do in Lean is to reduce the question to "element is in ideal" and then solve it using a computer algebra system, copy the explicit presentation of the element as a linear combination of generators of the ideal back into Lean and then hope that ring
can handle it. Computer algebra systems can handle very large algebraic objects easily. Lean 3 is not so good at it. I did propose to a couple of people the project of doing this in Lean 4 as an interesting test for ring
but someone objected that the "bash it out" proof was not "the proper proof" and so shouldn't be formalised at all!
Adam Topaz (Dec 15 2021 at 14:38):
But I would say the "proper proof" is to prove Riemann Roch first, and that's going to take a little while...
Adam Topaz (Dec 15 2021 at 14:40):
Can ring
in lean3 not handle the explicit formulas for the group law?
Kevin Buzzard (Dec 15 2021 at 14:41):
I don't know yet. David Angdinata just pushed where he was at; https://github.com/leanprover-community/mathlib/commit/f9685f40de179cba4fb8f68b1a51c2913c0c7463
Adam Topaz (Dec 15 2021 at 14:42):
The definition of add
is ~130 lines long...
Adam Topaz (Dec 15 2021 at 14:43):
Where is my copy of Silverman
Kevin Buzzard (Dec 15 2021 at 14:44):
This is the first time I've seen the code. I also tried something with Cremona at LFTCM: https://github.com/ImperialCollegeLondon/Example-Lean-Projects/blob/master/src/elliptic_curves/basic.lean
Kevin Buzzard (Dec 15 2021 at 14:44):
Looks like I got the definition of add down to 50
Kevin Buzzard (Dec 15 2021 at 14:46):
the definition involves a proof obligation, right? You have P=(x1,y1) and Q=(x2,y2) and the answer to P+Q in general is "the x coordinate is some messy function of x1,y1,x2,y2, as is the y coordinate, but now we have to prove they're a solution to the cubic"
Adam Topaz (Dec 15 2021 at 14:47):
Yeah, I see... I guess the ideal approach would have some custom auto param on that proof obligation that would (hopefully) discharge the goal automatically ;)
Johan Commelin (Dec 15 2021 at 14:48):
An elliptic curve is a smooth proper group scheme of dimension 1. Isn't that the way mathlib should approach this?
Kevin Buzzard (Dec 15 2021 at 14:48):
Jamie Bell's project does addition in 30 lines https://github.com/jamiebell2805/BSD-conjecture/blob/master/src/elliptic_curve_rationals.lean (one of which is very long) but he works with the simpler model y^2=cubic
Kevin Buzzard (Dec 15 2021 at 14:48):
@Johan no I don't think it is. Because then we're just saying "Mordell-Weil is 3 years away".
Adam Topaz (Dec 15 2021 at 14:49):
@Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?
Filippo A. E. Nuccio (Dec 15 2021 at 14:49):
Well, but are we speaking of mathlib or Lean? Mordell--Weil can land in mathlib in 3 years, no? There might be a parallel project proving it in the meanwhile.
Kevin Buzzard (Dec 15 2021 at 14:49):
The proof of Mordell-Weil for elliptic curves over Q looks like this: first prove that a smooth proper group scheme of dimension 1 over a PID has a Weierstrass equation. Then do the theory of heights and a 2-descent.
Kevin Buzzard (Dec 15 2021 at 14:50):
For me the two parts are totally disjoint. The first step is some algebro-geometric argument which I sketch in the module doc for EllipticCurve. The second is arithmetic and is completely independent.
Kevin Buzzard (Dec 15 2021 at 14:51):
The algebro-geometric language buys you very little; this was why Mordell-Weil was proved way before schemes were even invented
Reid Barton (Dec 15 2021 at 14:52):
There is a formalization https://github.com/strub/elliptic-curves-ssr in Coq (paper) that takes this approach via the Picard group. The math-hard part is indeed (1) and unfortunately they don't discuss it at all in the paper. The formalization of that part is in https://github.com/strub/elliptic-curves-ssr/blob/main/src/ecrr.v and it uses elementary methods (Fermat descent). I had some difficulty following the formalization, but then I came across some notes by Eva Belmont which contain the same proof (pages ~6-8).
Kevin Buzzard (Dec 15 2021 at 14:52):
The reason I called the file EllipticCurve in mathlib is because you are supposed to think of it as the category of elliptic curves. Now your objection becomes irrelevant because defnitional equality, and equality in general, is evil in a category, and my category is equivalent to yours and furthermore actually exists so we can start using it.
Johan Commelin (Dec 15 2021 at 14:54):
Adam Topaz said:
Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?
That's a consequence of Riemann-Roch, isn't it?
Kevin Buzzard (Dec 15 2021 at 14:55):
Sure
Kevin Buzzard (Dec 15 2021 at 14:55):
but to do anything like that we need a working theory of sheaf cohomology and projective space
Adam Topaz (Dec 15 2021 at 14:55):
You could use the classical approach with repartitions in this case.
Adam Topaz (Dec 15 2021 at 14:56):
E.g. as in Serre's book on class fields
Johan Commelin (Dec 15 2021 at 14:56):
so we can start using it
I agree. But for a whole bunch of arguments, it will be quite useful to know that an elliptic curve is a projective variety.
Johan Commelin (Dec 15 2021 at 14:57):
So even if you say "definitions don't matter" you are in some sense just post-poning the "problem"
Kevin Buzzard (Dec 15 2021 at 14:57):
My impression is that this is what Strub did (repartitions). They wrote down local uniformisers at all points on the curve, e.g. "it's Y-y unless y=0 in which case it's X-x"
Kevin Buzzard (Dec 15 2021 at 14:58):
Johan we are postponing what you think is a problem but we also postponing the ability to say "we have stated Shimura-Taniyama and BSD in a sorry-free way in a theorem prover" and the less that this is postponed the better.
Kevin Buzzard (Dec 15 2021 at 14:59):
Nobody is going to object that the statement of BSD refers to y^2=cubic -- this is the definition that Wiles gives in his write-up of BSD on the Clay website.
Kevin Buzzard (Dec 15 2021 at 15:00):
I view "generating the theory of the arithmetic of y^2=cubic" and "proving y^2=cubic can be re-interpreted up to equivalence using Grothendieck's machine as a smooth proper morphism" as two completely different problems.
Alex J. Best (Dec 15 2021 at 15:03):
Kevin Buzzard said:
For me the two parts are totally disjoint. The first step is some algebro-geometric argument which I sketch in the module doc for EllipticCurve. The second is arithmetic and is completely independent.
No idea if it still compiles but I did a very basic notion of heights once and proved a weak mw -> mw type result. One may want a more general notion of height in practice though.
import group_theory.index
import group_theory.subgroup.pointwise
import group_theory.quotient_group
import data.real.basic
import data.real.nnreal
lemma lt_of_mul_lt_mul_of_le {G : Type*} [linear_ordered_comm_group_with_zero G] {a b c d : G}
(h : a * b < c * d) (hc : 0 < c) (hh : c ≤ a) : b < d :=
begin
rw ←(inv_le_inv₀ (ne_of_gt hc) (ne_of_gt (lt_of_lt_of_le hc hh) : a ≠ 0)) at hh,
-- rw ← div_lt_div₀ _ _ at h,
sorry,
end
attribute [to_additive add_subgroup.smul_mem] subgroup.pow_mem
attribute [to_additive] subgroup.gpow_mem
open add_subgroup set
--TODO move
lemma not_mem_of_not_mem_closure {A : Type*} [add_comm_group A] {G : set A} {P : A}
(hP : P ∉ closure G) : P ∉ G :=
begin
intro h,
have := subset_closure h,
exact hP this,
end
open_locale nnreal
-- TODO generalize to valued in a linear_ordered_field
structure height_function (A : Type*) [add_comm_group A] :=
(h : A → ℝ≥0)
(height_smul' : ∀ (a : A) (m : ℕ), h (m • a) = m^2 * h a)
(parallelogram_law : ∀ (a b : A), h (a + b) + h (a - b) = 2 * h a + 2 * h b)
namespace height_function
variables {A : Type*} [add_comm_group A] (h : height_function A)
instance : has_coe_to_fun (height_function A) := ⟨λ _, A → ℝ≥0, λ h, h.h⟩
-- @[simp]
-- lemma height_gsmul (a : A) (m : ℤ) : h (m • a) = m^2 * h a := h.height_smul a m
@[simp]
lemma height_smul (a : A) (m : ℕ) : h (m • a) = m^2 * h a :=
h.height_smul' a m
-- begin
-- convert h.height_smul a m using 1,
-- simp,
-- end
-- should be a version for any nontrivial target
@[simp]
lemma height_zero : h 0 = 0 :=
begin
have := calc 2^2 * h 0 = h (2 • 0) : by simp only [int.cast_bit0, nat.cast_bit0,
int.cast_one, nat.cast_one, height_smul]
... = h 0 : by simp,
exact eq_zero_of_mul_eq_self_left (by norm_num) this,
end
lemma zero_of_tors (a : A) {m : ℕ} (hm : m ≠ 0) (h : height_function A) (hma : m • a = 0) :
h a = 0 :=
begin
have := eq.symm (calc 0 = h 0 : by simp
... = h (m • a) : by rw hma
... = m^2 * h a : h.height_smul _ _),
rw mul_eq_zero at this,
norm_num at this,
finish,
end
def northcott_property : Prop := ∀ x, {P | h P ≤ x}.finite
end height_function
-- TODO make mul and to_add
open_locale pointwise
open quotient_add_group
variables {A : Type*} [add_comm_group A] (h : height_function A) (hN : h.northcott_property)
variables {Ps : finset A} {m : ℕ} (hm : 2 ≤ m)
-- TODO maybe nicer to take any finite index subgroup
variables (hPs : ∀ x : quotient (m • (⊤ : add_subgroup A)), ∃ (p : A) (hp : p ∈ Ps), ↑ p = x)
include hPs hN hm
-- https://www.msri.org/attachments/workshops/301/HtSurveyMSRIJan06.pdf
-- https://ocw.mit.edu/courses/mathematics/18-782-introduction-to-arithmetic-geometry-fall-2013/lecture-notes/MIT18_782F13_lec25.pdf
lemma group_eq_span_bdd_height : closure {R | h R ≤ Ps.sup h} = ⊤ :=
begin
rw add_subgroup.eq_top_iff',
classical,
by_contra h,
push_neg at h,
choose x hx using h,
have C_finite : ((closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ ∩ {P | h P ≤ h x}).finite,
exact finite.inter_of_right (hN (h x)) _,
have : ∃ P, P ∈ ((closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ ∩ {P | h P ≤ h x})
∧ ∀ P' ∈ ((closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ ∩ {P | h P ≤ h x}),
h P ≤ h P',
{ have := C_finite.to_finset.exists_min_image h _,
{ simp only [and_imp, exists_prop, mem_inter_eq, finite.mem_to_finset, mem_set_of_eq,
set_like.mem_coe, compl_eq_compl, mem_compl_eq] at this ⊢,
exact this, },
rw finite.to_finset.nonempty,
use x,
simp only [hx, le_refl, mem_inter_eq, mem_set_of_eq, set_like.mem_coe, not_false_iff,
and_self, mem_compl_eq], },
have : ∃ P, P ∈ (closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ
∧ ∀ P' ∈ (closure {R : A | h R ≤ Ps.sup h} : set A)ᶜ,
h P ≤ h P',
{ obtain ⟨P, hP, hPP⟩ := this,
use P,
simp_rw mem_inter_iff at hP hPP ⊢,
split,
exact hP.1,
intros P' hP',
by_cases hPx : h P' ≤ h x,
{ apply hPP,
split,
exact hP',
rw mem_def,
exact hPx, },
{ push_neg at hPx,
exact calc _ ≤ h x : hP.2
... ≤ _ : hPx.le }, },
choose P hP hPP' using this,
have : ∃ (Pj) (hPj : Pj ∈ Ps) (Q), Pj + m • Q = P,
{ obtain ⟨Pj, hPj, hPPj⟩ := hPs P,
use [Pj, hPj],
erw [quotient_add_group.eq, mem_smul_set] at hPPj,
simpa [eq_neg_add_iff_add_eq] using hPPj, },
obtain ⟨Pj, hPj, Q, hPj'⟩ := this,
simp only [set_like.mem_coe, compl_eq_compl, mem_compl_eq] at hP,
apply hP,
have : Q ∈ closure {R : A | h R ≤ Ps.sup h},
{ have : 2 * h Pj < 2 * h P,
{ rw mul_lt_mul_left (show (0 : ℝ≥0) < 2, by norm_num),
have := not_mem_of_not_mem_closure hP,
simp only [not_le, mem_set_of_eq] at this,
calc _ ≤ Ps.sup h : finset.le_sup hPj
... < _ : this, },
rw ← eq_sub_iff_add_eq' at hPj',
have h_main := calc
(m^2 : ℝ≥0) * h Q = h (m • Q) : (h.height_smul _ _).symm
... = h (P - Pj) : by rw hPj'
... ≤ h (P + Pj) + h (P - Pj) : le_add_self
... = 2 * h P + 2 * h Pj : h.parallelogram_law P Pj
... < 2 * h P + 2 * h P : add_lt_add_left this _
... = 4 * h P : by {rw ← add_mul, norm_num},
have := hPP' Q,
rw ← not_imp_not at this,
simp only [not_le, not_not, set_like.mem_coe, compl_eq_compl, mem_compl_eq] at this,
apply this,
have two_le_m : (2 : ℝ≥0) ≤ m := by exact_mod_cast hm,
have four_le_m : (2^2 : ℝ≥0) ≤ m^2 := pow_le_pow_of_le_left (zero_le 2) two_le_m 2,
norm_num at four_le_m,
exact lt_of_mul_lt_mul_of_le h_main (by norm_num) four_le_m, },
rw ← hPj',
apply' add_mem,
{ apply subset_closure,
rw mem_def,
exact finset.le_sup hPj, },
{ exact smul_mem _ this m, },
end
Kevin Buzzard (Dec 15 2021 at 15:21):
Just trying to disentangle the notes that Reid sent (Tom Fisher's elliptic curves course): on page 8 when the notes refer to "Corollary 1.5(in green)" they mean "Corollary 1.8(with 1.5 written in green afterwards)" as opposed to the completely irrelevant Corollary 1.5 which had me utterly confused.
Alex J. Best (Dec 15 2021 at 15:25):
I just saw on page 2 it says
A note on the numbering
There are two concurrent numbering schemes in this document: numbers and headings
in green refer to Prof. Fisher’s numbering system; everything else refers to LATEX’s numbering, induced by setting section numbers = lecture numbers.
Kevin Buzzard (Dec 15 2021 at 16:12):
unfortunately I started reading on pages 6 to 8 ;-)
Johan Commelin (Dec 16 2021 at 07:09):
Kevin Buzzard said:
The reason I called the file EllipticCurve in mathlib is because you are supposed to think of it as the category of elliptic curves. Now your objection becomes irrelevant because defnitional equality, and equality in general, is evil in a category, and my category is equivalent to yours and furthermore actually exists so we can start using it.
I understand your point. At the same time, right now you have a type in mathlib that doesn't have a category instance yet.
Kevin Buzzard (Dec 16 2021 at 08:45):
That's because I don't know what to do for the morphisms. Are they supposed to be isogenies, or are you allowed the zero map, or are they supposed to be isomorphisms, or random morphisms of schemes? I know how to define all of these.
Kevin Buzzard (Dec 16 2021 at 08:46):
You don't need associativity of the group law to do any of them because of this theorem which says that preserving zero implies preserving +
Johan Commelin (Dec 16 2021 at 08:56):
Probably we need all those categories.
Kevin Buzzard (Dec 16 2021 at 09:02):
You know Jujian is working on the category where the objects are pairs (R,M) with R a ring and M an R-module? There are two kinds of morphisms you can choose for that category. If you want presheaves of modules to be a functor you let the rings and the modules both change. But if you want to do the moduli stack of elliptic curves then you allow the base ring to change but you demand that the map on modules induces an isomorphism after you tensor up. I don't fully understand what's going on here.
Marc Masdeu (Dec 16 2021 at 09:05):
Proving associativity using the explicit equations is not so much a triviality, at least in the computational sense. For the general Weierstrass form, it involves checking polynomial identities with terms having more than 130000 nonzero coefficients. I don't know Lean's IO system, but would be the kind of proof where one would load a separate input file containing the polynomials to be checked. And we would need a tactic for checking equalities of polynomials (ring didn't work last time I tried).
Kevin Buzzard (Dec 16 2021 at 09:10):
Oh interesting! I remember checking the generic case for y^2=x^3+Ax+B in pari-gp when I was a graduate student and it being huge but easily doable. But I wanted to use the 5 variable version in mathlib because I realised I could prove that it was valid for all PIDs and more generally all rings whose Picard group has trivial 12-torsion.
Kevin Buzzard (Dec 16 2021 at 09:15):
@Johan Commelin here's another way of thinking about the definition of elliptic curve in mathlib. There's something weird going on in the Taniyama-Shimura conjecture, because it relates modular forms to elliptic curves and it's supposed to be hard, but modular forms are just functions on elliptic curves so it all sounds like it should be easy. But those two uses of the word "elliptic curve" are completely different. The hard version specifically relates to elliptic curves over the rationals. The easier version is all about moduli spaces and hence is about elliptic curves defined over all schemes (and Wiles uses modular forms defined over the complexes and finite fields and the p-adic integers in his proof as well as arguments over general bases). We still don't have enough algebraic geometry to do the scheme side of the story (the geometric side) but we can start on the arithmetic side right now.
Johan Commelin (Dec 16 2021 at 09:19):
Fair enough
Johan Commelin (Dec 16 2021 at 09:19):
I just hope that the geometric side will catch up sooner rather than later.
Kevin Buzzard (Dec 16 2021 at 09:22):
These things can move independently I think. I don't think we need anything about schemes to prove Mordell-Weil for elliptic curves over number fields -- that's why it was done before schemes existed. There's still every chance that Ashvni will define the p-adic zeta function before the Riemann zeta function.
Johan Commelin (Dec 16 2021 at 09:25):
Mordell-Weil wasn't done before varieties existed, right?
Kevin Buzzard (Dec 16 2021 at 09:28):
@Marc Masdeu it's probably worth saying that even if it's true that when you expand it all out associativity involves terms with a hundred thousand monomials in, that doesn't mean that Lean ever needs to see one. For example it's true that but I can prove this in Lean without ever multiplying everything out. In general Lean does not expand things out unless it's told to, so perhaps some judicious variable name choices can come to our rescue. Associativity is a really annoying sorry right now because any statement you make about a group attached to the curve won't compile without warnings.
Kevin Buzzard (Dec 16 2021 at 09:30):
To do the abelian varieties case you need varieties (to even state it), but I suspect that Mordell had it for elliptic curves over Q before Weil's book. All you need for that is stuff about number fields and remember we had all of class field theory 100 years ago, just not with the cohomological language.
Johan Commelin (Dec 16 2021 at 09:31):
My history of mathematics starts around 1956 :rofl:
Marc Masdeu (Dec 16 2021 at 09:41):
@Kevin Buzzard I'm not clever enough to get those polynomials without calling a Groebner basis algorithm, and then it spits out the kind of monsters that I mentioned. I'd like to see it improved, here's what I am using: https://gist.github.com/mmasdeu/20d5c0fec521f03e30e4f68ccee21a77
Kevin Buzzard (Dec 16 2021 at 09:45):
Thanks Marc -- this is really helpful. Do you have a "brute force" proof of associativity in Sage? i.e. a computer algebra package version of algebraic_geometry.elliptic_curve.group_law
? It's on the elliptic_curve_group_law
branch.
Marc Masdeu (Dec 16 2021 at 09:53):
I'll look at what you mean later, but when I had to teach some undergraduates about elliptic curves I figured out that I could write a complete proof if I had these three identities for "generic" points, and this I did check in Sage. In this way, I can tell my students that the associativity proof is really "elementary" if you trust Sage to multiply polynomials for you.
Antoine Chambert-Loir (Dec 16 2021 at 09:55):
Johan Commelin said:
An elliptic curve is a smooth proper group scheme of dimension 1. Isn't that the way mathlib should approach this?
— hopefully adding the hypothesis that it has connected fibers
Antoine Chambert-Loir (Dec 16 2021 at 10:00):
Adam Topaz said:
Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?
The first chapter of the book by N. Katz and B. Mazur, Arithmetic moduli of elliptic curves has a clear, high-level, recap of elliptic curves from this point of view. Over a field, Riemann Roch tells you that there is a non constant function $x$ in $|2·O|$, a function $y$ in $|3·O|$ which is not in $|2·O|$, and that $x^3, x^2, x, 1, y^2, xy,y$ are linearly related… Over a base, $|2·O|$, etc. are vector bundles over the base (by the theorems on the dimensions of cohomology groups), locally they are free, etc.
Antoine Chambert-Loir (Dec 16 2021 at 10:02):
Johan Commelin said:
Mordell-Weil wasn't done before varieties existed, right?
Mordell is 1922, it follows a 1901 paper of Poincaré, varieties here are complex varieties, and one can talk about their rational points. In particular, at that time, the group law was understood using the Weierstrass $\wp$-function.
Adam Topaz (Dec 16 2021 at 13:56):
Antoine Chambert-Loir said:
Adam Topaz said:
Johan Commelin how would you prove that such a smooth proper group scheme of dimension 1 has a Weierstrass equation?
The first chapter of the book by N. Katz and B. Mazur, Arithmetic moduli of elliptic curves has a clear, high-level, recap of elliptic curves from this point of view. Over a field, Riemann Roch tells you that there is a non constant function $x$ in $|2·O|$, a function $y$ in $|3·O|$ which is not in $|2·O|$, and that $x^3, x^2, x, 1, y^2, xy,y$ are linearly related… Over a base, $|2·O|$, etc. are vector bundles over the base (by the theorems on the dimensions of cohomology groups), locally they are free, etc.
Yes, this is the standard proof I know that a curve of genus 1 with a point (or a section, if you work over a more general base) has a Weierstrass equation. Well, Johan asked for a smooth proper group scheme of dimension 1. I don't think it's so easy to apply Riemann Roch in that case, since one first has to obtain the genus. In any case, my point was that we would need Riemann Roch regardless.
Adam Topaz (Dec 16 2021 at 14:21):
I guess one needs to show that if C
is a proper curve endowed with a group structure, then the group structure forces the canonical class to be trivial, hence degree 0, so that the genus must be one. But again, these are at least two additional results that are nontrivial to formalize.
Damiano Testa (Dec 16 2021 at 14:50):
Dear All,
I have taken some distance from formalization, but I am still very intrigued by such questions!
I wanted to say that proving that cubics are irrational need not be hard. Every automorphism of fixes at least one point, while translations by non-zero elements on an elliptic curve have no fixed points (you need not know that addition is associative for this, of course). If the ground field does not have characteristic two, you can also use the fact that negation on the elliptic curve is not the identity and fixes 4 points, while an automorphism of that fixes 3 points is the identity.
Having said this, it is not clear to me how to go from the inclusion of function fields (notation as in Kevin's initial post in this thread) to an isomorphism without using the fact that rational maps between smooth projective curves extend and that the degree of such a map can be computed by looking at any scheme-theoretic fiber.
Of course, these statements are not "hard", but I do not see an easy way to obtain them without some further machinery,
Adam Topaz (Dec 16 2021 at 15:00):
Wouldn't this argument be circular? I was under the impression that Kevin wanted to use the group structure on the class group to obtain the group structure on the elliptic curve, and it seems to me that you would need the group structure to see that the curve is not rational.
Damiano Testa (Dec 16 2021 at 16:14):
My thought was simply to use the definition of the group structure, not the fact that you know is associative. Once you have a map , you can fix one coordinate and check whether the map that you obtain is injective or not. You do not need to know that this map arises from a group action.
Damiano Testa (Dec 16 2021 at 16:15):
Of course, I know that the map is injective because it is translation in a group, but I can check injectivity "by brute force".
Adam Topaz (Dec 16 2021 at 16:15):
I see. Of course, one should conclude by checking that the multiplication agrees with the multiplication induced by the inclusion in the class group :)
Damiano Testa (Dec 16 2021 at 16:17):
Yes, my argument was simply because I felt that proving irrationality of a cubic could be just as strong as proving associativity. The sketch above makes it more likely that it might be simpler.
Adam Topaz (Dec 16 2021 at 16:20):
Forget the IMO grand challenge. Let's teach lean how to pass a PhD qualifying exam in algebraic geometry ;)
Damiano Testa (Dec 16 2021 at 16:21):
Chances are, that what you describe is a shorter route to the IMO grand challenge...
David Michael Roberts (Dec 16 2021 at 21:57):
Kevin Buzzard said:
You know Jujian is working on the category where the objects are pairs (R,M) with R a ring and M an R-module? There are two kinds of morphisms you can choose for that category. If you want presheaves of modules to be a functor you let the rings and the modules both change. But if you want to do the moduli stack of elliptic curves then you allow the base ring to change but you demand that the map on modules induces an isomorphism after you tensor up. I don't fully understand what's going on here.
You don't need that last restriction, that's only if you want the stack of groupoids of elliptic curves. I've been saying for a long time that it's possibleto do the whole algebraic stack story with the full stack including non-invertible maps, in many cases. For elliptic curves and isogenies it's as about as nice as it can be.
I got pushback when I suggested this nine years ago, but now here and there people (eg Rezk, Gaitsgory) have started to slip the idea in, usually under weird neologisms.
David Michael Roberts (Dec 16 2021 at 22:22):
In particular, while the normal story builds a groupoid scheme, this extended version builds a "category scheme", where the subscheme of invertible arrows is the usual groupoid
Kevin Buzzard (Dec 17 2021 at 09:32):
I learnt about the isomorphism picture in Katz-Mazur. Maybe the idea is that historically we were very happy with this idea of "2-category but all the 2-morphisms are isomorphisms", it used to play an important role in the theory of stacks
David Michael Roberts (Dec 17 2021 at 10:03):
Yes, it's everywhere, but people still want to be able to descend isogenies (I.e. the hom sheaf, not just the isom sheaf), so this stuff is secretly there, but not talked about in a unified way...
Antoine Chambert-Loir (Dec 27 2021 at 12:26):
Adam Topaz said:
Yes, this is the standard proof I know that a curve of genus 1 with a point (or a section, if you work over a more general base) has a Weierstrass equation. Well, Johan asked for a smooth proper group scheme of dimension 1. I don't think it's so easy to apply Riemann Roch in that case, since one first has to obtain the genus. In any case, my point was that we would need Riemann Roch regardless.
Then you probably want to formalize Mumford's Abelian varieties book!
Elif Sacikara (Alumni) (Dec 28 2021 at 22:50):
Since I have been struggling learning both lean and algebraic geometry I was quite hesitant to say something but let me ask the following question anyway. Have you ever considered formalizing the approach for elliptic function fields given in Stichtenoth's Algebraic Function Fields and Codes? My motivation of studying this book is that the notions can be derived from some basics from linear algebra and algebra. For instance elliptic curve addition is defined with respect to equivalence on class group, as it is mentioned in (2). More explicitly we define $P\oplus Q=R$ (addition on elliptic curve) if $P+Q-R-O=(f)$ (addition and subtraction on Divisor group), where $O$ is the identity and $f$ can be chosen as a fraction of two lines relating P, Q and R. However, I am not sure if this can be a good approach for lean.
Kevin Buzzard (Dec 28 2021 at 23:20):
The difficulty with the function field approach is that one has to prove that the map from the points on the curve to the divisor class group of is injective. This is possible but might be messy. It might also be the nicest approach.
Kevin Buzzard (Dec 28 2021 at 23:29):
There's a pdf linked to above (Tom Fisher's Cambridge elliptic curves course notes) has a proof in corollary 1.8 (green 1.5) in the case of characteristic != 2.
Elif Sacikara (Alumni) (Dec 29 2021 at 12:45):
I see. Thanks for the response. For injectivity, instead of considering the class group of K[x,y]/(f(x,y)), I was thinking first to define a function field F, and then work with Cl(F) as the quotient group Div^0(F)/Princ(F) with degree zero divisors Div^0(F) and principal divisors Princ(F). Then by treating E(K) as a set of rational places of F, map each rational place P to the class of P-O. Then injectivity comes the fact that P=Q+(x) implies P=Q, because otherwise [F:K(x)]=1 (degree of zero divisor of x in F) gives us contradiction to the genus of F. However, as said, I should see if all definitions and theorems behind it are doable in Lean.
Kevin Buzzard (Dec 29 2021 at 14:34):
Right, so you have to prove F != K(x) so it boils down to the same calculation. I don't know a low-level way of doing this in characteristic 2
Elif Sacikara (Alumni) (Dec 29 2021 at 15:48):
When I look at the textbook, in theory it is claimed that this approach works for elliptic function fields F/K with an arbitrary field K. This is why may I kindly ask how characteristic 2 creates extra difficulty for Lean (under the assumptions that in Lean we can define an elliptic function as a genus 1 function field F/K containing a degree 1 divisor and if this approach is doable for any other characteristic).
Kevin Buzzard (Dec 29 2021 at 15:50):
It doesn't create extra difficulty for lean, it creates extra difficulty in the sense that there's a cheap proof which works in char > 2 but if you want to deal with char 2 you'll have to develop more of the theory
Kevin Buzzard (Dec 29 2021 at 15:52):
Basically I'm asking you "give me a complete proof that the field of fractions of y^2+axy+by=x^3+cx^2+dx+e over a field k is never isomorphic to k(t)" (provided a discriminant doesn't vanish, yes thanks Oliver :-) )
Kevin Buzzard (Dec 29 2021 at 15:52):
I can prove this using a bunch of algebraic geometry. I can prove this in a low-level way if char isn't 2
Oliver Nash (Dec 29 2021 at 15:52):
(provided a certain discriminant doesn't vanish)
Kevin Buzzard (Dec 29 2021 at 15:56):
The low-level way would be easier to formalise, the general way would take longer but of course needs to be done in the end. The issue is that there's currently no theory of the genus of a curve in algebraic geometry in Lean yet. It could be made, but these things take time. All of the theory needed to do the low-level proof is there already
Adam Topaz (Dec 29 2021 at 16:03):
To illustrate the state of things, for example, I don't think we currently have a good way to define a "regular function field of transcendence degree one" using what's currently in mathlib
Marc Masdeu (Mar 14 2022 at 16:57):
Sorry to bump this old thread, but I've given some more thought to proving associativity the brute force way. As @Kevin Buzzard pointed out above, it is indeed possible to check associativity without carrying out desperately large polynomial calculations. I've played around with Sage and all that is needed is a tactic that generalizes the ring tactic, and that allows one to work in the ring modulo the ideal generated by the Weierstrass equation, which is satisfied by the points . That is, after each operation one would run the tactic and the result would be an expression that only contains at most to the power 1. A question then for the tactic writers: how hard would it be to cook up a quick tactic that does this?
Heather Macbeth (Mar 14 2022 at 17:04):
@Marc Masdeu The new linear_combination
tactic would get part of the way there. Do you have a sample problem written up in Lean that I could use to write up an example of use?
Marc Masdeu (Mar 14 2022 at 17:15):
Hi @Heather Macbeth, thanks!
I'd like to close goals like this one, for example. Ideally, I'd use something like field_simp [h1, h2]
for this.
variables {K : Type*} [field K]
example (a1 a2 a3 a4 a6 x1 x2 y1 y2 : K)
(h1 : y1^2 + a1*x1*y1 + a3*y1 = x1^3 + a2*x1^2 + a4*x1 + a6)
(h2 : y2^2 + a1*x2*y2 + a3*y2 = x2^3 + a2*x2^2 + a4*x2 + a6)
(h: x1 ≠ x2) :
(a2*x1^2 + x1^3 - 2*a2*x1*x2 - x1^2*x2 + a2*x2^2 - x1*x2^2
+ x2^3 - a1*x1*y1 + a1*x2*y1 + a1*x1*y2 - a1*x2*y2 - y1^2 + 2*y1*y2 - y2^2)/(-x1^2 + 2*x1*x2 - x2^2)
= (-2*a2*x1*x2 - x1^2*x2 - x1*x2^2 + a1*x2*y1 +
a1*x1*y2 - a4*x1 - a4*x2 + a3*y1 + a3*y2 + 2*y1*y2 - 2*a6)/(-x1^2 + 2*x1*x2 - x2^2) :=
begin
sorry
end
Heather Macbeth (Mar 14 2022 at 17:35):
@Marc Masdeu You need to figure out the coefficients yourself (which I guess you can do in Sage), but it works pretty well!
import tactic.linear_combination
import tactic.field_simp
variables {K : Type*} [field K]
example (a1 a2 a3 a4 a6 x1 x2 y1 y2 : K)
(h1 : y1^2 + a1*x1*y1 + a3*y1 = x1^3 + a2*x1^2 + a4*x1 + a6)
(h2 : y2^2 + a1*x2*y2 + a3*y2 = x2^3 + a2*x2^2 + a4*x2 + a6):
(a2*x1^2 + x1^3 - 2*a2*x1*x2 - x1^2*x2 + a2*x2^2 - x1*x2^2
+ x2^3 - a1*x1*y1 + a1*x2*y1 + a1*x1*y2 - a1*x2*y2 - y1^2 + 2*y1*y2 - y2^2)/(-x1^2 + 2*x1*x2 - x2^2)
= (-2*a2*x1*x2 - x1^2*x2 - x1*x2^2 + a1*x2*y1
+ a1*x1*y2 - a4*x1 - a4*x2 + a3*y1 + a3*y2 + 2*y1*y2 - 2*a6)/(-x1^2 + 2*x1*x2 - x2^2) :=
begin
have : -x1^2 + 2*x1*x2 - x2^2 ≠ 0 := sorry,
field_simp,
linear_combination (h1, -1) (h2, -1),
end
Marc Masdeu (Mar 14 2022 at 17:39):
It will need to be more automated, because there will be several dozens of applications of reductions like that. For example, in the expression there may be a term of the form y1^10
(times stuff) and this should be recusively simplified. This is how Hales (yes, that Hales) did it in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7324045/
Heather Macbeth (Mar 14 2022 at 17:44):
I think it can still be a single application of linear_combination
(because the coefficients in linear_combination
can themselves be polynomial expressions). But it might look very long -- dozens of lines. You could get Sage to compute the polynomial coefficients and then copy-paste them.
Heather Macbeth (Mar 14 2022 at 17:45):
@Rob Lewis is the person to talk to, in any case, whether the process is this Sage-to-linear_combination
method, or a direct implementation of this very special case of a Grobner basis tactic.
Yaël Dillies (Mar 14 2022 at 17:45):
I am planning on doing something similar, caching an absurd amount of oracle in the source code, in #10645
Heather Macbeth (Mar 14 2022 at 17:45):
@Yaël Dillies Wrong link?
Yaël Dillies (Mar 14 2022 at 17:46):
slipped!
Heather Macbeth (Mar 14 2022 at 17:52):
@Marc Masdeu So to be more explicit: If you have the polynomial goal and the polynomial hypotheses h1
that , ... hn
that :
-
get Sage to exhibit as an element of the ideal generated by , ... , say as
for polynomials , ... -
give Lean the tactic instruction
linear_combination (h1, a1) ... (hn, an)
.
Rob Lewis (Mar 14 2022 at 18:17):
Heather Macbeth said:
Rob Lewis is the person to talk to, in any case, whether the process is this Sage-to-
linear_combination
method, or a direct implementation of this very special case of a Grobner basis tactic.
There may be news about this in the not too distant future! Maybe @Dhruv Bhatia reads Zulip?
Kevin Buzzard (Mar 15 2022 at 08:12):
@Marc Masdeu there's a PhD student at Imperial @David Ang who is trying to prove Mordell-Weil but right now they're assuming associativity -- if you could make this happen this would be wonderful!
Marc Masdeu (Mar 15 2022 at 08:18):
Kevin Buzzard said:
Marc Masdeu there's a PhD student at Imperial David Ang who is trying to prove Mordell-Weil but right now they're assuming associativity -- if you could make this happen this would be wonderful!
I noticed that branch. I saw that they got the group law definition, which I had worked out in the ell_add_assoc
branch. I like better the definition in that branch (I think you made it), and my approach to well-definedness (which is not super long, IMHO). Maybe you can "persuade" them to use the definition in that branch?
David Ang (Mar 15 2022 at 08:28):
The mordell_weil
branch already has quite a few other definitions and lemmas in it, so it’s going to be a bit tough to switch over completely, but I’m happy to port over the definition in the other branch once it’s all done.
Dhruv Bhatia (Mar 16 2022 at 15:07):
Heather Macbeth said:
Marc Masdeu So to be more explicit: If you have the polynomial goal and the polynomial hypotheses
h1
that , ...hn
that :
get Sage to exhibit as an element of the ideal generated by , ... , say as
for polynomials , ...give Lean the tactic instruction
linear_combination (h1, a1) ... (hn, an)
.
Thanks @Rob Lewis for the mention! This is actually verbatim what I am working on right now. If all goes well, I should have something ready to use in a few weeks!
Heather Macbeth (Jun 27 2022 at 15:59):
@Marc Masdeu Dhruv's tactic is now being PR'd: #14878. Do you have any more really nasty examples from your elliptic curves project that you could share as tests?
Marc Masdeu (Jul 10 2022 at 11:42):
Sorry for not replying before... Probably there are, but buried inside the proofs as I have them right now. Maybe it's time for an update. The branch ell_add_assoc
contains the proof of associativity and it just has 3 sorrys left. They are corresponding to 3 computations corresponding to generic situations (that is, it is always clear which formula to use when adding). I have done them in Sage, checking that certain rational functions are the same (modulo replacing y^n for n > 1 using the Weierstrass equation). If you want to see that code I'd happy to post it here, too. If anyone has ideas on how to do these, please do share them!
(PS: below it's not meant to be a mwe, just to point at what's left in the EllipticCurveGroupLaw.lean
file.)
lemma assoc_aux { P Q R : points E} (hPz : P ≠ 0) (hQz : Q ≠ 0) (hRz : R ≠ 0)
(hPQ1 : P ≠ Q) (hPQ2 : P ≠ -Q) (hQR1 : Q ≠ R) (hQR2 : Q ≠ -R) (hPQR1 : P + Q ≠ R) (hPQR2 : P + Q ≠ -R)
(hQRP1 : Q + R ≠ P) (hQRP2 : Q + R ≠ -P) : P + Q + R = P + (Q + R) := sorry
lemma assoc_aux' {P Q : points E} (hPz : P ≠ 0) (hQz : Q ≠ 0) (hneg : P + P ≠ 0)
(hPQ1 : P ≠ Q) (hPQ2 : P ≠ -Q) (h2PQ1 : (P + P) ≠ Q) (h2PQ2 : (P + P) ≠ -Q)
(hPQP1 : P + Q ≠ P) (hPQP1 : P + Q ≠ -P) : P + P + Q = P + (P + Q) := sorry
lemma assoc_aux'' {P : points E} (hz : P ≠ 0) (hneg : P + P ≠ 0)
(htwo : (P + P) + (P + P) ≠ 0) (hPP1 : P + P ≠ P) (hPP2 : P + P ≠ -P)
(hPPP1 : P + P + P ≠ P) (hPPP2 : P + P + P ≠ -P) : (P + P) + (P + P) = P + (P + (P + P)) := sorry
Kevin Buzzard (Sep 03 2022 at 19:54):
Aargh! @Heather Macbeth @Marc Masdeu @Mario Carneiro I had some time when away on holiday to think about associativity and how to do it this fancy class group way, and I discovered to my surprise that the method I'd been proposing does not work in characteristic 2!
Kevin Buzzard (Sep 03 2022 at 19:55):
Which is really annoying because for me that had been one of the selling points -- that it was a truly general argument.
Mario Carneiro (Sep 03 2022 at 19:55):
what goes wrong?
Kevin Buzzard (Sep 03 2022 at 19:55):
Just some Diophantine arithmetic doesn't work out because some extension of fields can be inseparable
Kevin Buzzard (Sep 03 2022 at 19:59):
You assume the field k is alg closed and then try to solve Y^2=X(X-1)(X-\lambda) in the field k(T) using arithmetic but the only solutions are constant (three factors on the right are all coprime so all squares up to units and then do a descent) which shows there's no rational function of X and Y with a simple zero and simple pole, giving you the injectivity of the map to the class group and then you win because the map from the curve to the class group plays well with the group laws
Kevin Buzzard (Sep 03 2022 at 20:00):
But in char 2 you can't complete the square on the left and I can't push through the descent because it's a 2-descent on the curve and in char 2 there inseparable degree 2 isogenies which would have to be dealt with separately.
Kevin Buzzard (Sep 03 2022 at 20:01):
tl;dr: the argument in Fisher's notes needs to complete the square to get rid of the Y term in the cubic equation, and this can't be done in char 2
Kevin Buzzard (Sep 03 2022 at 20:04):
I think that probably one could work hard and construct an analogous proof in char 2; one would perhaps at some stage split into two cases depending on whether a certain discriminant was a square or not, the inseparability issue would manifest itself in some way but I'm hazy on the details, because I then realised it didn't matter.
Kevin Buzzard (Sep 03 2022 at 20:06):
If the first line of the "fancy" proof I'm proposing is definitely going to be "OK let's split into the cases of whether char(k)=2 or not" then after that split you now have to ask "now what's the best method from this point" and I think that in both cases the answer is "not my method"
Kevin Buzzard (Sep 03 2022 at 20:07):
In fact if char(k)>2 then we already know the answer thanks to @Heather Macbeth : we say "WLOG is algebraically closed" (the step I had missed before when talking to her) and then say "so we can put it in Edwards form and now the machinery Heather and Rob and their team have developed can apparently do it, a result I was not excited about at the time but have now seen the importance of.
Mario Carneiro (Sep 03 2022 at 20:12):
Well, that's still of the form "reduce the problem a bit until it is accessible to a giant computer calculation" which is not what I was hoping from the "abstract" proof
Kevin Buzzard (Sep 03 2022 at 20:12):
What would be left would be translating Lean's "official" definition of an elliptic curve into Edwards form, but my instinct is that the algebra involved here is much easier: one can work out the maps using a computer algebra package and they're just going to be quadratic over cubic or whatever, it's not going to be too taxing I don't think.
And then there's char 2 and again there are apparently char 2 specific methods which can be used to prove associativity much more cheaply than the super-complicated thing I was waffling on about above.
Kevin Buzzard (Sep 03 2022 at 20:12):
It's "reduce the problem until it is amenable to two reasonable computer calculations"
Kevin Buzzard (Sep 03 2022 at 20:13):
These are calculations which one can do by hand if it were the year 1900
Kevin Buzzard (Sep 03 2022 at 20:41):
Dan Bernstein sent me a link to his treatment in the odd characteristic case: https://cr.yp.to/papers.html#completed . He said it was the most painless method he knew. In the char 2 case I thought someone mentioned a method -- I'll come back to this. If you want to do the general case then @Marc Masdeu didn't you conclude that the calculation was unfeasible? Someone at ANTS told me that Bosma--Lenstra was the way to do associativity if you wanted a computation which worked for general .
Kevin Buzzard (Sep 03 2022 at 20:44):
Here is Bernstein's sage notebook for Edwards:
R.<a,d,X1,Z1,Y1,T1,X2,Z2,Y2,T2,X4,Z4,Y4,T4> = PolynomialRing(ZZ,14)
curve1 = a*X1^2*T1^2+Z1^2*Y1^2-Z1^2*T1^2-d*X1^2*Y1^2
curve2 = a*X2^2*T2^2+Z2^2*Y2^2-Z2^2*T2^2-d*X2^2*Y2^2
curve4 = a*X4^2*T4^2+Z4^2*Y4^2-Z4^2*T4^2-d*X4^2*Y4^2
I = R.ideal([curve1,curve2,curve4])
def origadd(X1,Z1,Y1,T1,X2,Z2,Y2,T2):
X3 = X1*Y2*Z2*T1+X2*Y1*Z1*T2
Z3 = Z1*Z2*T1*T2+d*X1*X2*Y1*Y2
Y3 = Y1*Y2*Z1*Z2-a*X1*X2*T1*T2
T3 = Z1*Z2*T1*T2-d*X1*X2*Y1*Y2
return X3,Z3,Y3,T3
def dualadd(X1,Z1,Y1,T1,X2,Z2,Y2,T2):
X3 = X1*Y1*Z2*T2+X2*Y2*Z1*T1
Z3 = a*X1*X2*T1*T2+Y1*Y2*Z1*Z2
Y3 = X1*Y1*Z2*T2-X2*Y2*Z1*T1
T3 = X1*Y2*Z2*T1-X2*Y1*Z1*T2
return X3,Z3,Y3,T3
for a3 in origadd,dualadd:
for a7 in origadd,dualadd:
for a5 in origadd,dualadd:
for a0 in origadd,dualadd:
X3,Z3,Y3,T3 = a3(X1,Z1,Y1,T1,X2,Z2,Y2,T2)
X7,Z7,Y7,T7 = a7(X3,Z3,Y3,T3,X4,Z4,Y4,T4)
X5,Z5,Y5,T5 = a5(X1,Z1,Y1,T1,X4,Z4,Y4,T4)
X0,Z0,Y0,T0 = a0(X2,Z2,Y2,T2,X5,Z5,Y5,T5)
assert X0*Z7-X7*Z0 in I
assert Y0*T7-Y7*T0 in I
print([x.__name__ for x in (a3,a7,a5,a0)])
Michael Stoll (Sep 04 2022 at 10:08):
I have a fairly elementary proof for cubic curves that should be independent of the characteristic, which basically only uses Bézout's theorem for a line and a curve. See Section 9 in these notes (which are in German, but I'd be happy to help with translating :smile:).
Kevin Buzzard (Sep 04 2022 at 10:16):
You don't somewhere use Bezout for two cubics? That was what was scaring me with this approach.
Michael Stoll (Sep 04 2022 at 10:52):
No.
David Michael Roberts (Sep 08 2022 at 08:10):
I haven't followed this discussion, so it may have been mentioned, but if one is going to reduce to Edwards curves, then would https://arxiv.org/abs/2004.12030 be useful?
Kevin Buzzard (Sep 08 2022 at 08:21):
I think that @Heather Macbeth already did the Edwards case and claimed she could give a complete proof of associativity in that case. I feel bad because at the time I'd pointed out what I thought were two problems with the approach -- firstly she was assuming the curve had a point of order 4 and secondly that she'd avoided characteristic 2. But the first issue can be avoided by extending the base field and the second issue turned out to be a problem with my approach as well, unless I can prove my way around it (which I can't, right now)
David Michael Roberts (Sep 08 2022 at 08:41):
OK, thanks!
Kevin Buzzard (Sep 08 2022 at 12:51):
Dan Bernstein managed to prove associativity in char 2 in sage-math
using some formulae due to David Kohel:
R.<c,X0,X1,X2,X3,Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3> = PolynomialRing(GF(2),13)
curveX1 = X0^2-X2^2-c^2*X1*X3
curveX2 = X1^2-X3^2-c^2*X0*X2
curveY1 = Y0^2-Y2^2-c^2*Y1*Y3
curveY2 = Y1^2-Y3^2-c^2*Y0*Y2
curveZ1 = Z0^2-Z2^2-c^2*Z1*Z3
curveZ2 = Z1^2-Z3^2-c^2*Z0*Z2
I = R.ideal([curveX1,curveX2,curveY1,curveY2,curveZ1,curveZ2])
def add0(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
return (X0*Y0+X2*Y2)^2,c*(X0*X1*Y0*Y1+X2*X3*Y2*Y3),(X1*Y1+X3*Y3)^2,c*(X0*X3*Y0*Y3+X1*X2*Y1*Y2)
def add1(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
return c*(X0*X1*Y0*Y3+X2*X3*Y1*Y2),(X1*Y0+X3*Y2)^2,c*(X0*X3*Y2*Y3+X1*X2*Y0*Y1),(X0*Y3+X2*Y1)^2
def add2(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
return (X3*Y1+X1*Y3)^2,c*(X0*X3*Y1*Y2+X1*X2*Y0*Y3),(X0*Y2+X2*Y0)^2,c*(X0*X1*Y2*Y3+X2*X3*Y0*Y1)
def add3(X0,X1,X2,X3,Y0,Y1,Y2,Y3):
return c*(X0*X3*Y0*Y1+X1*X2*Y2*Y3),(X0*Y1+X2*Y3)^2,c*(X0*X1*Y1*Y2+X2*X3*Y0*Y3),(X1*Y2+X3*Y0)^2
additionlaws = add0,add1,add2,add3
for A in additionlaws:
for B in additionlaws:
for C in additionlaws:
for D in additionlaws:
A0,A1,A2,A3 = A(X0,X1,X2,X3,Y0,Y1,Y2,Y3)
B0,B1,B2,B3 = B(A0,A1,A2,A3,Z0,Z1,Z2,Z3)
C0,C1,C2,C3 = C(Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3)
D0,D1,D2,D3 = D(X0,X1,X2,X3,C0,C1,C2,C3)
assert B0*D1-B1*D0 in I
assert B0*D2-B2*D0 in I
assert B0*D3-B3*D0 in I
assert B1*D2-B2*D1 in I
assert B1*D3-B3*D1 in I
assert B2*D3-B3*D2 in I
print([x.__name__ for x in (A,B,C,D)])
Kevin Buzzard (Sep 08 2022 at 12:55):
But right now I guess I am confused about the general case. @Marc Masdeu you seem convinced that the polynomials involved would be astronomical but https://cr.yp.to/papers.html#completed seems to suggest that it will be less painful than we'd imagined.
Marc Masdeu (Sep 08 2022 at 14:58):
All the experiments I've done have been with the usual polynomials valid for the general Weierstrass equation. I would like a proof that worked in all characteristics at once, but I guess that it'd be okay if there was a separate case for characteristic 2. @Kevin Buzzard I've had this project slowly dying for long enough that maybe it'd be good to have a chat at some point and see how feasible it is to knock it off?
By the way, I can do the general case in Sage no problem, the denominators involved get large, and then they explode when you want to clear them out. I had prepared some scripts to share, but maybe it's better to see it live...
Riccardo Brasca (Sep 23 2022 at 13:31):
Maybe this has already been discussed, but I am at a conference where someone said elliptic curves have been formalized in HOL, but they're not sure how nor in which generality. Does anyone have more info about this?
Kevin Buzzard (Sep 23 2022 at 14:18):
They've been formalised in Coq (Weierstrass form, so not the most general form in char 2 or 3) with a conceptual proof of associativity valid in char != 2. There's an Isabelle/HOL formalisation in the AFP: https://www.isa-afp.org/entries/Elliptic_Curves_Group_Law.html again just of the group law (incl proof of associativity) for the Weierstrass curve (from last year), so again the proof is only valid in char != 2 and doesn't cover all elliptic curves in char 3.
Riccardo Brasca (Sep 23 2022 at 14:19):
I see
Adam Topaz (Sep 23 2022 at 14:41):
Maybe we should bite the bullet and start a big push for Riemann Roch so we can do all this stuff properly?
Riccardo Brasca (Sep 23 2022 at 14:47):
This is clearly the right thing to do if we can wait, but we are quite far, aren't we?
Adam Topaz (Sep 23 2022 at 14:50):
I still think the proof from Serre's book on class fields is doable
Adam Topaz (Sep 23 2022 at 14:50):
The "repartitions" proof
Riccardo Brasca (Sep 23 2022 at 14:51):
I don't know this proof, but is it for which "curves"? Schemes blah blah, varieties over algebraically closed field (defined somehow)?
Adam Topaz (Sep 23 2022 at 14:53):
One could say it's for function fields of trdeg 1
Adam Topaz (Sep 23 2022 at 14:54):
Maybe with a perfect base field? I don't remember
Riccardo Brasca (Sep 23 2022 at 14:55):
Oh, skipping algebraic geometry in some sense? Interesting!
Antoine Chambert-Loir (Sep 23 2022 at 15:29):
Kevin Buzzard said:
They've been formalised in Coq (Weierstrass form, so not the most general form in char 2 or 3) with a conceptual proof of associativity valid in char != 2. There's an Isabelle/HOL formalisation in the AFP: https://www.isa-afp.org/entries/Elliptic_Curves_Group_Law.html again just of the group law (incl proof of associativity) for the Weierstrass curve (from last year), so again the proof is only valid in char != 2 and doesn't cover all elliptic curves in char 3.
It is my understanding that the polyrith
tactic implemented @Heather Macbeth should give a completely automatic proof for all elliptic curves represented as a plane cubic curve.
Heather Macbeth (Sep 23 2022 at 15:54):
@Antoine Chambert-Loir In fact this tactic was written by @Dhruv Bhatia, a student of @Rob Lewis!
Heather Macbeth (Sep 23 2022 at 15:56):
And regarding your question: I tried this out at some point (on Edwards curves). There are many cases and I skipped all of them except for the one which I guessed would be the hardest. Polyrith did send me some output here (rather than failing), but the output was so long that it got garbled when being passed back to Lean. So my guess is that it could be made to work, but it requires some effort and probably a more robust version of the Sage interaction.
Heather Macbeth (Sep 23 2022 at 15:57):
It was just a quick experiment for me but I'd be glad to see someone (else) push it further.
Antoine Chambert-Loir (Sep 23 2022 at 16:19):
Oh, I see!
Kevin Buzzard (Sep 23 2022 at 17:46):
The problem with the brute force approach is that there are problems if you use the long form because the polynomials involved apparently have hundreds of thousands of terms. I thought I could do a cheap class group proof but the argument I know doesn't work in char 2. Over an alg closed field (which you can assume wlog) in char not two you can prove that the only solutions x,y in k[t] to y^2=x(x-1)(x-a) (with a in k not 0 or ) are necessarily in k, by a descent argument using the fact that coprime stuff whose product is a square must all be squares etc. This is just a concrete proof that the function field of the curve isn't rational. But in char 2 I don't know of a low-level proof of this, and of course I'm worried about the 2-descent because the isogeny could be inseparable...
Junyan Xu (Sep 23 2022 at 19:05):
Adam Topaz said:
One could say it's for function fields of trdeg 1
reminds me that there's also a proof in Michael Rosen, Number Theory in Function Fields https://link.springer.com/book/10.1007/978-1-4757-6046-0
Anne Baanen (Sep 24 2022 at 04:55):
Riccardo Brasca said:
Maybe this has already been discussed, but I am at a conference where someone said elliptic curves have been formalized in HOL, but they're not sure how nor in which generality. Does anyone have more info about this?
Just to be clear, this was HOL Light, not HOL 4.
Kevin Buzzard (Sep 24 2022 at 08:18):
I was thinking about Riemann-Roch recently. Formally it's two completely different theorems -- one for line bundles on compact Riemann surfaces and one for invertible sheaves on smooth projective algebraic curves (in arbitrary characteristic). Neither implies the other (unless you want to develop the entire theory of GAGA first, which we probably don't). Similarly generalisations of the statement never make it clear about what they're generalising -- places like Wikipedia claim that the Atiyah-Singer index theorem is "a generalisation of Riemann-Roch" and that Grothendieck-Riemann-Roch is "a generalisation of Riemann-Roch" but they're two generalisations of two different theorems.
Kevin Buzzard (Sep 24 2022 at 08:20):
PS Jujian Zhang is working on GAGA but it's slow going, not least because as far as I know we still haven't decided how to implement sheaves of modules on a scheme...
Antoine Chambert-Loir (Sep 25 2022 at 09:04):
I don't have thought deeply about this, but I wonder whether one could adopt a high-level point of view and have topoi, ringed topoi… One reason is that many results on rings, modules… will be given for free once the traditional proofs are transferred in this generality.
Joël Riou (Sep 27 2022 at 22:57):
I would agree with @Antoine Chambert-Loir. Sheaves of rings could be identified to ring objects A in the category of sheaves of sets, and sheaves of modules M with "A-Module objects". This may imply redefining internal versions of [has_mul]
, [add_comm_group]
, [module]
, etc (for a symmetric monoidal structure, e.g. the one given by finite products). Then, there could be suitable notations and instances expressing that M(U)
is a module over A(U)
, etc.
Kevin Buzzard (Sep 28 2022 at 16:58):
Some previous discussion on sheaves of modules is here and here.
Wrenna Robson (Sep 29 2022 at 10:44):
Kevin Buzzard said:
Dan Bernstein managed to prove associativity in char 2 in
sage-math
using some formulae due to David Kohel:R.<c,X0,X1,X2,X3,Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3> = PolynomialRing(GF(2),13) curveX1 = X0^2-X2^2-c^2*X1*X3 curveX2 = X1^2-X3^2-c^2*X0*X2 curveY1 = Y0^2-Y2^2-c^2*Y1*Y3 curveY2 = Y1^2-Y3^2-c^2*Y0*Y2 curveZ1 = Z0^2-Z2^2-c^2*Z1*Z3 curveZ2 = Z1^2-Z3^2-c^2*Z0*Z2 I = R.ideal([curveX1,curveX2,curveY1,curveY2,curveZ1,curveZ2]) def add0(X0,X1,X2,X3,Y0,Y1,Y2,Y3): return (X0*Y0+X2*Y2)^2,c*(X0*X1*Y0*Y1+X2*X3*Y2*Y3),(X1*Y1+X3*Y3)^2,c*(X0*X3*Y0*Y3+X1*X2*Y1*Y2) def add1(X0,X1,X2,X3,Y0,Y1,Y2,Y3): return c*(X0*X1*Y0*Y3+X2*X3*Y1*Y2),(X1*Y0+X3*Y2)^2,c*(X0*X3*Y2*Y3+X1*X2*Y0*Y1),(X0*Y3+X2*Y1)^2 def add2(X0,X1,X2,X3,Y0,Y1,Y2,Y3): return (X3*Y1+X1*Y3)^2,c*(X0*X3*Y1*Y2+X1*X2*Y0*Y3),(X0*Y2+X2*Y0)^2,c*(X0*X1*Y2*Y3+X2*X3*Y0*Y1) def add3(X0,X1,X2,X3,Y0,Y1,Y2,Y3): return c*(X0*X3*Y0*Y1+X1*X2*Y2*Y3),(X0*Y1+X2*Y3)^2,c*(X0*X1*Y1*Y2+X2*X3*Y0*Y3),(X1*Y2+X3*Y0)^2 additionlaws = add0,add1,add2,add3 for A in additionlaws: for B in additionlaws: for C in additionlaws: for D in additionlaws: A0,A1,A2,A3 = A(X0,X1,X2,X3,Y0,Y1,Y2,Y3) B0,B1,B2,B3 = B(A0,A1,A2,A3,Z0,Z1,Z2,Z3) C0,C1,C2,C3 = C(Y0,Y1,Y2,Y3,Z0,Z1,Z2,Z3) D0,D1,D2,D3 = D(X0,X1,X2,X3,C0,C1,C2,C3) assert B0*D1-B1*D0 in I assert B0*D2-B2*D0 in I assert B0*D3-B3*D0 in I assert B1*D2-B2*D1 in I assert B1*D3-B3*D1 in I assert B2*D3-B3*D2 in I print([x.__name__ for x in (A,B,C,D)])
oh my god this is the most Dan code ever
Wrenna Robson (Sep 29 2022 at 10:45):
I think morally I wouldn't call this a proof per se in the same way we use that word on Lean, but it seems to work on a cursory scan.
David Ang (Oct 26 2022 at 15:53):
API for addition of K-rational points
Junyan Xu (Nov 10 2022 at 06:49):
I just came across this proof and it seems to me that the computation involved is feasible for Lean 3, and the prerequisites are also available. Maybe someone could give it a try.
Andrew Yang (Nov 10 2022 at 06:56):
I think this is the "function field" approach mentioned by Kevin and others above. It seems like the "explicit calculation" doesn't really work when the characteristic is 2?
Riccardo Brasca (Nov 10 2022 at 07:00):
It uses short weirstrass equation it seems. In characteristic 2 and 3 it's not always possible to write the curve in that way
Kevin Buzzard (Nov 10 2022 at 07:29):
Yes the problem is precisely that this proof doesn't work in char 2 as it stands and neither does the Edwards curve approach. In fact it seems to me that there might be no formalisation of associativity in characteristic 2 in any system
Eric Rodriguez (Nov 10 2022 at 11:27):
is a char2-only proof just as tricky as a general proof?
David Ang (Nov 10 2022 at 14:39):
If we’re going by the brute-force approach, there are char2-specific simplifications e.g. removing two of the five coefficients, replacing (a-b)^2 with a^2+b^2, etc, but it still doesn’t reduce the computation to a level that’s feasible for Lean3’s ring tactic, especially because there’s no good way to work with expressions in char2
Yaël Dillies (Nov 10 2022 at 14:45):
Is the distinction really char 2 vs char not 2 or rather 2 is invertible vs 2 not invertible? I thought I understood it was the latter
Johan Commelin (Nov 10 2022 at 14:46):
I think we're working over a field for now
Adam Topaz (Nov 10 2022 at 14:54):
Trying to do elliptic curves over general rings (or schemes) would complicate things even further.
Riccardo Brasca (Nov 10 2022 at 14:59):
If we define an elliptic curve as a group scheme such that blah blah, at least associativity will be easy... (just joking)
Kevin Buzzard (Nov 10 2022 at 16:30):
Yes, we need all the definitions, but the point is that we have a lot of the tools necessary to start proving nontrivial theorems about this model. The concrete approach is important for applications, e.g. proving that certain explicit Diophantine equations have some explicit set of rational solutions. As Lean 4 becomes a reality then it will be interesting to see what we can do here.
Junyan Xu (Nov 10 2022 at 19:04):
I think the following argument works: to show injectivity, assume that in the class group (see [MO answer] for the definition of ), then multiplying by we have . If then so assume it's the point (x,y), then says (X-x, Y-y) is a principal ideal in R, which means that (X-x, Y-y) = (X-x, Y-y, f) = (g, f) for some . But the cubic f=0 cannot intersect g=0 at a single point, contradiction. (I think the only place where Δ ≠ 0 will be needed is in showing .)
Kevin Buzzard (Nov 10 2022 at 20:52):
The cubic f=0 can certainly intersect g=0 at a single point, for example when if g is linear it can meet f at a point of order 2. I don't see how to finish easily from here. The only low level proof I know assumes char not 2.
Kevin Buzzard (Nov 10 2022 at 20:55):
Even if we assume some conceptual stuff then this argument boils down to the assertion that there's no function on E with a simple zero at a given finite point and a simple pole at infinity, so it's the statement that the cubic doesn't have genus 0 (for some definition of genus), but is there an easy way of seeing that? I don't know one
Kevin Buzzard (Nov 10 2022 at 20:56):
This is my understanding of the stumbling block in this approach but I might be missing something
Junyan Xu (Nov 10 2022 at 21:04):
Yes I was wrong, it's necessary to assume some regularity of f in the last part of the argument, for example if g is linear f=0 is one line intersecting g=0 union two lines parallel to g=0 then (g,f) could be a maximal ideal. But if f=0 is smooth it certainly follows from Bezout's theorem; the intersection multiplicity at infinity can be at most deg g
, so the total intersection multiplicity in the affine chart is at least 2 * deg g
, which cannot be 1. I imagine Bezout is still easier to prove than Riemann-Roch.
Kevin Buzzard (Nov 10 2022 at 22:20):
Even with bezout you will have this annoying issue about things meeting at infinity eg you'll have to rule out g meeting f once away from infinity and 3d-1 times at infinity
Junyan Xu (Nov 11 2022 at 01:19):
Let me note that since f
has the form y^2 + p(x) y + q(x)
and is in the ideal (f,g)
, we may assume that g is of the form p(x) y + q(x)
and therefore solve y = -q(x) / p(x)
. I feel that polynomials in one variable should be easy to deal with, but I haven't worked out a proof.
Kevin Buzzard (Nov 11 2022 at 08:10):
The reason I'm nervous about believing there's an easy proof is that in the char not 2 case another way of saying that the ideal isn't principal is that the only solutions to y^2=x(x-1)(x-lambda) in k(t) are all in k, which is a delicate diophantine argument (product of three coprime polys is a square so they're all squares and now you're doing a 2-descent) -- it's arithmetic rather than algebraic.
Junyan Xu (Nov 12 2022 at 04:05):
Okay I hope the following argument now contains no gap, but it depends on ideal norms so I think we first need to show that is a Dedekind domain; we could go the regular + dimension 1 route which uses #16860. Ideal norms have been PR'd #17203 but only for finite extensions of ℤ; @Anne Baanen is it easy to generalize your developments to all Dedekind domains?
So our goal is to show the ideal is not principal (when ), and we consider the degree 2 extension and the associated norm. Every element in can be written as , whose conjugate is , so . Let a
and b
be the degrees of p
and q
respectively, then the three terms are 2a+3
, a+b+(0 or 1)
, and 2b
respectively. When the first term has the highest degree (without tie), and when the last term has the highest degree (without tie), so the degree of the norm of an element is either 2a+3
or 2b
and never 1. On the other hand, the norm of the ideal is if x isn't 2-torsion and if x is. If were generated by a single element then its norm would be generated by the norm of that element, so the norm is a constant multiple of (degree 1, impossible) or . If the norm of is , since the degree is even it can't be 2a+3
and must be 2b=2
so we are in the second case, and we have b=1
and a+1<b
means that p(X)=0
, so the norm is and . But it's easy to show in this case.
Indeed I'm solving the norm equation in polynomials here but fortunately it's easy in this case :) This argument is inspired by Exercise I.6.2 in Hartshorne's GTM52, and I checked out the exercise after seeing the comment here.
Junyan Xu (Nov 12 2022 at 08:43):
I think we probably want to develop this general version of ideal norms (notes by Sutherland) in mathlib. To show regularity of k[X,Y]/(f)
, we may develop stacks#07PF, or may be we can show that the localizations at all points are PID more directly and use the 7th characterization of Dedekind domains here.
Riccardo Brasca (Nov 12 2022 at 09:38):
I think @Anne Baanen and @Alex J. Best are working on the ideal norm.
Anne Baanen (Nov 14 2022 at 15:13):
At the moment we only have working code for finite extensions of ℤ
indeed, but I did do some work on the norm definition using this general suggestion, [M_p : N_p]A_p := det φ̂_p
. For example, nat_abs_det_equiv
in #17299 links this determinant to the cardinality of the quotient. I don't know how developed the theory of localizations for Dedekind is now, to say how quickly we can get the full definition working nicely.
Anne Baanen (Nov 14 2022 at 15:14):
My work in progress can be found on branch#ideal_norm. It is somewhat duplicate with the PRs mentioned above, but should have some more results toward this determinant-based definition of the norm.
Riccardo Brasca (Nov 14 2022 at 15:26):
Let me review some of the PR involved!
David Ang (Nov 14 2022 at 17:04):
David Ang said:
I made a PR last month giving a basic API for the addition of K-rational points, the type of which is defined as an inductive over the point at infinity and the rest of the affine points. There were some discussion with @Michael Stoll in the PR page that perhaps we don't want this to be the definition, because it might be unlikely that associativity would be proven easily with this definition. So far the current approaches discussed are (if I didn't miss any):
- brute-force (Lean3's
ring
is ~4 times too slow for this, and even if it were fast enough I need a file with thousands of lines of polynomials), - injecting it into the class/Picard group (need to define the associated ring),
- using some form of Cayley-Bacharach (need some form of Bezout), and
- translating it into Edwards form (need to define Edwards curves, and this doesn't work for char 2).
The last three methods would involve translating the current definition of affine addition into their respective contexts, then showing that this addition is the same as the translated addition. The ideal definition is obviously to define the Weierstrass model via the Proj construction (or even as a group scheme etc), but I doubt we are close to doing that. I'm thinking that we should fix the affine definition, partly because it's the definition that's easiest to work with and prevalent in other theorem provers, and partly because I already proved a bunch of other things with this without needing associativity. For whatever proof we have for associativity that comes later, we can simply introduce the necessary translation and define an equiv
with the affine definition. Thus I'm asking for opinions via a poll here.
David Ang (Nov 14 2022 at 17:08):
/poll What should I do for #17194?
The current affine definition should be approved, and should be the definition (at least until we can give coordinates via Riemann-Roch)
The current affine definition should be approved, but should be refactored as a theorem/equiv once we have a proof of associativity
The current affine definition should be rejected, and we should wait for a better definition with a proof of associativity
Junyan Xu (Nov 15 2022 at 01:31):
I just found that the Picard group proof @Michael Stoll proposed two weeks ago on Github is almost the same as the proof I worked out, except that it's stated in terms of the (smooth) projective curve while mine is totally restricted to the affine chart, working with a Dedekind domain; and in the last part of the argument, I use the norm to show the maximal ideals (with residue field k, i.e. inertial degree 1) can't be principal, while Michael uses that the orders of zeros/poles of a rational function sum up to 0, which is essentially equivalent to Bezout for the cubic and arbitrary plane curves defined by the numerator and denominator of the rational function. (It's also possible to work with abstract curves (formed by valuations of the tr.deg. 1 function field) but that would be more theory development, and some glue is necessary to connect to the concrete plane curve model.) So I think my approach is more elementary (purely commutative algebra). Although the approaches to this last part are distinct, they also share some similarity: showing a rational function cannot have a simple pole at infinity and no pole elsewhere in his approach is the same as showing an element in the Dedekind domain cannot have norm of degree 1. I should have seen his proof in email notification but had forgotten about it when I began to pursue my approach :)
Junyan Xu (Nov 15 2022 at 01:31):
Although a full theory of ideal norms is desirable (it unifies the number field and function field cases), it's also possible to take a shortcut if we just want to prove the maximal ideals (with residue field k) in the Dedekind domain are non-principal: for algebraic number rings we have a natural notion of size of an ideal, which is the cardinality of the quotient. In our case where all ideals are vector subspaces over the base field k, we may use the k-dimension of the quotient (which is additive rather than multiplicative, but we don't even need this fact): the quotient at a rational point has dimension 1, but we can show that the quotient by any nonzero principal ideal (pY+q) has dimension equal to the degree of N(pY+q), which is at least 2 from previous argument. (And when I write this I realize I made an error in that argument: there's no need to split out the 2-torsion case; the norm of the ideal is still (X-x) rather than its square.) My argument uses this result just merged into mathlib:
def ideal.quotient_equiv_pi_span
(I : ideal S) (b : basis ι R S) (hI : I ≠ ⊥) :
(S ⧸ I) ≃ₗ[R] Π i, (R ⧸ ideal.span ({I.smith_coeffs b hI i} : set R))
The determinant of the multiplication by pY+q
on S=k[X,Y]/(f) as a k[X]-linear map (R=k[X]) is easily calculated to be equal to N(pY+q) using the basis {1,Y}. On the other hand, it's also associated to the product of the smith_coeffs
: we may consider S → (pY+q) → S → S
where the first map sends each element in the Smith basis to the corresponding smith_coeff
times that element (which sends a basis to a basis so is a R-linear_equiv), the second map sends every element x ∈ (pY+q)
to (pY+q)⁻¹ * x
(which is also a R-linear_equiv when p,q are not both zero since S is a domain), and the third map is multiplication by pY+q
. Then last two maps "cancel", so the net effect of the composition of the three maps is multiplying by smith_coeffs and so has determinant equal to the product of the smith_coeffs. On the other hand, the first two maps compose to an R-automorphism of S, whose determinant is a unit (docs#linear_equiv.det). Therefore, N(pY+q), which is the determinant of the third map, is associated to and hence has the same degree as the product of the smith_coeffs. According to ideal.quotient_equiv_pi_span
, the dimension of S/(pY+q)
over k is the same as the sum of the degrees of the smith_coeffs, which is the same as the degree of N(pY+q), and we are done.
With this shortcut, I don't think we even need to show S is Dedekind, only that S is a domain, i.e. the polynomial f in k[X,Y] is irreducible; if f splits into a product (Y+p(X))(Y+q(X)) it should be easy to show Δ = 0.
Junyan Xu (Nov 15 2022 at 01:32):
Regarding #17194, I think the definition of addition and the case splits are fine, and are quite amenable for showing . However, I think def weierstrass
should be made a polynomial f : k[X][X]
since we're gonna define the Dedekind domain as k[X] adjoining a root of f
; we can easily evaluate the polynomial at any point in the affine plane.
Kevin Buzzard (Nov 15 2022 at 05:23):
Yeah I believe that this argument should work even in char 2 and looks like the most promising approach currently
Kevin Buzzard (Nov 15 2022 at 05:23):
@David Ang
David Ang (Nov 15 2022 at 11:58):
Junyan Xu said:
Regarding #17194, I think the definition of addition and the case splits are fine, and are quite amenable for showing . However, I think
def weierstrass
should be made a polynomialf : k[X][X]
since we're gonna define the Dedekind domain as k[X] adjoining a root off
; we can easily evaluate the polynomial at any point in the affine plane.
I've defined weierstrass_polynomial
for this - I'm guessing the ring is now just K[X][X] ⧸ ideal.span ({E.weierstrass_polynomial} : set K[X][X])
?
David Ang (Nov 15 2022 at 15:53):
Probably not the right place to ask but I will ask here anyway since it's relevant: I was trying to define the map $P \mapsto [I_P]$ and I encountered a deterministic timeout, so I tried extracting a MWE:
import ring_theory.class_group
open polynomial
open_locale polynomial
variables (K : Type) [field K]
noncomputable def cusp : K[X][X] := X ^ 2 - C (X ^ 3) -- supposed to be weierstrass_polynomial
@[reducible] def cusp_ring := K[X][X] ⧸ ideal.span ({cusp K} : set K[X][X])
instance : is_domain $ cusp_ring K := sorry
instance : is_dedekind_domain $ cusp_ring K := sorry
noncomputable def to_class : K → class_group (cusp_ring K) := -- supposed to be E(K) -> class_group
λ x, class_group.mk0 ⟨ideal.span {quotient.mk' $ C (X - C x)}, sorry⟩
The last line takes ~10 seconds to compile for me, and making it more complicated (e.g. changing K
to E(K)
and making the definitions actually correct) makes it time out - any idea what's going on? I can change ... : K → ... := λ x, ...
to ... (x : K) : ... := ...
and it's fast, but for E(K)
I'd rather pattern match the inductive...
Junyan Xu (Nov 15 2022 at 17:34):
It takes <4s on my machine and I'm not yet sure how to speed it up. By the way I think you want to use the default constructor which doesn't require is_dedekind_domain
, and you can directly show is the inverse of because . Showing is_dedekind_domain R
would be more work, though certainly doable.
Junyan Xu (Nov 15 2022 at 17:36):
and K[X,Y]/(Y^2-X^3) is definitely not a Dedekind domain; (Y/X)^2=X so it's not integrally closed.
David Ang (Nov 15 2022 at 17:47):
Yeah the Y^2 = X^3 was just a stupid example to show that it’s very slow - once someone approves #17194 I can write down the actual definition that times out (and I don’t know how to fix)
Junyan Xu (Nov 15 2022 at 19:32):
Oooh I just realized the weierstrass polynomial can never be factored as (Y+p(X))(Y+q(X)), because pq needs to have degree 3 so p+q has degree at least 2 but needs to have degree 1. So k[X,Y]/(f) is always a domain no matter whether Δ = 0 and my argument actually shows that the nonsingular rational points in Spec k[X,Y]/(f) always form a group. Maybe we should do things in this generality, @Kevin Buzzard? It would be useful for bad (additive/multiplicative) reduction of elliptic curves, right?
Michael Stoll (Nov 15 2022 at 19:37):
Doing it in this generality would definitely be good!
Andrew Yang (Nov 15 2022 at 21:28):
David Ang said:
the ring should now be just
K[X][X] ⧸ ideal.span ({E.weierstrass_polynomial} : set K[X][X])
docs#adjoin_root might be useful here
David Ang (Nov 15 2022 at 22:13):
Andrew Yang said:
David Ang said:
the ring should now be just
K[X][X] ⧸ ideal.span ({E.weierstrass_polynomial} : set K[X][X])
docs#adjoin_root might be useful here
Thanks, this magically fixes literally all my current timeout issues as well :saluting_face:
Eric Rodriguez (Nov 15 2022 at 22:27):
adjoin_root
has some diamond somewhere iirc, I have a fix on the splitting field diamond fix but I haven't kept that up to date sadly... I'll try finish that soon so that doesn't cause any issues here
Junyan Xu (Nov 16 2022 at 03:49):
Doing it in this generality would definitely be good!
So we need to introduce a new structure without the Δ : R× and make docs#EllipticCurve extends
it, and define addition for this new structure more generally. What should the new structure be called? The constructor point.some (x y : K) (h : E.weierstrass_equation x y)
would take a nonsingularity condition (or do we want to consider all rational points including the singular point?), and we may define another constructor for points on an EllipticCurve without that condition.
BTW @David Ang In #17194 I think it's better to use [comm_ring] instead of [field] whenever possible; I think weierstrass_equation/polynomial
and neg
don't need [field], only addition needs it.
P.S. I just learned that is the discriminant of src#EllipticCurve.inhabited :) Looks like it's not so random, but maybe just law of small number.
Andrew Yang (Nov 16 2022 at 04:00):
I think that is one of the reasons why 37 is 37 and not 42 or something else.
Junyan Xu (Nov 16 2022 at 04:06):
I think it's always advertised as a random choice, but I've now found another mention of its appearance in elliptic curves ...
Mario Carneiro (Nov 16 2022 at 04:28):
it's definitely not a random choice. It's meme status on this zulip originates from @Kevin Buzzard using it as his favorite number
Mario Carneiro (Nov 16 2022 at 04:28):
and I'm pretty sure the elliptic curve connection is part of it
Thomas Browning (Nov 16 2022 at 05:21):
I didn't know about the elliptic curve connection. I thought it just had to do with it being the first irregular prime.
Junyan Xu (Nov 16 2022 at 05:32):
and it's also the smallest prime not dividing the order of Monster ... and 59 is the only irregular prime dividing the order of Monster.
it's definitely not a random choice.
I always thought it's a randomly chosen prime number.
Johan Commelin (Nov 16 2022 at 06:54):
Never attribute to randomness that which is adequately explained by expertise in number theory.
Antoine Chambert-Loir (Nov 16 2022 at 08:58):
Just a remark on this amazing discussion: When you look at the first papers of Hasse on elliptic curves, 1936, the language was essentially field-theoretic. What made the theory work was that for curves, closed points correspond to places/valuations on the function field. This explains why @Junyan Xu it is not really necessary to develop the Dedekind-ring theory for these results to be proved. On the other hand, the Riemann-Roch theorem is a crucial component of Hasse's papers, it has been proved a few years earlier by F.K. Schmidt. Only with Weil would the theory start to be framed in a geometric language. (For Hasse and Deuring, the product of a curve $C/k$ by another one $E/k$ was constructed as the curve $C$ with scalars extended to the function field of $E$, and its places could be separated into constant one and non-constant ones, leading to an understanding of the divisor class group of the surface $C\times E$ modulo rational equivalence and its action on divisors of $C$, mapping them to divisors on $E$.)
David Ang (Nov 16 2022 at 10:53):
Junyan Xu said:
Doing it in this generality would definitely be good!
So we need to introduce a new structure without the Δ : R× and make docs#EllipticCurve
extends
it, and define addition for this new structure more generally. What should the new structure be called? The constructorpoint.some (x y : K) (h : E.weierstrass_equation x y)
would take a nonsingularity condition (or do we want to consider all rational points including the singular point?), and we may define another constructor for points on an EllipticCurve without that condition.BTW David Ang In #17194 I think it's better to use [comm_ring] instead of [field] whenever possible; I think
weierstrass_equation/polynomial
andneg
don't need [field], only addition needs it.P.S. I just learned that is the discriminant of src#EllipticCurve.inhabited :) Looks like it's not so random, but maybe just law of small number.
WeierstrassCurve
? I will do the comm_ring
change. I think Kevin was intentionally trying to find a curve with discriminant 37
to put in there
Kevin Buzzard (Nov 16 2022 at 11:28):
It's the first rank 1 elliptic curve.
David Ang (Nov 16 2022 at 12:18):
I can refactor both EllipticCurve.lean
and EllipticCurve/point.lean
to include singular Weierstrass equations, but I think it's better to have this approved first (since it's valid) and do that immediately in a separate PR
Michael Stoll (Nov 16 2022 at 16:14):
We could take a curve of conductor 5077 instead...
Oisin McGuinness (Nov 16 2022 at 19:57):
Michael Stoll said:
We could take a curve of conductor 5077 instead...
Of course 5077 is the smallest conductor of an elliptic curve of rank 3 (this has been proved), and also stands out for its role in the resolution of the class number problem for imaginary quadratic fields (Goldfeld and Gross-Zagier).
My personal favorite would be 19047851, conductor of smallest known (not yet proved, apparently searches haven't quite proved this yet) conductor of rank 5; entry in LMFDB here: Elliptic curve with LMFDB label 19047851.a1, only relatively recently added to LMFDB. See 5th page of this paper this curve's publication.
Junyan Xu (Nov 16 2022 at 23:28):
LMFDB doesn't seem to have Elkies' curves; at least the only known curve (over Q) of rank ≥ 28 y^2 + xy + y = x^3 − x^2 − 20067762415575526585033208209338542750930230312178956502x
+34481611795030556467032985690390720374855944359319180361266008296291939448732243429
isn't among the results returned from searching bad prime 48463. (I found the bad prime from this page.) These curves are really special as it's conjectured that only finitely many elliptic curves over Q has rank > 21.
David Michael Roberts (Nov 17 2022 at 04:42):
Another fact: it's a factor of the conductor of what seems to be the oldest recorded curve on which someone (Diophantus, who else) found a rational point, 8732.b1.
David Michael Roberts (Nov 17 2022 at 04:59):
@Junyan Xu do we know the torsion subgroup of the Mordell–Weil group of Elkies' curve?
David Michael Roberts (Nov 17 2022 at 05:43):
Oh, I see that it has trivial torsion. Took a little search to track down that email...
Junyan Xu (Nov 17 2022 at 07:06):
I think the 123-digit number in Elkies's email must have been factored as it's not hard nowadays (at most several days according to this 2018 post). The paper I linked also mentioned "passing a list of bad primes" of the curve but I'm not sure if it means all of them; it doesn't mention how the global root numbers were obtained.
Apparently it gets harder to find elliptic curves of high ranks when there is more torsion: the records in this paper are {20,15,13,8,9,6} for Z/{2,3,4,5,6,7}Z, and heuristics suggest that there are only finitely many of ranks above {13,9,7,5,5,3}.
Elliptic curves in nature (mentioned in the MO thread) is also a nice read :) with 5077-a1 the last example.
Junyan Xu (Nov 17 2022 at 07:16):
Oh, someone added the factorization to factordb.com in 2019: 315574902691581877528345013999136728634663121 x 376018840263193489397987439236873583997122096511452343225772113000611087671413
Kevin Buzzard (Nov 17 2022 at 23:24):
So @David Ang and I went through @Junyan Xu 's approach today at work and convinced ourselves that it should work fine, even in char 2. I realise that I am still a bit confused about some of the story.
If is a commutative ring and is an -algebra which is finitely-generated and projective (or locally free, which might be slightly different in the non-Noetherian case) as an -module, then there's a norm map , a monoid homomorphism, defined by considering multiplication by as an -module endomorphism of and then taking its determinant (this and the trace are sufficiently "canonical" that they can be computed locally, meaning that doesn't have to be free, just locally free).
Now if is an ideal of we can define its norm to be the ideal of generated by the norms of all the elements. What I am slightly confused by is that we don't need to assume is a Dedekind domain in this definition, and going through the argument it seemed to me that all one needs is that the norm of the ideal is the ideal . In particular we don't need to prove that is a Dedekind domain. Did I miss something?
Is the idea that one can define the norm of an ideal of as an ideal of but one can't prove that this construction is multiplicative unless is a Dedekind domain and perhaps more too?
Kevin Buzzard (Nov 17 2022 at 23:24):
@Anne Baanen are you working on this?
Kevin Buzzard (Nov 17 2022 at 23:25):
I think David might want to "start at the top", deducing associativity from injectivity and then starting on injectivity and perhaps meeting others coming up from the bottom.
Anne Baanen (Nov 18 2022 at 10:14):
Yes, the idea is just that bundling the map as a homomorphism requires stricter assumptions
Anne Baanen (Nov 18 2022 at 10:16):
In particular we needed for our project that it returns a number, not ideal, and that the map is multiplicative, so we restrict to rings of integers of number fields
Anne Baanen (Nov 18 2022 at 10:17):
I'm pretty sure I have some code on the ideal generated by the norms, let me look it up
Kevin Buzzard (Nov 18 2022 at 12:12):
Aha -- the norm of an ideal is an ideal in my comments above, but then I suspect that you're in the special case where and so instead of an ideal you can just take its non-negative generator. Alternatively you can just take the size of if it's finite.
Anne Baanen (Nov 18 2022 at 12:28):
In fact, I recall our development went span of norms → generator of span → cardinality of the quotient as we figured out exactly what we required :)
Alex J. Best (Nov 18 2022 at 12:55):
Yeah some things were way easier to prove with one version than the other in the end
Alex J. Best (Nov 18 2022 at 12:55):
And indeed we considered a version of relative norm, but taking the positive generator of a Z ideal was very convenient for now
Anne Baanen (Nov 18 2022 at 13:09):
Anne Baanen said:
I'm pretty sure I have some code on the ideal generated by the norms, let me look it up
WIP on branch#ideal-span-norm
Junyan Xu (Nov 18 2022 at 22:32):
Kevin Buzzard said:
In particular we don't need to prove that is a Dedekind domain.
Yes, this is what I concluded in the latest version of my argument, and why I said the argument could be extended to Weierstrass curves with nonzero discriminant / domains that are not Dedekind. In fact I think this part of the argument is now even easier than the other part (showing ).
Ideal norm is usually defined either as module index as in Sutherland's notes, or as the cardinality of the quotient ring (for rings of algebraic integers A relative to R=Z as in #17203 and in Conrad's notes). The former definition requires the base ring R to be Dedekind but and latter definition requires R=Z, but both definitions make sense without A being Dedekind. However, indeed multiplicativity may fail when A isn't Dedekind: see Example 8.2 in [Conrad] for an example where |A/P|=p but |A/P^2|=p^{n+1}. (Since A is of Krull dimension 1, It should be clear that for P nonzero prime the dimension of P/P^2 over A/P is 1 iff A is regular at P.) We also have
Remark 7.6. In a domain where all nonzero ideals have finite index, the multiplicativity of the ideal norm implies unique factorization of ideals.
[3] H. S. Butts and L. Wade, Two Criteria for Dedekind Domains, Amer. Math. Monthly 73 (1966), 14-21.
If we instead define ideal norm as the ideal generated by norms of elements as you do, it's also not obviously multiplicative (because the norm of a sum has no obvious relation with the norms of the summands), but N((g)) = N(g), N(gI) = N(g)N(I) and N(I)N(J) ≤ N(IJ) are obvious; curiously this inequality seems to be opposite to what holds with the usual definition. We also know that the two definitions agree when A is a Dedekind domain, see Theorem 7.10 in [Conrad] or Corollary 6.9 in [Sutherland], in which case both are multiplicative. When A isn't Dedekind they need not agree: if A=Z+2Z[√-1] and p=2Z[√-1] (like in Example 8.2), then |A/p|=2 but N(p)=4, and |A/p^2|=8 but N(p^2)=16, so multiplicativity fails for the usual ideal norm but not your ideal norm, and I struggle to come up with counterexample to multiplicativity of your ideal norm.
Notice that [Sutherland] develops the theory of module index for modules that are lattices over a Dedekind domain; lattices are submodules of vector spaces, so they are torsion-free and hence (beingn over a Dedekind domain) projective, which is exactly the setting under which you defined the norm of an element.
David Ang (Nov 19 2022 at 21:23):
This PR, which depends on #17194 and two small PRs, proves case-by-case that the Weierstrass polynomial is irreducible and hence that the quotient ring is a domain, and this branch defines the add_monoid_hom
and a proof of the group law on assuming injectivity and that it respects addition - feel free to make commits into it
Antoine Chambert-Loir (Nov 21 2022 at 21:01):
Kevin Buzzard said:
If is a commutative ring and is an -algebra which is finitely-generated and projective (or locally free, which might be slightly different in the non-Noetherian case) as an -module,
it depends on what you call “locally free”. It's equivalent if you define it as the existence of an open covering of by, say such that each module is free and finitely generated. It is not equivalent if, by that, you mean that the localization at all prime ideals are free modules over the local rings .
… then there's a norm map , a monoid homomorphism, defined by considering multiplication by as an -module endomorphism of and then taking its determinant (this and the trace are sufficiently "canonical" that they can be computed locally, meaning that doesn't have to be free, just locally free).
Indeed.
Now if is an ideal of we can define its norm to be the ideal of generated by the norms of all the elements. What I am slightly confused by is that we don't need to assume is a Dedekind domain in this definition, and going through the argument it seemed to me that all one needs is that the norm of the ideal is the ideal . In particular we don't need to prove that is a Dedekind domain. Did I miss something?
What is not clear to me is what you expect of this norm, especially when the ideals are not locally principal.
Is the idea that one can define the norm of an ideal of as an ideal of but one can't prove that this construction is multiplicative unless is a Dedekind domain and perhaps more too?
There is one case where good things happen, namely when your ideals are locally principal.
In this case, the norm will be multiplicative: this reduces to the multiplicativity of the determinant in local charts.
Kevin Buzzard (Nov 21 2022 at 22:27):
Re what we expect of the norm: the only thing needed is the observation that the norm of down to is principal generated by a polynomial of degree 1, but that if is principal then its norm is generated by which is never a polynomial of degree 1.
Junyan Xu (Nov 22 2022 at 07:42):
I've previously outlined on Github a unified way of proving that the point given by the addition and doubling formulas lies on the Weierstrass curve, and I think it's most convenient to use a multivariate chain rule, so I asked about it on Zulip, though it may be more straightforward to compute the derivative of directly (where g is linear). I have now also worked out a unified way to show whenever and both are not :
Let L be the slope of the line connecting the two points and on the Weierstrass curve , or of the tangent line when the two points are equal. This is called add_L
or dbl_L
in #17194. Define . As shown in the outline, with the x-coordinate of . It suffices to show that .
Since P,Q both lie on the line , it is easy to show that and ; on the other hand, P+Q lies on so , where is the conjugate of Y. (It would be most convenient to define these ideals via docs#ideal.map from K[X,Y] to the quotient ring.) Notice that we have in the quotient ring , and it follows that in the quotient ring. Therefore and , so it suffices to show . We can subtract from to get , so it suffices to show is the unit ideal in K[X,Y]. But the vanishing locus of the ideal is nonempty only when and has a unique double root (since the derivative of w.r.t. Y is exactly ), in which case we must have . Thus when the vanishing locus is empty and therefore the ideal is the unit ideal by Nullstellensatz docs#mv_polynomial.vanishing_ideal_zero_locus_eq_radical (and docs#ideal.radical_eq_top). (It'd be slightly annoying but manageable to switch between K[X][X] and the mv_polynomial ring K[X,Y].)
Junyan Xu (Nov 22 2022 at 08:45):
For the case , we want to show . We have and because in the quotient ring. Notice that always has the root and it's a multiple root iff the X-derivative of f is zero at . If , it's easy to show is the unit ideal, and we're done. If then the Y-derivative of f at vanishes, so P is a singular point iff the X-derivative also vanish, iff is a root of . Moreover in this case and the three-generator ideal is the unit ideal (and we are done) iff isn't a root of iff P is nonsingular, and is the maximal ideal iff it is a root iff P is singular.
Since we know ideals I with R/I one dimensional are never principal (for all Weierstrass curves, no matter singular or not), we see that indeed fails in the last case, and it's the only case where it fails. This is natural since the maximal ideal at a singular point should not be invertible, so for all ideals , in particular for .
To show that the nonsingular points form a group, we still need to show that the sum of two nonsingular points is nonsingular, but that should be easy.
To show that nonzero discriminant implies no singular points exist, Silverman uses a coordinate transform to get a simpler Weierstrass equation, and characteristic 2 requires special treatment.
Alex J. Best (Nov 22 2022 at 22:21):
We have some lean 4 proofs relating singularity and the discriminant in all characteristics at GitHub.com/kisarablue/ec-tate-lean the project is still wip, but that part is sorry free. It sounds like the one you want is the easier one though. It suffices to express the discriminant as a member of the ideal generated by the Weierstrass equation and the partial derivatives, which is just one huge ring
call at the end of the day
Kevin Buzzard (Nov 22 2022 at 22:35):
It suffices, but it's not necessary and sufficient because it could have been e.g. that the 37th power of the discriminant is in the ideal, so in some sense we're lucky here.
Alex J. Best (Nov 22 2022 at 22:51):
Indeed, thats why this is the easier direction! Somehow geometry alone doesn't capture the fact the discriminant is the resultant of those polynomials
Junyan Xu (Nov 23 2022 at 08:12):
In fact we could also use a computation-light proof for Δ ≠ 0 ⇒ nonsingular. Mathlib already have docs#EllipticCurve.two_torsion_polynomial.disc_ne_zero which shows over a field of characteristic nonzero, the discriminant of the "two-torsion polynomial" of an elliptic curve (Δ ≠ 0) is nonzero, which means that the two-torsion polynomial has no multiple root, i.e. its derivative at any root is nonzero. But the two-torsion polynomial is exactly a multiple of (obtained by solving the Y-derivative=0 equation for Y and plugging into f), and its derivative at X=x is the X-derivative of f at minus the Y-derivative of f at the same point times , according to the chain rule. For any point on the curve where the Y-derivative is zero, we have , and is a root of the two-torsion polynomial because , so the derivative at () is nonzero; since is zero there we conclude , so is nonsingular. The characteristic 2 case does need special treatment but should be easy.
Kevin Buzzard (Nov 23 2022 at 08:19):
I am secretly hoping that we never have to treat char 2 as a special case but I know this is not always possible in general
David Ang (Nov 23 2022 at 14:33):
I plan to refactor the stuff on two-torsion polynomials (any name suggestions are highly appreciated) once we define n-division polynomials, but it’s there to demonstrate the usage of the cubic discriminant file
Marc Masdeu (Dec 01 2022 at 06:57):
I have a computational proof of discriminant nonzero implies nonsingular, for an appropriate definition of nonsingular (the one needed in the computational proof of associativity)...
David Ang (Dec 06 2022 at 16:47):
Marc Masdeu said:
I have a computational proof of discriminant nonzero implies nonsingular, for an appropriate definition of nonsingular (the one needed in the computational proof of associativity)...
Thank you for this - I originally copied the huge expressions over in #17631, but you can actually avoid them (and the community probably won't like it anyway), by first applying a variable change to (0, 0), then proving that the discriminant is unchanged and the same result holds. I called it a smooth
point just because the terminology is shorter (is there actually a semantic difference between being smooth and being nonsingular?)
Kevin Buzzard (Dec 06 2022 at 18:52):
nonsingular is usually a property of objects and smooth a property of morphisms.
Jz Pan (Dec 07 2022 at 00:24):
I have a proof that if a Weierstrass equation over a field has non-zero discriminant, then it does not have rational singular point.
Conversely, if it does not have rational singular point, under the assumption that the field is not of char 2 or 3, or is perfect, then it has non-zero discriminant. (If the assumption on the field does not hold, then there can be singular point defined over a finite inseparable extension.)
Jz Pan (Dec 07 2022 at 00:26):
The computation does not involve lengthy explicit calculations if I remembered correctly.
Jz Pan (Dec 07 2022 at 00:42):
Kevin Buzzard (Dec 07 2022 at 00:43):
Very nice!
Junyan Xu (Jan 23 2023 at 05:02):
I just pushed a commit to make #18000 sorry-free. Lean now knows rational points on an elliptic curve over a field form an abelian group! (Actually we show the same for the nonsingular rational points on a possibly singular curve defined by a long Weierstrass equation. As far as we know, associativity has not been proved for long Weierstrass equations in any theorem prover before.)
Thanks to @David Ang's hard work spread out in numerous PRs which reduced the whole argument to the final sorry. He intended to attack the final sorry with another approach, but it depends on branch#ideal-span-norm which is not yet merged into master, so after discussions he generously left the final sorry to me while he works on our joint paper, and I filled it using the original approach suggested by myself. It's a bit painful to fight timeouts and pi instance issues (in the auxiliary file) and to construct a certain isomorphism, but finally Lean accepts the proof!
Thanks for everyone who shared thoughts in this thread; this elementary new proof of associativity is really the fruit of online interchange of ideas. And thanks to all mathlib contributors, whose work our formalization builds upon.
Yaël Dillies (Jan 23 2023 at 07:36):
This is one of those things I thought I would never see solved. Congratulations!
Johan Commelin (Jan 23 2023 at 07:38):
@Junyan Xu @David Ang Huge congratulations!
Riccardo Brasca (Jan 23 2023 at 08:21):
Congratulations! I will review today one of the blocking PR... we really want this in mathlib!
Chris Birkbeck (Jan 23 2023 at 09:03):
Oh amazing work! I'm happy this is finally done :)
Kevin Buzzard (Jan 23 2023 at 09:53):
Just to remark that in particular (as far as I know) this is the first formalisation of associativity which works for all elliptic curves over all fields and in particular in characteristic 2. Furthermore it does not use the coordinate bash, which in this generality involves manipulating polynomials with tens of thousands of coefficients, and does not involve a case split on the characteristic either. The argument is much more conceptual and uses the work of @Anne Baanen et al on class groups of Dedekind domains. The work is a testament to what we are doing here in this community. The effort put into formalising harder and more conceptual commutative algebra means that we can use more powerful tools to take things further than before. Furthermore as far as I can make out, the formalised proof is actually mathematically novel. My understanding is that it has its roots in an exercise in Hartshorne's book on algebraic geometry which specifically excludes characteristic 2. Of course the proof of Fermat's last theorem needs a robust theory of elliptic curves which works in all characteristics.
Anne Baanen (Jan 23 2023 at 10:06):
Amazing! I didn't realize you needed the ideal norm work for this, so I'll definitely hurry up more and finish PR'ing soon :O
Riccardo Brasca (Jan 23 2023 at 10:13):
Concerning the ideal norm I've this comment. I am not sure it is relevant here, and everything is perfectly fine for ℤ
, but I am sure it is going to cause troubles later on.
Riccardo Brasca (Jan 23 2023 at 10:14):
I've opened this issue, that is essentially what we need to replace the norm from 𝓞 K
with the restriction of the norm from K
.
Junyan Xu (Jan 23 2023 at 17:19):
@Anne Baanen To clarify, @David Ang told me he has an approach using your branch#ideal-span-norm (which he has not explained to me), but the way I filled the final sorry relies only on what's already in mathlib. However, our formalization definitely rely crucially on your work everywhere, including but not limited to:
docs#ideal.quotient_equiv_pi_span (Smith normal form stuff, used in the final sorry)
docs#linear_map.associated_det_of_eq_comp (used in the final sorry)
docs#algebra.norm (proof of non-rationality/injectivity from points to class group centers around this)
and of course docs#class_group. It's fortunate that the class group is defined in the generality of all domains, not just Dedekind domains, so that we can generalize from elliptic curves to possibly singular Weierstrass curves. (We use the constructor that takes an invertible (fractional) ideal in an arbitrary domain rather than the one that takes a nonzero ideal in a Dedekind domain.)
Anne Baanen (Jan 23 2023 at 17:26):
Oh I see. It makes me very happy to know my work is useful outside of my immediate area of concern! :D
David Ang (Jan 24 2023 at 00:45):
Junyan Xu said:
Anne Baanen To clarify, David Ang told me he has an approach using your branch#ideal-span-norm (which he has not explained to me), but the way I filled the final sorry relies only on what's already in mathlib. However, our formalization definitely rely crucially on your work everywhere, including but not limited to:
docs#ideal.quotient_equiv_pi_span (Smith normal form stuff, used in the final sorry)
docs#linear_map.associated_det_of_eq_comp (used in the final sorry)
docs#algebra.norm (proof of non-rationality/injectivity from points to class group centers around this)
and of course docs#class_group. It's fortunate that the class group is defined in the generality of all domains, not just Dedekind domains, so that we can generalize from elliptic curves to possibly singular Weierstrass curves. (We use the constructor that takes an invertible (fractional) ideal in an arbitrary domain rather than the one that takes a nonzero ideal in a Dedekind domain.)
The approach I had in mind was just to directly prove that the span_norm of (X-x, Y-y) is (c(X-x)) for some c, so it can’t be the span_norm of (f) for any f, by degree_norm_ne_one
using span_norm (f) = (algebra.norm f) which is already in Anne’s branch. For the proof itself I was only left with showing that you can extract c(X-x) from linear combinations of products of X-x and Y-y with their conjugates, but that calculation was already embedded somewhere in XY_ideal_neg_mul
, and I was just dreading to refactor that part.
Anne Baanen (Jan 25 2023 at 17:28):
Junyan Xu said:
it depends on branch#ideal-span-norm which is not yet merged into master
I'm happy to announce that I finished the last sorry in span_norm_mul
:octopus: This is what I was waiting for before opening the PR, hopefully tomorrow I can get through all of this.
Anne Baanen (Jan 26 2023 at 13:21):
#18303 :tada:
Kevin Buzzard (Jan 26 2023 at 13:27):
Thanks so much!
David Michael Roberts (Jan 27 2023 at 06:23):
It would be nice if this proof was summarised for instance in a blog post, or I guess an article on the arXiv would be even more stable.
Stepan Nesterov (Feb 02 2023 at 21:24):
I know I'm late to the party, but why is characteristic 2 a problem?
I think it suffices to check associativity for the universal curve over Z[A_1,...,A_6], which boils down to the fact that two elements of Z[X,Y,A_1,...,A_6] are equal. But then it suffices to check that they are equal in C[X,Y,A_1,...,A_6], and we can use a short form (and Weierstrass pi even)
Kevin Buzzard (Feb 02 2023 at 21:45):
The approach using explicit polynomials was just too unwieldy
Kevin Buzzard (Feb 02 2023 at 21:47):
You're right that you just have to check that some very large polynomials are equal after applying the ring axioms many thousands of times. But lean 3 couldn't even prove that two equal polynomials of that size were equal. These things were gigantic when you used the full A1..A6 model. I had a PhD student try and fail to pull this off.
Kevin Buzzard (Feb 02 2023 at 21:49):
Furthermore it was decided that this was not the proof we wanted in mathlib anyway, because it would just be a diabolical file
Patrick Massot (Feb 02 2023 at 21:49):
And this way sounds so ugly compared to a conceptual proof. Would you get any understanding of why the theorem is true from such a proof?
Reid Barton (Feb 02 2023 at 21:53):
This argument is supposed to show that you can reduce to the char 0 case without doing any computation.
Mario Carneiro (Feb 02 2023 at 21:55):
I am definitely interested to know how to do this without doing any computation, since that sounds like the more efficient approach from a proof theoretic and compile time standpoint. Even after reducing to char 0 there is still a pretty hard problem remaining, no?
Reid Barton (Feb 02 2023 at 21:59):
Hmm, is there a problem with having to invert the discriminant?
Reid Barton (Feb 02 2023 at 22:06):
I guess not. Here is a geometric reformulation:
Every elliptic curve is pulled back from the universal elliptic curve over . So it is enough to check that the multiplication on is associative. Since is separated this equation has to hold on a closed subset, so it's enough to check it over the generic point .
Kevin Buzzard (Feb 02 2023 at 22:06):
I don't really follow the argument. The claim is that the difference between two polynomials in Z[A1,..A6] is in some principal ideal. Marc Masdeu worked out the coefficient which one needs to multiply the generator of the ideal by to make the difference and the polynomial had hundreds of thousands of coefficients. We now need to prove this identity. Lean 3 can't even state the identity without falling over (so that's why this method doesn't work) The observation being made as I understand it is that if we now invert 2 and 3 we can introduce new variables A and B which are messy polynomials in A1..A6 and it turns out that some of the polynomials in question can be rewritten as polynomials in A and B, but the problem is already in the form "prove these two polynomials are equal" and lean has already become unresponsive, it can't even prove that the coefficient is equal to itself by refl, it times out. Furthermore the polynomial with hundreds of thousands of coefficients needs to be explicitly entered into the proof.
Stepan Nesterov (Feb 02 2023 at 22:10):
But then how does the conceptual proof work if Lean times out just because you typed in "A * B * C"?
Mario Carneiro (Feb 02 2023 at 22:11):
because in the conceptual proof you never have to type that much
Mario Carneiro (Feb 02 2023 at 22:12):
just like how you can prove 1234 * 5678 = 5678 * 1234
by an explicit computation, or you can do it by an induction proof and never touch numbers as big as that
Kevin Buzzard (Feb 02 2023 at 22:12):
In the conceptual proof there is no "explicit" addition of three points on an elliptic curve anywhere
Kevin Buzzard (Feb 02 2023 at 22:13):
The largest explicit computation is showing that the class of P+Q is linearly equivalent to the product of the classes of P and Q
Stepan Nesterov (Feb 02 2023 at 22:13):
Reid Barton said:
I guess not. Here is a geometric reformulation:
Every elliptic curve is pulled back from the universal elliptic curve over . So it is enough to check that the multiplication on is associative. Since is separated this equation has to hold on a closed subset, so it's enough to check it over the generic point .
Yes, this is exactly what I had in mind. Even more so, because it suffices to check the equality for the three generic points, you don't have 30 cases, you automatically have that the three points are not on a line and nothing is on a line.
Stepan Nesterov (Feb 02 2023 at 22:13):
Although I guess Lean doesn't know about Zariski density yet
Mario Carneiro (Feb 02 2023 at 22:14):
but it sounds like "it suffices to check the equality" still boils down to equality of some huge polynomials in this case
Stepan Nesterov (Feb 02 2023 at 22:15):
Patrick Massot said:
And this way sounds so ugly compared to a conceptual proof. Would you get any understanding of why the theorem is true from such a proof?
I understand why the theorem is true, it's just a pedagogical probelm of teaching it to an undergrad named Lean, who doesn't know what a scheme is :)
Kevin Buzzard (Feb 02 2023 at 22:15):
Lean is well aware of what a scheme is ;-) docs#algebraic_geometry.Scheme
Patrick Massot (Feb 02 2023 at 22:16):
I didn't mean you don't know why it's true, I meant your understanding does not come from that computational proof.
Stepan Nesterov (Feb 02 2023 at 22:16):
Mario Carneiro said:
but it sounds like "it suffices to check the equality" still boils down to equality of some huge polynomials in this case
There were some slick ideas which were proposed for the proof in this topic but then discarded because they would not work in characteristic 2. My observation is just that the char 2 case follow formally from char 0 case.
Kevin Buzzard (Feb 02 2023 at 22:17):
In this chat we use the word "formally" to mean something a little different :-)
Mario Carneiro (Feb 02 2023 at 22:18):
I would use "by abstract nonsense" there
Stepan Nesterov (Feb 02 2023 at 22:18):
By concrete nonsense which is too large for Lean supposedly
Mario Carneiro (Feb 02 2023 at 22:19):
wait, I'm hoping it's not concrete nonsense since that's the method we want to avoid
Kevin Buzzard (Feb 02 2023 at 22:19):
I think that the proof we ended up with is absolutely the best proof. To check the char 0 case is still a horrible computatation if you want to do the computation route
Mario Carneiro (Feb 02 2023 at 22:19):
I'll definitely be reading the paper when it comes out :)
Stepan Nesterov (Feb 02 2023 at 22:19):
Kevin Buzzard said:
In the conceptual proof there is no "explicit" addition of three points on an elliptic curve anywhere
So does Lean have the theorem add_on_elliptic_curve_assoc: (E : elliptic_curve K, A B C : E(K)): A * B * C = A * (B * C) or it times out?
Kevin Buzzard (Feb 02 2023 at 22:19):
The alternative in the char 0 case is the Picard group proof but we have shown that the Picard group proof works in general
Kevin Buzzard (Feb 02 2023 at 22:20):
Yes we have that proof now, for K any field
Stepan Nesterov (Feb 02 2023 at 22:21):
Can you generalise to any ring by the method I proposed?
Kevin Buzzard (Feb 02 2023 at 22:21):
By "we" I mean David Angdinata and Junyan Xu; I was just cheering from the sidelines
Stepan Nesterov (Feb 02 2023 at 22:21):
I don't get what exactly times out
Kevin Buzzard (Feb 02 2023 at 22:22):
I don't think your question about rings makes sense
Mario Carneiro (Feb 02 2023 at 22:23):
As far as what was concretely shown, I see https://github.com/leanprover-community/mathlib/blob/bd7a6fecf38b4b606508eea4a81d7c1f708d7250/src/algebraic_geometry/elliptic_curve/point.lean#L684 , can someone confirm that this is the general group law for an elliptic curve in any characteristic?
Kevin Buzzard (Feb 02 2023 at 22:23):
What times out is that lean cannot cope with manipulating polynomials with hundreds of thousands of coefficients, and we have no guarantee that it will not time out when manipulating the simpler polynomials which come up in the char 0 case
Mario Carneiro (Feb 02 2023 at 22:23):
It doesn't look exactly like Stepan Nesterov 's sketch
Kevin Buzzard (Feb 02 2023 at 22:24):
Yes it's the Picard group proof.
Mario Carneiro (Feb 02 2023 at 22:25):
I thought that there was already a definition of elliptic curves, such that we could write (and sorry) the associativity statement. This W.point
type is new with the PR
Kevin Buzzard (Feb 02 2023 at 22:26):
Yes, they prove something more general.
Kevin Buzzard (Feb 02 2023 at 22:26):
They allow curves with discriminant zero and prove that the nonsingular points on such curves form a group
Mario Carneiro (Feb 02 2023 at 22:27):
I see, so is this a candidate to replace the existing definition?
Kevin Buzzard (Feb 02 2023 at 22:27):
If the discriminant is nonzero then all points are nonsingular
Reid Barton (Feb 02 2023 at 22:28):
But there is a different proof also based on the Picard group, that only works in char (formalized in Coq).
Kevin Buzzard (Feb 02 2023 at 22:30):
I think they used the short form so it doesn't cover all cases in char 3 either
Stepan Nesterov (Feb 02 2023 at 22:35):
Anyway let me tell you the application that I have in mind for the ring generalization. I am looking for a beginner-friendly short project with elliptic curves now that mathlib knows about them and I am thinking maybe formalizing good reduction will be possible. Rough plan is:
- Generalize the associativity to rings
- Define base change for ellliptic curves.
- Define good reduction (we have isomorphisms right?)
- Prove that an elliptic curve over Q (do we have enough number fields and Dedekind domains story?) has good reduction at all but finitely many primes.
Does this look realistic and/or useful for something else?
Kevin Buzzard (Feb 02 2023 at 22:46):
1 doesn't make sense really. Let R be a commutative ring. What is your definition of an elliptic curve over R and what is your definition of the R-points on the curve?
Kevin Buzzard (Feb 02 2023 at 22:46):
3 and 4 look good though.
Stepan Nesterov (Feb 02 2023 at 22:46):
I only really care about R local to be honest
Kevin Buzzard (Feb 02 2023 at 22:47):
Yes we have number fields and Dedekind domains
Kevin Buzzard (Feb 02 2023 at 22:47):
The issue is that we only really have a definition of an elliptic curve over a field
Stepan Nesterov (Feb 02 2023 at 22:47):
So the definition is just the Weierstrass equation
And bad reduction will be potentially worked out using minimal Weierstrass equation, Silverman style
Kevin Buzzard (Feb 02 2023 at 22:48):
Are you happy to restrict to DVRs?
Kevin Buzzard (Feb 02 2023 at 22:48):
Alex Best has formalised Tate's algorithm
Stepan Nesterov (Feb 02 2023 at 22:50):
Yeah, sure, DVR’s are fine.
Kevin Buzzard (Feb 02 2023 at 22:50):
One issue is how much of the theory one really wants to develop for Weierstrass models. I had Diophantine applications in mind (David Ang is working on Mordell-Weil) but beyond some point we want to switch to the scheme definition
Stepan Nesterov (Feb 02 2023 at 22:52):
Where can I find dedekind domains and DVR’s in mathlib?
Kevin Buzzard (Feb 02 2023 at 22:53):
Right now we have the category of schemes, so we have isomorphisms there, and then we have this type of elliptic curves over a ring with trivial 12-torsion in the Picard group as a Weierstrass model but I think no concept of a morphism although it would certainly be possible to make it if it weren't there already
Kevin Buzzard (Feb 02 2023 at 22:53):
Do you know about the API docs? Just search them for Dedekind domain and discrete valuation ring
Kevin Buzzard (Feb 02 2023 at 22:53):
https://leanprover-community.github.io/mathlib_docs/
Kevin Buzzard (Feb 02 2023 at 22:55):
We have class groups of Dedekind domains, factorisation of ideals in extensions etc
Stepan Nesterov (Feb 02 2023 at 22:55):
Ok found them:)
Kevin Buzzard (Feb 02 2023 at 22:55):
I think we have less about DVRs
Stepan Nesterov (Feb 02 2023 at 22:58):
Is it a bad idea to try and use elliptic_curve.variable_change as a working definition for isomorphism of elliptic curves (over fields)?
Kevin Buzzard (Feb 02 2023 at 23:00):
docs#elliptic_curve.variable_change
Stepan Nesterov (Feb 02 2023 at 23:00):
The definition of good reduction I have in mind is:
“Let K be a field, R a sub ring of K such that R is a discrete valuation ring. An elliptic curve E over K has good reduction if there is an elliptic curve script E over R such that the base change of (script E) to K has a variable change to E and the discriminant of (script E) in invertible in R”
Kevin Buzzard (Feb 02 2023 at 23:02):
The "discriminant of \E is invertible in R" part is unnecessary because that's part of the definition of being an elliptic curve over R. Otherwise that looks like a good definition to me
Kevin Buzzard (Feb 02 2023 at 23:03):
It's a definition of "good reduction at the place corresponding to R", not "good reduction" (there could be many such Rs)
Stepan Nesterov (Feb 02 2023 at 23:04):
Ok cool!
I just found out that we have a Lean club here at Stanford which meets on Tuesdays so I can try and talk to people who can actually code in Lean and can help me with compliling :)
I’ll start writing something then.
Kevin Buzzard (Feb 02 2023 at 23:04):
We have the concept of a valuation subring of K, you might want to use that instead, in situations where you know the valuation is guaranteed to be discrete.
Kevin Buzzard (Feb 02 2023 at 23:05):
Oh that's good to hear about the lean club!
Stepan Nesterov (Feb 02 2023 at 23:06):
Yes, the next definition is “E over a number field has a good reduction at a place p, if it has good reduction corresponding to O_K,p”
David Ang (Feb 03 2023 at 00:20):
I think reduction types will be very very useful for a conceptual proof of Mordell-Weil, at least based on Silverman’s argument; back then this was something I skipped because it’s possible to take a giant shortcut to go straight to descent
David Ang (Feb 03 2023 at 00:48):
Stepan Nesterov said:
Anyway let me tell you the application that I have in mind for the ring generalization. I am looking for a beginner-friendly short project with elliptic curves now that mathlib knows about them and I am thinking maybe formalizing good reduction will be possible. Rough plan is:
- Generalize the associativity to rings
- Define base change for ellliptic curves.
- Define good reduction (we have isomorphisms right?)
- Prove that an elliptic curve over Q (do we have enough number fields and Dedekind domains story?) has good reduction at all but finitely many primes.
Does this look realistic and/or useful for something else?
2 has docs#weierstrass_curve.base_change
Antoine Chambert-Loir (Feb 03 2023 at 09:38):
Kevin Buzzard said:
We have the concept of a valuation subring of K, you might want to use that instead, in situations where you know the valuation is guaranteed to be discrete.
Why would you want to stick to discrete valuations?
Kevin Buzzard (Feb 03 2023 at 11:09):
I only put in that clause because I was scared that the definition of elliptic curve which we have ("long Weierstrass equation") was not mathematically correct for some random valuation ring (because I don't know anything about Picard groups of arbitrary valuation subrings of fields). I convinced myself that as long as Pic(R)[12]={1} the Weierstrass definition is OK and this condition holds for DVRs and fields.
Junyan Xu (Feb 03 2023 at 19:14):
Another issue is the definition of R-points for a general commutative ring R. Although people do talk about integral points on curves, if we want the R-points of E to form a group then we have to consider the scheme-theoretic points E(R), which are morphisms from Spec R to E. If E is defined by a Weierstrass equation (as is the case in mathlib), E(R) is a subset of P^2(R). If R=F is a field then Spec R is one point, so its image in P^n must lie in some affine chart, so our current definition weierstrass_curve.point (as the affine points in the (x,y) plane union a point at infinity) works, but in general P^n(R) classifies R-linear epimorphisms from R^{n+1} to an invertible R-module up to isomorphism, which doesn't admit a simple description using homogeneous coordinates (note that Eisenbud-Harris made an error here), and that's why docs#projectivization was only defined for division rings. When Pic(R)=0, all invertible modules are isomorphic to R, and by considering images of the free generators of R^{n+1} we get (n+1)-elements of R that together generates R (by surjectivity) up to simultaneous multiplication by a unit, and these give the familiar homogeneous coordinates. If R is local or a PID, all projective modules are free (finite case is easier), in particular the rank 1 (invertible) ones, so we do get coordinates. (When R is local surjectivity says at least one of the coordinates is a unit (which we may as well choose to be 1), and when R is a PID it says the GCD of the coordinates is 1.) I guess we should first develop these in mathlib before talking about E(R) over your favorite ring R. (Or maybe you can just work over the field of fractions?)
Junyan Xu (Feb 04 2023 at 07:15):
Some updates: Pic(R)=0 also holds for all UFDs (stacks#0BCH) and the argument here seems to show it holds more generally for all GCD domains. See this paper for a study of the group of points valued in finite rings, which has cryptographic applications. These rings are not necessarily domains, but since they're direct products of local rings, we still have Pic=0 and P^n(R) still admits the simple description via homogeneous coordinates.
In general, if R is a domain, K its field of fractions and X a separated scheme over R (in particular P^n_R), then X(R) → X(K) is injective, according to the argument here, because the generic point of R is dense in R. So any R-point can be represented as a K-point, though in the case of elliptic curves in Weierstrass form, it's not immediately clear X(R) is closed under addition in X(K). For surjectivity of P^n(R) → P^n(K), I think all nonzero ideals of R generated by (n+1) elements need to be invertible, which is true for valuation rings (where all f.g. ideals are principal, and surjectivity also follows from valuative criterion since P^n is proper) and for Dedekind domains. My conclusion is that for rings we care about it's probably good enough to work with P^n(K). (Maybe that has always been the plan?)
Kevin Buzzard (Feb 04 2023 at 07:42):
I guess the long term plan is to move to a scheme-theoretic definition of elliptic curve and use algebraic Riemann Roch to relate it to what we have. But for Diophantine applications there's no point going via this route
Antoine Chambert-Loir (Feb 04 2023 at 11:41):
Projective modules over a local ring are free (Kaplansky).
Stepan Nesterov (Feb 07 2023 at 02:55):
Is docs#weierstrass_curve.equation supposed to be the set of points of an elliptic curve? Because a function is definitionally the same as subset of or something?
But that can't be right, though, I need a set of all these plus a point at infinity. Is this not at docs#algebraic_geometry.elliptic_curve.weierstrass?
Junyan Xu (Feb 07 2023 at 03:21):
The type of points is defined here, it's not merged yet. (It's not quite the type of points when R isn't a field, in which case you can probably use W⟮fraction_ring R⟯
.)
Junyan Xu (Feb 07 2023 at 03:28):
#18000 still needs some polishing, but we should be able to merge #17194 relatively quickly (which also contains the definition point
), once we merge some latest changes into the branch. (Maybe @David Ang could do that soon; there are a few things I want to refactor, but that could be done in a later PR.)
Stepan Nesterov (Feb 07 2023 at 03:43):
I would like to define the reduction homomorphism in the following way:
Let be an elliptic curve over , let be a discrete valuation on , with its ring of integers and residue field .
- Define .
- Define a map by unless , in which case .
- Prove that is surjective (by clearing denominators) and injective (this is the most involved step and uses invertibility of the discriminant)
- Define the reduction map as followed by reduction.
Stepan Nesterov (Feb 07 2023 at 03:44):
Should I just make an auxilliary type for my then? It's not mathematically correct, of course, but it works fine for dvrs.
Stepan Nesterov (Feb 07 2023 at 03:45):
Also ideas which would work less painfully in Lean are appreciated, of course :)
Johan Commelin (Feb 07 2023 at 03:53):
I think an aux type for would indeed make sense
Junyan Xu (Feb 07 2023 at 05:12):
Sorry, this is a better link to the definition of point
; the link to the PR automatically hides the point.lean file because it's too long.
1 & 2. We don't have homogenization API yet in mathlib (it's in the flt-regular project I think), so you probably have to define the polynomial by hand as a term of type O[X][X][X]. You can see here for a way to introduce notation in order to write O[X][Y][Z]. I'd suggest you define E(O) as a subset of P^2(O), for which you can mimic docs#projectivization and docs#projectivization_setoid; you'd then use docs#quotient.lift to descend the equation to a predicate on the quotient and to define the map from E(O) to E(K).
- I don't see how injectivity uses the invertibility of the discriminant ... It's also pretty trivial, I think.
I suspect instead of working with the quotient type, it might be easier to work with a canonical representative for each equivalence class: if Z is a unit then set Z=1, otherwise if Y is a unit then set Y=1, and otherwise set X=1. This would also make the map from E(O) to E(k) easier to define; if Z=1 then it's (That doesn't seem pleasant to work with, unfortunately.)some X Y
and otherwise it's zero
.
Stepan Nesterov (Feb 07 2023 at 05:33):
Oh yes, never mind, weierstrass curves are always proper as we defined them, so of course is bijective.
The place where we actually need hypothesis on discriminant is actually in checking that is a homomorphism, because otherwise something reduces to a singular point of and we can't even add it to anything. This might be tricky to check depending on how exactly the definition of addition works. Is it just by explicit addition formulas?
Junyan Xu (Feb 07 2023 at 05:49):
Addition is defined in affine coordinates in multiple steps in point.lean, with add_X
, add_Y
and slope
as the key definitions: x₃ := W.add_X x₁ x₂ $ W.slope x₁ x₂ y₁ y₂
and y₃ := W.add_Y x₁ x₂ y₁ $ W.slope x₁ x₂ y₁ y₂
as you can see here. Not sure how easy it is to show E(K)->E(k) is a homomorphism ...
David Ang (Feb 07 2023 at 15:58):
Junyan Xu said:
#18000 still needs some polishing, but we should be able to merge #17194 relatively quickly (which also contains the definition
point
), once we merge some latest changes into the branch. (Maybe David Ang could do that soon; there are a few things I want to refactor, but that could be done in a later PR.)
#17194 should be more-or-less ready now.
Junyan Xu (Feb 10 2023 at 08:53):
The proof that the reduction map is a homomorphism is apparently quite complicated (see also Silverman, Proposition VII.2.1, where the discriminant can be nonzero and we restrict to points that don't reduce to the singular point), so I tried and worked out a proof using ideal class groups, since we already have an injective homomorphism from the rational points into the ideal class group (isomorphic to the Picard group for domains). However, I have trouble making the argument work in residue characteristic two.
The Picard group is a contravariant functor from schemes to abelian groups, or a covariant functor from commutative rings, so we get homomorphisms between Picard groups of coordinate rings Pic(O[E]) → Pic(K[E]) and Pic(O[E]) → Pic(k[E]) given by base change of modules, and we can show the former is injective because O is a PID (moreover, it agrees with docs#ideal.map because O → K is a localization map). Therefore, if for each P = (x,y) in E(K), the ideal class of (X-x, Y-y) lies in the image of Pic(O[E]), then we can map the element of Pic(O[E]) to Pic(k[E]), and we just need to show the final result is the ideal corresponding to the reduction of P.
Let's write P=(x,y)=[x:y:1]=[u:v:w] with u,v,w in O and at least one of them a unit. If w is a unit, then x,y are in O and we just take the ideal (X-x, Y-y) of O[E]: its product with (X-x, Y+a1*x+a3+y) is equal to the ideal generated by the two partial derivatives of the Weierstrass polynomial, which contains the discriminant, so it's the unit ideal if the discriminant is a unit. And since (X-x, Y-y) is a free O-module, it's easy to see the canonical map (X-x,Y-y) ⊗O k → (X-x~,Y-y~) is an isomorphism, where (x~,y~) is the reduction of (x,y), i.e. the image of (X-x, Y-y) in Pic(k[E]) indeed corresponds to the point (x~,y~).
If w is not a unit, then u must also not be a unit, since in the homogeneous equation, every term except u^3 contains w as a factor. Therefore v must be a unit, so P = [u:v:w] reduces to [0:1:0], the point at infinity. If w=0, then u=0 as well, so P itself is the point at infinity, corresponding to the principal ideal class, so we can just lift it to the unit ideal of O[E], which reduces to the unit ideal of k[E].
Otherwise, we have val(u) > 0, val(v) = 0, val(w) > 0, and val(y) = val(v/w) < 0. Consider the inhomogeneous equation y^2 + a1 * x * y + a3 * y = x^3 + a2 * x^2 + a4 * x + a6. If val(x) ≤ val(y) < 0, then val(RHS) = 3val(x) < 2val(x) ≤ min{2val(y), val(x)+val(y), val(y)} ≤ val(LHS), contradiction. Therefore val(x) > val(y), which implies that val(LHS) = 2val(y) and val(RHS) = 3val(x) are both < 0 and equal.
We have the freedom to choose a nonzero factor t when lifting (X-x, Y-y) to an (fractional) ideal (t(X-x), Y-y) of O[E]. We can check its product with (t(X-x), Y+a1 * x+a3+y) is the principal ideal t(X-x) multiplied by (t(X-x), Y-y, Y+a1 * x+a3+y, (X^2 + (x + a2)X + (x^2 + a2 * x + a4) − a1 * Y)/t). This ideal clearly contains 2y+a1 * x+a3, whose valuation is equal to val(y) if 2 is a unit. So this ideal is generated by the single element y if it divides all coefficients: it suffices to have val(y) ≤ val(tx) and val(y) ≤ val(x^2/t). We can take t=y/x to make both equalities hold, because 2val(y)=3val(x) (if 2 is not a unit, 2val(2y+a1 * x+a3)>3val(x), so this doesn't work).
Now we've lifted the invertible ideal (X-x, Y-y) to the invertible ideal ((y/x)(X-x), Y-y) = y(X/x-1, Y/y-1), which is in the same class as I = (bX-1, cY-1) = (bX-1, bX-cY), where b=1/x, c=1/y. Tensoring with k=O/m we get I/mI. Since b and c are in the maximal ideal m, b=b * bX and c=c * bX modulo mI. Therefore bX-cY = bX * (bX-cY) lies in mI, so I/mI is generated by (bX-1) only, hence principal, corresponding to the point at infinity, which P reduces to.
If someone could figure out how to treat the case char k = 2, we would have a complete proof.
Kevin Buzzard (Feb 10 2023 at 09:46):
I haven't thought about this too hard but can you explain "We have the freedom to choose a nonzero factor t when lifting"? y isn't in O, I'm a bit confused about what (t(X-x),Y-y) means.
Junyan Xu (Feb 10 2023 at 14:39):
I wrote "(fractional) ideal", which is a kind of O[E]-submodule of Frac(O[E])=Frac(K[E]), so it makes sense to write generators in K[E]; of course we could replace (t(X-x), Y-y) by an integral ideal in the same class: just multiply it by some common denominator of tx and y, and that's what I did when I wrote ((y/x)(X-x), Y-y) = y(X/x-1, Y/y-1) and replaced it with the integral ideal (X/x-1, Y/y-1) in the same class.
Kevin Buzzard (Feb 10 2023 at 20:05):
I'm just missing something really stupid. Why isn't it (X - x, Y - y)? Where is this t coming from?
Junyan Xu (Feb 10 2023 at 22:51):
You can take (X-x, Y-y) (and it's what I took when w is a unit), but when w isn't a unit, I think the resulting fractional ideal isn't invertible. We want to extend the invertible sheaf / projective module (X-x, Y-y) on Spec K[E] to a projective module on Spec O[E], but not every extension works; actually we have the freedom to choose a,b,c,d in GL(2,K) and use (a(X-x)+b(Y-y), c(X-x)+d(Y-y)) which all boils down to (X-x, Y-y) when you tensor with K, but I think most choices won't give you projective modules.
I think the most natural choice is Hom_{O[E]}((X-x, Y+a1*x+a3+y), O[E]) (sort of double dual) and it should work if anything works at all, but I need to work out the details. (I think it amounts to computing a colon ideal, or the intersection of two principal ideals.) Another approach is to decompose K[E]^2 into a direct sum of (X-x, Y-y) with another summand, and see how we can lift it to O[E], or maybe try to lift the idempotent matrix representing the projection to a summand.
Another thought is that maybe it suffices to come up with a lift that becomes projective after tensoring with k (then it should be free, isomorphic to k[E] since it's supposed to be principal); since tensoring with K already produces an invertible module (X-x, Y-y), this probably implies the lift is itself projective, or else we might come up with a substitute for Pic(O[E]) that includes more than invertible modules.
Junyan Xu (Feb 11 2023 at 05:07):
Let me add if the discriminant is a unit in O then Spec O[E] should be regular, and so Hom_{O[E]}((X-x, Y+a1*x+a3+y), O[E]) should indeed be projective, according to stacks#0AV3 (dual modules over a Noetherian domain is reflexive), this MO answer (reflexive implies projective if global dimension <=2) and stacks#00OE (regular of Krull dimension 2 implies global dimension 2).
We don't really need to formalize these advanced results for our purpose, as it suffices to come up with the correct candidate for the lift and verify it's indeed invertible. But it would be nicer if we could prove the statement on Silverman which allows singularities, saying that the points of E(K) that reduces to nonsingular points of E(k) form a subgroup, and the reduction map is still a homomorphism.
Kevin Buzzard (Feb 11 2023 at 10:41):
It's important to allow singularities if you want to use these techniques to go on and formalise the formal group; in that approach you keep scaling the a_i by pi^Ni and then reduce, in order to understand the solutions to the equation near the origin
Junyan Xu (Feb 12 2023 at 20:29):
I've now computed an example (in mixed characteristic 2) of the dual module in Macaulay2, and verified it's indeed invertible. Hopefully it's not too hard to work out the general form of the lift from here. It might be necessary to compute an example in equal characteristic 2 to figure out the general form there.
Macaulay2, version 1.21
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, Isomorphism, LLLBases, MinimalPrimes, OnlineLookup, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone
i1 : R = ZZ[x,y]/(y^2-x^3)
o1 = R
o1 : QuotientRing
// This is supposed to be O[W], where O is a DVR with 2 non-invertible
// but for convenience we use Z rather than its localization at (2)
i4 : intersect(ideal(4*x-1), ideal(8*y-1))
// (4x-1,8y-1) corresponds to the point (1/4,1/8) on y^2=x^3
// we want to compute the dual of (4x-1,8y-1), which is isomorphic to the intersection of
// the fractional ideals generated by the inverse of (4x-1) and by the inverse of (8y-1),
// which is isomorphic to the intersection of the integral ideals (8y-1) and (4x-1).
o4 = ideal(64y2−1,8xy+16y2−x−2y)
// factorizes as (8y-1)*(8y+1,x+2y)
// so (8y+1,x+2y) is the candidate for an invertible lift of (x-1/4,y+1/8)
// it's clearly a lift: x-1/4 = -1/4*(8y+1)+(x+2y), y+1/8 = 1/8*(8y+1)
o4 : Ideal of R
i5 : ideal(8*y+1,x+2*y)*ideal(8*y-1,x-2*y)
// Take the product of candidate lift of (x-1/4,y+1/8) with lift of (x-1/4,y-1/8)
o5 = ideal(64y2−1,8xy−16y2+x−2y,8xy+16y2−x−2y,x2−4y2)
o5 : Ideal of R
i7 : mingens I
// simplify the ideal
o7 = (4x−1,32y2+2x−1,x2−4y2)
// This ideal is actually principal, generated by (4x-1): x^2-4y^2 = x^2-4x^3 = -x(4x-1),
// and 32y^2+2x-1 = 8x^2+2x-1 - 8(x^2-4y^2), 8x^2+2x-1 = (4x-1)(2x+1),
// but Macaulay2 isn't able to simplify it to a single generator, or I haven't found the way.
// Therefore, both the lifts (8*y+1,x+2*y) and (8*y-1,x-2*y) are invertible.
Junyan Xu (Feb 12 2023 at 21:00):
Actually I first tried Singular (on CoCalc) but didn't have as much luck:
SINGULAR / Development
A Computer Algebra System for Polynomial Computations / version 4.1.1
0<
by: W. Decker, G.-M. Greuel, G. Pfister, H. Schoenemann \ Feb 2018
FB Mathematik der Universitaet, D-67653 Kaiserslautern \ Debian 1:4.1.1-p2+ds-4build2
> ring R = integer,(x,y), dp;
> ideal w = y2 - x3;
> qring r = std(w);
> ideal i = 4x-1;
> ideal j = 8y-1;
> intersect(i,j);
_[1]=32xy-4x-8y+1
_[2]=8x2y+16xy2-x2-2xy
// factorizes as (8y-1)*(4x-1,x(x+2y))
// multiply the second factor with its conjugate and simplify:
> ideal a = 4x-1, x*(x+2y);
> ideal b = 4x-1, x*(x-2y);
> simplify(a*b,63);
_[1]=16x2-8x+1 = (4x-1)^2
_[2]=4x3-8x2y-x2+2xy = x(4x-1)(x-2y)
_[3]=x4-4x2y2 = x^2(x+2y)(x-2y)
// doesn't include the generator (4x-1)
// but Macaulay2 is able to find (4x-1) also in this ideal:
i9 : J = ideal(4*x-1, x*(x+2*y)) * ideal(4*x-1, x*(x-2*y))
o9 = ideal(16x2−8x+1,−8x2y−x2+2xy+4y2,8x2y−x2−2xy+4y2,−4x2y2+xy2)
o9 : Ideal of R
i10 : mingens J
o10 = (4x−1,16y2−x,x2−4y2)
Junyan Xu (Feb 12 2023 at 21:13):
I'm guessing (X/x-1, X-xY/y) may be invertible in any characteristic (in the w nonunit case)
Junyan Xu (Feb 21 2023 at 08:26):
Junyan Xu said:
We have the freedom to choose a nonzero factor t when lifting (X-x, Y-y) to an (fractional) ideal (t(X-x), Y-y) of O[E]. We can check its product with (t(X-x), Y+a1 * x+a3+y) is the principal ideal (X-x) multiplied by (t^2(X-x), t(Y-y), t(Y+a1 * x+a3+y), X^2 + (x + a2)X + (x^2 + a2 * x + a4) − a1 * Y).
I think I worked this out. It turns out t=y/x always works, even in residue characteristic 2. Since val(t^2)=val(x), we actually have xX=x^2 and X^2=x^2 modulo t^2(X-x), so the last generator can be replaced by 3x^2+2a2 * x+a4-a1 * y, whose valuation is 2val(x) if 3 is invertible, and it's easy to check this is the lowest valuation among all coefficients of all generators. The case where 2 is invertible was handled before (where we can get val(t(2 * y + a1 * x + a3)) = 2val(x)). Since one of 2 and 3 must be invertible (in any local ring, in particular DVR), we conclude this ideal is always generated by x^2.
Junyan Xu (Feb 21 2023 at 08:29):
Fun exercise: for a long time I thought (4x-1,8y-1) and (4x-1,x-2y) are different ideals, but in fact they're the same ...
Junyan Xu (Feb 21 2023 at 08:35):
I'll try to work out a skeleton of a formalization in the following days; the first step would be to define Picard group of a comm_ring, which I plan to call Pic_Spec, and will be defined to be projective modules of rank 1 (free of rank 1 when localized at any prime ideal) (update: let's use modules with a tensor-inverse) up to isomorphism (which lives in one universe higher), and then show that for an integral domain this is isomorphic to docs#class_group, and the functoriality of Pic_Spec.
David Ang (Feb 22 2023 at 01:35):
David Michael Roberts said:
It would be nice if this proof was summarised for instance in a blog post, or I guess an article on the arXiv would be even more stable.
https://arxiv.org/abs/2302.10640
Junyan Xu (Feb 22 2023 at 03:46):
Thanks to David for writing the paper (I mostly only provided suggestions based on his draft) and taking care of submission (to ITP 2023 and arXiv)!
It's not a purely mathematical exposition as some people might hope, but also includes formalization details. I might consider writing a mathematical exposition of the associativity proof together with the reduction homomorphism proof (or any other applications we find in the meantime), to be titled (tentatively) "Ideal treatment of elliptic curves".
Oliver Nash (Feb 22 2023 at 10:16):
It's great to see this paper. Congratulations @David Ang and @Junyan Xu ! I for one would be very interested to read the hypothetical companion "Ideal treatment of elliptic curves" paper.
Junyan Xu (Mar 07 2023 at 02:52):
Sorry for the belated update. I discovered two gaps when I started to formalize the arguments, but fortunately they can be fixed while keeping the argument elementary. The same constructions are used to fill both gaps, so the proof doesn't get much longer, and still seems simpler than Silverman's or Knapp's proof (on paper or to formalize).
Recall our goal is to show that if W is a Weierstrass curve over a DVR R with field of fractions K and residue field k=R/p, then the points in W(K) that reduce to nonsingular points in W(k) form a subgroup of the group of nonsingular points in W(K), and the reduction map is a homomorphism when restricted to this subgroup. (In particular, if we take R to be the localization of Z at a prime p, then we get the reduction homomorphisms from W(Q) to W(Z/pZ). More generally, for any integral domain R that is regular at a height 1 prime p, we get a reduction homomorphism W(Frac(R)) → W(Frac(R/p)).)
The first gap is the injectivity of Pic(R[W]) → Pic(K[W]) induced by the inclusion of coordinate rings. My original argument in mind is too simple to work, beause it doesn't even use invertibility of the R[W]-module. It's true according to a theorem of Raynaud in EGA IV addenda because Pic(R)=0. Fortunately, it is possible to extract an elementary argument in terms of rings and fractional ideals from the scheme-theoretic proof involving Cartier divisors.
The key theorem EGA IV 21.1.8 translates to the following: to check two fractional ideals are equal, it suffices to consider their localizations at the primes whose local rings have depth 1. In our situation, these are either the generic point of the special fiber Spec R[W], or contained in the generic fiber Spec K[W], so in particular we have that R[W] is the intersection of K[W] with the localization R[W]_p at the prime ideal p generated by the uniformizer of R (in which all polynomials with some coefficient not divisible by p are inverted). (p is prime because the quotient (R/p)[W] is an integral domain, corresponding to the condition that the fibers are integral in Raynaud's theorem.) This can be checked directly, and it's easy to see that R[W]_p is also a DVR with the same uniformizer as R, using the fact that R[W] is a free module over R, which implies the faithfully flat condition in Raynaud's theorem.
Now, for an invertible fractional ideal I of R[W] such that KI is a principal K[W]-module (i.e. its image in Pic(K[W]) is trivial), we may divide I by the generator and assume KI = K[W]. On the other hand, R[W]_p I must be generated by a (integral) power of the uniformizer, and we may divide I by the power and assume R[W]_p I is the unit ideal. The power is a nonzero constant and hence a unit in K[W], so we still have KI = K[W]; K[W] and R[W]_p are "orthogonal" in some sense. Therefore, I is contained in both KI = K[W] and R[W]_p I = R[W]_p and therefore contained in R[W]. On the other hand, I⁻¹ = R[W]/I ⊆ (K[W]/I) ∩ (R[W]_p/I) = (K[W]/KI) ∩ (R[W]_p/R[W]_p I) = KI⁻¹ ∩ (R[W]_p I)⁻¹ = K[W] ∩ R[W]_p = R[W], where M/N here means docs#submodule.has_div with R := R[W] and A := Frac(K[W]), so I = I⁻¹⁻¹ ⊇ R[W] as well. This closes the first gap.
As for the second gap, recall my earlier argument showed that if two points of W(K) both reduce to nonsingular points in W(k), AND if their sum also reduces to a nonsingular point, then the sum of the reductions equals the reduction of the sum. But in order for the reduction-nonsingular points to form a group, we need to show the second condition is unnecessary, i.e. the sum of two reduction-nonsingular points is always reduction-nonsingular. (If W/k is an elliptic curve or more generally if W(k) contains no singular point, then this is vacuously true.) This is a more general version of docs#weierstrass_curve.nonsingular_add.
The idea is to show that if (x,y) is nonsingular but reduction-singular, then the line bundle (X-x,Y-y) on Spec K[W] does not extend to a line bundle on Spec R[W], i.e. it's not in the image of Pic(R[W]) → Pic(K[W]). Once we have this, since we already constructed such an extension when (x,y) is reduction-nonsingular, if (x,y) is the sum of two reduction-nonsingular points, then (X-x,Y-y) extends to the tensor product of the extensions for those two points, so (x,y) can't be reduction-singular.
My proof of non-extendability proceeds by first realizing the hypothetical extension as an invertible fractional ideal I of R[W] using class_group R[W] ≃* Pic_Spec R[W]
, and then modify I like before so that KI is the K[W]-ideal (X-x,Y-y) and R[W]_p I = R[W]_p, while keeping I invertible. I is contained in the intersection of (X-x,Y-y) and R[W]_p, which is contained in K[W] ∩ R[W]_p = R[W]. But an element of K[W] lies in the K[W]-ideal (X-x,Y-y) iff it evaluates to 0 at (x,y), and the same is true for an element of R[W] to lie in the R[W]-ideal (X-x,Y-y). Therefore the intersection is the R[W]-ideal (X-x,Y-y) and it contains I. On the other hand, I⁻¹ ⊆ KI⁻¹ ∩ (R[W]_p I)⁻¹ = (X-x)⁻¹(X-x,Y-y') ∩ R[W]_p = (X-x)⁻¹((X-x,Y-y') ∩ R[W]_p), which equals the R[W]-submodule (X-x)⁻¹(X-x,Y-y'), where (x,y') is the conjugate of (x,y). Therefore, II⁻¹ is contained in (X-x)⁻¹(X-x,Y-y)(X-x,Y-y') = (X-x,Y-y,W_X(x,y),W_Y(x,y)) [*], but this can't be the unit ideal in R[W], because neither of the derivatives W_X(x,y) and W_Y(x,y) can be a unit in R, because the reduction of (x,y) to k=R/p is singular. This contradicts the invertibility of I.
[*] This computation is used in XY_ideal_neg_mul
, and will be used to show (X-x,Y-y) is an invertible R[W]-module if (x,y) is reduction-nonsingular. A more general version with X-x replaced by t(X-x) was used to show ((y/x)(X-x),Y-y) is invertible when (x,y) reduces to O.
I am posting these in the hope that people might come up with some simplifications.
Junyan Xu (Mar 07 2023 at 03:05):
I came up with these proofs last Monday but didn't have much time to continue the formalization. I defined Pic_Spec last week, and plan to add facts about base change of modules and projective modules next (to un-sorry the Pic_Spec functor and the isomorphism with class_group), and then a robust API around the rings R[W], K[W] and R[W]_p and submodules inside Frac(K[W]).
Those needed facts about tensor product, base change of modules and projective modules are:
- tensoring with an arbitrary module is right exact (two statements: surjectivity, and kernel = image), to show
(R/I) ⊗[R] M ≃ₗ[R/I] M/IM
. - tensoring with a projective module preserves injectivity (i.e. projective modules are flat), used to embed invertible R-modules into K, and to show the multiplication in Pic_Spec agrees with that in class_group.
- invertible modules and invertible fractional ideals are projective (working on this)
A ⊗[R] R ≃ₗ[A] A
andA ⊗[R] (M ⊗[R] N) ≃ₗ[A] (A ⊗[R] M) ⊗[A] (A ⊗[R] N)
for[algebra R A]
, used to showPic_Spec
is a functor andK ⊗[R] M
is 1-dimensional for an invertible R-module M. I'm using docs#category_theory.Module.extend_scalars by @Jujian Zhang so I'm wondering if you already have these somewhere.
Kevin Buzzard (Mar 07 2023 at 07:14):
In Cassels' book the fact that the sum of two things-with-nonsingular-reduction has nonsingular reduction is: the line through them reduces to the line through the reductions, and a line through two nonsingular points hits the cubic at a third nonsingular point.
Kevin Buzzard (Mar 07 2023 at 07:17):
PS seeing posts like this and other potential nontrivial mathematics which we're dealing with now in algebraic geometry was basically beyond my (and I think everyone in the community's) wildest dreams five years ago.
Junyan Xu (Mar 07 2023 at 08:10):
In Cassels' book the fact that the sum of two things-with-nonsingular-reduction has nonsingular reduction is: the line through them reduces to the line through the reductions, and a line through two nonsingular points hits the cubic at a third nonsingular point.
I think this is more or less what Knapp/Langlands and Silverman do as well. The problem is that you need to count multiplicities when the two points agree upstairs or downstairs. Silverman does a case bash and relegate many cases into exercises. Knapp/Langlands apparently avoids that but the proof also doesn't look trivial to formalize.
Bolton Bailey (Apr 09 2023 at 12:53):
Hello elliptic curve people. I have been going over my project thinking of ways I could extend it, and it occurred to me I've seen a lot of discussion about elliptic curves here. While the cryptography that is formalized there is based on elliptic curve pairings, I avoided formalizing the actual elliptic curve operations that would give you the ability to run the protocols. I have a few questions about what it would take to make that happen.
It seems mathlib doesn't have implementations of the Weil or Tate pairings. How much code would be needed to implement those and prove the bilinearity property? Would it be a big undertaking on the scale of your all's efforts to prove the group law, or would it be easier somehow? On computability for elliptic curves - is it actually possible to execute code that computes the elliptic curve, or are there issues with that like there are for polynomials? Has any of this code been ported to lean 4 yet?
Bolton Bailey (Apr 09 2023 at 12:54):
Sorry if this is the wrong place the ask this, I know elliptic curve pairings obviously are based on elliptic curves, but I don't know how familiar or interesting they are to people outside cryptography.
Kevin Buzzard (Apr 09 2023 at 12:58):
I would imagine that what we have for elliptic curves is optimised for proving rather than running. For example you probably don't want to use the full Weierstrass form if you actually want to do computations. On the other hand you can prove that every elliptic curve over every field can be put into this form, which is something we want.
Reid Barton (Apr 09 2023 at 12:59):
Computability is never a problem, you write the code you actually want to run and then prove it is equal to the mathlib definition. (You probably want to do this even if the mathlib definition is already computable.)
Kevin Buzzard (Apr 09 2023 at 13:03):
The problem is that already the mathlib definition of elliptic curve is suboptimal unless you actually care about characteristic <= 3
Yaël Dillies (Apr 09 2023 at 13:06):
Surely it's just a matter of writing a norm_num
extension for the cases you care about?
Kevin Buzzard (Apr 09 2023 at 14:09):
The issue is that the very model is wrong for computation.
Reid Barton (Apr 09 2023 at 14:25):
Are you talking about the specification of the curve itself a1 ... a6
?
Kevin Buzzard (Apr 09 2023 at 14:32):
Right. If you want to do fast computation then you use the Edwards curve model.
David Ang (Apr 09 2023 at 14:56):
There are implementations of pairing-based cryptography for elliptic curves over fields of characteristic two, and some of them do use the full Weierstrass form minus a few coefficients
Reid Barton (Apr 09 2023 at 15:09):
In this case, it's even more irrelevant whether the stuff in mathlib is computable.
David Loeffler (Apr 09 2023 at 15:28):
Just to pick up on a part of the question which didn't get (explicitly) answered:
Bolton Bailey: It seems mathlib doesn't have implementations of the Weil or Tate pairings. [...] Sorry if this is the wrong place the ask this, I know elliptic curve pairings obviously are based on elliptic curves, but I don't know how familiar or interesting they are to people outside cryptography.
The Weil pairing is definitely totally familiar and essential all over the theory of elliptic curves, it's not at all specific to cryptography. The Tate pairing (as understood in crypto) is perhaps a bit more specialised.
Bolton Bailey (Apr 09 2023 at 15:30):
Ok cool, well then maybe implementing it might have some more broad appeal. I'm not too concerned about making a super performant cryptographic implementation, I more just think it would be nice to round out my project and see if it would be useful to any of the people I know of who are interested in formal methods for cryptographic proof systems.
Jz Pan (Jul 11 2023 at 19:54):
Sorry to bump this message again. I see that the elliptic curve codes are already ported to mathlib4.
If possible I'd like to contribute something starting from it.
Jz Pan (Jul 11 2023 at 19:55):
More specifically, I'd like to find out if there are useful things in my previous attempts https://github.com/acmepjz/my-lean-test/blob/master/gtm106/ which worth to add to mathlib.
Jz Pan (Jul 12 2023 at 00:03):
Currently my work has the following which are not present in mathlib4 elliptic curves:
- Weierstrass curve (over a field) which have a node or cusp, and the criteria of them (namely disc=0 and c4!=0, resp. c4=0)
- change of variable defined in a structure (which allows to define the type of isomorphisms between two Weierstrass curves, of type of isomorphisms of a Weierstrass curve)
- normal forms of Weierstrass curves of char not 2 or 3 (resp. char 3, and char 2); normal form of singular Weierstrass curve
- for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure
Unfortunately, some of them require lengthy calculations in char 2 or 3 or not-perfect case.
Scott Morrison (Jul 12 2023 at 00:18):
It sounds like you have a lot of material, and that it covers material Mathlib certainly wants to have. It also sounds like it would be many separate PRs (i.e. at least one PR per phrase in your bullet points above), so I would advise just starting with the simplest and easier chunk that would work as a standalone PR.
Do be aware that in starting the review process of PRs, very likely reviewers will ask for some design changes, and this may constitute a fair bit of work to adapt all your later material for this. Obviously do explain if proposed changes would cause problems downstream, but also try to take the attitude that big refactors are good, and that the review process, while sometimes a lot of work, does work!
Scott Morrison (Jul 12 2023 at 00:19):
@Jz Pan, I've just sent you a github invite, which should give you the ability to create a new branch on the mathlib4
repository.
David Ang (Jul 12 2023 at 00:26):
@Jz Pan I have had a quick glance at your old repository.
- Weierstrass curve (over a field) which have a node or cusp, and the criteria of them (namely disc=0 and c4!=0, resp. c4=0)
I think these definitions would be nice in Weierstrass.lean
. What eventual theorem do you have in mind?
- change of variable defined in a structure (which allows to define the type of isomorphisms between two Weierstrass curves, of type of isomorphisms of a Weierstrass curve)
We already have this as WeierstrassCurve.variableChange
, etc. unless I'm missing something.
- normal forms of Weierstrass curves of char not 2 or 3 (resp. char 3, and char 2); normal form of singular Weierstrass curve
These are great! I have some normal forms for char 2 in my old mathlib3 branch but yours is a lot more complete.
- for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure
You should PR the former! I think benjeffers (from the Xena Discord) is working on the latter and I'll let them know.
Something I noticed is that your repo seems to have a good projective API, whereas mine only defines affine points. Maybe you could also add these (perhaps using the new Projectivization
API), and a good way to transfer between the two? In what I'm working right now I need "Jacobian coordinates" (as Sutherland calls it) where the curve lives in weighted projective space (x, y, z) -> (l^2x, l^3y, lz), and I don't know a clean way to do this just yet, so I'm hoping your usual projective version would give me a better idea.
AFAIK other ongoing projects include: @Junyan Xu is defining the reduction map over a DVR, and I'm defining division polynomials which requires Jacobian coordinates to prove anything useful.
Jz Pan (Jul 12 2023 at 01:01):
Scott Morrison said:
It sounds like you have a lot of material, and that it covers material Mathlib certainly wants to have. It also sounds like it would be many separate PRs (i.e. at least one PR per phrase in your bullet points above), so I would advise just starting with the simplest and easier chunk that would work as a standalone PR.
Sure, I'll reorganize and upgraded the code to lean4, and divided them into small chunks.
Do be aware that in starting the review process of PRs, very likely reviewers will ask for some design changes, and this may constitute a fair bit of work to adapt all your later material for this. Obviously do explain if proposed changes would cause problems downstream, but also try to take the attitude that big refactors are good, and that the review process, while sometimes a lot of work, does work!
That's OK. In fact my code are _just works_ without much design consideration; I'm open to discussions and suggestions.
I've just sent you a github invite
Thank you very much, I'll look at them when I have spare time.
Jz Pan (Jul 12 2023 at 01:21):
David Ang said:
- Weierstrass curve (over a field) which have a node or cusp, and the criteria of them (namely disc=0 and c4!=0, resp. c4=0)
I think these definitions would be nice in
Weierstrass.lean
. What eventual theorem do you have in mind?
Currently I have the following theorems (only for Weierstrass curve over fields, not sure if they are meaningful over a ring):
- if disc is not zero then all points on the curve are non-singular (which already in mathlib4)
- conversely, if all points are non-singular, under some field characteristic and perfectness assumptions, disc is not zero
- there are at most one singular point on a Weierstrass curve
- if disc=0 and c4 not zero (resp. c4=0 plus some field characteristic and perfectness assumptions), it has a rational node (resp. cusp)
- if it has a rational node (resp. cusp), then disc=0 and c4 not zero (resp. c4=0)
- change of variable defined in a structure (which allows to define the type of isomorphisms between two Weierstrass curves, of type of isomorphisms of a Weierstrass curve)
We already have this as
WeierstrassCurve.variableChange
, etc. unless I'm missing something.
I see. In my opinion, is it better if we define a structure which consists of u,r,s and t? Since variable change can be composed, inverted, etc. (which are useful to define the automorphism group of an elliptic curve), which is cumbersome to define if you don't define it as a structure, but mention u,r,s,t separately everywhere.
- normal forms of Weierstrass curves of char not 2 or 3 (resp. char 3, and char 2); normal form of singular Weierstrass curve
These are great! I have some normal forms for char 2 in my old mathlib3 branch but yours is a lot more complete.
- for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure
You should PR the former! I think benjeffers (from the Xena Discord) is working on the latter and I'll let them know.
These two are the parts which use polynomial calculus heavily (especially the latter).
Something I noticed is that your repo seems to have a good projective API, whereas mine only defines affine points. Maybe you could also add these (perhaps using the new
Projectivization
API), and a good way to transfer between the two? In what I'm working right now I need "Jacobian coordinates" (as Sutherland calls it) where the curve lives in weighted projective space (x, y, z) -> (l^2x, l^3y, lz), and I don't know a clean way to do this just yet, so I'm hoping your usual projective version would give me a better idea.
I have defined projective points and the conversions between them and affine points. Current mathlib4 elliptic curve only has affine points. Nevertheless, it looks like affine points API is enough (I don't remember if I actually use projective points to prove anything). Maybe I'll experiment them later.
David Ang (Jul 12 2023 at 11:08):
I don’t mind changing variableChange
into a structure. You can make a PR and we’ll see how it goes.
Jz Pan (Jul 12 2023 at 22:37):
I've just submitted the PR which changes variableChange
into a structure: https://github.com/leanprover-community/mathlib4/pull/5841 . By the way, I created the branch acmepjz_ec_variable_change
which turns out to be wrong. Could someone help me to delete that branch? Thanks.
Junyan Xu (Jul 12 2023 at 22:57):
You should be able to find your branch at https://github.com/leanprover-community/mathlib4/branches and delete it.
Jz Pan (Jul 16 2023 at 10:30):
David Ang said:
- for each j there is an elliptic curve with j-invariant j; elliptic curves with same j are isomorphic over algebraic closure
You should PR the former! I think benjeffers (from the Xena Discord) is working on the latter and I'll let them know.
I have submitted a draft PR https://github.com/leanprover-community/mathlib4/pull/5935 for the first part.
Jz Pan (Jul 16 2023 at 10:31):
This is mostly done, but I meet a problem in the final theorem, and need some help. I'm trying to make an MWE, but due to if's in the code I failed to make it shorter:
import Mathlib.AlgebraicGeometry.EllipticCurve.Weierstrass
open scoped Classical
universe u
variable (R : Type u) [CommRing R] (F : Type u) [Field F]
namespace WeierstrassCurve
/-- The Weierstrass curve $Y^2 + Y = X^3$. -/
def ofJInvariant0 : WeierstrassCurve R :=
⟨0, 0, 1, 0, 0⟩
lemma ofJInvariant0_c₄ : (ofJInvariant0 R).c₄ = 0 := by
simp only [ofJInvariant0, c₄, b₂, b₄]
ring1
lemma ofJInvariant0_Δ : (ofJInvariant0 R).Δ = -27 := by
simp only [ofJInvariant0, Δ, b₂, b₄, b₆, b₈]
ring1
/-- The Weierstrass curve $Y^2 = X^3 + X$. -/
def ofJInvariant1728 : WeierstrassCurve R :=
⟨0, 0, 0, 1, 0⟩
lemma ofJInvariant1728_c₄ : (ofJInvariant1728 R).c₄ = -48 := by
simp only [ofJInvariant1728, c₄, b₂, b₄]
ring1
lemma ofJInvariant1728_Δ : (ofJInvariant1728 R).Δ = -64 := by
simp only [ofJInvariant1728, Δ, b₂, b₄, b₆, b₈]
ring1
end WeierstrassCurve
namespace EllipticCurve
/-- The elliptic curve $Y^2 + Y = X^3$ when $3$ is invertible. -/
def ofJInvariant0 [Invertible (3 : R)] : EllipticCurve R :=
⟨
WeierstrassCurve.ofJInvariant0 R,
⟨-27, -(⅟3) ^ 3,
by
rw [calc (27 : R) = (3 : R) ^ 3 := by norm_num]
simp only [mul_neg, neg_mul, neg_neg]
exact pow_mul_pow_eq_one 3 (mul_invOf_self (3 : R))
, by
rw [calc (27 : R) = (3 : R) ^ 3 := by norm_num]
simp only [mul_neg, neg_mul, neg_neg]
exact pow_mul_pow_eq_one 3 (invOf_mul_self (3 : R))
⟩,
by rw [WeierstrassCurve.ofJInvariant0_Δ R]
⟩
lemma ofJInvariant0_j [Invertible (3 : R)] : (ofJInvariant0 R).j = 0 := by
simp only [j, ofJInvariant0, WeierstrassCurve.ofJInvariant0_c₄]
ring1
/-- The elliptic curve $Y^2 = X^3 + X$ when $2$ is invertible. -/
def ofJInvariant1728 [Invertible (2 : R)] : EllipticCurve R :=
⟨
WeierstrassCurve.ofJInvariant1728 R,
⟨-64, -(⅟2) ^ 6,
by
rw [calc (64 : R) = (2 : R) ^ 6 := by norm_num]
simp only [mul_neg, neg_mul, neg_neg]
exact pow_mul_pow_eq_one 6 (mul_invOf_self (2 : R))
, by
rw [calc (64 : R) = (2 : R) ^ 6 := by norm_num]
simp only [mul_neg, neg_mul, neg_neg]
exact pow_mul_pow_eq_one 6 (invOf_mul_self (2 : R))
⟩,
by rw [WeierstrassCurve.ofJInvariant1728_Δ R]
⟩
lemma ofJInvariant1728_j [Invertible (2 : R)] : (ofJInvariant1728 R).j = 1728 := by
simp only [j, ofJInvariant1728, WeierstrassCurve.ofJInvariant1728_c₄]
field_simp
ring1
noncomputable def ofJInvariant0' : EllipticCurve F := by
by_cases hchar3 : (3 : F) = 0
· have : Invertible (2 : F) := by
apply invertibleOfNonzero
rw [←hchar3]
apply_fun (· - 2)
norm_num
exact zero_ne_one
exact ofJInvariant1728 F
· have : Invertible (3 : F) := invertibleOfNonzero hchar3
exact ofJInvariant0 F
lemma ofJInvariant0'_j : (ofJInvariant0' F).j = 0 := by
by_cases hchar3 : (3 : F) = 0
· have : Invertible (2 : F) := by
apply invertibleOfNonzero
rw [←hchar3]
apply_fun (· - 2)
norm_num
exact zero_ne_one
simp only [ofJInvariant0', hchar3, dite_true]
have := ofJInvariant1728_j F
rw [calc (1728 : F) = (3 : F) * (576 : F) := by norm_num, hchar3, zero_mul] at this
-- FIXME: does not work
exact this
· have : Invertible (3 : F) := invertibleOfNonzero hchar3
simp only [ofJInvariant0', hchar3, dite_false]
-- FIXME: does not work
exact ofJInvariant0_j F
end EllipticCurve
Jz Pan (Jul 16 2023 at 10:33):
It produces weird error messages:
type mismatch
this
has type
j (ofJInvariant1728 F) = 0 : Prop
but is expected to have type
j (ofJInvariant1728 F) = 0 : Prop
and
type mismatch
ofJInvariant0_j F
has type
j (ofJInvariant0 F) = 0 : Prop
but is expected to have type
j (ofJInvariant0 F) = 0 : Prop
I don't understand where does it go wrong.
Jz Pan (Jul 16 2023 at 10:35):
Also, any suggestions on the proof that -27 and -64 are units of R? Currently my code works, but looks stupid.
Scott Morrison (Jul 16 2023 at 10:35):
You might try turning on more pretty printer options (e.g. set_option pp.all true
for the most extreme version).
Scott Morrison (Jul 16 2023 at 10:35):
Also convert this
rather than exact this
can be helpful for diagnosing why Lean doesn't think the type matches.
Jz Pan (Jul 16 2023 at 10:39):
Ah, convert this
works! Thank you very much.
Jz Pan (Aug 07 2023 at 10:12):
Is it possible to overwrite the ext lemma generated by the system? Currently it is
theorem EllipticCurve.ext {R : Type u} :
∀ {inst : CommRing R} (x y : EllipticCurve R),
x.a₁ = y.a₁ → x.a₂ = y.a₂ → x.a₃ = y.a₃ → x.a₄ = y.a₄ → x.a₆ = y.a₆ → x.Δ' = y.Δ' → x = y
But clearly x.Δ' = y.Δ'
is not necessary.
Scott Morrison (Aug 07 2023 at 10:13):
You should just write your own ext
lemma rather than using the attribute.
Ruben Van de Velde (Aug 07 2023 at 10:14):
And in particular, you can stick @[ext]
on that manually written theorem rather than on the structure
Jz Pan (Aug 07 2023 at 10:20):
Thanks. If I provide ext
, can I still ask the system to generateext_iff
?
Scott Morrison (Aug 07 2023 at 12:06):
Unfortunately no, you need to that yourself.
David Ang (Aug 07 2023 at 12:20):
I definitely wanted to do this originally, but I didn't want to write the ext_iff
manually
Junyan Xu (Aug 08 2023 at 04:51):
Recently I had the chance to think deeply about the treatment of reduction on Cassels' book and work out many details, and have been able to convince myself the approach does work; in fact, both map_add
and nonsingularity of reduction of the sum can be proven relatively easily with the approach (probably easier than my former Picard group approach), and the sophisticated multiplicity-based treatments by Silverman/Knapp-Langlands are unnecessary, contrary to what I previously thought. This approach involves homogenization of 2-variable polynomials, and defining reduction definitely requires projective coordinates, which people seem to be developing recently. Thus I'd like to write down a detail exposition of the proofs here to pin down what exactly are needed, so as to better coordinate the efforts.
Junyan Xu (Aug 08 2023 at 04:52):
The basic setting consists of a valuation ring R, its field of fractions K, its maximal ideal m, and the residue field k. Let W be a Weierstrass curve with equation f(X,Y)=0
with coefficients in R, and let F(X,Y,Z)
be the homogenization of `f(X,Y). Explicitly,
It's easy to verify the projective solutions of F(X,Y,Z)=0
exactly consists of affine points [x:y:1] (where (x,y) are solutions of f(X,Y)=0
) and the point at infinity O=[0:1:0] (Lemma 1). Every point (singular or not) in W(K)
has some projective coordinate ; one or more of x,y,z has minimum valuation and is a GCD of x,y,z. Dividing by the GCD, we may assume one of x,y,z is a unit (or actually 1, if necessary), and x,y,z all lie in R. Any two such choices of x,y,z differ by a unit in R, so the reduction is a well-defined point in that is a solution of the reduced homogeneous polynomial (which is the homogenization of ), and by Lemma 1 again this is either an affine point or O.
Now let P1 and P2 be two arbitrary points in W(K)
. If both points are affine and P1 ≠ -P2, we know from docs#WeierstrassCurve.addPolynomial_slope that there exists a linear function l(X) = kX + b
such that f(X, l(X)) = -(X-x1)(X-x2)(X-x3)
, where P1 = (x1,y1)
, P2 = (x2,y2)
and P3 = -(P1 + P2) = (x3,y3)
. Homogenization of this factorization yields L(X,Z) = kX + bZ
and . To be able to reduce the LHS, we need L to have coefficients (k and b) in R, and when this is not the case, we can transform the equation Y = kX + bZ
to X = (1/k)Y - (b/k)Z
if k
has minimal valuation, and Z = (1/b)Y - (k/b)X
if b
(but not k
) has minimal valuation. In the former case, letting L'(Y,Z) = (1/k)Y - (b/k)Z
, we obtain , and the latter case is similar. We need to check the following key fact is preserved: if we write , then .
If both points are affine but P1 = -P2, or if exactly one of two points (wlog P1) is affine, then we take and check (where i=2 in the former case and i=3 in the latter case); if , we use instead. Finally, if both points are O we take and check . In all cases we substitute one of the variable in the homogeneous polynomial by a R-combination of the other two variables, and factorize the resulting homogeneous polynomial in two variables as a product of three linear homogeneous polynomials, from which the projective coordinates of the three points can be read off. Without loss of generality, we'll only discuss the L(Y,Z)
case below.
Consider the factorization ; the factor appears because we want to ensure that for each , one of and is 1 and the other also lies in R, in order to perform reduction. This actually lies in R, because the product of the three linear polynomials is primitive and therefore divides the LHS as polynomials in by docs#Polynomial.IsPrimitive.dvd_of_fraction_map_dvd_fraction_map. (Note: the redundant Polynomial.IsPrimitive q
condition should be removed, and NormalizedGCDMonoid
should be replaced by GCDMonoid
. Since mathlib doesn't have primitivity of multivariate polynomials, we may evaluate at Z=1 before applying this argument.) Now we can reduce both sides of the factorization modulo m, and it's crucial that doesn't reduce to 0 (i.e. is a unit), by the following argument: if then is the zero polynomial, which is equivalent to , so is not irreducible. However, we know that is irreducible by docs#WeierstrassCurve.irreducible_polynomial, and the homogenization of an irreducible polynomial is again irreducible (Lemma 2), contradiction.
We conclude that is a unit in R and therefore can be incorporated into one of the factors while keeping for all . Reducing both sides of the factorization yields , and is exactly the reduction of . Because of the way we chose to apply the transformations X = (1/k)Y - (b/k)Z
, Z = (1/b)Y - (k/b)X
, and Z = (1/x)X
, we've made sure that the reduced linear substitution polynomials are of one of three forms: , , or . Given the factorization of the substituted polynomial, it should be straightforward to establish that (i.e., map_add
), and that is nonsingular if both and are (e.g. dehomogenize and apply docs#WeierstrassCurve.nonsingular_add' for the affine case).
Proof of Lemma 2: Assume that f(X,Y)
is an irreducible polynomial with coefficient in a domain. It's easy to see the newly introduced variable Z
doesn't divide the homogenization F(X,Y,Z)
. For any factorization of its homogenization F=GH
, both G
and H
must be homogeneous (already formalized here) and not divisible by Z
, and f=gh
where g(X,Y)=G(X,Y,1)
and h(X,Y)=H(X,Y,1)
, so wlog we may assume g
is a unit by irreducibility of f
. We can write with homogeneous of degree n-i
, where n
is the degree of , so also has degree n
because the coefficient is nonzero. Since g
is a unit, n=0
and therefore G
is also a unit.
Johan Commelin (Aug 08 2023 at 06:38):
Thanks so much for writing down these observations. Feel free to post them as GH issue. That way it might be easier to preserve them. Here on zulip they might get lost by the flood of messages that gets posted each week.
David Ang (Aug 08 2023 at 18:36):
I have a basic working API of projective coordinates of Weierstrass curves (defined using MvPolynomial
over an arbitrary ring, but some things are only correct over a field), but the group law I only have it over a field. The PR should be ready in a couple of weeks.
David Ang (Aug 08 2023 at 18:49):
Although I'm not sure how best to link it back to affine coordinates, since our current affine coordinates is defined as a Polynomial Polynomial
. It's not realistic to add another layer of Polynomial
since even the definition will time out, so I had to resort to MvPolynomial
. We could prove that Polynomial Polynomial
is equivalent to MvPolynomial
over bool
or fin 2
, then define Projective.polynomial
as the homogenisation of this, but I think that's going to make proving anything rather annoying. If I remember correctly, I think homogenisation exists in FLT-regular but not in mathlib yet?
Currently I just repeated the definition:
local notation "𝐗" => X 0
local notation "𝐘" => X 1
local notation "𝐙" => X 2
noncomputable def polynomial : MvPolynomial (Fin 3) R :=
𝐘 ^ 2 * 𝐙 + C W.a₁ * 𝐗 * 𝐘 * 𝐙 + C W.a₃ * 𝐘 * 𝐙 ^ 2
- (𝐗 ^ 3 + C W.a₂ * 𝐗 ^ 2 * 𝐙 + C W.a₄ * 𝐗 * 𝐙 ^ 2 + C W.a₆ * 𝐙 ^ 3)
Then have things like lemma nonsingular_iff_affine (x y : R) : W.nonsingular ![x, y, 1] ↔ W.toAffine.nonsingular x y
, etc. without going through homogenisation at all. Is this sufficient?
Eric Rodriguez (Aug 08 2023 at 20:28):
cc @Alex J. Best - I think the homogenisation code would be nice to have in Mathlib - if you don't have time, should we port it?
Damiano Testa (Aug 08 2023 at 21:38):
Junyan, I wonder if the proofs will become simpler by defining the reduction for points on vanishing set of ideals in multivariate polynomials over a valuation ring.
The ideals part may take some of the pain away from dealing with the equation of a line. Moreover, you can then apply the reduction map both to points on E(K) and to collinear points, since those are also points on a closed subset.
Alex J. Best (Aug 08 2023 at 22:24):
Eric Rodriguez said:
cc Alex J. Best - I think the homogenisation code would be nice to have in Mathlib - if you don't have time, should we port it?
I think @Riccardo Brasca already ported it at https://github.com/riccardobrasca/flt-regular/blob/master/FltRegular/ReadyForMathlib/Homogenization.lean so it just needs a little bit of cleanup and then it should be ready to PR. help is always appreciated :smile:
Eric Rodriguez (Aug 08 2023 at 22:44):
I wonder if it'd be nice to have homogenization be defined over two index types, a map from the old to the new, and a place to homogenise at. It'd be weird but for example in this case I guess it'd be nice to have mv_poly fin2 R
homogenized to mv_poly fin3 R
Alex J. Best (Aug 08 2023 at 22:50):
Yeah my original thought process is that the way to do that should be to simply move your polynomial to the bigger ring (as its functorial or monadic or whatever in the variable set) and then homogenise using the version I wrote. This seemed to be the more flexible basic set-up rather than baking in two index types with a possibly identity map into the core definition. We probably do want some api for that eventually.
Yaël Dillies (Aug 09 2023 at 19:49):
Let me @Mantas Baksys because we wanted homogenisation for Zsigmondy's theorem.
Junyan Xu (Aug 12 2023 at 04:00):
David Ang said:
I have a basic working API of projective coordinates of Weierstrass curves (defined using
MvPolynomial
over an arbitrary ring, but some things are only correct over a field), but the group law I only have it over a field. The PR should be ready in a couple of weeks.
Thanks! I won't need to use the group law written in projective coordinates; I just need to know the solutions (over a field) of F(X,Y,Z)=0 (up to simultaneous multiplication) are in 1-1 correspondence to the solutions of f(X,Y)=0 plus O.
Then have things like
lemma nonsingular_iff_affine (x y : R) : W.nonsingular ![x, y, 1] ↔ W.toAffine.nonsingular x y
, etc. without going through homogenisation at all. Is this sufficient?
Yeah I think a general API connecting R[X][X]
and MvPolynomial (Fin 3) R
is probably not of very broad interest, so anything we develop will be probably tailored to our specific need and not be very complete. The flt-regular homogenization API is nice, and we could map f(X,Y)
into MvPolynomial (Fin 3) R
and homogenize it, and I think we could perform the substitutions F(X,Y,1)
and F(X,Y,L(X,Y))
all within MvPolynomial (Fin 3) R
without the involvement of a bivariate polynomial ring.
Some API that I do want to develop here is for the "concrete projectivization", i.e. P^n(R) for a general ring, as suggested here. Rather than working with a tuple ![x,y,1]
, we'd work with the quotient of tuples by simultaneous (left) multiplication by a unit of R, which I tentatively call TupleQuotUnits n R
. I think the equation
and nonsingular
predicates would be more convenient when descended to TupleQuotUnits n R
. The map TupleQuotUnits n R → TupleQuotUnits n S
can be defined for any MonoidHom via Quotient.lift, so the reduction map can be defined on this level. When applied to a RingHom, this map respects the predicate (again defined via Quotient.lift) on TupleQuotUnits n R
that says the (left) ideal generated by the elements is the unit ideal, which cuts out the projective space P^n(R) from TupleQuotUnits (n+1) R
. (Note for general commutative ring R this only gives a subset of the points in P^n(R), but if Pic(R)=0 we get all points, and this is good enough for our purpose.) We can then show P^n(R) → P^n(Frac(R))
is bijective when R is a valuation ring or a PID, or some more general class of rings. This would allow us to define the reduction map W(K) → W(k)
.
Some more words about the homogenization API: I originally thought that homogenization would take an arbitrary element in a Nat-graded algebra A and produce an element in A[X] homogeneous w.r.t. the induced Nat-grading (sum of grading on A and polynomial degree). I don't know any application of this generalized version, even though docs#AlgebraicGeometry.Proj.toLocallyRingedSpace does take a Nat-graded algebra. However, since there are the following three advantages of using MvPolynomial (Fin 3) R
, I'm now in favor of the flt-regular API:
- David has encountered timeout when trying to work with
R[X][X][X]
butMvPolynomial
is OK - Homogeneity is already defined for
MvPolynomial
but not yet available forR[X][X][X]
(also, Nat-grading onMvPolynomial
has been PR'd in mathlib#10119 but not forR[X][X]
) - The three variables have equal status, which makes it easier to make general statements that apply to any of them
Junyan Xu (Aug 12 2023 at 04:04):
Damiano Testa said:
Junyan, I wonder if the proofs will become simpler by defining the reduction for points on vanishing set of ideals in multivariate polynomials over a valuation ring.
The ideals part may take some of the pain away from dealing with the equation of a line. Moreover, you can then apply the reduction map both to points on E(K) and to collinear points, since those are also points on a closed subset.
I'm afraid I have trouble making sense of your idea @Damiano Testa . You are saying there's another definition of reduction when restricted to points on vanishing set of ideals / closed subsets? What can we say about reduction defined this way, and wouldn't it still agree with the map P^n(K) → P^n(k)
?
Damiano Testa (Aug 12 2023 at 06:03):
I expressed myself badly. What I meant is basically that once you prove that projective space satisfies the valuative criterion for properness, then you get essentially for free that closed subsets of projective space also satisfy it.
Thus, the reduction map on an elliptic curve, not only maps points of the elliptic curve over K to points over k, but it also maps triples of points summing to zero to triples of points summing to zero.
That was all the content of my message!
Kevin Buzzard (Aug 12 2023 at 08:43):
So we need projective morphisms are proper, which is elimination theory.
Re F[X][X] being slow: I saw an example yesterday where this was due to the ring1
tactic trying and failing to find a proof that F[X][X] had characteristic zero. Removing the instances it was trying made things faster.
David Ang (Aug 12 2023 at 10:49):
Something to note while writing my projective API over a field is that it was a lot easier to work with [(triple, nonsingular)], where [.] denotes the class under the usual equivalence relation, rather than ([(triple, nonzero)], nonsingular), which was why I didn’t reuse the existing projective space API
David Ang (Aug 12 2023 at 10:53):
In any case what I really needed is weighted homogenisation (x,y,z) ~ (u^2x, u^3y, uz), so I’m only writing this to get a feel on how to write my weighted projective API (which cryptographers usually call “Jacobian coordinates”), and I don’t know if flt-regular or anywhere else has this
Antoine Chambert-Loir (Aug 12 2023 at 14:10):
Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'
David Ang (Aug 12 2023 at 15:31):
Antoine Chambert-Loir said:
Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'
Would it be easy to generalise the homogenisation in flt-regular to this weighted context?
Junyan Xu (Aug 13 2023 at 03:26):
David Ang said:
Antoine Chambert-Loir said:
Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'
Would it be easy to generalise the homogenisation in flt-regular to this weighted context?
For the definition, we just need to change totalDegree
here to weightedDegree'
(w i
needs to be 1 though) ...
Something to note while writing my projective API over a field is that it was a lot easier to work with [(triple, nonsingular)], where [.] denotes the class under the usual equivalence relation, rather than ([(triple, nonzero)], nonsingular), which was why I didn’t reuse the existing projective space API
That's good to know, though what I'm proposing is something like ([triple], [nonsingular]).
I expressed myself badly. What I meant is basically that once you prove that projective space satisfies the valuative criterion for properness, then you get essentially for free that closed subsets of projective space also satisfy it.
If I understand correctly, this is saying if a point in P^n(K) lies on some closed subset cut out by some equations like F(X,Y,Z)=0 or Z=L(X,Y) with coefficients in R, then its reduction also satisfies the equation. But this isn't hard to prove, right? (Just reduce the equations.) And we don't just want to know the reduction of the third point lies on the line: if the reduction of the three points are distinct then it's enough, but if not we need to know the third reduction is the negative of the sum of the other two reductions, for which it's crucial to use a factorization so we can express the coordinates of the third reduction linearly in terms of the other two reductions via Vieta's rule.
We can then show P^n(R) → P^n(Frac(R)) is bijective when R is a valuation ring or a PID, or some more general class of rings.
Just noticed Bezout domains are a common generalization.
Eric Rodriguez (Aug 14 2023 at 23:16):
David Ang said:
Antoine Chambert-Loir said:
Mathlib knows about homogeneous polynomials with respect to any given weight on variables, docs#MvPolynomial.weightedDegree'
Would it be easy to generalise the homogenisation in flt-regular to this weighted context?
this should be easy, the definition is finsupp.map_domain (λ j, j + finsupp.single i (p.total_degree - j.sum (λ _ m, m))) p
; the most annoying thing would probably just be to make API linking this to the non-weighted case
Damiano Testa (Aug 16 2023 at 07:44):
Junyan Xu said:
I expressed myself badly. What I meant is basically that once you prove that projective space satisfies the valuative criterion for properness, then you get essentially for free that closed subsets of projective space also satisfy it.
If I understand correctly, this is saying if a point in P^n(K) lies on some closed subset cut out by some equations like F(X,Y,Z)=0 or Z=L(X,Y) with coefficients in R, then its reduction also satisfies the equation. But this isn't hard to prove, right? (Just reduce the equations.) And we don't just want to know the reduction of the third point lies on the line: if the reduction of the three points are distinct then it's enough, but if not we need to know the third reduction is the negative of the sum of the other two reductions, for which it's crucial to use a factorization so we can express the coordinates of the third reduction linearly in terms of the other two reductions via Vieta's rule.
I need to think more about how to formalize this, but in the "informal" case, you would get the "correct distribution of multiplicities" by functoriality. I suspect that this kind of functoriality would come from the "functor of points", which I think is actually not in mathlib. So maybe your approach with explicit factorisations might be quicker for your goal.
David Ang (Nov 17 2023 at 22:31):
I've been busy with other things, but here is finally some basic API for projective coordinates. @Junyan Xu only wanted the definition and some predicates, but I needed to write out the full group law for other purposes. I'm requesting for comments on the code before going further and adding documentation. In particular I have not defined the map induced by a MonoidHom but this should be easy once this PR is OK.
Last updated: Dec 20 2023 at 11:08 UTC