Zulip Chat Archive
Stream: general
Topic: Lean and the LMFDB
David Roe (Dec 30 2024 at 23:06):
@Andrew Sutherland and I are starting to plan for a project building connections between Lean and the L-functions and modular forms database. More specifically, we're currently thinking about aiming for the following:
- Have a valid Lean/Mathlib definition for as many objects in the database, and their attributes, as possible. The goal is to be able to translate claims made in the LMFDB into propositions in Lean (without proof), and the corresponding Lean code would be available as a download on LMFDB pages.
- Create Lean tactics that query the LMFDB to find counterexamples or find examples for existence statements. For example, "there exists a modular form of level and weight ", or "there exists a finite group with nonabelian automorphism group where every automorphism is central."
There are plenty of possible directions that we think are either too hard or not ideas we currently want to pursue:
- We don't plan to work on formal proofs for algorithms used to compute data in the LMFDB.
- We don't plan to work on generating proof certificates for LMFDB data (though we are happy to chat with others interested in this).
- We don't currently plan to connect with AI or LLM tools.
Thanks to @Jason Rute for recent discussions helping us steer toward concrete plans, and for pointing out the following prior work:
- Incorporating a Database of Graphis into a Proof Assistant describes the Lean-HoG library for interfacing with the House of Graphs. There's a lot of overlap with what we're aiming to do, though they have a bit more focus on formally verifying results in the database. @Katja Berčič @Andrej Bauer @Jure Taslak
- Lean Forward was a Dutch grant from 2019-2023 on "Usable Computer-Checked Proofs and Computations for Number Theorists"; Objective 2 included "use verified algorithms to check entries in online databases (e.g. LMFDB)." While this specific goal is beyond our proposed scope, hopefully there is a commonality of interest. @Jasmin Blanchette @Rob Lewis @Gabriel Ebner @Anne Baanen @Jannis Limperg @Kevin Kappelmann @Sander Dahmen @Assia Mahboubi @Aurélien Saue @Freek Wiedijk
We're happy to get suggestions on how to proceed or expressions of interest in collaborating on building connections between Lean and the LMFDB!
Ruben Van de Velde (Dec 30 2024 at 23:16):
Sounds cool! One thing you should probably think about is ongoing maintenance, particularly keeping up to date with mathlib and lean changes. Do you expect any of this work to lead to changes in mathlib?
David Roe (Dec 30 2024 at 23:21):
There are two kinds of changes I could imagine to mathlib:
- If there are missing definitions needed to connect to properties stored on the LMFDB, that could be something that we would contribute to mathlib rather than leaving as separate Lean code to be downloaded from the LMFDB.
- I don't know where tactics are maintained within the Lean ecosystem, but that's another possible contribution.
Jason Rute (Dec 30 2024 at 23:23):
One thing I had trouble helping @David Roe (who is an IRL friend) with is who the right Lean people are to talk to about this project. So if there is someone he should reach out to, it would be good to let David and/or that person know.
Kevin Buzzard (Dec 31 2024 at 00:01):
@David Ang has been working on defining invariants of elliptic curves over the rationals, with an eye on stating BSD in Lean, so perhaps elliptic curves are one place where we could start. Here it would be very easy to add valid Lean code to LMFDB corresponding to each elliptic curve over in the database, for example (our definition of an elliptic curve over is a 5-tuple of rationals plus ). Mathlib also has modular forms although I don't think we have any good constructors for them right now, just the definition, so for example I am right now not at all clear how we could define the two level 37 eigenforms in Lean (I choose 37 because for things like there are tricks involving -products). I guess one way to proceed here would be to prove the Eichler-Shimura isomorphism in Lean and then write down explicit 1-cocycles or use modular symbols or however you want to think about them (we can't just write down the q-expansion using the elliptic curves because we'd then have to prove it was a modular form, which is out of scope right now as far as I can see, there are no converse theorems yet for example, although I'll need them for FLT so they'll be there one day; this wouldn't work for more general eigenforms anyway). We have some Eisenstein series so for some low levels there will be other dirty hacks involving linear combinations of products but I don't really think it would be a productive long-term strategy. Does LMFDB store modular forms internally using modular symbols somehow? Perhaps it would be worth discussing exactly what's going on there, as this might be the best way to link modular forms between the two systems.
David Ang (Dec 31 2024 at 01:26):
All the BSD invariants I intend to formalise should work over number fields as well (for the tab "elliptic curves over Q(\alpha)"), and I intend to look into defining the Sato-Tate group (which I think is a new thing in LMFDB?)
David Roe (Dec 31 2024 at 01:40):
We store modular forms internally basically by storing their q-expansions up to the Sturm bound (usually further, actually). For newforms where we store the coefficients as algebraic integers, while for larger coefficient fields we only store complex approximations for various embeddings.
We do store polynomials so that the modular form can be reconstructed as the kernel of certain linear operators on a space of modular forms (see cmf.hecke_kernels), so that might be a way to proceed.
David Roe (Dec 31 2024 at 01:46):
@David Ang, Sato-Tate groups have been around in the LMFDB for a while (8 years or so), though they have been expanded since then. I'm glad that you're working on formalizing BSD invariants! There are some subtleties over number fields; see the knowl on global period for example.
Chris Birkbeck (Dec 31 2024 at 10:22):
I think this is a great idea and I'd be happy to contribute any way I can. I expect the number field bits of the lmfdb is probably the easiest thing to do. As for modular forms, Kevin is right we don't currently have good constructors for general levels, but I am currently working on finite dimensionality and Hecke operators which would help towards the Hecke kernels approach.
Chris Birkbeck (Dec 31 2024 at 10:23):
Further down the line when Kevin proves Jacquet--Langlands we'll have some great constructors for modular forms as well
Kevin Buzzard (Dec 31 2024 at 11:12):
Jacquet-Langlands is many years away!
Anne Baanen (Dec 31 2024 at 13:12):
This is a very exciting project and I expect that such a bridge will be very useful!
The Lean Forward project has sadly run its course but now the VU Amsterdam has Formalizing Diophantine Algorithms run by Sander Dahmen with @Alain Chavarri Villarello. The paper on certifying rings of integers in number fields that Alain, Sander and I wrote recently also concerns connecting LMFDB entries with Mathlib results.
I'll also tag @Alex J. Best and @Nirvana Coppola.
Andrej Bauer (Jan 01 2025 at 18:07):
Hello. Our experience was that JSON is a good way to interface with external databases. This works quite well with the FromJson
type class. Since you're not aiming to generate verified certificates, this shouldn't be too complicated. Let me know if you'd like to pick my brain (and that of @Katja Berčič, @Jure Taslak and @Gauvain Devillez), we can show you which parts of our code you could scavenge.
David Roe (Jan 01 2025 at 18:08):
Thanks! We're aiming to actively start working in late spring or summer, so will reach out at that point.
Andrej Bauer (Jan 01 2025 at 18:09):
Ok, if I am not here, find us by email.
Last updated: May 02 2025 at 03:31 UTC