Zulip Chat Archive

Stream: general

Topic: How to Lean for large math formalisation projects (often CI)

Emily de Oliveira Santos (Jan 09 2026 at 15:44):

Longer Title: How can I learn to code in Lean with the goal of eventually working on large mathematics formalisation projects? — CI, scalability, best practices, cleanliness of code, maintainability, etc.

Recently I posted a question over the Proof Assistants Stack Exchange on how to approach Lean as not only something you learn to code with, but as something you learn to "software engineer" with. (I'm aware this sentence sounds a bit horrid to the ear, sorry!)

My main worry (and request for help there and now here) is that while I see a number of introductory sources like Mathematics in Lean, The Mechanics of Proof, etc., there doesn't really seem to be that many resources on how to bridge the gap between knowing how to fill a sorry and knowing how to work (and maybe even lead!) large-scale projects.

Some of the related skills I want to learn but don't know how:

Identifying bad Lean code: overly complicated, brittle, hard to maintain, unclear, not scalable, etc.
Knowing how to properly package new definitions, packages, reusable lemmas, etc. in a way that won't generate technical debt a couple months down the line.
Knowing how to optimise Lean code, as well as having a better idea of what breaks what and which things are obvious resource sinks for the compiler.
Knowing when to prefer certain strategies over others (i.e. writing efficient code).

While I'm aware a large amount of this comes with practice, I suppose there's a good chance there might be some good resources on it, too...

...right?

Vlad Tsyrklevich (Jan 09 2026 at 16:18):

The library style guidelines Jason linked to are a good resource for stylistic issues

Vlad Tsyrklevich (Jan 09 2026 at 16:23):

I don't think much has been written about optimization. I'm not really sure what else you're looking for. I don't think resources that bridge being able to write a function and lead a large-scale project in the software engineering world exist either, this always comes with experience.

Yaël Dillies (Jan 09 2026 at 17:05):

The info you are asking for is something I would like to write a paper about. Unfortunately, I have had other priorities as of late

Michael Rothgang (Jan 09 2026 at 17:06):

In my Lean course (co-taught with Floris van Doorn), I talked about code style, linting, golfing and gave some advice for formalisation projects etc. I didn't mention profiling much; that should have its own little guide. Here's the lecture file I used: https://github.com/fpvandoorn/LeanCourse25/blob/master/LeanCourse25/Lectures/Lecture21After.lean

Michael Rothgang (Jan 09 2026 at 17:06):

This won't answer all your questions, but it might help with some of them.

Michael Rothgang (Jan 09 2026 at 17:08):

Re: writing efficient code --- you're talking about writing proofs, right? (You can also write programs in Lean... optimising them is a different skill; this would have different answers I know less about.)

Michael Rothgang (Jan 09 2026 at 17:09):

This paper describes some tools we use to keep scaling mathlib. (Full disclosure: I'm one of the authors.) Likewise, I only expect it to answer some of your questions.

Michael Rothgang (Jan 09 2026 at 17:10):

In general, "how do I formalise area X best" depends on the area --- so ideally there'd be a paper (or several!) for each area, explaining the trade-offs, pitfalls etc. in setting things up. Ideally https://leanprover-community.github.io/papers.html has them all; if you spot anything missing, please speak up so they can be added.

Michael Rothgang (Jan 09 2026 at 17:10):

(Some areas don't have such a paper yet. This includes differential geometry; my medium-term goals include changing this.)

Artie Khovanov (Jan 09 2026 at 18:21):

I have learned largely by PRing to Mathlib, getting feedback, discussing in #mathlib4 , etc
(I was also fortunate that more experienced users have kindly given a lot of time)

Snir Broshi (Jan 09 2026 at 19:03):

Michael Rothgang said:

This paper describes some tools we use to keep scaling mathlib.

Does https://www.youtube.com/watch?v=cabiomTEWe0 cover most of it?

Patrick Massot (Jan 09 2026 at 20:54):

Emily, I’m afraid this is rather close to the classic question “why are there so many books about mathematics and no clear instruction about how to do mathematical research?”. Some things simply seem very hard to communicate, you mostly have to experience them.

Michael Rothgang (Jan 10 2026 at 08:19):

Snir Broshi said:

Michael Rothgang said:

This paper describes some tools we use to keep scaling mathlib.

Does https://www.youtube.com/watch?v=cabiomTEWe0 cover most of it?

Kind of, yes: My talk at the EuroProofNet workshop (September 2025) is a bit longer and more up to date. Let me check if there is a recording.

Michael Rothgang (Jan 10 2026 at 08:31):

Yes, there is: https://youtube.com/watch?v=9qRZS4b3J6E&list=PLaT9F1eDUuN1NFuCZmlbiHJOHNPrqh0xf&index=1

Emily de Oliveira Santos (Jan 10 2026 at 18:05):

Hey everyone, thank you so much for your replies in this thread, they are extremely helpful!

Michael Rothgang said:

Re: writing efficient code --- you're talking about writing proofs, right? (You can also write programs in Lean... optimising them is a different skill; this would have different answers I know less about.)

Yep!

Patrick Massot said:

Emily, I’m afraid this is rather close to the classic question “why are there so many books about mathematics and no clear instruction about how to do mathematical research?”. Some things simply seem very hard to communicate, you mostly have to experience them.

That’s a really great way to put it. I think reading everyone’s replies here made me realise that perhaps the “correct” question to be asking might be more akin to something like “What are some tasks I can do in order to improve my understanding of Lean [with the ultimate goal of being able to work on large-scale projects]?”

I think some of the pointers here form good answers to this: Rothgang’s lecture file, the Growing Mathlib paper, and several of the papers in https://leanprover-community.github.io/papers.html. Artie's reply also made me realise that one other such task would be studying Mathlib PR’s and the feedback given there and in #mathlib4.

I’ll try putting some work into these pointers and keep hanging around here in the Zulip. Thank you very much once again, everyone! :)

Vlad Tsyrklevich (Jan 10 2026 at 20:04):

My recommendation, assuming you are interested in theorem proving for math and not CS, would be to read and do all of the exercises in the first ~9 chapters of #MIL, and then formalize some results you know--starting from simple lemmas and building up. Once you are proving things on your own, you will already be making use of mathlib related to your formalizations and likely already starting to find missing lemmas you think should be there.

Michael Rothgang (Jan 11 2026 at 08:49):

Adding to Vlad's answer: there is something to do in almost every area. This might be easier or harder to approach --- but asking "I'm interested in X, are there interesting projects to work on" will usually yield helpful ideas.

Emily de Oliveira Santos (Jan 14 2026 at 15:27):

Hey Vlad and Michael!

I'm currently going through The Hitchhiker's Guide to Logical Verification and am planning to do MIL next (or maybe functional programming with Lean and then MIL), and after that (and possibly more books; I'm enjoying them quite a bit) find projects to work on. (As you can see I'm happy with learning CS too :)

Thank you both very much for the tips!

Lua Viana Reis (Jan 15 2026 at 13:07):

(deleted)

Lua Viana Reis (Jan 15 2026 at 13:49):

(deleted)

Last updated: Feb 28 2026 at 14:05 UTC