Zulip Chat Archive

Stream: mathlib4

Topic: linter to generalize theorems

N. V. Lang (Aug 25 2025 at 16:33):

Hi there! I'm a CS undergrad looking to make a small contribution to Lean4 / mathlib4 for my bachelor's thesis. I'm particularly interested in working on a linter which suggests ways in which a theorem can be generalized (e.g., type class generalization).

I've seen work done on something like this by @Vlad Tsyrklevich (#PR reviews > #22337 type class generalization ) and @Jovan Gerbscheid (#mathlib4 > Linter for generalizing type class hypotheses ), so I think it should hopefully be doable. I've also noticed that something similar was implemented for Lean3 by @Alex J. Best (paper: https://ceur-ws.org/Vol-3377/fmm12.pdf).

I don't have experience with Lean, but I'm eager to change that. I've got some experience with VS Code extension development. I "officially" have ~300hours to contribute. And I love math.

My question for the community is as follows: Given that I'd like to build something useful to the people working on mathlib4 — ideally in the form of a linter which suggests ways to generalize theorems — could you give me your thoughts regarding...

...whether there's any other mechanisms (besides typeclass generalization) through which theorems might be automatically generalized;
...how much room for (feasible) improvement there is in prior work (i.e., do you think this is a dead end?);
...whether you think there's some other specific aspect(s) of mathlib4's developer experience I could do more good with.

Thank you so much for your time!

Vlad Tsyrklevich (Aug 25 2025 at 17:13):

I think ~300 hours is a very short amount of time to get up to speed in Lean and Lean metaprogramming. It took me a significant investment to begin to feel confident using Lean as a theorem prover, and my brief foray in metaprogramming barely scratched the surface of whats possible there. Picking a small well-scoped project would probably be best if you still wanted to pursue this, but I'm not sure what would be appropriate.

Vlad Tsyrklevich (Aug 25 2025 at 17:16):

The new #mathlib4 > Tactic analysis framework seems like it may offer some low-hanging fruit for useful contributions, but I haven't looked at it enough to suggest anything specific.

Michael Rothgang (Aug 25 2025 at 17:33):

Another idea for a useful project: a tool to suggest names for theorems --- given a theorem, suggest a name according to mathlib's naming convention (optional: validate that an existing name is fine; bonus: allow some kind of hard-coded allowlist to deal with existing violations). @Robin Arnez wrote a prototype; see the discussion here: #lean4 > automatic spelling generation & comparison. (I don't know how much more work they're thinking of putting in, you should certainly coordinate before charging ahead.)

Making this truly useful is hard, because the naming conventions are long, mathlib is huge (so you will find almost every corner case you can think of, and then a few more), and judging if a theorem is badly named works best if you're somewhat familiar with mathlib.
At the same time, this tool could be very helpful, both to newcomers (learn how to name theorems) and for checking compliance at scale.

Anand Rao Tadipatri (Aug 25 2025 at 17:53):

Another approach is to generalize a constant in a theorem by determining which of its properties were used in the proof term, and @AG and I have implemented a Lean tactic of this flavour here: https://github.com/Human-Oriented-ATP/automatic-proof-generalization.
A Mathlib linter for generalization would be very welcome! One crucial detail regarding the above algorithm is that it needs to be provided a constant in the theorem to generalize, so any linter following this approach must be paired with a sensible heuristic for picking a term from the theorem statement to generalize.

Jovan Gerbscheid (Aug 25 2025 at 18:47):

For type class hypothesis generalization, there are 2 different approaches:

you could work on the syntax level, which is what Vlad Tsyrklevich did. Here the underlying idea is very simple; simply try to replace A with B for some specifically chosen type classes A and B and see if the theorem still type checks.
you could work on the expression level. This is what I did and what Alex J. Best did. I think this approach has more potential for making a generally useful tool. My implementation was quick and dirty and could be improved to generalize some more cases, and to suggest the most general generalization, instead of the least general generalization. However, the issue I ran into was that there were already over 2000 theorems flagged by the program, most of which were intentionally not generalized, for various reasons (although we did find a few theorems that genuinely should be generalized). I tried to filter out some, but it was still too many false positives. Looking at the paper it seems that Alex J. Best had a similar amount in his lean3 implementation. So if you want to make this into a linter useful for mathlib, I think you'll have to find a way to get less false positives. If we can't work around the many false positives, there could still be value in a linter that can be turned on locally, so that people can use it if they think that a theorem could be generalized.

Jovan Gerbscheid (Aug 25 2025 at 18:52):

Also feel free to ask me about the code I wrote. I don't think I explained in too much detail. Or just follow the paper by Alex J. Best :smile:

N. V. Lang (Aug 25 2025 at 21:21):

Thank you all so much for your comments! I deeply appreciate them, and will look into each of your suggestions, and report back as I go.

@Vlad Tsyrklevich Thank you for your suggestion! I suspect it'll take me a whole bunch of reading to really understand what the tactic analysis framework is about, but it does sound very exciting! And indeed, 300 hours is not that much, but I'm very eager to learn Lean even outside of the context of the project, and the hours are merely a guideline, so I'm hopeful it'll be fine. I will certainly keep your word of caution in mind though.
@Michael Rothgang I really like the theorem naming suggestion, thank you! It sounds very much like the kind of contribution I was hoping to make: something relatively small which might just be too tedious or boring for experts to tackle, but which could still improve DX and save people some time better spent elsewhere.
@Anand Rao Tadipatri This sounds great, thank you! I have to admit, though, that I thought that this was precisely what typeclass generalization was about, too.
@Jovan Gerbscheid Thank you for your comments, they're very helpful! You mention something which I also thought about a bit, namely intentional specialization. I have to say, though, that I couldn't come up with any reasons (beyond just personal preference) for why a theorem shouldn't be generalized. May I ask, if a generalization would've technically worked, how did you determine what to deem a false positive?

Jovan Gerbscheid (Aug 25 2025 at 23:44):

There are a couple of reasons why generalizations might not be wanted

In some situations, generalizing a type class assumption doesn't actually generalize anything mathematically. This was discussed here: #mathlib4 > Elab to generalize type classes for theorems @ 💬 .
When a lemma is proved with the sole pupose of using it in some theorem, then it doesn't make sense to generalize that lemma if we can't also generalize the corresponding theorem.
Synthesizing a more general type class assumption takes more time. This is a bit unfortunate, but by making a theorem more general, we also make it slower, because type class search now has to do a greater search.

There are also some cases where something technically is a generalization, but it isn't actually a nicer hypothesis:

The generalized assumption is more complicated, for example turning [NoMaxOrder α] into [NoMinOrder αᵒᵈ] is not a nice change.
When there exist cyclic instances (e.g. Zero α and OfNat α 0), then the linter might suggest one as a generalization of the other, for example it may suggest replacing Nonempty α with Inhabited α.

Artie Khovanov (Aug 25 2025 at 23:44):

This sounds great, thank you! I have to admit, though, that I thought that this was precisely what typeclass generalization was about, too.

@N. V. Lang not all properties are given via typeclasses. In fact, properties of terms are almost never given via typeclasses.

Michael Rothgang (Aug 28 2025 at 20:57):

I just thought about this again today, and wanted to emphasize that there is lots of helpful tooling for generalisations. Let me take a recent example of mine, generalising typeclasses from e.g. NormedAddCommGroup to ENormedAddCommMonoid (or its weaker variants). Having a tool to help with such generalisations would be really useful. Some aspects that make it useful are the following:

Take dependencies into account: often a lemma can be generalised once all the lemmas it uses are.
In practice, this means it's smarter to traverse files in e.g. topologically sorted order (meaning you parse dependents before dependencies).
(A corollary: don't just analyse lemmas at random/in random order, but try to take dependencies into account.)
Definitions matter: a lemma may be generalisable, but only once its definitions are. This point is subtle: often just weakening typeclasses (when possible) is the mathematically wrong thing to do. If it's fine, this will enable automatically finding lemmas that can be generalised.
Perhaps it's useful to start with an allowlist, of the form "such generalisations are always good". To give an example (biased from my perspective), I'd be happy to learn about any lemma which can be generalised from NormedAddCommGroup to ENormedAddCommMonoid (or a weaker variant).

This is the "basic" case of automatic refactoring --- when you only need to change typeclass assumptions. When making a PR, I often realise a lemma needs to be ported in a natural way, but that involves edits. Doing that automatically seems much harder, but even having a simple tool would already help --- by indicating where to look for low-hanging fruit.

N. V. Lang (Aug 29 2025 at 17:08):

@Michael Rothgang Thank you so much for your comment! I'm very glad to hear that a linter like this could be helpful, and it'll help me make the case to my advisor.

Your point on dependencies is a very good one, and I suspect that it'll also be very relevant to performance.

I also agree that blindly weakening typeclasses without taking into account definitions would be counterproductive. I like the idea that adding weaker definitions should potentially trigger generalization hints on theorems referencing stronger definitions, which would essentially make the linter orient itself based on existing definitions (which is, I think, what you're mentioning), as opposed to suggesting ad hoc "minimal definitions".

In any event, I'm still reading into Lean 4 metaprogramming, after which I hope that I'll be able to better take advantage of your (and the community's in general) advice and also make more informed questions, should they arise.

Last updated: Feb 28 2026 at 14:05 UTC