Zulip Chat Archive
Stream: mathlib4
Topic: Spinning off `mathlib-linters` and `shake` repo
Anne Baanen (Apr 07 2025 at 10:41):
I've been working on CI for non-Mathlib projects, and as part of that I would like to spin off the linters and the shake
command into their own repositories (let's say leanprover-community/mathlib-linters
and leanprover-community/shake
), so they can be used by other projects too. What do you think, anything I should keep in mind (e.g. linter PRs that still need reviewing)?
Edward van de Meent (Apr 07 2025 at 10:43):
i imagine it's good to make sure it does not lint files that are from another project. I.e. it won't shake mathlib when the project depends on mathlib, for example
Anne Baanen (Apr 07 2025 at 10:51):
True! Thankfully, this should already be the case. The linters are already disabled by default but can be opted into (see also #23670). And shake
only runs on files in the same package:
-- Only submodules of `pkg` will be edited or have info reported on them
let pkg := mods[0]!.components.head!
(Although shake
still defaults to running on Mathlib
, we should probably use the default_target
from the Lakefile instead.)
Michael Rothgang (Apr 07 2025 at 12:03):
Does your proposal mean "I'll manually copy the mathlib files" or "I propose moving those linters into a shared repository"?
Michael Rothgang (Apr 07 2025 at 12:03):
I see no real downside to the former --- if e.g. the mathlib version is the reference copy and you manually sync over things. (Perhaps stick a comment to make changes in mathlib first, then it's good enough.)
Michael Rothgang (Apr 07 2025 at 12:10):
For the latter, I worry to what extent this would make development of linters harder or slower. At the same time, re-using mathlib infrastructure more widely is certainly nice, so solving these problems could be nice. My worries would be fully assuaged by:
- ensuring changes to the linters repository hit mathlib quickly: perhaps the batteries infrastructure could be re-used here? (Developing linters which require mathlib changes will be annoying; I know core/batteries changes have adaptation branches and so on. This is definite overhead, and would e.g. slow me down slightly.)
- figuring out how to deal with in-flight linter PRs. (One possible policy: files with no open PRs are fine to move up the mathlib-linters repository; for the others, open PRs are merged first. That's also an incentive to review those PRs quickly :-))
- ensuring some visibility into mathlib-linter PRs: such as, hooking up the maintainer merge infrastructure or beefing up the queueboard to include this.
To be honest, I'm the least sure how to address the last point (and it should certainly not be a blocker to experimenting). But I care quite a bit about this; mathlib reviewers for linters are already missing/stretched thin --- changing to a future where such PRs languish because it's a separate repository is something I'd rather avoid.
Michael Rothgang (Apr 07 2025 at 12:10):
To emphasize: these are good questions to ask, thanks for starting this discussion! But I am wondering about some bigger-picture questions, that could be good to discuss first.
Michael Rothgang (Apr 07 2025 at 12:11):
Taking a step back: it's not an option that your projects depend on mathlib, right?
Anne Baanen (Apr 07 2025 at 12:35):
Yes, it would be the latter option of moving the files, not copying them.
- Updating the linters should use the same
update
machinery that we use for other dependencies. One goal of mine in the coming weeks is to make automatic updates like this available everywhere. - I would be happy to be assigned to review all the linter PRs for the coming time. I think the time between PRs is low enough that we can declare a one-day moratorium to move everything over, and otherwise we can follow the rule you proposed about only moving untouched files. (I see quite a few linter PRs but only two of them are on the review queue.)
- Making sure everything stays on the radar would indeed be a key step. Do you/@Johan Commelin have an indication how hard it is to hook up extra repositories to the queueboard? The
maintainer merge
workflow seems easy enough to copy from the mathlib repo.
Eric Wieser (Apr 07 2025 at 12:43):
Maybe it makes sense to put the "shake" discussion in a separate thread?
Michael Rothgang (Apr 07 2025 at 12:43):
My impression is that shake is mostly "finished", hence doesn't see much churn. I don't see real downsides to moving that to a separate repository.
Michael Rothgang (Apr 07 2025 at 12:44):
@Anne Baanen Thanks for clarifying. What you're saying sounds very reasonable. You raise a good question about the queueboard.
Michael Rothgang (Apr 07 2025 at 12:47):
If the aim is to "have a separate dashboard for separate repositories", that is definitely on the roadmap (but will require some refactoring - which so far was not-to-high priority). However, we probably want to merge the data from the different repositories (i.e., show one review queue for all repositories together).
That will require some refactoring, but not too painful. I would estimate that one or two days of work should suffice. (Perhaps less, perhaps more: with CI, it's hard to be sure - and the queueboard is complex enough that unforeseen events do and will happen.)
Damiano Testa (Apr 07 2025 at 13:24):
A side comment: when writing a new linter, it is very useful to run it on all of mathlib, fixing bugs and modifying it. Thus, I am in favour of moving "established" linters to a separate repository, but I would rather keep the development of "new" linters into mathlib first and then upstream.
Damiano Testa (Apr 07 2025 at 13:24):
This also helps making the compliance of mathlib with the linter a non-issue.
Anne Baanen (Apr 07 2025 at 13:26):
Good point, we might want to keep a few linters in Mathlib until they are more stable. (The opposite argument for shake
, essentially!)
Ruben Van de Velde (Apr 07 2025 at 16:03):
Maybe shake should go into the import graph repo?
Anne Baanen (Apr 07 2025 at 16:06):
Shake is in principle independent of import-graph, but I agree both could also be usefully combined into one repo.
Mario Carneiro (Apr 07 2025 at 19:04):
aren't the linters already in another repo, namely batteries?
Michael Rothgang (Apr 07 2025 at 21:41):
Many, but not all: mathlib's environment linters and all of its syntax linters are in mathlib
Jon Eugster (Apr 07 2025 at 22:52):
Speaking of import-graph
, from my experience extracting that script into it's own repository there is the question about maintenance and code review standards.
For import-graph I'd say the current state is:
@Kim Morrison needs to bump the repo frequently as part of the entire update-routine, but otherwise the repo has very little activity besides me and Kim occasionally pushing a feature without much of a reviewing process.
For import-graph that's fine because the script isn't really used by any parts of the mathlib workflow. For linters that might not be desirable and you'd probably want more reviewing infrastructure (& people actually looking at the open PRs). I'd probably give Mario's suggestion of upstreaming linters to batteries
a thought.
Kim Morrison (Apr 08 2025 at 05:12):
I don't think we should combine shake
and import-graph
unless there is some positive reason to do so (e.g. an modification to one or the other would require a dependency, and there is some pressure to flatten the dependency structure --- seems very unlikely to me).
Kim Morrison (Apr 08 2025 at 05:13):
I think shake
is pretty safe, it has barely needed touching since Mario first wrote it. I don't recall if/when we've previously discussed moving shake
to a separate repo, so it would be good to get a +1 from Mario on the general principle!
Kim Morrison (Apr 08 2025 at 05:14):
I agree that some linters could go in Batteries (and indeed to Lean), but only linters that all Lean users would agree they likely want turned on or available. Anything that is mathematics specific, or part of the opinionated style of Mathlib, should go into mathlib-linters
.
Yury G. Kudryashov (Apr 08 2025 at 05:32):
What's wrong with having disabled opinionated linters in Batteries?
Michael Rothgang (Apr 08 2025 at 05:36):
How would mathlib enable these? IIUC, there is no good solution for these.
Michael Rothgang (Apr 08 2025 at 05:38):
(The other reason: batteries doesn't have any syntax linters so far. Though I don't know if that is deliberate.)
Mario Carneiro (Apr 08 2025 at 07:18):
Michael Rothgang said:
How would mathlib enable these? IIUC, there is no good solution for these.
If we can't do this then the linter framework isn't ready for use. That's literally how they are supposed to be used
Mario Carneiro (Apr 08 2025 at 07:19):
Michael Rothgang said:
(The other reason: batteries doesn't have any syntax linters so far. Though I don't know if that is deliberate.)
Batteries has two syntax linters rn IIRC, unreachableTactic
and unnecessarySeqFocus
Anne Baanen (Apr 08 2025 at 10:16):
Mario Carneiro said:
Michael Rothgang said:
How would mathlib enable these? IIUC, there is no good solution for these.
If we can't do this then the linter framework isn't ready for use. That's literally how they are supposed to be used
I don't see why the current approach is broken, having the linter's option disabled by default but enabled in Mathlib's Lakefile. (Ideally we'd add the possibility to declare one option for turning on/off multiple linters at once, but that would require some changes to core Lean afaict; see also #23670.)
Michael Rothgang (Apr 08 2025 at 11:52):
Can the lakefile also be used to control the default setting of environment linters? (So batteries could gain some stable mathlib environment linters, off by default, with mathlib activating them: but it'd be really nice if writing #lint in mathlib also ran the linters mathlib wants.)
Anne Baanen (Apr 08 2025 at 13:02):
I had a discussion with @Kim Morrison on enabling/disabling all linters, and I think we've come up with something useful: we would replace the current set_option linter.all
mechanism. There would be a new option, let's call it linter_set
that can take multiple values including all
, none
and default
. Depending on which linter_set
option we choose, different linters will be enabled by default.
So in the lakefile we'd write leanOptions := [⟨
linter_set, mathlib⟩]
and somewhere early on in Mathlib we'd write something like
declare_linter_set mathlib := [
⟨`linter.hashCommand, true⟩,
⟨`linter.oldObtain, true⟩,
⟨`linter.style.cases, true⟩,
⟨`linter.style.cdot, true⟩,
⟨`linter.style.docString, true⟩,
⟨`linter.style.longFile, .ofNat 1500⟩,
...
]
(syntax is of course bikesheddable!).
We can also have a recommended
linter set for downstream projects, which might be somewhat stronger than default
but not as pedantic as mathlib
.
Anne Baanen (Apr 08 2025 at 13:03):
For environment linters, as long as they use docs#Lean.Linter.getLinterValue to check if they should be enabled, the same would work.
Mario Carneiro (Apr 08 2025 at 13:25):
Anne Baanen said:
For environment linters, as long as they use docs#Lean.Linter.getLinterValue to check if they should be enabled, the same would work.
they don't; environment linters use a completely independent framework from regular linters.
Anne Baanen (Apr 08 2025 at 13:35):
Oh, is this why sometimes we have to write @[nolint foo]
and sometimes set_option linter.foo false in ...
?
Anne Baanen (Apr 08 2025 at 14:09):
Kim Morrison said:
I think
shake
is pretty safe, it has barely needed touching since Mario first wrote it. I don't recall if/when we've previously discussed movingshake
to a separate repo, so it would be good to get a +1 from Mario on the general principle!
@Mario Carneiro, can I ask you directly if you think a separate shake
repo is a good idea?
Mario Carneiro (Apr 08 2025 at 14:10):
I'm generally dubious of making new repos, I think they add more maintenance burden and possibility for version mismatches compared to the benefits one gets from them
Mario Carneiro (Apr 08 2025 at 14:11):
I think the considerations for shake
are similar for any other tool that is not very large in terms of metaprogramming code
Anne Baanen (Apr 08 2025 at 14:13):
But depending on the whole of mathlib just to be able to run shake would also cause maintenance burden downstream, no?
Mario Carneiro (Apr 08 2025 at 14:14):
but moving it to batteries would largely mitigate that concern I think?
Anne Baanen (Apr 08 2025 at 14:15):
Yes, that should reduce the update burden by quite a lot. Still, shake
right now only imports Lean
itself, no batteries. So it would still bring in unnecessary dependencies.
Mario Carneiro (Apr 08 2025 at 14:19):
? Shake would still not import batteries
Mario Carneiro (Apr 08 2025 at 14:19):
batteries would include shake
Mario Carneiro (Apr 08 2025 at 14:19):
and this seems to be an argument against any grouping of code
Mario Carneiro (Apr 08 2025 at 14:20):
I think lake just needs to get gud, I don't like that lake limitations are being used to justify fracturing the ecosystem
Anne Baanen (Apr 08 2025 at 14:30):
Okay, let's say we have project A that uses shake and project B that uses A, shake and batteries. Now we add a new feature to shake, so A wants to update. If shake is an independent repo, A and B can do an easy update (which can be entirely automatic using e.g. the lean-update
action).
If shake is part of Batteries, then A can do an easy update (since they don't actually import anything from that library), but B now needs to make fixes to update their Batteries dependency or have a mismatching dependency compared to A.
Version mismatch is much more of an issue with a language like Lean where implementation details absolutely matter (since e.g. definitional equality is very easy to break). I don't think it's a matter of telling Lake to "git gud" to solve that.
Anne Baanen (Apr 08 2025 at 14:33):
Regardless, looks like we agree that moving shake to batteries would be an improvement over the status quo.
Mario Carneiro (Apr 08 2025 at 14:34):
Also shake is what I would call a "dev dependency", whether it is used in A should not affect its use in B or A's versioning
Anne Baanen (Apr 10 2025 at 12:10):
So I made PRs upstreaming shake to batteries:
- batteries#1204
- Mathlib adaptation: #23916
Michael Rothgang (Apr 10 2025 at 13:14):
The batteries PR basically looks good to me. (In mathlib, I'd maintainer delegate it, which I cannot do in batteries.)
Mario Carneiro (Apr 10 2025 at 15:09):
(You can still review batteries PRs using the github interface)
Kim Morrison (Apr 10 2025 at 23:41):
I disagree here, and think shake
should just be its own repo. There's no positive reason to put in the Batteries, and it's such stable code that there's very little cost to having it in a new repo.
The positive reason for having it in its own repo is of course that projects which don't otherwise depend on Batteries can more easily use it (e.g. quote4
, lean4checker
, doc-gen4
, verso
, lean4-cli
, plausible
, just naming those in the upstream-of-Mathlib ecosystem).
(The secondary positive reason is just as an examplar that small modular tools can be independent repositories, just like in the rest of the software universe. :-)
Last updated: May 02 2025 at 03:31 UTC