Zulip Chat Archive
Stream: mathlib4
Topic: Deprecation policy
Michael Rothgang (May 02 2024 at 13:18):
In the past weeks, the question of a deprecation policy for mathlib has come up a few times. I think it would be helpful to make a formal decision, to simplify guidance for future PR authors. Let me suggest a first draft, based on what I took away from discussions so far.
Michael Rothgang (May 02 2024 at 13:19):
Mathlib deprecation policy
This policy applies to all items in mathlib tagged with the @deprecated
attribute. This can be aliasses, but also definitions, theorems and lemmas.
Deprecating items. When deprecating a new item, please add the current date (to allow removing it in the future).
Since Lean 4.8, the @deprecated attribute allows including a date in the since
field (as a string of the form 2024-04-01
).
Some existing items have no deprecation date; PRs adding them are welcome.
Please deprecate when renaming. When renaming a definition or theorem, please create an alias
for the old name and mark it with deprecated
. This step can be skipped when the declaration is very new (say, less than one week).
There is no need to create aliasses for renaming instances.
For automatically generated declarations (e.g. by to_additive
or simps
), please ensure the generated declaration is also tagged with this attribute.
Removing deprecated items.
Deprecated items can be removed from mathlib after <time> after their initial deprecation date. PRs doing so are welcome.
Michael Rothgang (May 02 2024 at 13:19):
I would appreciate comments on e.g the following questions:
- Any general comments about this? Missing aspects?
- Time-frame for skipping a deprecation.
- Are there other cases of generated declarations which could be deprecated? I've only seen this with
to_additive
(added: andsimps
) so far. - Time-frame when items can be removed is up for discussion. (So far, I gather that "a year" is considered plenty and "six months" considered sufficient by some, but at least "three months" seems to be desired. But that is probably not a representative sample/will have overheard some!)
If/once there's broad consensus, we could e.g. have a poll on the last question (which is perhaps the most controversial).
Michael Rothgang (May 02 2024 at 13:20):
Currently in mathlib,
- some declarations generated by
to_additive
are missing this attribute (update: it seems there is only one offender, fixed in #12597), - not all deprecated declarations are tagged with a date (my WIP branch branch#MR-deprecation-dates-all makes some progress on this; some PRs for this are open for review)
- deprecation dates are using doc comments; I presume PRs converting this to the
since
field are welcome
Michael Rothgang (May 02 2024 at 13:23):
My personal opinion (but I'm not a maintainer, so no need to weigh it highly): I think at least four months could be useful. In practice, deprecation dates are usually the date when the PR is filed, and merging it can take a while (especially with big refactoring PRs). A four months policy ensures the deprecation will be on mathlib for at least three months.
Johan Commelin (May 02 2024 at 13:44):
Quick note, I think simps
also generates candidates
Ruben Van de Velde (May 02 2024 at 14:24):
Did we end up recommending that people update mathlib on the lean release cadence? If so, we could use x of those as a cutoff
Mario Carneiro (May 02 2024 at 16:06):
I would say 1 year max and 1 stable release minimum, and for things in the middle we try to keep them if it's not an undue amount of work
Julian Berman (May 02 2024 at 16:10):
(I have nothing to add, but this is great to see discussed/given attention to!)
Mario Carneiro (May 02 2024 at 16:15):
Michael Rothgang said:
My personal opinion (but I'm not a maintainer, so no need to weigh it highly): I think at least four months could be useful. In practice, deprecation dates are usually the date when the PR is filed, and merging it can take a while (especially with big refactoring PRs). A four months policy ensures the deprecation will be on mathlib for at least three months.
I think deprecation dates should reflect when they actually landed in mathlib, not when the deprecation was proposed. We may be able to do this more effectively by having PRs not put a date and then having a bot automatically add dates to unadorned deprecations directly on master
Yaël Dillies (May 02 2024 at 16:16):
Stupid consideration: What about the 100 chars line limit?
Mario Carneiro (May 02 2024 at 16:16):
try to leave enough space?
Mario Carneiro (May 02 2024 at 16:17):
the bot would probably fail and ping someone to help
Floris van Doorn (May 02 2024 at 17:39):
I would be fine not mandating deprecations for lemmas generated by simps
:
- They are generally not typed manually (though they could appear in
simp only
calls - It's very hard to have a lemma generated by
simps
in your code without the original definition, and thesimps
lemma should change very predictably. (assuming the only change is renaming thedef
.)
Floris van Doorn (May 02 2024 at 17:44):
Our policy should probably say something about renaming X
to Y
and in the same PR renaming W
to X
. E.g.:
We allow but discourage contributors to rename declaration
X
toY
and at the same time renamingW
toX
. In this case, no deprecation attribute is required forX
(but it is forW
).
But maybe we should have a better solution for this case?
E.g. a line in the docstring of (the new) X
of the form "On <date> X
was renamed to Y
."
Johan Commelin (May 02 2024 at 19:48):
Random thought: instead of recording the "date" when a decl was deprecated, we record the PR number... Tooling can figure out the date on which a PR landed. Much less hassle for users.
Yaël Dillies (May 02 2024 at 19:49):
That means tying the deprecation notice to a git repo, which we might not want to do sine the deprecation system is supposed to be used across the Lean ecosystem
Mario Carneiro (May 02 2024 at 19:56):
Lean is intrinsically connected to both git and github in several places. I don't think this is an issue
Mario Carneiro (May 02 2024 at 19:57):
In any case this is a mathlib-local decision. You can put (since := "#12345")
if you want and lean won't care
Patrick Massot (May 02 2024 at 20:18):
About deprecating stuff: I’m bumping MIL and I just met the effect of a line of Mathlib:
@[deprecated] alias op_norm_le_of_shell := opNorm_le_of_shell
Patrick Massot (May 02 2024 at 20:18):
with the associated code action. It’s really great!
Patrick Massot (May 02 2024 at 20:19):
Of course we can still dream of a command line tool doing all those renames for us. But this is already infinitely nicer than getting an error and searching the changelog website (and I don’t even mention the situation before the changelog).
Violeta Hernández (Aug 19 2024 at 02:07):
Is there a possibility to skip the deletion time period for aliases to lemmas that already see little to no use? Or at least make it shorter, e.g. 1 month.
Violeta Hernández (Aug 19 2024 at 02:14):
I find it's generally the untested stuff that's inconvenient or suboptimal
Kim Morrison (Aug 19 2024 at 02:43):
At an absolute minimum, every @[deprecated]
attribute should survive at least to the next time we move Mathlib to a new stable version of Lean and update the stable
tag.
Yury G. Kudryashov (Aug 19 2024 at 05:16):
Why is it hard to keep an alias for a longer time? Are you making 2 refactors one after another?
Kim Morrison (Aug 19 2024 at 07:13):
If it's just "cluttering your file", then please be patient and leave things in place for several months. You don't need to remember to do the cleanup yourself --- eventually we'll get this semiautomated.
Kim Morrison (Aug 19 2024 at 07:13):
If you want to reduce imports, it's fine to move deprecations to another file.
Kim Morrison (Aug 19 2024 at 07:14):
So generally I think the only reason to do really fast deprecations is when you need to reuse an existing deprecated name.
Damiano Testa (Aug 19 2024 at 07:20):
A few people use Mathlib to teach courses on Lean and these are, very often, run once a year.
For this kind of schedule, a "deprecation period" of at least 9 months, and preferably at least one year seems very useful.
Damiano Testa (Aug 19 2024 at 07:20):
For instance, the repository that I used for the module that I taught last October is on a Mathlib version from October (or at the very latest November). In a couple of weeks, I will start looking at updating it for the new academic year and if I have deprecations in place for the last year to help with the upgrade, I will be very happy!
Violeta Hernández (Aug 19 2024 at 07:25):
Alright then. I guess there isn't really any hurry.
Violeta Hernández (Aug 19 2024 at 07:26):
I guess I hadn't considered people outside of Mathlib might be using my code
Yaël Dillies (Aug 19 2024 at 07:35):
I would however strongly encourage doc-gen to let us filter out deprecated lemmas. It's currently very hard to read the docs#MeasureTheory.eLpNorm API because (precisely) half the lemmas are deprecated
Ruben Van de Velde (Aug 19 2024 at 07:57):
Deprecation warnings are primarily for code outside mathlib - you're already fixing the code inside mathlib anyway
Shreyas Srinivas (Aug 19 2024 at 08:50):
Yaël Dillies said:
I would however strongly encourage doc-gen to let us filter out deprecated lemmas. It's currently very hard to read the docs#MeasureTheory.eLpNorm API because (precisely) half the lemmas are deprecated
Disagree with this. Non mathlib savvy users need to know a definition exists but has been deprecated. A notice on the definition is better than the definition disappearing altogether. There are several ways a user could end up at a definition including neural net search tools that are not retrained on every stable release
Ruben Van de Velde (Aug 19 2024 at 08:53):
They should be flagged more clearly, though
Shreyas Srinivas (Aug 19 2024 at 08:54):
Ruben Van de Velde said:
They should be flagged more clearly, though
they could be sorted and colour coded, both on the main content div and the side panel listing. Maybe even in the search results
Ruben Van de Velde (Aug 19 2024 at 08:54):
And maybe we could add a checkbox in the UI to hide them, for people who prefer that
Shreyas Srinivas (Aug 19 2024 at 08:55):
sure. My main suggestion is : As long as a definition exists and can be found by search engines, it must show up in the docs and redirect users to the replacement.
Yaël Dillies (Aug 19 2024 at 09:05):
Shreyas Srinivas said:
Disagree with this. Non mathlib savvy users need to know a definition exists but has been deprecated.
I'm talking about "letting us filter" not "filtering for us"
Daniel Weber (Aug 19 2024 at 09:41):
Yaël Dillies said:
I would however strongly encourage doc-gen to let us filter out deprecated lemmas. It's currently very hard to read the docs#MeasureTheory.eLpNorm API because (precisely) half the lemmas are deprecated
As a quick solution you can use this userscript:
// ==UserScript==
// @name Lean Docs Remove Deprecated
// @namespace http://tampermonkey.net/
// @version 2024-08-19
// @description try to take over the world!
// @author You
// @match https://leanprover-community.github.io/mathlib4_docs/**
// @icon https://www.google.com/s2/favicons?sz=64&domain=github.io
// @grant none
// ==/UserScript==
(function() {
'use strict';
document.querySelectorAll('.decl').forEach(x => x.querySelector('.attributes')?.innerText?.includes('deprecated') && x.remove());
})();
Kim Morrison (Aug 19 2024 at 11:16):
I think it's also reasonable for files that are badly cluttered with deprecations to simply move all the deprecations into a separate file (even under Deprecated/
). They will still show up in the docs, but quarantined away from all the "live" stuff.
Yaël Dillies (Aug 19 2024 at 11:20):
That makes a lot of sense actually. Even just the bottom of the file would be better than "inline"
Shreyas Srinivas (Aug 19 2024 at 11:44):
Kim Morrison said:
I think it's also reasonable for files that are badly cluttered with deprecations to simply move all the deprecations into a separate file (even under
Deprecated/
). They will still show up in the docs, but quarantined away from all the "live" stuff.
how will they be linked to by moogle?
Eric Wieser (Aug 19 2024 at 11:45):
Kim Morrison said:
I think it's also reasonable for files that are badly cluttered with deprecations to simply move all the deprecations into a separate file (even under
Deprecated/
). They will still show up in the docs, but quarantined away from all the "live" stuff.
This has the effect of causing downstream code (or rebased code in mathlib branches) that doesn't import Mathlib
to report "name foo
not found" instead of "foo
is deprecated".
Artie Khovanov (Aug 27 2024 at 16:50):
Is there automatic removal of deprecated names?
Also, what is the policy on wholesale refactoring that goes beyond renaming / editing the proof? Must the old version be kept?
Yaël Dillies (Aug 27 2024 at 16:53):
No. ?_. No.
Yaël Dillies (Jan 07 2025 at 13:59):
In #20541 I am moving a bunch of lemmas that were renamed in July. To avoid cluttering the diff, I deleted the deprecated aliases at the same time. Are people happy with this?
Kevin Buzzard (Jan 07 2025 at 14:03):
Won't that have the opposite effect, cluttering the output of the github action declaration diff and furthermore making the blocks of deleted and included code even more different?
Eric Wieser (Jan 07 2025 at 14:24):
I think it would be better to delete the deprecated aliases in their own pr
Eric Wieser (Jan 07 2025 at 14:24):
I think the policy says that's fine for you to do as long as the PR deprecating them was actually merged and not just opened in July
Johan Commelin (Jan 07 2025 at 16:55):
Which is of course very annoying to check for every decl. So we need better tooling there, but I'm not exactly sure how it should work.
Johan Commelin (Jan 07 2025 at 16:55):
Maybe it can use git blame?
Eric Wieser (Jan 07 2025 at 17:11):
I think it should work by looking at the environment in a previous release
Eric Wieser (Jan 07 2025 at 17:11):
Arguably it's also fine to delete lemmas without a deprecation if they never made it into a release
Ruben Van de Velde (Jan 07 2025 at 17:25):
I disagree on that - that just makes life harder for people who update more frequently if they end up using anything new. We should be deprecating much more consistently, not less
Jireh Loreaux (Jan 07 2025 at 17:33):
I think for deleting deprecated aliases we should go solely off the since
date. It's up to the maintainer doing the merging in the PR introducing them to say: "you added these deprecations 2 months ago, that's not recent enough, please update the date: bors d+"
Eric Wieser (Jan 07 2025 at 17:41):
Maybe we should have deprecated (pending_since := "2025-01-07")
, and a nightly job that replaces all such markers with since := "current date"
?
Eric Wieser (Jan 07 2025 at 17:44):
Or even just deprecated (since := pending-date)
in the PR, which avoids the need for any human thought about dates at all
Eric Wieser (Jan 07 2025 at 17:44):
(pending-date
is the same length as "YYYY-MM-DD"
so avoids any line wrapping issues")
Eric Wieser (Jan 07 2025 at 17:45):
The other complexity that this introduces is that the nightly job has to run before a release is cut
Jireh Loreaux (Jan 07 2025 at 17:46):
(if we can avoid PRs rotting for months, then it doesn't matter anyway :laughing: I realize this isn't quite a fix as sometimes PRs can exist for a long time naturally.)
Michael Rothgang (Jan 07 2025 at 17:47):
Is there a special label for renaming PRs? Should there be one? (Should such PRs be prioritised?)
Jireh Loreaux (Jan 07 2025 at 17:49):
What about a GitHub bot which scans a PR before merging (just with grep, nothing fancy) and flags if the deprecations are older than, say, 2 weeks? I imagine this would fix 90+% of the issues, even if it weren't perfect. It would avoid the need for a maintainer to think, and it would prevent pending-date
cruft from appearing on master
.
Jireh Loreaux (Jan 07 2025 at 17:50):
Michael Rothgang said:
Is there a special label for renaming PRs? Should there be one? (Should such PRs be prioritised?)
The problem is that renaming happens outside of "renaming PRs" (i.e., outside of PRs which solely perform renames).
Jireh Loreaux (Jan 07 2025 at 17:52):
And I don't think we should be in a rush to merge renaming PRs unless there is already consensus, as this can lead to hasty renames which cause further headaches. In cases where there is consensus and the PR touches many files, it often gets posted in the speedy review thread anyway.
Edward van de Meent (Jan 07 2025 at 17:58):
i think using pending-date
is a feasable pattern... manually inserting a date pre-merging only makes it harder to automatically update the deprecations, since you'd need to go through the diff to decide if it was or wasn't there previously.
Edward van de Meent (Jan 07 2025 at 18:03):
as for what triggers replacement, i feel like it should be possible to have bors do this replacement while pushing to staging
, but i'm not familiar enough with bors to be the judge of that. if it's not happening with bors, we could have a CI trigger from a comment of lock deprecations
on the PR...
Edward van de Meent (Jan 07 2025 at 18:04):
we might then even implement unlock deprecations
which looks for a commit by the bot with appropriate header and reverts it!
Michael Rothgang (Apr 14 2025 at 09:22):
For the record: this thread seems to have converged, and there's a de facto deprecation policy now. https://github.com/leanprover-community/leanprover-community.github.io/pull/604 is documenting that (awaiting-author to incorporate some details; I'll ping this thread when I'd maintainer-merge this).
Michael Rothgang (Apr 14 2025 at 09:22):
I'd like to propose an addition to this deprecation policy: when a Lean module is removed, a module deprecation should be added. This will notify downstream projects (or mathlib branches) of files moving. Specifics:
- separate PRs: one PR should rename or remove the file (to keep git's diff clean), a follow-up PR should add the module deprecation. These PRs should be merged at the same time, and within the same release.
- specifics: a new deprecated module contains a copyright header and a
deprecated_module
command, no more deprecated_module
s keep the same removal schedule as for normal deprecations, i.e. six months after the PR is merged.
The last point is up for discussion.
See the other thread #mathlib4 > `Deprecated` folder for the prior discussion.
If there is agreement on this, I can PR this change to the webpage.
Edward van de Meent (Apr 14 2025 at 10:01):
Michael Rothgang said:
- specifics: a new deprecated module contains a copyright header and a
deprecated_module
command, no more
does this mean we do not allow imports which make sure we don't fully pull the rug from under users?
Kim Morrison (Apr 14 2025 at 10:15):
I thought the deprecated_module
file must have imports, because they are used to generate the replacement suggestion.
Kim Morrison (Apr 14 2025 at 10:16):
Michael Rothgang said:
deprecated_module
s keep the same removal schedule as for normal deprecations, i.e. six months after the PR is merged.
And with the same exception as for deprecated declarations: "and earlier is fine if there is a good reason, but please try to make sure it survives at least one month".
Eric Wieser (Apr 14 2025 at 10:18):
Where "one month" really means "is alive at one tagged release"?
Damiano Testa (Apr 14 2025 at 10:19):
I confirm that deprecated_module
uses the imports of the current file to suggest which modules to deprecate to. So, the structure of a "deprecated module" is
copyright
imports
deprecated_module (since "yyy-mm-dd")
except that the header linter will not enforce it.
Yaël Dillies (Apr 14 2025 at 10:19):
Why would there be a copyright in a deprecated module?
Damiano Testa (Apr 14 2025 at 10:20):
I'm not sure about this, I was just going for "minimal changes wrt the above"!
Damiano Testa (Apr 14 2025 at 10:21):
I would be happy with
imports
deprecated_module (since "yyy-mm-dd")
and I also would be ok with allowing something more (e.g. a module doc explaining the reason for the file existing, e.g. for doc-gen).
Damiano Testa (Apr 14 2025 at 10:23):
For instance, if removing a module is a long term refactor for some reason, the deprecated_module
command may get there early, but the file might still contain declarations that are planned to be removed? I am not sure.
Yaël Dillies (Apr 14 2025 at 10:24):
Is the syntax really deprecated_module (since "yyy-mm-dd")
and not deprecated_module (since := "yyyy-mm-dd")
?
Damiano Testa (Apr 14 2025 at 10:25):
The syntax is what is written above, without :=
. Did I mess up copying the @[deprecated]
syntax?
Damiano Testa (Apr 14 2025 at 10:26):
Yes, it looks like it. Let me amend this.
Yaël Dillies (Apr 14 2025 at 10:26):
I'm afraid so :sweat_smile: The syntax is @[deprecated decl "message" (since := "yyyy-mm-dd")]
Damiano Testa (Apr 14 2025 at 10:43):
Ok, #24032 makes the syntax line up, but it may be that Michael will add my PR to his #24028.
Damiano Testa (Apr 14 2025 at 10:46):
(Note that my PR is a superset of Michael's, but his came before Yaël noticed the missing :=
syntax from deprecated_module
.)
Michael Rothgang (Apr 14 2025 at 11:09):
Thanks for the comments so far. I will include those in a PR with the policy (assuming no further arguments against doing so).
Eric Wieser (Apr 14 2025 at 12:05):
Michael Rothgang said:
- separate PRs: one PR should rename or remove the file (to keep git's diff clean), a follow-up PR should add the module deprecation. These PRs should be merged at the same time, and within the same release.
I agree this is a good idea for git, but this sounds very annoying so I'd hope we can find a better solution. Could we teach lean / lake to either:
- Try
Foo.lean.deprecated
if it can't findFoo.lean
- Try
deprecated/Mathlib/Foo.lean
if it can't findMathlib/Foo.lean
Then we could create the files at the same time and wouldn't leave people in a weird middle ground where the change has landed but not the deprecation.
Mario Carneiro (Apr 14 2025 at 12:50):
the latter sounds easier than the former, we just need it to exist as an additional LEAN_SRC_DIR
Eric Wieser (Apr 14 2025 at 12:51):
I don't think source directories are allowed to overlap any more
Eric Wieser (Apr 14 2025 at 12:52):
That is, the algorithm for finding import Foo.Bar
is:
- Search the search path for
Foo
- Now look inside the first match for
Bar.lean
, fail if it can't be found
As opposed to
- Search the search path for
Foo/Bar.lean
Mario Carneiro (Apr 14 2025 at 12:56):
I don't think it searches for directories
Mario Carneiro (Apr 14 2025 at 12:56):
I think what you say is mostly true but I don't think that algorithm is correct
Mario Carneiro (Apr 14 2025 at 12:58):
def findWithExt (sp : SearchPath) (ext : String) (mod : Name) : IO (Option FilePath) := do
let pkg := mod.getRoot.toString (escape := false)
let root? ← sp.findM? fun p =>
(p / pkg).isDir <||> ((p / pkg).addExtension ext).pathExists
return root?.map (modToFilePath · mod ext)
Mario Carneiro (Apr 14 2025 at 12:59):
So you are mostly right but it's looking for Foo.lean
not Foo/
Eric Wieser (Apr 14 2025 at 12:59):
It looks for both, right?
Mario Carneiro (Apr 14 2025 at 13:00):
oh right I guess that's what it would do :man_facepalming:
Mario Carneiro (Apr 14 2025 at 13:01):
if a change is needed to this function then I think both options are equally hard
Mario Carneiro (Apr 14 2025 at 13:01):
(I would suggest Foo.deprecated.lean
instead of Foo.lean.deprecated
though so that editors and general purpose language identification tools don't freak out)
Maja Kądziołka (Apr 14 2025 at 13:33):
Eric Wieser said:
Then we could create the files at the same time and wouldn't leave people in a weird middle ground where the change has landed but not the deprecation.
The usual way of keeping diffs clean while not having any weird intermediate states would be to do it in a single PR, but not squash the PR.
Eric Wieser (Apr 14 2025 at 13:34):
I think the way we use bors
makes that difficult for mathlib
Kim Morrison (Apr 14 2025 at 13:35):
That has it's own difficulties, because of the pace of commits to master
and the relatively long CI time. If you decide to manually merge, you lose the guarantees bors
gives you.
Eric Wieser (Apr 14 2025 at 13:35):
Bors can probably be configured to use merge commits, but currently it's the norm to have PR's with ~20 "fix" commits, and we'd need a shift to requiring users to (partially) squash their history before merging
Kim Morrison (Apr 14 2025 at 13:36):
My preference is to just accept the suboptimal git history. It's pretty common that this happens anyway today during renames, and it would be unfortunate to stall here.
Eric Wieser (Apr 14 2025 at 13:36):
Do you mean "don't split the commit" or "deal with it being in two unsynchronized parts"?
Kim Morrison (Apr 14 2025 at 13:37):
Don't split the commit.
Damiano Testa (Apr 14 2025 at 13:41):
If we add a removed-file
label and make sure that bors
does not merge a PR with such a label, the PR could do the rename without the deprecation, have the label applied until the end and, only as a last step, create the deprecated file and remove the label.
Eric Wieser (Apr 14 2025 at 13:43):
That only helps with seeing the diff in review, not with seeing it in git blame
Damiano Testa (Apr 14 2025 at 13:43):
Btw, is anyone interested in merging #24032, which fixes my oversight of how the since
syntax works?
Eric Wieser (Apr 14 2025 at 13:44):
Another option is to create the .deprecated
files despite lean not supporting them and:
- Have CI rename them before doing a build, to check that they are valid
- Have a nightly bot that creates a PR removing the
.deprecated
suffix
Damiano Testa (Apr 14 2025 at 13:46):
If we do that, could the nightly bot also sync Mathlib.lean
and we stop maintaining it?
Michael Rothgang (Apr 14 2025 at 21:34):
Slightly different topic: https://github.com/leanprover-community/leanprover-community.github.io/pull/604 documents the mathlib deprecation policy. The PR looks good to me and matches my impression of the current policy --- but more eyes on this are welcome. Please comment on the PR (if you have style suggestions) or say here if you think something significant should change!
Last updated: May 02 2025 at 03:31 UTC