Zulip Chat Archive

Stream: mathlib4

Topic: Deprecation policy


Michael Rothgang (May 02 2024 at 13:18):

In the past weeks, the question of a deprecation policy for mathlib has come up a few times. I think it would be helpful to make a formal decision, to simplify guidance for future PR authors. Let me suggest a first draft, based on what I took away from discussions so far.

Michael Rothgang (May 02 2024 at 13:19):

Mathlib deprecation policy

This policy applies to all items in mathlib tagged with the @deprecated attribute. This can be aliasses, but also definitions, theorems and lemmas.

Deprecating items. When deprecating a new item, please add the current date (to allow removing it in the future).
Since Lean 4.8, the @deprecated attribute allows including a date in the since field (as a string of the form 2024-04-01).
Some existing items have no deprecation date; PRs adding them are welcome.

Please deprecate when renaming. When renaming a definition or theorem, please create an alias for the old name and mark it with deprecated. This step can be skipped when the declaration is very new (say, less than one week).
There is no need to create aliasses for renaming instances.
For automatically generated declarations (e.g. by to_additive or simps), please ensure the generated declaration is also tagged with this attribute.

Removing deprecated items.
Deprecated items can be removed from mathlib after <time> after their initial deprecation date. PRs doing so are welcome.

Michael Rothgang (May 02 2024 at 13:19):

I would appreciate comments on e.g the following questions:

  • Any general comments about this? Missing aspects?
  • Time-frame for skipping a deprecation.
  • Are there other cases of generated declarations which could be deprecated? I've only seen this with to_additive (added: and simps) so far.
  • Time-frame when items can be removed is up for discussion. (So far, I gather that "a year" is considered plenty and "six months" considered sufficient by some, but at least "three months" seems to be desired. But that is probably not a representative sample/will have overheard some!)

If/once there's broad consensus, we could e.g. have a poll on the last question (which is perhaps the most controversial).

Michael Rothgang (May 02 2024 at 13:20):

Currently in mathlib,

  • some declarations generated by to_additive are missing this attribute (update: it seems there is only one offender, fixed in #12597),
  • not all deprecated declarations are tagged with a date (my WIP branch branch#MR-deprecation-dates-all makes some progress on this; some PRs for this are open for review)
  • deprecation dates are using doc comments; I presume PRs converting this to the since field are welcome

Michael Rothgang (May 02 2024 at 13:23):

My personal opinion (but I'm not a maintainer, so no need to weigh it highly): I think at least four months could be useful. In practice, deprecation dates are usually the date when the PR is filed, and merging it can take a while (especially with big refactoring PRs). A four months policy ensures the deprecation will be on mathlib for at least three months.

Johan Commelin (May 02 2024 at 13:44):

Quick note, I think simps also generates candidates

Ruben Van de Velde (May 02 2024 at 14:24):

Did we end up recommending that people update mathlib on the lean release cadence? If so, we could use x of those as a cutoff

Mario Carneiro (May 02 2024 at 16:06):

I would say 1 year max and 1 stable release minimum, and for things in the middle we try to keep them if it's not an undue amount of work

Julian Berman (May 02 2024 at 16:10):

(I have nothing to add, but this is great to see discussed/given attention to!)

Mario Carneiro (May 02 2024 at 16:15):

Michael Rothgang said:

My personal opinion (but I'm not a maintainer, so no need to weigh it highly): I think at least four months could be useful. In practice, deprecation dates are usually the date when the PR is filed, and merging it can take a while (especially with big refactoring PRs). A four months policy ensures the deprecation will be on mathlib for at least three months.

I think deprecation dates should reflect when they actually landed in mathlib, not when the deprecation was proposed. We may be able to do this more effectively by having PRs not put a date and then having a bot automatically add dates to unadorned deprecations directly on master

Yaël Dillies (May 02 2024 at 16:16):

Stupid consideration: What about the 100 chars line limit?

Mario Carneiro (May 02 2024 at 16:16):

try to leave enough space?

Mario Carneiro (May 02 2024 at 16:17):

the bot would probably fail and ping someone to help

Floris van Doorn (May 02 2024 at 17:39):

I would be fine not mandating deprecations for lemmas generated by simps:

  • They are generally not typed manually (though they could appear in simp only calls
  • It's very hard to have a lemma generated by simps in your code without the original definition, and the simps lemma should change very predictably. (assuming the only change is renaming the def.)

Floris van Doorn (May 02 2024 at 17:44):

Our policy should probably say something about renaming X to Y and in the same PR renaming W to X. E.g.:

We allow but discourage contributors to rename declaration X to Y and at the same time renaming W to X. In this case, no deprecation attribute is required for X (but it is for W).

But maybe we should have a better solution for this case?
E.g. a line in the docstring of (the new) X of the form "On <date> X was renamed to Y."

Johan Commelin (May 02 2024 at 19:48):

Random thought: instead of recording the "date" when a decl was deprecated, we record the PR number... Tooling can figure out the date on which a PR landed. Much less hassle for users.

Yaël Dillies (May 02 2024 at 19:49):

That means tying the deprecation notice to a git repo, which we might not want to do sine the deprecation system is supposed to be used across the Lean ecosystem

Mario Carneiro (May 02 2024 at 19:56):

Lean is intrinsically connected to both git and github in several places. I don't think this is an issue

Mario Carneiro (May 02 2024 at 19:57):

In any case this is a mathlib-local decision. You can put (since := "#12345") if you want and lean won't care

Patrick Massot (May 02 2024 at 20:18):

About deprecating stuff: I’m bumping MIL and I just met the effect of a line of Mathlib:

@[deprecated] alias op_norm_le_of_shell := opNorm_le_of_shell

Patrick Massot (May 02 2024 at 20:18):

with the associated code action. It’s really great!

Patrick Massot (May 02 2024 at 20:19):

Of course we can still dream of a command line tool doing all those renames for us. But this is already infinitely nicer than getting an error and searching the changelog website (and I don’t even mention the situation before the changelog).

Violeta Hernández (Aug 19 2024 at 02:07):

Is there a possibility to skip the deletion time period for aliases to lemmas that already see little to no use? Or at least make it shorter, e.g. 1 month.

Violeta Hernández (Aug 19 2024 at 02:14):

I find it's generally the untested stuff that's inconvenient or suboptimal

Kim Morrison (Aug 19 2024 at 02:43):

At an absolute minimum, every @[deprecated] attribute should survive at least to the next time we move Mathlib to a new stable version of Lean and update the stable tag.

Yury G. Kudryashov (Aug 19 2024 at 05:16):

Why is it hard to keep an alias for a longer time? Are you making 2 refactors one after another?

Kim Morrison (Aug 19 2024 at 07:13):

If it's just "cluttering your file", then please be patient and leave things in place for several months. You don't need to remember to do the cleanup yourself --- eventually we'll get this semiautomated.

Kim Morrison (Aug 19 2024 at 07:13):

If you want to reduce imports, it's fine to move deprecations to another file.

Kim Morrison (Aug 19 2024 at 07:14):

So generally I think the only reason to do really fast deprecations is when you need to reuse an existing deprecated name.

Damiano Testa (Aug 19 2024 at 07:20):

A few people use Mathlib to teach courses on Lean and these are, very often, run once a year.

For this kind of schedule, a "deprecation period" of at least 9 months, and preferably at least one year seems very useful.

Damiano Testa (Aug 19 2024 at 07:20):

For instance, the repository that I used for the module that I taught last October is on a Mathlib version from October (or at the very latest November). In a couple of weeks, I will start looking at updating it for the new academic year and if I have deprecations in place for the last year to help with the upgrade, I will be very happy!

Violeta Hernández (Aug 19 2024 at 07:25):

Alright then. I guess there isn't really any hurry.

Violeta Hernández (Aug 19 2024 at 07:26):

I guess I hadn't considered people outside of Mathlib might be using my code

Yaël Dillies (Aug 19 2024 at 07:35):

I would however strongly encourage doc-gen to let us filter out deprecated lemmas. It's currently very hard to read the docs#MeasureTheory.eLpNorm API because (precisely) half the lemmas are deprecated

Ruben Van de Velde (Aug 19 2024 at 07:57):

Deprecation warnings are primarily for code outside mathlib - you're already fixing the code inside mathlib anyway

Shreyas Srinivas (Aug 19 2024 at 08:50):

Yaël Dillies said:

I would however strongly encourage doc-gen to let us filter out deprecated lemmas. It's currently very hard to read the docs#MeasureTheory.eLpNorm API because (precisely) half the lemmas are deprecated

Disagree with this. Non mathlib savvy users need to know a definition exists but has been deprecated. A notice on the definition is better than the definition disappearing altogether. There are several ways a user could end up at a definition including neural net search tools that are not retrained on every stable release

Ruben Van de Velde (Aug 19 2024 at 08:53):

They should be flagged more clearly, though

Shreyas Srinivas (Aug 19 2024 at 08:54):

Ruben Van de Velde said:

They should be flagged more clearly, though

they could be sorted and colour coded, both on the main content div and the side panel listing. Maybe even in the search results

Ruben Van de Velde (Aug 19 2024 at 08:54):

And maybe we could add a checkbox in the UI to hide them, for people who prefer that

Shreyas Srinivas (Aug 19 2024 at 08:55):

sure. My main suggestion is : As long as a definition exists and can be found by search engines, it must show up in the docs and redirect users to the replacement.

Yaël Dillies (Aug 19 2024 at 09:05):

Shreyas Srinivas said:

Disagree with this. Non mathlib savvy users need to know a definition exists but has been deprecated.

I'm talking about "letting us filter" not "filtering for us"

Daniel Weber (Aug 19 2024 at 09:41):

Yaël Dillies said:

I would however strongly encourage doc-gen to let us filter out deprecated lemmas. It's currently very hard to read the docs#MeasureTheory.eLpNorm API because (precisely) half the lemmas are deprecated

As a quick solution you can use this userscript:

// ==UserScript==
// @name         Lean Docs Remove Deprecated
// @namespace    http://tampermonkey.net/
// @version      2024-08-19
// @description  try to take over the world!
// @author       You
// @match        https://leanprover-community.github.io/mathlib4_docs/**
// @icon         https://www.google.com/s2/favicons?sz=64&domain=github.io
// @grant        none
// ==/UserScript==

(function() {
    'use strict';
    document.querySelectorAll('.decl').forEach(x => x.querySelector('.attributes')?.innerText?.includes('deprecated') && x.remove());
})();

Kim Morrison (Aug 19 2024 at 11:16):

I think it's also reasonable for files that are badly cluttered with deprecations to simply move all the deprecations into a separate file (even under Deprecated/). They will still show up in the docs, but quarantined away from all the "live" stuff.

Yaël Dillies (Aug 19 2024 at 11:20):

That makes a lot of sense actually. Even just the bottom of the file would be better than "inline"

Shreyas Srinivas (Aug 19 2024 at 11:44):

Kim Morrison said:

I think it's also reasonable for files that are badly cluttered with deprecations to simply move all the deprecations into a separate file (even under Deprecated/). They will still show up in the docs, but quarantined away from all the "live" stuff.

how will they be linked to by moogle?

Eric Wieser (Aug 19 2024 at 11:45):

Kim Morrison said:

I think it's also reasonable for files that are badly cluttered with deprecations to simply move all the deprecations into a separate file (even under Deprecated/). They will still show up in the docs, but quarantined away from all the "live" stuff.

This has the effect of causing downstream code (or rebased code in mathlib branches) that doesn't import Mathlib to report "name foo not found" instead of "foo is deprecated".

Artie Khovanov (Aug 27 2024 at 16:50):

Is there automatic removal of deprecated names?
Also, what is the policy on wholesale refactoring that goes beyond renaming / editing the proof? Must the old version be kept?

Yaël Dillies (Aug 27 2024 at 16:53):

No. ?_. No.

Yaël Dillies (Jan 07 2025 at 13:59):

In #20541 I am moving a bunch of lemmas that were renamed in July. To avoid cluttering the diff, I deleted the deprecated aliases at the same time. Are people happy with this?

Kevin Buzzard (Jan 07 2025 at 14:03):

Won't that have the opposite effect, cluttering the output of the github action declaration diff and furthermore making the blocks of deleted and included code even more different?

Eric Wieser (Jan 07 2025 at 14:24):

I think it would be better to delete the deprecated aliases in their own pr

Eric Wieser (Jan 07 2025 at 14:24):

I think the policy says that's fine for you to do as long as the PR deprecating them was actually merged and not just opened in July

Johan Commelin (Jan 07 2025 at 16:55):

Which is of course very annoying to check for every decl. So we need better tooling there, but I'm not exactly sure how it should work.

Johan Commelin (Jan 07 2025 at 16:55):

Maybe it can use git blame?

Eric Wieser (Jan 07 2025 at 17:11):

I think it should work by looking at the environment in a previous release

Eric Wieser (Jan 07 2025 at 17:11):

Arguably it's also fine to delete lemmas without a deprecation if they never made it into a release

Ruben Van de Velde (Jan 07 2025 at 17:25):

I disagree on that - that just makes life harder for people who update more frequently if they end up using anything new. We should be deprecating much more consistently, not less

Jireh Loreaux (Jan 07 2025 at 17:33):

I think for deleting deprecated aliases we should go solely off the since date. It's up to the maintainer doing the merging in the PR introducing them to say: "you added these deprecations 2 months ago, that's not recent enough, please update the date: bors d+"

Eric Wieser (Jan 07 2025 at 17:41):

Maybe we should have deprecated (pending_since := "2025-01-07"), and a nightly job that replaces all such markers with since := "current date"?

Eric Wieser (Jan 07 2025 at 17:44):

Or even just deprecated (since := pending-date) in the PR, which avoids the need for any human thought about dates at all

Eric Wieser (Jan 07 2025 at 17:44):

(pending-date is the same length as "YYYY-MM-DD" so avoids any line wrapping issues")

Eric Wieser (Jan 07 2025 at 17:45):

The other complexity that this introduces is that the nightly job has to run before a release is cut

Jireh Loreaux (Jan 07 2025 at 17:46):

(if we can avoid PRs rotting for months, then it doesn't matter anyway :laughing: I realize this isn't quite a fix as sometimes PRs can exist for a long time naturally.)

Michael Rothgang (Jan 07 2025 at 17:47):

Is there a special label for renaming PRs? Should there be one? (Should such PRs be prioritised?)

Jireh Loreaux (Jan 07 2025 at 17:49):

What about a GitHub bot which scans a PR before merging (just with grep, nothing fancy) and flags if the deprecations are older than, say, 2 weeks? I imagine this would fix 90+% of the issues, even if it weren't perfect. It would avoid the need for a maintainer to think, and it would prevent pending-date cruft from appearing on master.

Jireh Loreaux (Jan 07 2025 at 17:50):

Michael Rothgang said:

Is there a special label for renaming PRs? Should there be one? (Should such PRs be prioritised?)

The problem is that renaming happens outside of "renaming PRs" (i.e., outside of PRs which solely perform renames).

Jireh Loreaux (Jan 07 2025 at 17:52):

And I don't think we should be in a rush to merge renaming PRs unless there is already consensus, as this can lead to hasty renames which cause further headaches. In cases where there is consensus and the PR touches many files, it often gets posted in the speedy review thread anyway.

Edward van de Meent (Jan 07 2025 at 17:58):

i think using pending-date is a feasable pattern... manually inserting a date pre-merging only makes it harder to automatically update the deprecations, since you'd need to go through the diff to decide if it was or wasn't there previously.

Edward van de Meent (Jan 07 2025 at 18:03):

as for what triggers replacement, i feel like it should be possible to have bors do this replacement while pushing to staging, but i'm not familiar enough with bors to be the judge of that. if it's not happening with bors, we could have a CI trigger from a comment of lock deprecations on the PR...

Edward van de Meent (Jan 07 2025 at 18:04):

we might then even implement unlock deprecations which looks for a commit by the bot with appropriate header and reverts it!

Michael Rothgang (Apr 14 2025 at 09:22):

For the record: this thread seems to have converged, and there's a de facto deprecation policy now. https://github.com/leanprover-community/leanprover-community.github.io/pull/604 is documenting that (awaiting-author to incorporate some details; I'll ping this thread when I'd maintainer-merge this).

Michael Rothgang (Apr 14 2025 at 09:22):

I'd like to propose an addition to this deprecation policy: when a Lean module is removed, a module deprecation should be added. This will notify downstream projects (or mathlib branches) of files moving. Specifics:

  • separate PRs: one PR should rename or remove the file (to keep git's diff clean), a follow-up PR should add the module deprecation. These PRs should be merged at the same time, and within the same release.
  • specifics: a new deprecated module contains a copyright header and a deprecated_module command, no more
  • deprecated_modules keep the same removal schedule as for normal deprecations, i.e. six months after the PR is merged.

The last point is up for discussion.

See the other thread #mathlib4 > `Deprecated` folder for the prior discussion.
If there is agreement on this, I can PR this change to the webpage.

Edward van de Meent (Apr 14 2025 at 10:01):

Michael Rothgang said:

  • specifics: a new deprecated module contains a copyright header and a deprecated_module command, no more

does this mean we do not allow imports which make sure we don't fully pull the rug from under users?

Kim Morrison (Apr 14 2025 at 10:15):

I thought the deprecated_module file must have imports, because they are used to generate the replacement suggestion.

Kim Morrison (Apr 14 2025 at 10:16):

Michael Rothgang said:

  • deprecated_modules keep the same removal schedule as for normal deprecations, i.e. six months after the PR is merged.

And with the same exception as for deprecated declarations: "and earlier is fine if there is a good reason, but please try to make sure it survives at least one month".

Eric Wieser (Apr 14 2025 at 10:18):

Where "one month" really means "is alive at one tagged release"?

Damiano Testa (Apr 14 2025 at 10:19):

I confirm that deprecated_module uses the imports of the current file to suggest which modules to deprecate to. So, the structure of a "deprecated module" is

copyright
imports
deprecated_module (since "yyy-mm-dd")

except that the header linter will not enforce it.

Yaël Dillies (Apr 14 2025 at 10:19):

Why would there be a copyright in a deprecated module?

Damiano Testa (Apr 14 2025 at 10:20):

I'm not sure about this, I was just going for "minimal changes wrt the above"!

Damiano Testa (Apr 14 2025 at 10:21):

I would be happy with

imports

deprecated_module (since "yyy-mm-dd")

and I also would be ok with allowing something more (e.g. a module doc explaining the reason for the file existing, e.g. for doc-gen).

Damiano Testa (Apr 14 2025 at 10:23):

For instance, if removing a module is a long term refactor for some reason, the deprecated_module command may get there early, but the file might still contain declarations that are planned to be removed? I am not sure.

Yaël Dillies (Apr 14 2025 at 10:24):

Is the syntax really deprecated_module (since "yyy-mm-dd") and not deprecated_module (since := "yyyy-mm-dd")?

Damiano Testa (Apr 14 2025 at 10:25):

The syntax is what is written above, without :=. Did I mess up copying the @[deprecated] syntax?

Damiano Testa (Apr 14 2025 at 10:26):

Yes, it looks like it. Let me amend this.

Yaël Dillies (Apr 14 2025 at 10:26):

I'm afraid so :sweat_smile: The syntax is @[deprecated decl "message" (since := "yyyy-mm-dd")]

Damiano Testa (Apr 14 2025 at 10:43):

Ok, #24032 makes the syntax line up, but it may be that Michael will add my PR to his #24028.

Damiano Testa (Apr 14 2025 at 10:46):

(Note that my PR is a superset of Michael's, but his came before Yaël noticed the missing := syntax from deprecated_module.)

Michael Rothgang (Apr 14 2025 at 11:09):

Thanks for the comments so far. I will include those in a PR with the policy (assuming no further arguments against doing so).

Eric Wieser (Apr 14 2025 at 12:05):

Michael Rothgang said:

  • separate PRs: one PR should rename or remove the file (to keep git's diff clean), a follow-up PR should add the module deprecation. These PRs should be merged at the same time, and within the same release.

I agree this is a good idea for git, but this sounds very annoying so I'd hope we can find a better solution. Could we teach lean / lake to either:

  • Try Foo.lean.deprecated if it can't find Foo.lean
  • Try deprecated/Mathlib/Foo.lean if it can't find Mathlib/Foo.lean

Then we could create the files at the same time and wouldn't leave people in a weird middle ground where the change has landed but not the deprecation.

Mario Carneiro (Apr 14 2025 at 12:50):

the latter sounds easier than the former, we just need it to exist as an additional LEAN_SRC_DIR

Eric Wieser (Apr 14 2025 at 12:51):

I don't think source directories are allowed to overlap any more

Eric Wieser (Apr 14 2025 at 12:52):

That is, the algorithm for finding import Foo.Bar is:

  • Search the search path for Foo
  • Now look inside the first match for Bar.lean, fail if it can't be found

As opposed to

  • Search the search path for Foo/Bar.lean

Mario Carneiro (Apr 14 2025 at 12:56):

I don't think it searches for directories

Mario Carneiro (Apr 14 2025 at 12:56):

I think what you say is mostly true but I don't think that algorithm is correct

Mario Carneiro (Apr 14 2025 at 12:58):

def findWithExt (sp : SearchPath) (ext : String) (mod : Name) : IO (Option FilePath) := do
  let pkg := mod.getRoot.toString (escape := false)
  let root?  sp.findM? fun p =>
    (p / pkg).isDir <||> ((p / pkg).addExtension ext).pathExists
  return root?.map (modToFilePath · mod ext)

Mario Carneiro (Apr 14 2025 at 12:59):

So you are mostly right but it's looking for Foo.lean not Foo/

Eric Wieser (Apr 14 2025 at 12:59):

It looks for both, right?

Mario Carneiro (Apr 14 2025 at 13:00):

oh right I guess that's what it would do :man_facepalming:

Mario Carneiro (Apr 14 2025 at 13:01):

if a change is needed to this function then I think both options are equally hard

Mario Carneiro (Apr 14 2025 at 13:01):

(I would suggest Foo.deprecated.lean instead of Foo.lean.deprecated though so that editors and general purpose language identification tools don't freak out)

Maja Kądziołka (Apr 14 2025 at 13:33):

Eric Wieser said:

Then we could create the files at the same time and wouldn't leave people in a weird middle ground where the change has landed but not the deprecation.

The usual way of keeping diffs clean while not having any weird intermediate states would be to do it in a single PR, but not squash the PR.

Eric Wieser (Apr 14 2025 at 13:34):

I think the way we use bors makes that difficult for mathlib

Kim Morrison (Apr 14 2025 at 13:35):

That has it's own difficulties, because of the pace of commits to master and the relatively long CI time. If you decide to manually merge, you lose the guarantees bors gives you.

Eric Wieser (Apr 14 2025 at 13:35):

Bors can probably be configured to use merge commits, but currently it's the norm to have PR's with ~20 "fix" commits, and we'd need a shift to requiring users to (partially) squash their history before merging

Kim Morrison (Apr 14 2025 at 13:36):

My preference is to just accept the suboptimal git history. It's pretty common that this happens anyway today during renames, and it would be unfortunate to stall here.

Eric Wieser (Apr 14 2025 at 13:36):

Do you mean "don't split the commit" or "deal with it being in two unsynchronized parts"?

Kim Morrison (Apr 14 2025 at 13:37):

Don't split the commit.

Damiano Testa (Apr 14 2025 at 13:41):

If we add a removed-file label and make sure that bors does not merge a PR with such a label, the PR could do the rename without the deprecation, have the label applied until the end and, only as a last step, create the deprecated file and remove the label.

Eric Wieser (Apr 14 2025 at 13:43):

That only helps with seeing the diff in review, not with seeing it in git blame

Damiano Testa (Apr 14 2025 at 13:43):

Btw, is anyone interested in merging #24032, which fixes my oversight of how the since syntax works?

Eric Wieser (Apr 14 2025 at 13:44):

Another option is to create the .deprecated files despite lean not supporting them and:

  • Have CI rename them before doing a build, to check that they are valid
  • Have a nightly bot that creates a PR removing the .deprecated suffix

Damiano Testa (Apr 14 2025 at 13:46):

If we do that, could the nightly bot also sync Mathlib.lean and we stop maintaining it?

Michael Rothgang (Apr 14 2025 at 21:34):

Slightly different topic: https://github.com/leanprover-community/leanprover-community.github.io/pull/604 documents the mathlib deprecation policy. The PR looks good to me and matches my impression of the current policy --- but more eyes on this are welcome. Please comment on the PR (if you have style suggestions) or say here if you think something significant should change!


Last updated: May 02 2025 at 03:31 UTC