Zulip Chat Archive

Stream: mathlib4

Topic: Call for help: technical/ organisational debt


Michael Rothgang (Apr 15 2025 at 15:07):

@Yaël Dillies and I were brainstorming some ideas for "technical debt" in a broad sense: let's collect tasks where helping hands are necessary and useful. The focus is not on "mathematical topic X" (surely, many people have many ideas of their own), but on refactoring/clean-up tasks that require more time than very deep expertise. Below, I'll put some ideas to get the ball rolling.

Maintainers/stats nerds If you know how to quantify one of these, adding that to the tech debt report could be welcome!
Everybody: please feel free to propose new tasks here. (If a task is controversial, let's move discussion elsewhere - so this is really a handy list of things.) If you think an item is complete, feel free to react with :check: --- or send a private message to the person proposing the item, so they can check.
To volunteers: if you come by a task task that is e.g. six weeks old, it's a good idea to ask if that task is still up to date. (Sometimes, context changes.) Otherwise, go ahead - mathlib is built by many helping hands! If you're working on a PR addressing an item in this list, feel free to claim it by adding the :working_on_it: emoji.

Michael Rothgang (Apr 15 2025 at 15:08):

  1. remove erw: this is tracked in the technical debt metrics already (some guidelines)

Michael Rothgang (Apr 15 2025 at 15:10):

  1. triage old issues: updating the state of a tracking issue is useful information. If you think a PR resolved an issue, pinging the author with a friendly message can be useful.

Michael Rothgang (Apr 15 2025 at 15:11):

  1. helping out with PRs: the review dashboard has a section with PRs needing help. For PRs with just a merge conflict, please be tactful and keep in mind that most contributors are volunteers working in their free time - e.g., don't expect faster reactions than a week

Michael Rothgang (Apr 15 2025 at 15:13):

  1. organisational debt: documenting existing mathlib policies, that are not formally written down. The deprecation policy comes to mind (and has an open PR). If you see something else, feel free to propose a PR (and leave a comment here, as zulip is read more often than the webpage repository). This thread aims to collect some examples.

If you're not sure what the current policy on X is, starting a thread is welcome. Sometimes, there's no explicit consensus yet :-)

Yaël Dillies (Apr 15 2025 at 15:15):

  1. Document top folder directories and their main subdirectories with a README.md file

Yaël Dillies (Apr 15 2025 at 15:15):

  1. Sort/split basic algebra files into the folders Algebra.Group, Algebra.GroupWithZero, Algebra.Ring, Algebra.Field, and their Algebra.Order counterparts.

Michael Rothgang (Apr 15 2025 at 15:17):

  1. tooling consolidation: rewrite the lint-style.py linters in Lean or as syntax linters. Requires some knowledge of syntax linting, but can be fun. @Michael Rothgang can answer questions.

Michael Rothgang (Apr 15 2025 at 15:18):

  1. fixing TODOs in the source code: often, a number of these are doable. Sometimes, TODOS can be very hard. Usually a PR is welcome. (If a TODO is hard, talking to the author of the TODO helps.)

Michael Rothgang (Apr 15 2025 at 15:21):

Honorable mention: reviewing pull requests. Having all the best PRs sit unreviewed does not help either. Everybody is welcome to review PRs! You can review a PR any way you want --- these guidelines can give you some ideas how to.

Michael Rothgang (Apr 15 2025 at 21:09):

  1. An item under point 4 above: document the current policy about using "open (scoped) classical". Much of it is enforced by a linter; some aspects are still being discussed

Kim Morrison (Apr 16 2025 at 00:04):

  1. Improve David Renshaw's excellent tryAtEachStep tool, so it can be used to drive automatic tactic replacements. In particular see my issue https://github.com/dwrensha/tryAtEachStep/issues/12 allowing filtering by the change in heartbeats.

Kim Morrison (Apr 16 2025 at 00:05):

  1. Replace terminal simp only with simp or simp [X], when there is minimal build time cost.

Michael Rothgang (Apr 16 2025 at 06:58):

  1. An item under point 4 above: document the current spelling policy (British/American/other variants of English). See e.g. here; might need further waiting (or work to build consensus)

Michael Rothgang (Apr 16 2025 at 06:59):

  1. Programming/linter-related; doesn't require deep knowledge: Rewrite the directoryDependency linter to take an allowlist of directories, instead of blocklisting them. (Or perhaps use a hybrid approach.) I can give pointers on how.

Michael Rothgang (Apr 16 2025 at 07:00):

  1. Help with reorganising top-level folders --- some of these are not logical. See here for more details. Typically, this may need some discussion on zulip.

Ruben Van de Velde (Apr 16 2025 at 07:29):

Michael Rothgang said:

  1. Help with reorganising top-level folders --- some of these are not logical. See here for more details. Typically, this may need some discussion on zulip.

#23498 is an example

Michael Rothgang (Apr 16 2025 at 10:14):

  1. Make this list discoverable for new or existing contributors, as time goes by. More broadly, how would a contributor find such ideas of tasks? See here for a discussion.

Michael Rothgang (Apr 16 2025 at 12:52):

  1. (Due to Eric Wieser) #24099 is also looking for crowd help: it does not require expert knowledge, but some familiary with the respective part of the library.

Michael Rothgang (Apr 16 2025 at 20:31):

  1. (Also due to Bolton Bailey) Rename inconsistently named lemmas: this issue collects some known cases.

Michael Rothgang (Apr 18 2025 at 19:49):

  1. Another informal policy incoming (still under discussion): "file names should be UpperCamelCase" (except for specific exceptional cases). If the discussion converges, a PR documenting this is very welcome.

Michael Rothgang (Apr 18 2025 at 19:50):

  1. Adding to 19: a linter implementing this is also very welcome; I'm happy to mentor that.

Michael Rothgang (Apr 19 2025 at 15:58):

  1. Help categorising the list of all tactics in the mathlib manual: this comment has more details.

Michael Rothgang (Apr 19 2025 at 15:59):

  1. Create PRs to mathlib (and dependencies) marking alternative forms of tactics with @[tactic_alt original_tactic], cleaning-up docstrings as you do so, i.e. describing all tactic alternatives in the main tactic's docstring. This comment links to an example. This would also help the mathlib manual.

Yaël Dillies (Apr 20 2025 at 09:35):

  1. Replace _coe with coe_ in lemma names that talk about the coercion Subfoo α → Set α, if possible by reconfiguring simps (see initialize_simps_projections)

Yaël Dillies (Apr 20 2025 at 11:44):

  1. lowerCamelCase foreign names: Apply the #naming convention that a definition whose name is ambiguous without its namespace should have its namespace prepended back in lowerCamelCase when used outside its namespace. For example, docs#continuous_subtype_val should be continuous_subtypeVal or Subtype.continuous_val. Discussion: #mathlib4 > Task 24: lowerCamelCase foreign names

Yury G. Kudryashov (Apr 21 2025 at 22:15):

  1. A very specific tech debt: docs#differentiableAt_comp_sub_const and other differentiability lemmas in that file unnecessarily assume that the domain is the base field. They should be generalized to any domain/codomain and moved to FDeriv/Add.

Yaël Dillies (Apr 22 2025 at 08:45):

  1. Replace Type u by Type* wherever possible. Discussion: #mathlib4 > Task 26: Replace Type u by Type* wherever possible

Michael Rothgang (Apr 22 2025 at 08:55):

  1. Replace Type _ by Type* wherever possible. Discussion: #mathlib4 > Task 26: Replace Type u by Type* wherever possible

Yury G. Kudryashov (Apr 22 2025 at 15:12):

  1. Fix UpperCamelCase vs lowerCamelCase. E.g., docs#DihedralGroup.OddCommuteEquiv should be oddCommuteEquiv. Add a linter for this.

Michael Rothgang (Apr 22 2025 at 16:00):

  1. (Due to Eric Wieser in the mathlib-reviewers stream) Write a bot which detects "unintentionally empty" commit messages. It would detect commit messages like
---

A nice explanation of the change

(i.e., the part before the --- being only whitespace, the part below the line containing something "substantial" --- i.e. not just the commit message template or "depends on" information) and could say something like

Warning: your commit message is entirely below the --- and so will be discarded by bors. Anything that could be useful to future readers must go above the ---

Eric Wieser (Apr 22 2025 at 16:02):

(with an exception for "depends on", which currently we ask to be below that line)

Yaël Dillies (Apr 22 2025 at 18:38):

  1. Make the nolints update bot (or another bot) automatically remove > 6 months old deprecations every week or month. Context: https://github.com/leanprover-community/mathlib4/pull/24271#pullrequestreview-2785016957

Yury G. Kudryashov (Apr 22 2025 at 22:10):

  1. Make the shake step post github suggestions so that the PR author can accept them instead of running something like git checkout mybranch; killall lean; lake exe cache get; lake exe shake --fix; git commit -am 'Shake'; git push ...

Yury G. Kudryashov (Apr 22 2025 at 22:29):

  1. Better support for by_cases (and may be some cases - not sure) in the "unused *" linters. If I run by_cases h : _, but one of the branches doesn't use h, then h is unused. For rcases, see src#iteratedDerivWithin_const_add (going to be fixed in #24227): the 2nd branch doesn't use hxs, but this hxs` isn't marked as unused.

Kim Morrison (Apr 23 2025 at 22:56):

Yury G. Kudryashov said:

  1. Make the shake step post github suggestions so that the PR author can accept them instead of running something like git checkout mybranch; killall lean; lake exe cache get; lake exe shake --fix; git commit -am 'Shake'; git push ...

I might prefer a bot that just pushes the commit... (after checking it still builds).

Kevin Buzzard (Apr 24 2025 at 20:10):

How about
33. unhelpful docstrings. There are many docstrings in mathlib of the form

/-- A foo -/
def foo : ...

For example I was just reviewing a PR and a theorem had the assumption [HasFilteredColimitsOfSize.{w, w'}] and I thought "oh I wonder what the precise definition of that is and exactly what roles the universes play" so I hovered over HasFilteredColimitsOfSize and got the docstring "Class for having all filtered colimits of a given size.". An issue here is that docstrings are often written by experts, who have to write them because of CI but know perfectly well what the definition means so might write something unhelpful like this. In my mind, at the very least the docstring for this declaration should explain the roles of the two universe variables and in my opinion there's no reason why the docstring shouldn't also say what a filtered colimit is, given that it's not hard to imagine that someone looking at the docstring might actually be looking because they don't know.

This is the sort of thing where I just organically find examples. There was a recent thread on the Xena Discord about this, examples posted there were DenseRange, MeasurableSet and IsOpen (although in these cases it's not immediately clear how to make them better), and then there were a bunch of algebraic constructions e.g. FreeRing,FreeCommRing which used to have this property but they were all fixed by Aaron Lui in #23609 as a consequence of the conversaion.

Edward van de Meent (Apr 24 2025 at 20:12):

another fun one i stumbled across today is that the docstrings for docs#T0Space and docs#T1Space seem to suggest these notions are equivalent, unless you read very very carefully

Yury G. Kudryashov (Apr 25 2025 at 00:08):

Another example: docs#FractionRing

Bolton Bailey (Apr 25 2025 at 00:08):

Kim Morrison said:

Yury G. Kudryashov said:

  1. Make the shake step post github suggestions so that the PR author can accept them instead of running something like git checkout mybranch; killall lean; lake exe cache get; lake exe shake --fix; git commit -am 'Shake'; git push ...

I might prefer a bot that just pushes the commit... (after checking it still builds).

I would also prefer this, since indeed, shake fix frequently breaks the build. For me, this has often greatly expanded the amount of time it takes to split files.

Michael Rothgang (Apr 28 2025 at 13:27):

  1. (from the core maintainers)
    Fix some list names in core: rename Sublist to IsSublist, rename lemmas to use isPrefix and isSublist and scope some of the list notation. This comment has the details.

Michael Rothgang (Apr 28 2025 at 13:28):

  1. Informal policy under discussion: how to name lemmas involving \notin: discussion at #mathlib4 > Naming: nmem vs not_mem @ 💬

Michael Rothgang (May 01 2025 at 08:57):

  1. Make the apply tactic with a config argument (docs#Mathlib.Tactic.applyWith) support modern config syntax, e.g. apply +allowSynthFailures instead of apply (config := { allowSynthFailures := true }). (The Lean 4.14.0 release notes describe this a bit.)

Michael Rothgang (May 01 2025 at 08:58):

  1. Switch mathlib tactic calls to use modern configuration syntax. See #24469 for an example.
    Caution: certain tactics don't support that syntax yet (including apply with a config, and perhaps others). Changing those, akin to task 36, is also welcome!

Last updated: May 02 2025 at 03:31 UTC