Zulip Chat Archive

Stream: general

Topic: mathlib organization


Johan Commelin (Sep 09 2022 at 07:15):

Dear mathlib community (and more generally the leanprover community),

For some time we've had some difficulties handling the mathlib pull-requests #queue, and a while ago we posted a brief message to let you know that we were working on a long-term solution. After a lot of back-and-forths, considerations, and reconsiderations, we have come up with some changes to the review process and a proposal for reorganising some of the structures in our community. The first effect has already been noticed: we've started using topical labels to get a better grip on the PR #queue. We are generally happy with the reorganisation design, but we do not want to rush the implementation. So for now we are sharing it to let you know what is coming, and solicit help for the points that require technological work.

The points that will be visible to everybody but are not yet implemented are:

  • It will become clearer who is handling each pull-request, and hopefully no pull-request will remain handled by nobody for extended periods of time.
  • Many people will be trusted to take a quasi-final decision to merge a PR by making recommendations to the maintainers team. After such recommendations, maintainer will simply check that there is no large scale cohesivity issues (either technological or mathematical) without another detailed small scale review.
  • Many discussions that used to be private among maintainers will move to public streams.

The long explanations below contain a lot of details about how we expect to achieve this.

Johan Commelin (Sep 09 2022 at 07:16):

Community organization

Currently, mathlib development is overseen by 25 maintainers from the community, who each individually have a wide range of responsibilities and expectations: account administration, continuous integration maintenance, Zulip moderation, the website and blog, tool development, and, of course, mathlib PR review and maintenance of the repository. De facto, they also serve as representatives of the community and discuss political issues and funding opportunities. We do not have any fine-grained way to delegate subsets of these responsibilities to individuals, which makes adding maintainers an unnecessarily weighty decision. Another consequence of this monolithic team organization is that we have a single private maintainers stream on Zulip where all this is discussed. This implies that many things that could be discussed publicly end up private.

To this end, we devised a reorganization of the community structure. The design of the reorganization, to a large extent, is to make implicit structure more explicit, to more clearly define how responsibilities and expectations are assigned to which people. This will allow us to add many more members of the community to the team that oversees mathlib development. We believe that this will allow us to scale while simultaneously keeping up our high standards for the design and implementation of mathlib. We leave the option open to have a more sophisticated organizational structure in the future as the need arises.

All groups involved will have discussions in public channels as much as possible. We will also create private streams as needed for the different groups, to be used only when necessary.

Admin team

The admin team oversees the activities of the Lean community organization and in particular the mathlib project. It is responsible for ensuring the mathlib project continues to reflect the general views and interests of the Lean/mathlib community.

The admin team is a central hub for coordination among other teams. They are responsible for procedural duties: when a decision needs to be made, what should the process be? How should votes be taken, if it's a voting matter?
In particular it is responsible for resolving disagreements among teams by finding a way to reach consensus. The admin team is tasked with finding procedures to resolve conflicts when consensus cannot be reached.

The admin team is the "first point of contact" for the community as a whole. This does not mean that they are in charge of all communication, but rather, can forward messages as needed, put the right people in touch, etc.

Admin maintainers have the admin privileges for the GitHub organization, and they are the "default" keyholders and actors for things that are not the responsibility of other teams.

The admin team will not necessarily be drawn purely from the current maintainers. In particular, we might want people from Microsoft Research, or Lean 4 programming-oriented community members, on the team.
The admin team will have between five and eight members. Five of the initial members will be current maintainers, determined by a vote among the current maintainers; the newly elected members will appoint up to three additional members of the admin team.

The members of all teams will choose an admin team annually by an election, in a manner organized by the current admin team using a system that gives all members a vote of equal weight. Members of the admin team may be re-elected any number of times with no limit.

mathlib team

The mathlib team is responsible for the mathlib GitHub repository. This team is divided in two subteams: mathlib maintainers and mathlib reviewers.

We expect that most current active maintainers will remain mathlib maintainers, and we will appoint a first group of reviewers. At the beginning there will be more mathlib maintainers than reviewers. We expect the reviewer team to grow faster over time.

mathlib maintainer team

The mathlib maintainers are given the right to merge pull requests (via the bors GitHub bot as we already do). Mathlib maintainers will usually have research-level knowledge in some field of mathematics or computer science, and they are given the responsibility to ensure mathlib has a cohesive design with a high-quality implementation. One should think of mathlib maintainers as being akin to journal editors.

Community members become maintainers at the invitation of the current maintainers, and anyone may nominate a community member by sending a message to a current maintainer. Maintainers serve for a renewable two year term, with the expectation that the term will be renewed unless the maintainer has become inactive or wishes to step down.

mathlib reviewer team

The mathlib maintainers should appoint members of the wider community to be reviewers. Reviewers, like mathlib maintainers, should have demonstrated that they can submit PRs that do not require significant revision and that they can leave constructive reviews on others' PRs. We will have a system where a reviewer can submit a PR to the mathlib maintainers that a PR is ready for mathlib (like maintainer merge rather than bors merge), and then a mathlib maintainer will do a quick check and put it onto the bors queue if it's the case. A maintainer merge should be as final as a bors merge since mathlib maintainers appoint reviewers they trust, but mathlib maintainers are responsible for what enters mathlib. A maintainer may choose to maintainer merge instead of bors merge if they want another maintainer to take a last look.

Other teams

There are many tasks that go into supporting the Lean user community outside of mathlib development. The admin team has freedom to decide how these tasks will be supported, but these are some possibilities:

  • Code of conduct team. This team already exists with a well-defined purpose: handle incident reports and maintain community standards. See the community web page.

  • Continuous integration team. In charge of Azure, bors administration, continuous integration virtual machines root access, GitHub actions on various projects of the Lean user community.

  • Zulip team. The Zulip maintainers have the admin privileges for the Zulip and are responsible for whatever random tasks require an administrator. The Zulip maintainers are expected to appoint additional Zulip moderators, who can move/edit posts, create streams, and watch for unacceptable behavior.

  • Website/blog team. In charge of the public internet face of the Lean user community.

  • Accounts team. Holds the passwords to the community Gmail, YouTube, Azure, etc. This information can be distributed to other teams as needed.

  • Supporting tools teams. There are a bunch of projects in the leanprover-community organization that support mathlib: doc-gen, mathlibtools, Lean 3.xc, VSCode extension. None of these are big enough for a complicated org structure, but they should have explicit maintainers with merge access/PR approval rights.

Johan Commelin (Sep 09 2022 at 07:16):

Proposal for the PR triage, review, merge pipeline

The question of how to handle the lifecycle of a PR is not entirely orthogonal to the reorganization above, but can be considered separately.

Area labels and groups

We will create a broad categorization of topics that PRs can fall under (for instance, analysis, topology, tactics, etc.): call these "areas." Each area will have an associated GitHub label, GitHub user group, and Zulip channel. Mathlib maintainers and reviewers will self-sort into these groups, which creates a loose association of sub-teams by area. They may "subscribe" or "unsubscribe" to areas as often as they wish. We expect many maintainers and reviewers will subscribe to multiple areas.

We will find a way, using GitHub CODEOWNERS, a custom action, manually applying labels, or something else, of making sure that every PR gets associated to at least one area. Anyone having experience in doing such things on GitHub is welcome to join the effort.

Reviewing

As is now, anyone is allowed to review any PR. We had a long discussion about the failing points of our current process and ways to improve it. Here are some thoughts:

  • Right now, there's a feeling that if you comment a few times on a PR, you've "claimed" it. Claiming a PR should be an explicit, opt-in process. We propose that reviewers and maintainers use self-assignments more systematically. If a maintainer wants to be the person with final say, they should assign the PR to themselves. A maintainer does not need to be a member of the area-subteam that the PR is associated with to self-assign a PR. But, the associated area-subteam is responsible for making sure that someone takes responsibility; if no maintainer has self-assigned after a certain period of time, that group should discuss to see who will take it on.

  • The triage bot can be improved in many ways, to better understand the queue, tag assignees as well as PR authors, etc. Anyone with technical expertise to make that happen is welcome to help.

  • There are different "kinds" of reviews: checking for mathematical correctness, generality, style, type class properties, ... We will create a list of items that a PR should be checked for before being merged. If someone checks a PR for style but not content, this should be explicitly indicated. This could be a checkbox in a bot post, or some other mechanism TBD. Again, help is welcome.

Merging

Mathlib reviewers and maintainers have the autonomy to maintainer merge or bors merge as they see fit, but they should discuss anything controversal with the rest of the relevant area sub-team, the rest of the mathlib maintainers, or the wider community. A responsibility of the maintainers in general is to, as needed, help the mathlib team come to consensus on difficult decisions.

Anatole Dedecker (Sep 09 2022 at 08:00):

(maybe this deserves to be in #announce ?)

Filippo A. E. Nuccio (Sep 09 2022 at 16:45):

@maintainers thank you all for the efforts put in the discussions and in imagining this new structure. Personally, I find it very well conceived and I think it will help mathlib a great deal.

Johan Commelin (Sep 30 2022 at 06:06):

I'm proud and happy to announce that the following community members have accepted the invitation to become mathlib reviewers:

  • Aaron Anderson
  • Alex J. Best
  • Thomas Browning
  • Anatole Dedecker
  • Yaël Dillies
  • Moritz Doll
  • Jireh Loreaux
  • Joseph Myers
  • Andrew Yang

Filippo A. E. Nuccio (Oct 05 2022 at 20:26):

I have one question about the #queue command. I see that it ranks PR's according to the date of updating, via sort:update-asc. Now, this has the effect that a PR on which people have been working for, say, two months and are constantly updating suggestions is far in the queue, because the "last update" can date back just some days. Wouldn't it an option to rank them via sort:opened-asc? A clever use of the awaiting-author tag ensures at any rate that it will not be in the list as long as there is still work to do.

Floris van Doorn (Oct 06 2022 at 08:11):

When reviewing, I personally sort the queue using sort:created-asc for that reason.

Johan Commelin (Oct 11 2022 at 06:59):

Two more reviewers have joined the team: @Jason KY. and @Junyan Xu !


Last updated: Dec 20 2023 at 11:08 UTC