Zulip Chat Archive

Stream: general

Topic: New policy regarding mathlib3 PRs that touch ported files


Scott Morrison (Apr 09 2023 at 21:21):

We are still struggling with having too many files in the state "already ported to mathlib4, but changes have since been made in mathlib3, which have not yet been forward ported". The number of these has been trending downwards, but only slowly. Too often files in this state are preventing us from moving forward with porting new files (including files on the critical path).

After some discussion amongst the mathlib maintainers, we have consensus for a throttling mechanism to control this issue.

https://leanprover-community.github.io/mathlib-port-status/out-of-sync shows the current list of files in this state. (That is, it tracks our "forward-porting debt".

PRs to mathlib3 that modify already ported files are still welcome, but with the following two provisos:

  1. They must be labelled as modified-synchronized-file. (Anyone may add this if they see it is missing, please!)
  2. Maintainers will not be merging such PRs unless one the following holds:

    • they are directly useful for the mathlib4 port (in which case they will be expedited!)
    • the author has no previously merged PRs which still appear on #outofsync
    • #outofsync has at most 20 items on it

For context, #outofsync currently has 55 items, although this could be reduced quite a bit by reviewing and merging existing forward porting PRs. We may crank the limit down from 20 to 10 later.

This change is not intended to blame the authors of PRs to mathlib3 --- they are making useful contributions, and very often the PRs that end up causing problems are ones that have been around for a long time, predating the period where we knew about this issue. However too often forward porting debt is blocking forward progress on the port, and reviewing and managing the #outofsync queue is absorbing too much time, and this new approach will hopefully remediate this. We simply need a better process to avoid these blockages.

Apologies to authors for whom this is going to cause delays getting their mathlib3 PRs in. We hope that you'll adapt okay! :-) It would be lovely if authors who are affected by this would redirect some of their efforts into either dealing with the #outofsync queue (whether their own PRs or others), or just doing direct porting work.

@maintainers, if you see mathlib3 PRs that are in this state, please add the label blocked-by-out-of-sync-queue, and for now at least it is probably helpful to post a link to this comment for explanation. Once the #outofsync queue drops below the thresholds again, we can go back to looking at these PRs.

Scott Morrison (Apr 09 2023 at 21:21):

Feel free to discuss here, of course --- while the maintainers are mostly happy with the formulation I've given above, we can certainly change it and adapt.

Eric Wieser (Apr 09 2023 at 21:43):

(regarding 1, as a reviewer remember that it may not be the author's fault that the label is missing; the file might have only recently been ported, long after the PR was made!)

Notification Bot (Apr 09 2023 at 22:03):

5 messages were moved from this topic to #general > CI for modifies-synchronized-file label by Eric Wieser.

Yaël Dillies (Apr 10 2023 at 07:48):

Well, I guess I'm out...

Scott Morrison (Apr 10 2023 at 09:11):

Can we just get down to a smaller queue and then proceed? There's no reason we need have lesser throughout on mathlib3. It's the latency that is killing us, and resulting in forwarding porting getting in the way of the port.

Eric Wieser (Apr 10 2023 at 09:30):

I'm broadly in favor of this protocol, though I don't think it would help much with the latency issue with linear_algebra.dimension; we ran into trouble porting that file because:

  • We panicked when we saw a single lemma it needed was missing. A manual forward-port of a two-line lemma seems fine to do while porting a new file, similar changes to other files are not uncommon to need when porting.
  • The lemma needed had only made it into mathlib3 a day or two before. Mathport + bors can easily add a 27 hour delay between a change making it to mathlib and getting to mathport
  • We deliberately decided to refactor the file while it was above the tide, and the tide caught up while a change was still going though mathport.
  • Many people (at least in the UK) are away on long weekends right now due to Easter Bank holidays

Yaël Dillies (Apr 10 2023 at 10:04):

Personally I open matching mathlib4 PRs the morning after the mathlib PR was merged. I cannot do any quicker, mathport runs overnight.

Eric Wieser (Apr 10 2023 at 10:20):

While that's exemplary, its not sufficient to make #outofsync empty; the bottleneck is reviewer time for forward-porting PRs, and the purpose of Scott's proposal is to prevent an increasing blockage behind that bottleneck

Patrick Massot (Apr 10 2023 at 10:20):

Yaël, you can do even better: stop modifying mathlib3.

Scott Morrison (Apr 10 2023 at 11:31):

At present it discouraging to see this queue of out of sync stuff. It's so unrewarding to review (because it feels like treading water rather than making progress), that when I see the out of sync queue is growing my gut feeling is to avoid it... Of course this doesn't help! :-) I really feel that if the queue size was much lower, we could avoid much of the pain. Everything that sits on the out of sync queue is a hazard waiting to trip the porting process.

Ruben Van de Velde (Apr 10 2023 at 11:35):

Maybe not quite everything :innocent: - !4#3369 and !4#3370 port comment-only changes

Moritz Doll (Apr 10 2023 at 11:56):

another trivial sync PR: #3363

Ruben Van de Velde (Apr 10 2023 at 11:57):

(That's !4#3363)

Scott Morrison (Apr 10 2023 at 11:57):

Presumably !4#3363. (At some point we should switch out the linkifiers, make people type !3#...)

Reid Barton (Apr 10 2023 at 12:00):

can it just match 4 digits versus 5?

Reid Barton (Apr 10 2023 at 12:00):

not sure how historical messages are treated when linkifiers change

Johan Commelin (Apr 10 2023 at 12:00):

they don't change what they link to, unless the message is edited.

Moritz Doll (Apr 10 2023 at 12:05):

Reid Barton said:

can it just match 4 digits versus 5?

We've had mathlib3 PRs with 4 digits recently.

Eric Wieser (Apr 10 2023 at 12:07):

We could have #NNNN go to a disambiguation page going forward, in the unlikely event anyone can be bothered to build something for that

Pedro Sánchez Terraf (Apr 10 2023 at 15:31):

Scott Morrison said:

We are still struggling with having too many files in the state "already ported to mathlib4, but changes have Apologies to authors for whom this is going to cause delays getting their mathlib3 PRs in. We hope that you'll adapt okay! :-) It would be lovely if authors who are affected by this would redirect some of their efforts into either dealing with the #outofsync queue (whether their own PRs or others), or just doing direct porting work.

My (only) PR #17972 fell victim of the rising tide. When I noticed that one file was submerged, I tried to move the few added lines to one of the fresh files, but then two more files were hit by the wave.

What should I do? A new file containing those lines? Or could I just "buy reviewer time" by porting some files? I'm not acquainted with Lean4 yet, but starting with some porting work seems feasible (from what I saw in the---outdated?---video.).

Jireh Loreaux (Apr 10 2023 at 15:39):

If you the video you are referring to is the one Scott made in November or December, then yes, it is very out of date and we have some nice workflows now. The best place to get started is probably the #port-guide (this doesn't linkify to quite the right place: you want https://github.com/leanprover-community/mathlib4/wiki/Mathlib3-Synchronisation for forward porting)

Johan Commelin (Apr 10 2023 at 16:24):

I fixed the linkifier. #port-guide should now point in the right direction.

Matthew Ballard (Apr 10 2023 at 16:26):

Is it easy to run mathport on a mathlib branch?

Eric Wieser (Apr 10 2023 at 16:29):

Pedro Sánchez Terraf said:

What should I do? A new file containing those lines? Or could I just "buy reviewer time" by porting some files? I'm not acquainted with Lean4 yet, but starting with some porting work seems feasible (from what I saw in the---outdated?---video.).

I think what you're currently doing in the PR looks fine; but porting a file or two first to get familiar with Lean4 before attempting a forward-port sound like a good plan

Eric Wieser (Apr 10 2023 at 16:30):

Matthew Ballard said:

Is it easy to run mathport on a mathlib branch?

No, because it is very slow. What's the motivation?

Matthew Ballard (Apr 10 2023 at 16:31):

Move existing PRs to Lean 4

Matthew Ballard (Apr 10 2023 at 16:33):

Is it easy for an individual to run mathport? I haven't really paid attention but the sense I got from eg #FLT regular was not yet

Reid Barton (Apr 10 2023 at 16:44):

I wouldn't describe anything related to mathport as "easy". It is possible. I assume it should also be possible to configure it for projects with mathlib as a dependency, but I never tried.

Eric Wieser (Apr 10 2023 at 16:44):

Running mathport on a downstream project is probably straightforward but not a well-trodden path

Eric Wieser (Apr 10 2023 at 16:44):

I would suggest we don't attempt to run it on PRs until we're close to being done

Matthew Ballard (Apr 10 2023 at 16:54):

I guess my main point is that if a migration path is provided for ML3 PRs it would decrease pressure on that side. Even if the author needed to run mathport locally, as long as the process was not "too complicated".

Matthew Ballard (Apr 10 2023 at 16:56):

(Please, treat these as comments from the peanut gallery.)

Reid Barton (Apr 10 2023 at 17:01):

If the PRs are not of the form "add a new file" then it's also just an intrinsically much harder problem.

Eric Wieser (Apr 10 2023 at 17:04):

I have some ideas on how to handle that

Eric Wieser (Apr 10 2023 at 17:05):

The first step is certainly "merge the most recent mathlib3 into the PR"

Yaël Dillies (Apr 10 2023 at 17:05):

I personally got a bunch of PRs to be merged to reduce the number of PRs that will need porting.

Yaël Dillies (Apr 10 2023 at 17:06):

I also told Johan a while back that I was ready to merge master into the ~480 open PRs and bump them.

Matthew Ballard (Apr 10 2023 at 17:08):

Isn't mathport purely a syntax translator or does it do other things?

Johan Commelin (Apr 10 2023 at 17:09):

That's the short summary, yes. But it's a pretty complex beast.

Johan Commelin (Apr 10 2023 at 17:09):

"You need to be Lean 3 to parse Lean 3" -- Mario Carneiro

Matthew Ballard (Apr 10 2023 at 17:10):

Scott Morrison said:

Lean 3 is dead, long live Lean 4. :-)

Uh oh :lol:

Henrik Böving (Apr 10 2023 at 17:32):

Johan Commelin said:

"You need to be Lean 3 to parse Lean 3" -- Mario Carneiro

That's based on "only perl(1) can parse perl" we are not the first ones

Patrick Massot (Apr 10 2023 at 19:26):

I'm pretty sure perl is younger than TeX, and definitely TeX is the only software that can parse TeX.

Henrik Böving (Apr 10 2023 at 19:29):

They are surprisingly close, 1978 vs 1987

Pedro Sánchez Terraf (Apr 10 2023 at 19:34):

Matthew Ballard said:

Move existing PRs to Lean 4

Was this directed to me? I believe I can do it (when I learn Lean4!)

Matthew Ballard (Apr 10 2023 at 19:38):

No it wasn't, sorry. It was asking about a general protocol for doing so.

Ruben Van de Velde (Apr 10 2023 at 19:49):

Henrik Böving said:

They are surprisingly close, 1978 vs 1987

Just one transposition needed

David Loeffler (Apr 10 2023 at 20:56):

I think this out-of-sync list is making some unfair accusations, because it seems to pick up any mathlib3 PR modifying a file for which a porting PR exists, whether or not the mathlib4 port PR has actually been merged.

So the list currently includes algebra.order.to_interval_mod, naming and shaming my PR #18427 and @Michael Stoll 's PR #17483 for editing this file. But this file hasn't actually been ported; there is only a long-dead abandoned attempt (!4#2148), which has been marked as "merge conflict – work in progress – help wanted" since 2 months ago. Is it really the intent of the new policy that Michael and I will be banned from making any new mathlib3 contributions until we (or someone else) take over that long-abandoned porting job?

I think that if the out-of-sync list is to be used as a basis for retributive action, then the script that generates it needs to be modified to ignore in-progress / stale port PR's, and only consider port PRs that have already been merged into master.

Eric Wieser (Apr 10 2023 at 21:03):

In progress PRs are already marked yellow on #outofsync

Eric Wieser (Apr 10 2023 at 21:04):

But I agree that we probably should be counting them separately

Eric Wieser (Apr 10 2023 at 21:05):

I think the script modifications needed are minor; what's important is modifying the policy at the top of this thread to be explicit about whether they should be counted.

Notification Bot (Apr 11 2023 at 07:40):

11 messages were moved from this topic to #mathlib4 > !4#2148 by Eric Wieser.

Reid Barton (Apr 11 2023 at 07:56):

Really mathlib4 porting PRs should not be open for two months in the first place--or even one month.

Eric Wieser (Apr 11 2023 at 08:00):

If people port low-priority files and get stuck, then long-lived porting PRs feels inevitable

Reid Barton (Apr 11 2023 at 08:00):

It would be better to close the PR and start again in say two months' time with the new mathport output

Eric Wieser (Apr 11 2023 at 08:03):

I'm not sure I agree; it might be easier just to rebase the old commits on the new mathport output

Shreyas Srinivas (Apr 11 2023 at 08:10):

Reid Barton said:

Really mathlib4 porting PRs should not be open for two months in the first place--or even one month.

I have a PR of a non-important file stuck in the review queue for two months : https://github.com/leanprover-community/mathlib4/pull/2179
It is complete. All it needs is a review.

Shreyas Srinivas (Apr 11 2023 at 08:10):

Maybe there are others like this. Starting them from scratch might not be optimal

Reid Barton (Apr 11 2023 at 08:13):

Ports waiting for a long time for review is also not something that should happen but I agree that closing them is not the correct action in that case.

Reid Barton (Apr 11 2023 at 08:34):

Eric Wieser said:

I'm not sure I agree; it might be easier just to rebase the old commits on the new mathport output

That's a fine way to make the new PR (and doesn't require the old one to be open). Please don't merge the mathport changes in though since it's a headache for reviewers.

Eric Wieser (Apr 11 2023 at 09:01):

I worry that if the old one is not open then people will port from scratch without looking for it and repeat the work

Ruben Van de Velde (Apr 11 2023 at 21:22):

Seems like we dropped from 55 out-of-sync files to 17 (including !4#3389) in 48 hours. Maybe the limit could be even lower :)

Eric Wieser (Apr 11 2023 at 21:48):

@Yaël Dillies had a single PR touching 21 files that hadn't been reviewed because lots of people were away for Easter

Jeremy Tan (Apr 12 2023 at 03:48):

In that respect, !4#3248 and !4#3278 need review; they occupy most of the remaining slots on that list

Scott Morrison (Apr 12 2023 at 12:16):

Down to 11. :happy:

Jeremy Tan (Apr 12 2023 at 13:58):

Down to 10

image.png

Let's keep it at 10 or less

Eric Wieser (Apr 12 2023 at 19:11):

I worry we've been rushing things here; we should not be trying to forward-port things before the mathport output is ready, because:

  • You will likely miss something (both @Jeremy Tan and I missed #aligns in forward-port PRs we made)
  • The reviewers will likely not be able to tell that you missed something

Eric Wieser (Apr 12 2023 at 19:11):

I worry we've been rushing things here; unless very urgent we should not be trying to forward-port things before the mathport output is ready, because:

  • You will likely miss something (both @Jeremy Tan and I missed #aligns in forward-port PRs we made)
  • The reviewers will likely not be able to tell that you missed something

Jeremy Tan (Apr 13 2023 at 02:34):

Instead we need to increase the frequency of mathport runs to 6 hours or less

Jeremy Tan (Apr 13 2023 at 02:36):

It currently takes about 3 hours, so 6 hours should be a reasonable interval


Last updated: Dec 20 2023 at 11:08 UTC