Zulip Chat Archive

Stream: mathlib4

Topic: some useful automation


Scott Morrison (Oct 25 2022 at 01:04):

There are a few pieces of CI automation that would help smooth the process for people porting files from mathlib3 to mathlib4.

  1. when someone creates the PR mathlib4#NNN which ports file XXX, automatically update the XXX line in #port-status to say No: PR'd as mathlib4#NNN
  2. when the PR mathlib4#NNN which ports file XXX is merged, update the XXX line in #port-status to say Yes mathlib4#NNN (checking that it doesn't already say Yes mathlib4#MMM AAAAAA).
  3. when the mathlib4 PR which ports file XXX is merged, automatically create the mathlib3 PR which adds the SYNCHRONIZED label to the module-doc

and something which I think can't / shouldn't be automated

  1. after all those steps, a human needs to open mathlib3 master (say at commit AAAAAA, and mathlib4 master, and verify that everything actually made it across, then update #port-status to say Yes mathlib4#MMM AAAAAA, to record that someone has checked that the file was fully ported as of that commit.

and finally, for automation

  1. a bot, that at some interval sends a reminder to zulip about any files in mathlib3 which have been modified after their recorded verification commit (so a human can either verify that nothing really changed, and update the verification hash, or make the corresponding changes in mathlib4)

Scott Morrison (Oct 25 2022 at 01:04):

Some discussion points:

  1. Perhaps #port-status should be something more structured than a wiki page!
  2. Do we need labels for mathlib4 PRs that say they are a porting PR?
  3. How does CI look at a mathlib4 PR and work out which (multiple?) files it is claiming to completely port.
  4. Do we need labels for mathlib3 PRs that are just adding synchronization comments? (We have mathlib4-synchronization.)

Scott Morrison (Oct 25 2022 at 01:06):

The only automation we currently have is the very ad-hoc scripts/port-status.py in mathlib3, which as well as telling you what you can currently work on, emits warnings like:

# The following files are marked as ported, but do not have a SYNCHRONIZED WITH MATHLIB4 label.
logic.basic
logic.is_empty
data.prod.pprod
data.fin.fin2

# The following files are marked as ported, but have not been verified against a commit hash from mathlib.
logic.basic
logic.is_empty
data.prod.pprod
data.fin.fin2
algebra.abs
algebra.covariant_and_contravariant
algebra.group.basic
algebra.group.defs

# The following files have been modified since the commit at which they were verified.
logic.relator

Scott Morrison (Oct 25 2022 at 01:06):

@Sarah Smith, notifying you as you wanted to be part of this discussion.

Yakov Pechersky (Oct 25 2022 at 01:12):

How about a mathlib3 branch that is just one file, on which we track commits?

Scott Morrison (Oct 25 2022 at 01:19):

I mean, I'm not sure a text file in a git branch is really that much different from a wiki page. In particular you can get the wiki via

git clone git@github.com:leanprover-community/mathlib.wiki.git

Yakov Pechersky (Oct 25 2022 at 01:21):

Ah, great

Mario Carneiro (Oct 25 2022 at 05:49):

Step 4 can be automated if the bot creates a link to the github changes page from the initial mathlib commit to master as of the PR

Mario Carneiro (Oct 26 2022 at 04:08):

I have another feature request for port-status: files that have been started / have incomplete work in mathlib4 should also be colored in some way. We could either track that manually or make mathport produce a file listing as part of CI (since it knows the naming convention and can tell whether a Mathbin file it translates has a matching file in Mathlib).

Yakov Pechersky (Oct 26 2022 at 04:09):

Does this status indicate presence of a Mathlib4 file? Mathbin file?

Mario Carneiro (Oct 26 2022 at 04:52):

A mathlib4 file. A mathbin file will exist for everything in mathlib

Yakov Pechersky (Oct 26 2022 at 04:59):

What's the function we use that does the snake_case to CamelCase? The usual one?

Mario Carneiro (Oct 26 2022 at 05:03):

https://github.com/leanprover-community/mathport/blob/master/Mathport/Util/String.lean#L8-L21

Yakov Pechersky (Oct 26 2022 at 05:05):

The fact you linked to a lean4 file reinforces the thought I had that this parser/tracker should be written in lean4 =)

Yakov Pechersky (Oct 26 2022 at 05:06):

Files that exist in my mathlib4 dir:
image.png

Yakov Pechersky (Oct 26 2022 at 05:06):

Like so?

Yakov Pechersky (Oct 26 2022 at 05:06):

Existing files have a red outline

Yakov Pechersky (Oct 26 2022 at 05:17):

Available from https://github.com/pechersky/mathlib-tools/commit/6e4e07b75e19afd01664ac342200c4e7f625dc66

Johan Commelin (Oct 27 2022 at 18:22):

Do we have a bot that labels mathlib3 PRs with touches-frozen-files if they touch frozen files?

Johan Commelin (Oct 27 2022 at 18:23):

That would immensely help reviewing PRs, I think.

Heather Macbeth (Oct 30 2022 at 22:44):

I wanted to check that I am using correctly the import graph tool for porting. I had the impression that there was supposed to be an orange colour on the graph for files whose status is "no" but with a further comment:
https://github.com/leanprover-community/mathlib-tools/blob/3886f16c04492c768ed821a2d2fc9f583493d5d9/mathlibtools/file_status.py#L50
So the following three files ought to be orange given the current #port-status :

When I tried the import graph porting tool just now I got this, which has those files in turquoise. Did I do something wrong?

Scott Morrison (Oct 30 2022 at 22:54):

My reading of the intention is that

  • anything with No and also WIP or PR should be lightskyblue
  • anything with No and some nonempty comment that does not contain WIP or PR should be orange.

However according to that logic.function.basic should currently be orange. (Because we forgot to write No: PR mathlib4#511 there, rather than just No: mathlib4#511...) @Yakov Pechersky.

Obviously this behaviour is too fragile, and we need something better eventually.

Heather Macbeth (Oct 30 2022 at 22:59):

I'm actually quite confused. I'd have thought that these two should be the same colour:

but on my graph they are different colours. (Both are blue-ish but I'm not sure which one is lightskyblue.)

Yakov Pechersky (Oct 30 2022 at 23:04):

Can we pick some colors that are better differentiated? Also, if a file is ready for porting (all reqs ported), does that supersede, or does the PR supersede in coloring?

Yakov Pechersky (Oct 30 2022 at 23:05):

I can also add that the PR nodes link directly to the PR

Heather Macbeth (Oct 30 2022 at 23:05):

Yes, I'd just come to the same hypothesis. I'm guessing that currently a file being tagged as "ready" supersedes its designation as "WIP", "PR" or "No-for-complicated-reason". And it should (probably?) be the other way around.

Yakov Pechersky (Oct 30 2022 at 23:07):

I also was in a middle of refactor, then yak-shaved myself in trying to write this in lean4, which yak-shaved into writing a yaml parser, which yak-shaved into trying to understand WF recursion termination in lean4...

Yakov Pechersky (Oct 30 2022 at 23:10):

Anyway, to make ready defer to PR, one just has to add two lines here: https://github.com/leanprover-community/mathlib-tools/blob/67f1bda7637d17a9f765ecade305d3f9ffc93e7c/mathlibtools/lib.py#L1047-L1050
Specifically:

if not target_node.get("status"):
  continue

Yakov Pechersky (Oct 30 2022 at 23:15):

For those that want to add links in PDF, the suggested data is

node["href"] = url_here
node["target"] = "_top"  # or "_blank", depending on the new-tab semantics you want
node["tooltip"] = ...

But right now, the data structure for FileStatus doesn't really support it.
My refactor in https://github.com/pechersky/mathlib-tools/tree/pechersky/schematized-port likely makes it easier

Mario Carneiro (Oct 31 2022 at 04:11):

Yakov Pechersky said:

I also was in a middle of refactor, then yak-shaved myself in trying to write this in lean4, which yak-shaved into writing a yaml parser, which yak-shaved into trying to understand WF recursion termination in lean4...

Just some advice: skip the termination proof, just use partial and get on with life. Removing partial should usually be treated as a v2 thing

Patrick Massot (Oct 31 2022 at 10:23):

Don't hesitate to open PRs to leanproject for anything that helps porting. Releasing new versions of leanproject is very cheap.

Patrick Massot (Oct 31 2022 at 10:25):

On the Lean side, would it be easy to have more checks that we are doing things correctly? Can we have a Lean 4 program that checks that a mathlib4 PR indeed ports everything from the Lean3 side in the correct order (modulo autogenerated declarartions) and indeed uses #align whenever mathport had another name? We could have a command flagging declarations that have been omitted and providing an explanation.

Scott Morrison (Nov 12 2022 at 06:21):

Oops, I accidentally did mathlib-tools#133, which made the "ready" colour defer to human-specified colours (e.g. for "PR"), although this was separately done in the larger mathlib-tools#132.

Scott Morrison (Nov 12 2022 at 06:23):

Alternatives:

Scott Morrison (Nov 12 2022 at 06:24):

I've also just made mathlib-tools#134, which corrects a previous problem with the --exclude-tactics flag, that files that were only imported by tactics files were also being dropped. Instead I just add all the necessary edges "bridging over" the tactics files before deleting them. To make the graphs slightly saner, I've decided to leave in tactic.core and tactic.basic, even if you ask for --exclude-tactics.

Scott Morrison (Nov 12 2022 at 07:22):

I've also just made mathlib-tools#135, which enables:

% leanproject port-progress
Total files in mathlib:           2723
Longest import path in mathlib:   148
Unported files in mathlib:        2693
Longest unported path in mathlib: 138
['logic.equiv.defs', 'order.synonym', 'order.compare', 'order.monotone', 'order.lattice', 'order.bounded_order', 'algebra.order.monoid.defs', 'algebra.order.sub.defs', 'algebra.order.group.defs', 'algebra.order.ring.defs', 'algebra.order.ring.canonical', 'data.nat.order.basic', 'data.nat.order.lemmas', 'algebra.group_power.ring', 'algebra.group_power.order', 'data.nat.pow', 'algebra.group_power.lemmas', 'algebra.group_power.default', 'data.list.big_operators', 'data.list.count', 'data.list.lattice', 'data.list.nodup', 'data.list.dedup', 'data.list.perm', 'data.multiset.basic', 'algebra.big_operators.multiset', 'data.multiset.bind', 'data.multiset.powerset', 'data.multiset.nodup', 'data.multiset.dedup', 'data.multiset.finset_ops', 'data.finset.basic', 'data.finset.card', 'data.finset.option', 'data.finset.lattice', 'data.finset.powerset', 'data.fintype.basic', 'data.finset.sort', 'data.set.finite', 'order.conditionally_complete_lattice', 'data.nat.lattice', 'order.order_iso_nat', 'order.well_founded_set', 'group_theory.submonoid.pointwise', 'group_theory.subgroup.basic', 'ring_theory.subring.basic', 'algebra.group_ring_action', 'algebra.ring.aut', 'algebra.star.basic', 'algebra.module.linear_map', 'algebra.module.equiv', 'algebra.module.submodule.basic', 'algebra.module.submodule.lattice', 'linear_algebra.basic', 'linear_algebra.span', 'algebra.algebra.basic', 'linear_algebra.prod', 'linear_algebra.linear_independent', 'linear_algebra.basis', 'linear_algebra.bilinear_map', 'algebra.module.submodule.bilinear', 'linear_algebra.tensor_product', 'algebra.algebra.bilinear', 'algebra.algebra.operations', 'ring_theory.ideal.operations', 'algebra.algebra.subalgebra.basic', 'algebra.algebra.tower', 'ring_theory.adjoin.basic', 'data.mv_polynomial.basic', 'data.mv_polynomial.rename', 'data.mv_polynomial.monad', 'data.mv_polynomial.variables', 'data.mv_polynomial.comm_ring', 'ring_theory.polynomial.basic', 'ring_theory.adjoin.fg', 'ring_theory.algebra_tower', 'ring_theory.finiteness', 'field_theory.finiteness', 'linear_algebra.finite_dimensional', 'linear_algebra.matrix.finite_dimensional', 'linear_algebra.matrix.to_lin', 'linear_algebra.matrix.basis', 'linear_algebra.determinant', 'topology.algebra.module.basic', 'analysis.normed.field.basic', 'analysis.normed_space.basic', 'analysis.normed.order.basic', 'analysis.locally_convex.basic', 'analysis.locally_convex.balanced_core_hull', 'topology.algebra.module.finite_dimension', 'analysis.convex.topology', 'topology.algebra.module.locally_convex', 'analysis.seminorm', 'analysis.locally_convex.bounded', 'topology.algebra.uniform_convergence', 'topology.algebra.module.strong_topology', 'analysis.normed_space.operator_norm', 'analysis.normed_space.multilinear', 'analysis.normed_space.bounded_linear_maps', 'analysis.calculus.fderiv', 'analysis.calculus.deriv', 'analysis.calculus.local_extr', 'analysis.calculus.mean_value', 'analysis.calculus.cont_diff', 'analysis.calculus.inverse', 'analysis.special_functions.exp_deriv', 'analysis.special_functions.complex.log_deriv', 'analysis.special_functions.pow_deriv', 'analysis.convex.specific_functions', 'analysis.special_functions.trigonometric.complex', 'analysis.special_functions.trigonometric.arctan', 'measure_theory.function.special_functions', 'measure_theory.integral.mean_inequalities', 'measure_theory.function.simple_func_dense', 'measure_theory.function.strongly_measurable', 'measure_theory.function.ae_eq_fun', 'measure_theory.function.lp_space', 'measure_theory.function.lp_order', 'measure_theory.function.l1_space', 'measure_theory.function.simple_func_dense_lp', 'measure_theory.integral.set_to_l1', 'measure_theory.integral.bochner', 'measure_theory.integral.set_integral', 'measure_theory.function.ae_eq_of_integral', 'measure_theory.measure.with_density_vector_measure', 'measure_theory.decomposition.lebesgue', 'measure_theory.decomposition.radon_nikodym', 'measure_theory.function.conditional_expectation.basic', 'measure_theory.function.conditional_expectation.indicator', 'measure_theory.function.conditional_expectation.real', 'probability.process.filtration', 'probability.process.adapted', 'probability.process.stopping', 'probability.process.hitting_time', 'probability.martingale.basic', 'probability.martingale.upcrossing', 'probability.martingale.convergence', 'probability.martingale.borel_cantelli', 'probability.borel_cantelli']

Scott Morrison (Nov 12 2022 at 07:23):

Perhaps without any information about what the longest unported path is, I'll propose a bot that posts regularly in the CI stream with this information. (In fact, there's already a test post.)

Scott Morrison (Nov 12 2022 at 07:24):

What other information would be useful or interesting?

Scott Morrison (Nov 12 2022 at 07:24):

Perhaps it's even nice to present these numbers also as percentages (i.e. X% of mathlib has been ported and The longest unported chain is Y% as long as the long chain in mathlib.)

Scott Morrison (Nov 12 2022 at 07:25):

It could even calculate estimated completion dates counting from some arbitrary starting point a few weeks ago. :-) Everyone loves a linearly extrapolated progress meter.

Jon Eugster (Nov 12 2022 at 11:24):

Scott Morrison said:

Oops, I accidentally did mathlib-tools#133, which made the "ready" colour defer to human-specified colours (e.g. for "PR"), although this was separately done in the larger mathlib-tools#132.

Yes, merge mathlib-tools#133 first. I messed up and included more commits in mathlib-tools#132 than I intended, it was meant to be equivalent to mathlib-tools#133 :upside_down:

Patrick Massot (Nov 12 2022 at 11:37):

There are MacOS issues in the tests that are unrelated to this work. Can some MacOS user have a look at https://github.com/leanprover-community/mathlib-tools/actions/runs/3449749724/jobs/5757954364#step:3:83?

Patrick Massot (Nov 12 2022 at 11:45):

I merged mathlib-tools#133 and mathlib-tools#134 anyway. The MacOS error is probably purely a conflict between CI config and changes on MacOS.

Patrick Massot (Nov 12 2022 at 12:47):

@Scott Morrison I rebased mathlib-tools#135 and added a commit counting lines in the relevant files. The current output (eliding the long import chain) is:

Total files in mathlib:            2677
Longest import chain in mathlib:   156
Ported files in mathlib:           30 (1.1% of total)
Ported lines in mathlib:           9091 (1.0% of total)
Longest unported chain in mathlib: 147
['logic.equiv.defs', ...]

Patrick Massot (Nov 12 2022 at 12:49):

If this is good enough for you then I can merge a release on pypi.

Patrick Massot (Nov 12 2022 at 12:50):

The line count is about the same files that the other statistics, and there are raw line numbers (including comments and blank lines). This is a rough measure anyway. Note that it seems we reached the 1% milestone!

Scott Morrison (Nov 12 2022 at 21:38):

Looks good to me.

Patrick Massot (Nov 12 2022 at 23:06):

Merged and released.

Patrick Massot (Nov 14 2022 at 23:54):

Patrick Massot said:

There are MacOS issues in the tests that are unrelated to this work. Can some MacOS user have a look at https://github.com/leanprover-community/mathlib-tools/actions/runs/3449749724/jobs/5757954364#step:3:83?

Let me mention again that it would be really useful to have some MacOS user investigating this. It's really annoying to be unable to quickly check whether tests are still passing in this period where lots of leanproject PRs are made to support the port. Alternatively I could simply disable MacOS tests in CI but that could seem a bit hostile towards Mac users.

Eric Wieser (Nov 15 2022 at 00:44):

https://github.com/actions/runner-images/issues/6459#issuecomment-1310535069 seems to be the same thing

Arien Malec (Nov 15 2022 at 00:51):

I’m on a Mac - haven’t seen a failure. What’s the test needed?

Scott Morrison (Nov 15 2022 at 01:01):

@Arien Malec, this is the build failure Patrick is concerned about: https://github.com/leanprover-community/mathlib-tools/actions/runs/3466186474/jobs/5789737532

Jireh Loreaux (Nov 15 2022 at 01:09):

I have an old (2011) Mac running MacOS 10.13, but I don't do any Lean on it. Point being, Patrick, if you need me to fire it up and build and install, but I would prefer if someone else (e.g., Arlen) do it, especially because my OS isn't the latest anyway so the results may not match current user experience.

Scott Morrison (Nov 15 2022 at 01:12):

I think leanproject is working fine for everyone on a mac. The problem is that the CI task that checks things work on a mac is failing (and has been for a while!) so we don't have an active guarantee of this.

Scott Morrison (Nov 15 2022 at 01:12):

That is, the task is to work out why it fails in CI, rather than on an end-users computer.

Jireh Loreaux (Nov 15 2022 at 01:25):

Ah I see.

Scott Morrison (Nov 15 2022 at 01:45):

I tried a few things (editing the scripts/install_macos.sh script and opening a PR), but couldn't work it out.

Scott Morrison (Nov 15 2022 at 01:45):

homebrew wants to install python3, even though it is already present on the github macos runners.

Scott Morrison (Nov 15 2022 at 01:46):

while installing it runs into existing files:

==> Pouring python@3.11--3.11.0.monterey.bottle.tar.gz
Error: The `brew link` step did not complete successfully
The formula built, but is not symlinked into /usr/local
Could not symlink bin/2to3-3.11
Target /usr/local/bin/2to3-3.11
already exists. You may want to remove it:
  rm '/usr/local/bin/2to3-3.11'

To force the link and overwrite all conflicting files:
  brew link --overwrite python@3.11

Julian Berman (Nov 15 2022 at 01:50):

I don't have time to look tonight (and possibly not tomorrow either), but a hacky thing to try if no one's said it already is to run brew install python@3.11 && brew link --overwrite python@3.11 explicitly before running the install step

Julian Berman (Nov 15 2022 at 01:51):

If no one figures it out by then I'm happy to look on Wednesdayish

Julian Berman (Nov 15 2022 at 01:51):

(I think it's unlikely the install_macos.sh needs changing -- this is probably some transient thing, but that's also a guess)

Julian Berman (Nov 15 2022 at 01:53):

Or even safer given that this is meant to check the install script works would be to instead run brew uninstall on whichever version of Python is the conflicting one -- I've had to do that this week for unrelated formulae that are now present on macOS runners (Firefox I think?).

Arien Malec (Nov 15 2022 at 02:25):

https://github.com/actions/runner-images/issues/4033

Arien Malec (Nov 15 2022 at 02:28):

Per https://github.com/actions/runner-images/issues/4020 fixed by updating the image version.

Arien Malec (Nov 15 2022 at 02:34):

From issue 6459, seems to be a recurrence of the issues noted above.

Arien Malec (Nov 15 2022 at 02:37):

As noted, hacking it is probably the best short term action: brew install --overwrite python@3.10 python@3.11 forces an overwrite.

Scott Morrison (Nov 15 2022 at 03:39):

It's been broken for quite a while, so I'm dubious that just waiting for an image update will help.

Scott Morrison (Nov 15 2022 at 03:40):

@Arien Malec, could you make a PR with that extra line in install_macos.sh and see if CI likes it?

Arien Malec (Nov 15 2022 at 03:45):

Can do via fork & PR

Arien Malec (Nov 15 2022 at 03:51):

PR #142

Arien Malec (Nov 15 2022 at 03:56):

might not work if the issue is with the brew update step above it, in which case might need to swap

Scott Morrison (Nov 15 2022 at 04:01):

mathlib-tools#142


Last updated: Dec 20 2023 at 11:08 UTC