Topic: mathlib4 porting status
Scott Morrison (Nov 11 2022 at 03:46):
Status of the mathlib4 port
The porting process has started!
There's been a lot of work going on, and various public and private conversations --- time for an update.
The plan is roughly as follows:
- We're already porting files from mathlib3 to mathlib4.
- We want lots more people to help with this, and to learn how to review porting PRs.
- Porting tactics is moving along; many of the major tactics are now ported, or in PR form, but there is a long tail of less frequently used tactics that remains.
- The future is Lean 4. That is, from the maintainers point of view, on any question where the needs of mathlib3 and mathlib4 are in conflict, mathlib4 wins.
- When files have been ported, we are adding a "This file in synchronized with mathlib4" label.
- If you want to make a PR to mathlib3 that touches a file with this label, you're expected to making a matching PR to mathlib4, to keep the files in synch.
- We expect and understand that as more files are synchronized, and as more peoples' efforts turn from mathlib3 to mathlib4, it is going to become harder to make successful PRs to mathlib3, and slower to receive a review.
- As yet, we don't really now the timescale over which this will happen: this is not an announcement that this is imminent, but a warning that it is coming!
- Eventually we may have "flag day", after which we stop accepting PRs to mathlib3 entirely.
- If you are not subscribed to the mathlib4 stream on zulip, I'd encourage you to do so!
- There is a wiki page giving some advice on porting. Edits and contributions to this are highly encouraged!
- There is a video tutorial showing a simple example of porting a file.
- The mathlib4 pull-request list has lots of PRs which you can help out on. Feel free to jump in and contribute changes to PRs, or to make comments, or to review.
mathlib, you can run
scripts/port_status.pyto obtain a list of files which it is currently possible to port.
mathlib, you can
cd src && leanproject import-graph --to data.rat.order --reduce --exclude-tactics --port-status out.pdf, and get a nice colour coded PDF showing the import hierarchy of your favourite file, to help plan porting and ascertain progress.
- The port status wiki page tracks the current state of porting.
- There is a one-shot mode for mathport which lets you quickly try out a change, and see what mathport will do with it.
Right now it's hard to estimate. We're constrained by being at the very beginning of the import hierarchy, so the frontier isn't very wide yet, so it's hard to evaluate what the workload-to-contributors ratio is!
As an example, here's today's output from
# The following files have all dependencies ported already, and should be ready to port: # Earlier items in the list are required in more places in mathlib. logic.equiv.defs -- PRd as mathlib4#550 order.basic -- PRd as mathlib4#556 logic.function.iterate -- WIP hrmacbeth algebra.ne_zero -- PRd as #557 algebra.group.units -- PRd as mathlib4#549 data.pi.algebra -- WIP PR mathlib4#564 logic.relation -- PRd as mathlib4#565 control.functor lean_core.data.vector data.char data.rat.init category_theory.elementwise data.stream.defs data.rbtree.init data.num.basic data.lazy_list data.nat.bits data.typevec data.rbtree.default_lt lean_core.data.dlist control.ulift lean_core.data.buffer algebra.pempty_instances lean_core.data.buffer.parser group_theory.eckmann_hilton data.two_pointing data.vector3 data.json data.mllist data.option.n_ary data.sigma.default control.basic algebra.hierarchy_design algebra.group.commutator
This list has been sorted according to the number of downstream dependencies; so it's important to work on the top of the list first. In fact, people have been doing this, as evidenced by the comments pointing to active PRs.
To look a bit further ahead, we should use
leanproject import-graph. Here's the output for
- Green indicates a file that's been done.
- Darker blue is a file that has an open PR (or at least someone has marked it as WIP in the port-status wiki page).
- Orange would indicate that someone has written some other comment in the port-status wiki file, possibly an explanation of why porting this file is currently blocked.
- Light blue is a file that someone should start porting!
I think it's very early days for extrapolating how long the whole porting process will take. Here's a heuristic that we can argue about, and see how true it ends up. Say each file takes a separate PR (pessimistic: multiple files can be done at once, and sequential PRs can be prepared and reviewed concurrently), and that each PR takes three days to process (wildly optimistic?). If we have maximum parallelism, then the time to reach any particular target is just the depth in the import graph --- the distance from the earliest not-green node to the target node. This heuristic would say:
- algebra.order.field.defs.pdf : depth 12, ~1 month
- data.rat.order.pdf : depth 19, ~2 months (up to 9 files on a level)
- category_theory.limits.yoneda.pdf : depth 29, ~3 months (this development is quite "linear"; there's never more than 4 files at a level)
- data.fintype.basic.pdf : depth 38, ~4 months
- topology.instances.real.pdf : depth 75, ~8 months
I've posted this in the "general" stream, but I'd encourage everyone to come across to the mathlib4 stream for further discussion, just to get everyone used to hanging out there. :-)
Johan Commelin (Nov 11 2022 at 08:10):
Continuing the timeline example:
>>> nx.dag_longest_path(G) ['data.rbtree.init', 'data.rbmap.basic', 'data.list.defs', 'data.string.defs', 'meta.expr', 'tactic.binder_matching', 'tactic.core', 'tactic.algebra', 'tactic.to_additive', 'tactic.lint.default', 'tactic.interactive', 'tactic.hint', 'tactic.norm_cast', 'tactic.tidy', 'tactic.obviously', 'tactic.basic', 'logic.unique', 'logic.equiv.defs', 'order.synonym', 'order.compare', 'order.monotone', 'order.lattice', 'order.bounded_order', 'order.heyting.basic', 'order.boolean_algebra', 'order.symm_diff', 'data.set.basic', 'data.set.prod', 'data.set.function', 'logic.equiv.set', 'order.rel_iso', 'order.hom.basic', 'algebra.order.group.defs', 'algebra.order.group.type_tags', 'algebra.order.with_zero', 'data.nat.order', 'algebra.group_power.ring', 'algebra.group_power.order', 'data.nat.pow', 'algebra.group_power.lemmas', 'algebra.group_power.default', 'data.list.big_operators', 'data.list.count', 'data.list.pairwise', 'data.list.chain', 'data.list.range', 'data.list.perm', 'data.multiset.basic', 'algebra.big_operators.multiset', 'data.multiset.bind', 'data.multiset.powerset', 'data.multiset.nodup', 'data.multiset.dedup', 'data.multiset.finset_ops', 'data.finset.basic', 'data.finset.card', 'data.finset.option', 'data.finset.lattice', 'data.finset.powerset', 'data.fintype.basic', 'algebra.char_zero', 'data.int.char_zero', 'data.rat.cast', 'tactic.norm_num', 'tactic.abel', 'algebra.module.basic', 'data.set.pointwise.basic', 'group_theory.submonoid.pointwise', 'group_theory.subgroup.basic', 'ring_theory.subring.basic', 'algebra.group_ring_action', 'algebra.ring.aut', 'algebra.star.basic', 'algebra.module.linear_map', 'algebra.module.equiv', 'algebra.module.submodule.basic', 'algebra.module.submodule.lattice', 'linear_algebra.basic', 'linear_algebra.span', 'algebra.algebra.basic', 'linear_algebra.prod', 'linear_algebra.linear_independent', 'linear_algebra.basis', 'linear_algebra.bilinear_map', 'algebra.module.submodule.bilinear', 'linear_algebra.tensor_product', 'algebra.algebra.bilinear', 'algebra.algebra.operations', 'ring_theory.ideal.operations', 'algebra.algebra.subalgebra.basic', 'algebra.algebra.tower', 'ring_theory.adjoin.basic', 'data.mv_polynomial.basic', 'data.mv_polynomial.rename', 'data.mv_polynomial.monad', 'data.mv_polynomial.variables', 'data.mv_polynomial.equiv', 'ring_theory.polynomial.basic', 'ring_theory.adjoin.fg', 'ring_theory.algebra_tower', 'ring_theory.finiteness', 'field_theory.finiteness', 'linear_algebra.finite_dimensional', 'linear_algebra.matrix.finite_dimensional', 'linear_algebra.matrix.to_lin', 'linear_algebra.matrix.basis', 'linear_algebra.determinant', 'topology.algebra.module.basic', 'analysis.normed.field.basic', 'analysis.normed_space.basic', 'analysis.normed.order.basic', 'analysis.locally_convex.basic', 'analysis.locally_convex.balanced_core_hull', 'topology.algebra.module.finite_dimension', 'analysis.convex.topology', 'topology.algebra.module.locally_convex', 'analysis.seminorm', 'analysis.locally_convex.bounded', 'topology.algebra.uniform_convergence', 'topology.algebra.module.strong_topology', 'analysis.normed_space.operator_norm', 'analysis.normed_space.multilinear', 'analysis.normed_space.bounded_linear_maps', 'analysis.calculus.fderiv', 'analysis.calculus.deriv', 'analysis.calculus.local_extr', 'analysis.calculus.mean_value', 'analysis.calculus.cont_diff', 'analysis.complex.real_deriv', 'analysis.special_functions.exp_deriv', 'analysis.special_functions.log.deriv', 'analysis.special_functions.pow_deriv', 'analysis.convex.specific_functions', 'analysis.special_functions.trigonometric.complex', 'analysis.special_functions.trigonometric.arctan', 'measure_theory.function.special_functions', 'measure_theory.integral.mean_inequalities', 'measure_theory.function.simple_func_dense', 'measure_theory.function.strongly_measurable', 'measure_theory.function.ae_eq_fun', 'measure_theory.function.lp_space', 'measure_theory.function.lp_order', 'measure_theory.function.l1_space', 'measure_theory.function.simple_func_dense_lp', 'measure_theory.integral.set_to_l1', 'measure_theory.integral.bochner', 'measure_theory.integral.set_integral', 'measure_theory.function.ae_eq_of_integral', 'measure_theory.measure.with_density_vector_measure', 'measure_theory.decomposition.lebesgue', 'measure_theory.decomposition.radon_nikodym', 'measure_theory.function.conditional_expectation.basic', 'measure_theory.function.conditional_expectation.indicator', 'measure_theory.function.conditional_expectation.real', 'probability.process.filtration', 'probability.process.adapted', 'probability.process.stopping', 'probability.process.hitting_time', 'probability.martingale.basic', 'probability.martingale.upcrossing', 'probability.martingale.convergence', 'probability.martingale.borel_cantelli', 'probability.borel_cantelli']
This path has length 162. So that might also give a bit of a pointer in which direction to look for optimizing the porting PR process.
Scott Morrison (Nov 11 2022 at 10:17):
There are so many dubious imports in that chain. :-)
Scott Morrison (Nov 11 2022 at 10:21):
Actually, I bet it has already dropped some after #17435 merged.
Anne Baanen (Nov 11 2022 at 10:28):
I can take a look at reducing the dependencies of and on
linear_algebra.finite_dimensional, that's a pretty fundamental file that is deep in the tree
Riccardo Brasca (Nov 11 2022 at 12:10):
@Anne Baanen concerning
finite_dimensional you can have a look at the discussion in #17401. It would be nice to be able to import
linear_algebra.finite_dimensional, but this is currently impossible.
Anne Baanen (Nov 11 2022 at 13:05):
It's a bit of a tangled mess at the moment. My approach is going to be:
- split off the definition of
- move the last part of
finite_dimensional(doesn't need any
linear_mapto a new file and/or rework
Mario Carneiro (Nov 11 2022 at 13:12):
Here's another fun game you can do with that list: mark all the modules containing theorems that appear in the theorem dependency graph of the borel-cantelli lemma. The unmarked modules are ones which are imported but for which no theorems in the file contribute to the proof, which makes them highly likely to be a spurious import which deserves a file split somewhere
Yaël Dillies (Nov 11 2022 at 14:32):
Moritz Doll (Nov 11 2022 at 14:37):
Anne Baanen (Nov 11 2022 at 14:59):
Shall I move that to
Anne Baanen (Nov 11 2022 at 14:59):
(I have very little idea how much counts as "basic" for this part of the library)
Sebastien Gouezel (Nov 11 2022 at 15:16):
I think I added the
linear_algebra.determinant import (to be able to say that the determinant is a continuous function, or something like that), and at the time I decided that determinant was more basic than continuous linear maps, as most of linear algebra is already imported in this file (as well as big operators, and so on).
Anatole Dedecker (Nov 11 2022 at 15:28):
Martin Dvořák (Nov 11 2022 at 15:33):
Where is the "This file in synchronized with mathlib4" label and what does it look like?
Moritz Doll (Nov 11 2022 at 15:35):
below the title. see https://leanprover-community.github.io/mathlib_docs/logic/lemmas.html
Notification Bot (Nov 11 2022 at 16:07):
11 messages were moved here from #general > mathlib4 porting status by Anne Baanen.
Martin Dvořák (Nov 11 2022 at 17:31):
Is it always a docstring right under imports?
Last updated: Aug 03 2023 at 10:10 UTC