Zulip Chat Archive

Stream: general

Topic: mathlib4 porting status


Scott Morrison (Nov 11 2022 at 03:46):

Status of the mathlib4 port

The porting process has started!

There's been a lot of work going on, and various public and private conversations --- time for an update.

The plan is roughly as follows:

  • We're already porting files from mathlib3 to mathlib4.
  • We want lots more people to help with this, and to learn how to review porting PRs.
  • Porting tactics is moving along; many of the major tactics are now ported, or in PR form, but there is a long tail of less frequently used tactics that remains.
  • The future is Lean 4. That is, from the maintainers point of view, on any question where the needs of mathlib3 and mathlib4 are in conflict, mathlib4 wins.
  • When files have been ported, we are adding a "This file in synchronized with mathlib4" label.
  • If you want to make a PR to mathlib3 that touches a file with this label, you're expected to making a matching PR to mathlib4, to keep the files in synch.
  • We expect and understand that as more files are synchronized, and as more peoples' efforts turn from mathlib3 to mathlib4, it is going to become harder to make successful PRs to mathlib3, and slower to receive a review.
  • As yet, we don't really now the timescale over which this will happen: this is not an announcement that this is imminent, but a warning that it is coming!
  • Eventually we may have "flag day", after which we stop accepting PRs to mathlib3 entirely.

Resources

  • If you are not subscribed to the mathlib4 stream on zulip, I'd encourage you to do so!
  • There is a wiki page giving some advice on porting. Edits and contributions to this are highly encouraged!
  • There is a video tutorial showing a simple example of porting a file.
  • The mathlib4 pull-request list has lots of PRs which you can help out on. Feel free to jump in and contribute changes to PRs, or to make comments, or to review.
  • In mathlib, you can run scripts/port_status.py to obtain a list of files which it is currently possible to port.
  • In mathlib, you can cd src && leanproject import-graph --to data.rat.order --reduce --exclude-tactics --port-status out.pdf, and get a nice colour coded PDF showing the import hierarchy of your favourite file, to help plan porting and ascertain progress.
  • The port status wiki page tracks the current state of porting.
  • There is a one-shot mode for mathport which lets you quickly try out a change, and see what mathport will do with it.

Timeline

Right now it's hard to estimate. We're constrained by being at the very beginning of the import hierarchy, so the frontier isn't very wide yet, so it's hard to evaluate what the workload-to-contributors ratio is!

As an example, here's today's output from scripts/port_status.py:

# The following files have all dependencies ported already, and should be ready to port:
# Earlier items in the list are required in more places in mathlib.
logic.equiv.defs    -- PRd as mathlib4#550
order.basic    -- PRd as mathlib4#556
logic.function.iterate    -- WIP hrmacbeth
algebra.ne_zero    -- PRd as #557
algebra.group.units    -- PRd as mathlib4#549
data.pi.algebra    -- WIP PR mathlib4#564
logic.relation    -- PRd as mathlib4#565
control.functor
lean_core.data.vector
data.char
data.rat.init
category_theory.elementwise
data.stream.defs
data.rbtree.init
data.num.basic
data.lazy_list
data.nat.bits
data.typevec
data.rbtree.default_lt
lean_core.data.dlist
control.ulift
lean_core.data.buffer
algebra.pempty_instances
lean_core.data.buffer.parser
group_theory.eckmann_hilton
data.two_pointing
data.vector3
data.json
data.mllist
data.option.n_ary
data.sigma.default
control.basic
algebra.hierarchy_design
algebra.group.commutator

This list has been sorted according to the number of downstream dependencies; so it's important to work on the top of the list first. In fact, people have been doing this, as evidenced by the comments pointing to active PRs.

To look a bit further ahead, we should use leanproject import-graph. Here's the output for algebra.order.field.defs.

algebra.order.field.defs.png

  • Green indicates a file that's been done.
  • Darker blue is a file that has an open PR (or at least someone has marked it as WIP in the port-status wiki page).
  • Orange would indicate that someone has written some other comment in the port-status wiki file, possibly an explanation of why porting this file is currently blocked.
  • Light blue is a file that someone should start porting!

I think it's very early days for extrapolating how long the whole porting process will take. Here's a heuristic that we can argue about, and see how true it ends up. Say each file takes a separate PR (pessimistic: multiple files can be done at once, and sequential PRs can be prepared and reviewed concurrently), and that each PR takes three days to process (wildly optimistic?). If we have maximum parallelism, then the time to reach any particular target is just the depth in the import graph --- the distance from the earliest not-green node to the target node. This heuristic would say:

I've posted this in the "general" stream, but I'd encourage everyone to come across to the mathlib4 stream for further discussion, just to get everyone used to hanging out there. :-)

Johan Commelin (Nov 11 2022 at 08:10):

Continuing the timeline example:

>>> nx.dag_longest_path(G)
['data.rbtree.init', 'data.rbmap.basic', 'data.list.defs', 'data.string.defs', 'meta.expr', 'tactic.binder_matching', 'tactic.core', 'tactic.algebra', 'tactic.to_additive', 'tactic.lint.default', 'tactic.interactive', 'tactic.hint', 'tactic.norm_cast', 'tactic.tidy', 'tactic.obviously', 'tactic.basic', 'logic.unique', 'logic.equiv.defs', 'order.synonym', 'order.compare', 'order.monotone', 'order.lattice', 'order.bounded_order', 'order.heyting.basic', 'order.boolean_algebra', 'order.symm_diff', 'data.set.basic', 'data.set.prod', 'data.set.function', 'logic.equiv.set', 'order.rel_iso', 'order.hom.basic', 'algebra.order.group.defs', 'algebra.order.group.type_tags', 'algebra.order.with_zero', 'data.nat.order', 'algebra.group_power.ring', 'algebra.group_power.order', 'data.nat.pow', 'algebra.group_power.lemmas',
'algebra.group_power.default', 'data.list.big_operators', 'data.list.count', 'data.list.pairwise', 'data.list.chain', 'data.list.range', 'data.list.perm', 'data.multiset.basic', 'algebra.big_operators.multiset', 'data.multiset.bind', 'data.multiset.powerset', 'data.multiset.nodup', 'data.multiset.dedup', 'data.multiset.finset_ops', 'data.finset.basic', 'data.finset.card', 'data.finset.option', 'data.finset.lattice', 'data.finset.powerset', 'data.fintype.basic', 'algebra.char_zero', 'data.int.char_zero', 'data.rat.cast', 'tactic.norm_num', 'tactic.abel', 'algebra.module.basic', 'data.set.pointwise.basic', 'group_theory.submonoid.pointwise', 'group_theory.subgroup.basic', 'ring_theory.subring.basic', 'algebra.group_ring_action', 'algebra.ring.aut', 'algebra.star.basic', 'algebra.module.linear_map', 'algebra.module.equiv', 'algebra.module.submodule.basic', 'algebra.module.submodule.lattice', 'linear_algebra.basic', 'linear_algebra.span', 'algebra.algebra.basic', 'linear_algebra.prod', 'linear_algebra.linear_independent', 'linear_algebra.basis', 'linear_algebra.bilinear_map', 'algebra.module.submodule.bilinear', 'linear_algebra.tensor_product', 'algebra.algebra.bilinear', 'algebra.algebra.operations', 'ring_theory.ideal.operations', 'algebra.algebra.subalgebra.basic', 'algebra.algebra.tower', 'ring_theory.adjoin.basic', 'data.mv_polynomial.basic', 'data.mv_polynomial.rename', 'data.mv_polynomial.monad', 'data.mv_polynomial.variables', 'data.mv_polynomial.equiv', 'ring_theory.polynomial.basic', 'ring_theory.adjoin.fg', 'ring_theory.algebra_tower', 'ring_theory.finiteness', 'field_theory.finiteness', 'linear_algebra.finite_dimensional', 'linear_algebra.matrix.finite_dimensional', 'linear_algebra.matrix.to_lin', 'linear_algebra.matrix.basis', 'linear_algebra.determinant', 'topology.algebra.module.basic', 'analysis.normed.field.basic', 'analysis.normed_space.basic', 'analysis.normed.order.basic', 'analysis.locally_convex.basic', 'analysis.locally_convex.balanced_core_hull', 'topology.algebra.module.finite_dimension', 'analysis.convex.topology', 'topology.algebra.module.locally_convex', 'analysis.seminorm', 'analysis.locally_convex.bounded', 'topology.algebra.uniform_convergence', 'topology.algebra.module.strong_topology', 'analysis.normed_space.operator_norm', 'analysis.normed_space.multilinear', 'analysis.normed_space.bounded_linear_maps', 'analysis.calculus.fderiv', 'analysis.calculus.deriv', 'analysis.calculus.local_extr', 'analysis.calculus.mean_value', 'analysis.calculus.cont_diff', 'analysis.complex.real_deriv', 'analysis.special_functions.exp_deriv', 'analysis.special_functions.log.deriv', 'analysis.special_functions.pow_deriv', 'analysis.convex.specific_functions', 'analysis.special_functions.trigonometric.complex', 'analysis.special_functions.trigonometric.arctan', 'measure_theory.function.special_functions', 'measure_theory.integral.mean_inequalities', 'measure_theory.function.simple_func_dense', 'measure_theory.function.strongly_measurable', 'measure_theory.function.ae_eq_fun', 'measure_theory.function.lp_space', 'measure_theory.function.lp_order', 'measure_theory.function.l1_space', 'measure_theory.function.simple_func_dense_lp', 'measure_theory.integral.set_to_l1', 'measure_theory.integral.bochner', 'measure_theory.integral.set_integral', 'measure_theory.function.ae_eq_of_integral', 'measure_theory.measure.with_density_vector_measure', 'measure_theory.decomposition.lebesgue', 'measure_theory.decomposition.radon_nikodym', 'measure_theory.function.conditional_expectation.basic', 'measure_theory.function.conditional_expectation.indicator', 'measure_theory.function.conditional_expectation.real', 'probability.process.filtration', 'probability.process.adapted', 'probability.process.stopping', 'probability.process.hitting_time', 'probability.martingale.basic', 'probability.martingale.upcrossing', 'probability.martingale.convergence', 'probability.martingale.borel_cantelli', 'probability.borel_cantelli']

This path has length 162. So that might also give a bit of a pointer in which direction to look for optimizing the porting PR process.

Scott Morrison (Nov 11 2022 at 10:17):

There are so many dubious imports in that chain. :-)

Scott Morrison (Nov 11 2022 at 10:21):

Actually, I bet it has already dropped some after #17435 merged.

Anne Baanen (Nov 11 2022 at 10:28):

I can take a look at reducing the dependencies of and on linear_algebra.finite_dimensional, that's a pretty fundamental file that is deep in the tree

Riccardo Brasca (Nov 11 2022 at 12:10):

@Anne Baanen concerning finite_dimensional you can have a look at the discussion in #17401. It would be nice to be able to import linear_algebra.free_module.finite.basicin linear_algebra.finite_dimensional, but this is currently impossible.

Anne Baanen (Nov 11 2022 at 13:05):

It's a bit of a tangled mess at the moment. My approach is going to be:

  • split off the definition of finrank from finite_dimensional (needed for linear_algebra.free_module.finite.rank)
  • move the last part of finsupp_vector_space to finite_dimensional (doesn't need any finsupp specific results)
  • move free and finite instances for linear_map to a new file and/or rework matrix.to_lin to use module.finite instead of finite_dimensional

Mario Carneiro (Nov 11 2022 at 13:12):

Here's another fun game you can do with that list: mark all the modules containing theorems that appear in the theorem dependency graph of the borel-cantelli lemma. The unmarked modules are ones which are imported but for which no theorems in the file contribute to the proof, which makes them highly likely to be a spurious import which deserves a file split somewhere

Yaël Dillies (Nov 11 2022 at 14:32):

Why does topology.algebra.module.basic import linear_algebra.determinant?

Moritz Doll (Nov 11 2022 at 14:37):

https://github.com/leanprover-community/mathlib/commit/5b36c868a1947a1ad17245be6fd280c9b91b1f6c#diff-4268c2431deac11bcaa91df861afa3b1dc02646ea27b354938035233a272efaa

Anne Baanen (Nov 11 2022 at 14:59):

Shall I move that to topology.algebra.module.determinant then?

Anne Baanen (Nov 11 2022 at 14:59):

(I have very little idea how much counts as "basic" for this part of the library)

Sebastien Gouezel (Nov 11 2022 at 15:16):

I think I added the linear_algebra.determinant import (to be able to say that the determinant is a continuous function, or something like that), and at the time I decided that determinant was more basic than continuous linear maps, as most of linear algebra is already imported in this file (as well as big operators, and so on).

Anatole Dedecker (Nov 11 2022 at 15:28):

(deleted)

Martin Dvořák (Nov 11 2022 at 15:33):

Where is the "This file in synchronized with mathlib4" label and what does it look like?

Moritz Doll (Nov 11 2022 at 15:35):

below the title. see https://leanprover-community.github.io/mathlib_docs/logic/lemmas.html

Notification Bot (Nov 11 2022 at 16:07):

11 messages were moved here from #general > mathlib4 porting status by Anne Baanen.

Martin Dvořák (Nov 11 2022 at 17:31):

Is it always a docstring right under imports?


Last updated: Dec 20 2023 at 11:08 UTC