Zulip Chat Archive

Stream: nightly-testing

Topic: moving to the module system


Kim Morrison (Oct 21 2025 at 00:35):

Modulizing checklist:

  • [x] confirm nightly-testing works against the modulize-nightly branch of quote4
  • [x] merge quote4's modulize-nightly branch into nightly-testing
  • [x] update mathlib's nightly-testing to point back to quote4's nightly-testing
  • [x] merge https://github.com/leanprover/lean4-cli/pull/56 to bring lean4-cli's nightly-testing branch to the module system
  • [x] lake update on import-graph and then mathlib4's nightly-testing branches to verify that worked (ci)
  • [ ] the batteries modulizing branch is called modsys
    • it's currently on a custom toolchain using lean#10819, which has not been merged
    • [ ] update this to a nightly toolchain when available
    • [ ] catch up to nightly-testing
  • [ ] the aesop modulizing branch is called modulize
    • currently on a custom toolchain using lean#10624, which has already been merged
    • lake build succeeds but lake test fails in batteries
    • [ ] once lean#10819 has been merged and/or rebased, run lake update, and change the toolchain to match batteries' modsys branch
  • import-graph has a modulize branch (on the leanprover-community fork) on the Kha fork, but it is breaking
    • [x] update to nightly-2025-10-25 and lake update lean4-cli fixes the lake build breakages there, but I can't push to @Sebastian Ullrich's fork, instead I've pushed 6de2dd4b to the modulize branch on the main repository
    • [x] fixed lake test, pushing fa1cae3383
    • [x] merge to main/nightly-testing
    • [x] lake update on mathlib4's nightly-testing branch
  • plausible has a modulize branch on the Kha fork (using the already-merged lean#10655 toolchain)
    • [x] merged new material from main (via nightly-testing) and completed modulization
    • [x] updated toolchain to nightly-2025-10-25
    • [x] updated toolchain to v4.25.0-rc1
    • [x] merge to main/nightly-testing
    • [x] lake update on mathlib4's nightly-testing branch
  • ProofWidgets4 has a modulize branch on the Kha fork
    • [x] merge to main / nightly-testing
    • [x] make a new release: v0.0.77
    • [x] lake update on mathlib4's nightly-testing branch
  • LeanSearchClient has a modulize branch on the Kha fork
    • [x] merged changes to main, fixed tests
    • [x] merge to nightly-testing
    • [x] lake update on mathlib4's nightly-testing branch
  • the mathlib4 modulizing branch is called modulize (on the mathlib4-nightly-testing fork), and uses the lean#10819 toolchain

Kim Morrison (Oct 21 2025 at 00:35):

(to clarify, this is all stuff Sebastian has been working nonstop on: this checklist is just me trying to make sure I understand what order to merge it all in!)

Kim Morrison (Oct 21 2025 at 04:06):

Phew... okay!

I think that everything except lean#10819, batteries, and aesop is done!

When #30740 lands, Mathlib will be on v4.25.0-rc1, and all of its dependencies except Batteries and Aesop will be on the module system.

Johan Commelin (Oct 21 2025 at 04:19):

That's a lot of modules! Vielen Dank @Sebastian Ullrich!

Sebastian Ullrich (Oct 21 2025 at 07:52):

Thanks @Kim Morrison for all the orchestration :)

Sebastian Ullrich (Nov 05 2025 at 17:49):

As some of you might already know, https://github.com/leanprover-community/mathlib4-nightly-testing/pull/80 contains a now-building port of Mahtlib to the module system. I've split the PR into more meaningful commits today and updated the PR description with some TODOs. I will be on vacation for the rest of the week and then look at the heartbeat regressions next week but if in the meantime people want to start exploring the PR, either looking at the diffs or playing around with parts in the editor, e.g. trying out changes that should short-circuit the build, or even rebasing the PR onto a new nightly, that would be very helpful!

/cc @Kim Morrison @Johan Commelin

Kevin Buzzard (Nov 05 2025 at 19:00):

Would !bench be a sensible thing to do at this point?

Sebastian Ullrich (Nov 05 2025 at 19:20):

I think I'll look at the known regressions first but I'll add it to the list!

Kim Morrison (Nov 05 2025 at 21:13):

And just to set expectations: the !bench results will likely have some regressions. We may have to accept a step backwards in places in order to be able to move forward. But we're quite confident that in the long term everything will be faster to download, build, and import. This port of Mathlib to the module system does not yet attempt to tighten up imports, or redesign interfaces to hide implementations.

Kim Morrison (Nov 05 2025 at 22:08):

I fixed the failing test in DeriveEncodable.

Kim Morrison (Nov 05 2025 at 22:09):

Do you know what is going on with the linter complaining about missing doc-strings on structure parent fields?

Sebastian Ullrich (Nov 05 2025 at 22:09):

No, I just saw it and was surprised as well. Something subtle I'm sure...

Sebastian Ullrich (Nov 05 2025 at 22:10):

Does cache work for you? Everything is rebuilt for me

Kim Morrison (Nov 05 2025 at 22:13):

Cache is working great for me!

Sebastian Ullrich (Nov 05 2025 at 22:14):

Huh...

Sebastian Ullrich (Nov 05 2025 at 22:19):

Played myself, unset LEAN_GITHASH fixes it

Kim Morrison (Nov 05 2025 at 22:24):

I'm finding examples where I can replace a proof step by sorry and get a fast rebuild of everything downstream, but replacing it with an alternative correct proof results in everything rebuilding. Is that expected?

Sebastian Ullrich (Nov 05 2025 at 22:25):

Only if a tactic in there is accidentally leaking information into the public scope. So any such cases would be very interesting to look at.

Sebastian Ullrich (Nov 05 2025 at 22:27):

Oh! I'm not sure I thought through the interaction of tactics generating private decls and backward.privateInPublic exporting private decls. I think we'll want to hard-disable the option whenever we descend into a theorem body.

Kim Morrison (Nov 05 2025 at 22:28):

Great: change the last line of lemma geom_sum_inv to grind is an easy example.

Sebastian Ullrich (Nov 05 2025 at 22:30):

Just to log my own more trivial rebuild avoidance tests now that I have a working build again:

diff --git a/Mathlib/Data/Real/Basic.lean b/Mathlib/Data/Real/Basic.lean
index 85216ff764e..70e43dc1110 100644
--- a/Mathlib/Data/Real/Basic.lean
+++ b/Mathlib/Data/Real/Basic.lean
@@ -562,8 +562,13 @@ end Real
 def IsPowMul {R : Type*} [Pow R ℕ] (f : R → ℝ) :=
   ∀ (a : R) {n : ℕ}, 1 ≤ n → f (a ^ n) = f a ^ n

+
+-- hi!
+
+/-- Hi!! -/
 lemma IsPowMul.map_one_le_one {R : Type*} [Monoid R] {f : R → ℝ} (hf : IsPowMul f) :
     f 1 ≤ 1 := by
+  apply id
   have hf1 : (f 1)^2 = f 1 := by conv_rhs => rw [← one_pow 2, hf _ one_le_two]
   rcases eq_zero_or_one_of_sq_eq_self hf1 with h | h <;> rw [h]
   exact zero_le_one
$ time lake build |  cat
✔ [5818/7479] Built Mathlib.Data.Real.Basic (1.8s)
Build completed successfully (7479 jobs).
lake build  12.00s user 1.20s system 352% cpu 3.745 total
cat  0.00s user 0.00s system 0% cpu 3.743 total

Sebastian Ullrich (Nov 05 2025 at 22:49):

@Kim Morrison Confirmed fixed by lean#11097. And with that I'm off.

Kim Morrison (Nov 06 2025 at 11:17):

Hmm, just tried rebasing onto nightly-testing-2025-11-06. I replaced the second commit with the result of running the commit message (after replacing the {Tactic,Util} piece with two separate commands...?), but then I get lots of conflicts as rebase tried to apply the later commits, so I aborted. :-(

Sebastian Ullrich (Nov 06 2025 at 11:19):

Oh, hm. Yesterday I only got a single conflicted file that way. Do you have an example conflict?

Sebastian Ullrich (Nov 06 2025 at 11:20):

The cmdline syntax may have been zsh specific, my bad

Kim Morrison (Nov 06 2025 at 11:22):

There were about twenty files conflicted. It looked like Modulize had simply not updated the header at all, and then the later commit was trying to modify a (should-have-been) modulized header.

First conflict message after the modulize step: gist.

Sebastian Ullrich (Nov 06 2025 at 11:23):

On my phone right now, maybe I was using other special shell syntax like double-asterisk that didn't match the necessary files in your shell?

Kim Morrison (Nov 06 2025 at 11:25):

Ah, I didn't check how my shell copes with **.

Sebastian Ullrich (Nov 06 2025 at 11:26):

It's shopt -s globstar in bash

Kim Morrison (Nov 06 2025 at 11:35):

Success. (I just switched to using find for robustness.)

Kim Morrison (Nov 06 2025 at 11:35):

I've pushed the rebased branch.

Sebastian Ullrich (Nov 06 2025 at 11:36):

Perfect, thanks!

Kim Morrison (Nov 06 2025 at 11:57):

Fixed two problems, so there are two additional commits. One is backported as #31323.

Kim Morrison (Nov 06 2025 at 12:00):

modulize.gif

:fire:

Sebastian Ullrich (Nov 06 2025 at 14:36):

Sebastian Ullrich said:

then look at the heartbeat regressions

Well, with some time to kill on the train, this turned out to be rather easy: there are no more heartbeat regressions, I was able to simply delete the commit. What's more, no other files were rebuilt from removing the set_option lines.

Sebastian Ullrich (Nov 06 2025 at 15:35):

! bench will not work until the linter is happy

Kim Morrison (Nov 07 2025 at 01:27):

Hmm, a clue as to the linter problem:

Adding

open Lean in
run_meta do
  let env  getEnv
  let some r  Lean.findDocString? env `GroupWithZero.zpow | failure
  logInfo r

in Mathlib.Algebra.GroupWithZero.Defs right after class GroupWithZero, when we run this in VSCode, it finds the doc-string, but when we run it with lake build it fails!

Kim Morrison (Nov 07 2025 at 01:49):

Okay, here is a minimizer, unfortunately requiring two files:

A.lean:

module

public section

class Monoid (M : Type) extends Mul M where

class DivInvMonoid (G : Type) extends Mul G where
  /-- The power operation: `a ^ n = a * ··· * a`; `a ^ (-n) = a⁻¹ * ··· a⁻¹` (`n` times) -/
  protected zpow : Int  G  G

B.lean:

module

import Lean.DocString
import Lean.Meta.Basic
public import A

public section

class GroupWithZero (G : Type) extends Monoid G, DivInvMonoid G where

/-- info: The power operation: `a ^ n = a * ··· * a`; `a ^ (-n) = a⁻¹ * ··· a⁻¹` (`n` times) -/
#guard_msgs in
open Lean in
run_meta do
  let env  getEnv
  let some r  Lean.findDocString? env `GroupWithZero.zpow | failure
  logInfo r

This succeeds in VSCode, but fails under lake build (and I think faithfully reflects the problem that the linter is hitting).

Kim Morrison (Nov 07 2025 at 02:00):

Oooh, further clues... Trying to install this as a test in tests/pkg/module/, it mysteriously works, because there we set the option Elab.inServer. Without this option, it fails.

Kim Morrison (Nov 07 2025 at 02:16):

I guess the problem is just that at https://github.com/leanprover/lean4/blob/master/src/Lean/Elab/Structure.lean#L665, the doc-string we want is in the .server.olean, but we don't have access to it.

I think this will have to wait for Sebastian to get back to make further progress.

Kim Morrison (Nov 07 2025 at 02:16):

In the meantime I may disable the linters so we can run a benchmark.

Kim Morrison (Nov 07 2025 at 02:27):

I guess we could stash the doc-strings in StructureFieldInfo?

Sebastian Ullrich (Nov 07 2025 at 08:22):

Oh! inherit_doc now has API for linking docstrings instead of copying them, we should use that here as well

Kim Morrison (Nov 09 2025 at 22:45):

lean#11122

Kim Morrison (Nov 10 2025 at 00:15):

I've rebased modulize onto lean-pr-testing-11133.

Kim Morrison (Nov 10 2025 at 00:17):

That seems to have fixed the linter issue at least in Mathlib.Algebra.GroupWithZero.Defs, and we'll wait for CI for the rest.

Kim Morrison (Nov 10 2025 at 01:53):

Nice, :check: on nightly#80.

Kim Morrison (Nov 10 2025 at 03:26):

And benchmark results at https://speed.lean-lang.org/mathlib4/compare/c02b2a1e-4afe-4559-92ab-29d75c1348c7/to/e7b27246-a3e6-496a-b552-ff4b45c7236e?hash2=69208ce408754bde2afd979fd39ede96dd8e1c42

Kim Morrison (Nov 10 2025 at 03:27):

2% increase in instructions, -0.5% decrease in wall clock, -60% decrease in (primary) .olean size.

Ruben Van de Velde (Nov 10 2025 at 07:38):

What does "primary" mean here?

Sebastian Ullrich (Nov 10 2025 at 08:37):

It's the only .olean kind you need for regular imports. Some additional smaller files are necessary for server mode, so that would likely be what a future lake cache would download by default.

Eric Wieser (Nov 10 2025 at 09:28):

Am I right in thinking that the private files are also needed for server mode, or is that not the case?

Sebastian Ullrich (Nov 10 2025 at 09:29):

That is not the case

Eric Wieser (Nov 10 2025 at 09:30):

Would lake cache not need to download any private files which are privately imported (import all?), to enable working in those files? Or is the thought that lake cache is only for downloading oleans for dependencies the user will not be editing, not the current project?

Sebastian Ullrich (Nov 10 2025 at 09:31):

Yes but import all should be the exception rather than the norm

Eric Wieser (Nov 10 2025 at 09:32):

I'd expect it to be fairly common to have 2-3 files that import all each other before drawing an abstraction boundary around the collection

Sebastian Ullrich (Nov 10 2025 at 09:33):

We'll have to see

Sebastian Ullrich (Nov 10 2025 at 09:34):

With caching integrated into Lake, fetching additional files when opening such modules would also be an option

Eric Wieser (Nov 10 2025 at 09:34):

Kim Morrison said:

-60% decrease in (primary) .olean size.

What's the total size change for downstream projects that don't opt into the module system? These need both primary and private oleans, right?

Sebastian Ullrich (Nov 10 2025 at 09:43):

The total .lake size in bytes appears to be 5.5% higher

Eric Wieser (Nov 10 2025 at 09:44):

Which I guess includes the server files

Sebastian Ullrich (Nov 10 2025 at 09:44):

Yes, all of them

Eric Wieser (Nov 10 2025 at 09:45):

Are those still needed for executing module-less packages? Or is main + private enough?

Sebastian Ullrich (Nov 10 2025 at 09:45):

If you want to see Mathlib docstrings, then yes :)

Kim Morrison (Nov 10 2025 at 11:56):

I just rebased onto nightly-testing-2025-11-10, and pushed.

Kim Morrison (Nov 10 2025 at 11:59):

@Sebastian Ullrich, what are your intentions re: shake? If you're actively working on a module-aware replacement, I propose that we simply stop using the old shake in Mathlib (at least on nightly-testing, and perhaps after v4.26.0-rc1 too?) until that is available. As long as we re-run it once it is available again, there's no particular loss from letting in a few un-shaken PRs in the meantime.

(Context: this is probably the last remaining issue before we can merge this; it seems likely and desirable that Mathlib v4.26.0-rc1 can be on the module system.)

Sebastian Ullrich (Nov 10 2025 at 12:25):

Yes, I was starting to look into that and the New Shake does want to make use of the new import modifiers to the fullest:

/home/sebastian/mathlib4/Mathlib/Topology/Category/TopCat/Sphere.lean:
  remove #[public import Mathlib.Topology.Category.TopCat.EpiMono]
  add #[import Mathlib.Tactic.Convert, import Mathlib.Tactic.Tauto, public import Mathlib.Topology.Category.TopCat.Basic, import Mathlib.Topology.Category.TopCat.EpiMono, import Mathlib.Analysis.Normed.Module.FiniteDimension]

From the system's PoV, this does look basically ideal: We have a few public imports for presumably the signatures in the module and then a lot of private imports used only in proofs. And because the same thing was done in upstream files, the proof imports are more explicit instead of relying on prior transitive imports.

But that may not yet be what is ideally desirable for human maintenance and so will likely need some iterations of human reviews and tweaks on top. So what I'm considering instead as a first step is to add back support for the ignore file into the new shake - or rather those parts of the ignore file that are not purely for working around limitations of the old shake.

Sebastian Ullrich (Nov 10 2025 at 13:56):

Actually that would only help with the two Mathlib.Tactic imports. Perhaps the better approach for now would be a flag for disabling demotion of any public imports to private ones, that should minimize the diff

Kim Morrison (Nov 10 2025 at 14:18):

re: actually merging this, I propose that we don't merge this into nightly-testing until basically the day that we're ready for v4.26.0-rc1 in Mathlib (i.e. merging modulize -> nightly-testing -> bump/v4.26.0 -> master on the same day).

I think if we merge it prematurely, we'll get nasty problems with the "merge master into nightly-testing, resolving conflicts automatically", doing the wrong thing on every commit...

Instead we'll just keep rebasing onto each new nightly-testing-YYYY-MM-DD tag for the next 3-4 days, to make sure everything stays "live".

Kim Morrison (Nov 10 2025 at 14:18):

That day could be this Friday, or next Monday, perhaps?

Sebastian Ullrich (Nov 10 2025 at 14:28):

SGTM!

Sebastian Ullrich (Nov 10 2025 at 14:39):

I'm attending a conference remotely on Monday so my availability may be limited that day

Ruben Van de Velde (Nov 10 2025 at 14:51):

On the other hand, it sounds like a fairly risky change, so maybe not just before the weekend? :)

Kim Morrison (Nov 10 2025 at 14:52):

Ideally (optimistically), I would like to just do this as part of moving Mathlib to v4.26.0-rc1.

Eric Wieser (Nov 10 2025 at 15:10):

Perhaps hijacking this thread, will https://lean-lang.org/doc/reference/latest/Source-Files-and-Modules/#module-headers be updated to explain the new syntax before the next release lands?

Sebastian Ullrich (Nov 10 2025 at 15:18):

I think it would be best to keep the module system experimental for one more cycle because inevitably some issues are likely to come up and we don't want to signal to Mathlib dependencies yet that they should update immediately. Perhaps a forward reference to the module system chapter could be placed into those docs until stabilization.

Eric Wieser (Nov 10 2025 at 15:24):

I guess what I'm struggling to find is any indication of the exhaustive syntax of the import command

Sebastian Ullrich (Nov 10 2025 at 15:51):

Let me rope in @David Thrane Christiansen here as well for visibility. If anyone wants to take a stab at that, I'm happy to review it, otherwise I can get to that some time this week.

Sebastian Ullrich (Nov 10 2025 at 16:05):

re shake: well, one issue appears to be that there are duplicate, explicitly named instances
image.png

The only reason this is accepted must be because they are Prop classes and so amenable to import-merging (which was only ever meant for auxiliary decls and may be removed at some point) but they do confuse the heck out of shake. Is this intended?

(edit: I'll continue via renaming for now and see whether that was a one-off case)

Ruben Van de Velde (Nov 10 2025 at 16:34):

99% sure not intentional

Ruben Van de Velde (Nov 10 2025 at 16:34):

The import merging is really confusing

Eric Wieser (Nov 10 2025 at 16:35):

Perhaps import merging should emit a warning, at least if a certain linter is enabled?

Ruben Van de Velde (Nov 10 2025 at 16:52):

#31474 fixes the duplication

Robin Arnez (Nov 10 2025 at 18:24):

Sebastian Ullrich schrieb:

So what I'm considering instead as a first step is to add back support for the ignore file into the new shake

Might it not be better to just introduce something like a keep public [meta] import <module> command that registers that module as an extraModUse? That way the ignore rule is less decoupled from the file its about and it's still easy enough to grep for (keep.*import)

Sebastian Ullrich (Nov 10 2025 at 18:30):

The example above is just one out of hundreds or thousands

Sebastian Ullrich (Nov 10 2025 at 20:00):

The only other duplicate I've seen (that creates actual errors when reorganizing imports) is List.mem_finRange, which exists in both Batteries and Mathlib but is @[simp] only in the latter

Kim Morrison (Nov 10 2025 at 22:17):

Sebastian Ullrich said:

mem_finRange

These will be deleted shortly once lean#9515 lands.

Kim Morrison (Nov 11 2025 at 11:22):

Rebased onto nightly-testing-2025-11-11.

Sebastian Ullrich (Nov 11 2025 at 19:27):

Kim Morrison said:

2% increase in instructions, -0.5% decrease in wall clock, -60% decrease in (primary) .olean size.

Perhaps most significantly when it comes to immediate impact, -2.8GB (-48%) maximum memory both when building Mathlib and when opening Mathlib.lean in the editor! /cc @Damiano Testa

Kim Morrison (Nov 14 2025 at 05:54):

Rebased onto nightly-testing-2025-11-13.

Michael Rothgang (Nov 14 2025 at 08:18):

Benchmarking in #31589 seems to indicate mathlib gets 3% faster in instructions now? I'll very happily take this, but don't understand where the 5% difference to previous benchmarks comes from. Anybody have an idea?

Sebastian Ullrich (Nov 14 2025 at 12:10):

If I understand it correctly, this is -3% instrs comparing nightly-testing to master, and the previous benchmark was +2% instrs comparing modulize to nightly-testing, right? So we should expect -1% from merging the module system branch into master

Kim Morrison (Nov 16 2025 at 23:39):

Rebased onto nightly-testing-2025-11-16 (there were a few merge conflicts, which I resolved speculatively... we'll have to see what the build does: nightly#80

Kim Morrison (Nov 17 2025 at 01:49):

Looks like I didn't get it right. :-(

Sebastian Ullrich (Nov 17 2025 at 08:56):

That specific issue is not from a merge conflict but a new metaprogram, no? I fixed it with an import all Init.Core for now though that's certainly not the nice way to do it

Kim Morrison (Nov 17 2025 at 12:22):

Rebased onto nightly-testing-2025-11-17.


Last updated: Dec 20 2025 at 21:32 UTC