Zulip Chat Archive

Stream: general

Topic: Tree shaking


Johan Commelin (Oct 11 2023 at 16:59):

Inspired by @Anne Baanen's Minimizer script thread, I was wondering again, how hard is it now to write a tree shaker for Lean 4 files?

I'm not yet sure what interacting with the treeshaker should look like. But to get a conversation going: I would like to autogenerate a new project ZeroToRealInSixThousandLines starting from Std and Mathlib, that contains everything needed for the construction of the real numbers, and nothing else. And the resulting project shouldn't depend on anything (except core, of course).

Alex J. Best (Oct 11 2023 at 18:18):

It sounds like @Anne Baanen has essentially already written it? From what I read in the linked thread if you made a file with only #check Real Anne's minimizer would eventually make such a file? Though perhaps it could get stuck at present, @Anne Baanen when your tool replaces an import by the file contents does it put everyhing in a section at least?

Anne Baanen (Oct 11 2023 at 18:19):

Alex J. Best said:

Anne Baanen when your tool replaces an import by the file contents does it put everyhing in a section at least?

It does!

Anne Baanen (Oct 11 2023 at 18:20):

Alex J. Best said:

It sounds like Anne Baanen has essentially already written it? From what I read in the linked thread if you made a file with only #check Real Anne's minimizer would eventually make such a file?

That's the ideal case, and if you're lucky it can get quite close. But you probably want to disable some of the more invasive passes, such as sorrying everything it can, though :)

Johan Commelin (Oct 11 2023 at 18:49):

Huh, @Anne Baanen Will your tool expand (aka inline) imports?

Johan Commelin (Oct 11 2023 at 18:50):

Aah, I see now that is indeed one of the features

Anne Baanen (Oct 11 2023 at 18:50):

Yes! It probably goes wrong with files outside of the current project but that should be fixable :)

Johan Commelin (Oct 11 2023 at 18:50):

But I guess tree-shaking can be done deterministically, and much faster than using the minimizer script.

Johan Commelin (Oct 11 2023 at 18:51):

So I think both tools are valuable, and probably deserve separate implementations

Andrew Yang (Dec 29 2023 at 01:02):

I just saw #9322 that reduce imports in one file. Is there a reason against doing this to every file in mathlib via some automated script?

Ruben Van de Velde (Dec 29 2023 at 08:17):

That would require that the code to minimise imports is fully reliable, which I don't think it is yet

Damiano Testa (Dec 29 2023 at 08:19):

In particular, I think that the core code around minimise imports is the same as the one for find_home and certainly find_home is unaware of where notation is defined.

Eric Wieser (Dec 29 2023 at 08:46):

The standard caveat here is that removing transitively-implied imports can be harmful; it's only import removals that genuinely reduce the size of the closure that are uncontroversially a good idea

Damiano Testa (Dec 29 2023 at 09:58):

This reminds me also of an experiment (#6763) where I sorted the imports (no removal, simply sorting what was present):

  • build times were affected;
  • one file stopped compiling.

Andrew Yang (Dec 29 2023 at 16:25):

What is the standard argument against removing redundant imports?

Yaël Dillies (Dec 29 2023 at 16:29):

I disagree with it, but the usual argument is that they carry semantic information about the relation between the importing file and the imported file

Ruben Van de Velde (Dec 29 2023 at 16:30):

If B imports C and A imports only B explicitly, but uses code from both B and C, then removing the import from B to C is a bit harder. Not sure that's "standard", though

Mario Carneiro (Dec 29 2023 at 16:55):

One positive outcome of removing transitive imports is that it makes you realize just how bonkers our dependency structure is

Yaël Dillies (Dec 29 2023 at 16:58):

Yep. Until last week, you could talk about finite sums in vector spaces just by importing the definition of the indicator function of a set.

Eric Wieser (Dec 29 2023 at 17:41):

It would be nice to have some tools to help keep things in check; the graphs we had when porting were useful in that regard. Perhaps we need to produce something similar as part of the doc build

Yury G. Kudryashov (Dec 29 2023 at 17:54):

But the graphs will be huge

Yury G. Kudryashov (Dec 29 2023 at 17:55):

At the last stages of the port, the graphs were manageable (then small) because most of the library was already ported.

Yury G. Kudryashov (Dec 29 2023 at 17:55):

Probably, we should add more assert_not_existss

Eric Wieser (Dec 29 2023 at 18:09):

I think the trick is to come up with useful heuristics for truncating graphs

Yaël Dillies (Dec 29 2023 at 19:30):

Yury G. Kudryashov said:

Probably, we should add more assert_not_existss

I have plans regarding the algebraic order hierarchy (some imports are still abberrations) but I probably won't have time to implement them before this summer.

Mario Carneiro (Dec 29 2023 at 21:35):

I wrote a tree shaker for olean files (so no recompilation is necessary). It's pretty fast (8 seconds to shake Mathlib and all dependencies), but it is subject to the usual caveats about ignoring compile-time dependencies that don't make it to the olean files. Here are the results:

Init.WFTactics:
  remove #[Init.SizeOf, Init.MetaTypes, Init.WF]
  add #[Init.Prelude]
  fix Init.SizeOf: #[Init.Data.Nat.Div]
  fix Init.MetaTypes: #[Init.Meta]
  fix Init.WF: #[Init.Data.Nat.Div]
Init.Data.Format.Macro:
  remove #[Init.Data.ToString.Macro]
  add #[Init.Meta]
Init.Hints:
  remove #[Init.NotationExtra]
  add #[Init.Prelude]
  fix Init.NotationExtra: #[Lean.Data.Lsp.Ipc, Lean.Meta.Instances, Lean.Parser.Module]
Lean.Compiler.Options:
  remove #[Lean.Util.Trace]
  add #[Lean.Util.Profile]
  fix Lean.Util.Trace: #[Lean.Compiler.LCNF.Main]
Lean.Compiler.Main:
  remove #[Lean.Compiler.LCNF]
  add #[Lean.Compiler.LCNF.Main]
Std.Lean.Format:
  remove #[Std.Tactic.OpenPrivate]
  fix Std.Tactic.OpenPrivate: #[Mathlib.Util.Imports]
Std.Control.Lemmas:
  remove #[Std.Tactic.Ext]
  fix Std.Tactic.Ext: #[Mathlib.Data.Prod.Basic, Mathlib.Data.Sigma.Basic, Mathlib.Combinatorics.Quiver.SingleObj]
Mathlib.Init.Data.Int.Basic:
  remove #[Mathlib.Mathport.Rename, Mathlib.Init.Data.Nat.Notation, Mathlib.Init.ZeroOne]
  fix Mathlib.Init.ZeroOne: #[Mathlib.Algebra.Group.Defs, Mathlib.Data.Set.Basic, Mathlib.Init.Data.Int.Bitwise, Mathlib.Data.Num.Basic, Mathlib.Init.Data.Int.CompLemmas]
Mathlib.Tactic.Attr.Register:
  remove #[Lean.Meta.Tactic.Simp]
  add #[Lean.Meta.Tactic.Simp.SimpTheorems]
Mathlib.Data.Rat.Init:
  remove #[Mathlib.Mathport.Rename, Std.Data.Rat.Basic]
  fix Std.Data.Rat.Basic: #[Mathlib.Algebra.Field.Defs, Mathlib.Data.Rat.Defs, Mathlib.Data.Rat.Encodable]
Mathlib.Tactic.Attr.Core:
  remove #[Mathlib.Tactic.Attr.Register]
Mathlib.Lean.Elab.Term:
  remove #[Lean.Elab]
  add #[Lean.Elab.SyntheticMVars]
Mathlib.Tactic.Contrapose:
  remove #[Mathlib.Tactic.PushNeg]
  add #[Mathlib.Logic.Basic]
  fix Mathlib.Tactic.PushNeg: #[Mathlib.Logic.Equiv.Basic, Mathlib.Algebra.GroupWithZero.Units.Basic, Mathlib.Algebra.Order.Monoid.Lemmas]
Mathlib.Tactic.Existsi:
  remove #[Mathlib.Tactic.Basic]
Mathlib.Tactic.HaveI:
  remove #[Std.Tactic.HaveI]
Mathlib.Tactic.SetLike:
  remove #[Aesop]
  add #[Aesop.Frontend.Extension]
Mathlib.Control.Monad.Basic:
  remove #[Mathlib.Init.Control.Lawful, Mathlib.Tactic.Common]
  fix Mathlib.Init.Control.Lawful: #[Mathlib.Control.Monad.Cont]
Mathlib.Tactic.Linarith.Lemmas:
  remove #[Mathlib.Algebra.GroupPower.Order]
  fix Mathlib.Algebra.GroupPower.Order: #[Mathlib.Data.Nat.Factorization.Basic, Mathlib.Data.Real.ENNReal, Mathlib.Algebra.ContinuedFractions.Computation.Approximations, Mathlib.Combinatorics.Additive.PluenneckeRuzsa, Mathlib.Computability.Ackermann, Mathlib.Data.Polynomial.Degree.CardPowDegree, Mathlib.RingTheory.Valuation.Basic, Mathlib.NumberTheory.PellMatiyasevic, Mathlib.NumberTheory.PythagoreanTriples, Mathlib.NumberTheory.LucasLehmer, Mathlib.NumberTheory.Multiplicity]
Mathlib.Tactic.NoncommRing:
  remove #[Mathlib.Tactic.Abel]
  add #[Mathlib.Algebra.GroupPower.Lemmas]
  fix Mathlib.Tactic.Abel: #[Mathlib.Algebra.Algebra.Spectrum, Mathlib.Algebra.Lie.SkewAdjoint, Mathlib.Analysis.NormedSpace.MStructure]
Mathlib.Init.Meta.WellFoundedTactics:
  remove #[Mathlib.Init.Data.Nat.Lemmas]
  add #[Std.Data.Nat.Lemmas]
  fix Mathlib.Init.Data.Nat.Lemmas: #[Mathlib.Algebra.Order.Floor, Mathlib.NumberTheory.Multiplicity]
Mathlib.Init.IteSimp:
  remove #[Mathlib.Init.Data.Bool.Basic, Mathlib.Init.Data.Bool.Lemmas]
  fix Mathlib.Init.Data.Bool.Lemmas: #[Mathlib.Analysis.BoxIntegral.Basic, Mathlib.Topology.Category.Profinite.Nobeling]
Mathlib.Tactic.Continuity:
  remove #[Mathlib.Tactic.Continuity.Init, Mathlib.Algebra.Group.Defs]
  fix Mathlib.Algebra.Group.Defs: #[Mathlib.Topology.UniformSpace.Basic, Mathlib.Topology.Support, Mathlib.Topology.LocalExtr, Mathlib.Topology.Algebra.Semigroup]
Mathlib.Algebra.GroupPower.Identities:
  remove #[Mathlib.Tactic.Ring]
  add #[Mathlib.Tactic.Ring.Basic]
Mathlib.RingTheory.QuotientNoetherian:
  remove #[Mathlib.RingTheory.QuotientNilpotent]
  add #[Mathlib.RingTheory.Ideal.QuotientOperations]
Mathlib.Tactic.Measurability:
  remove #[Mathlib.Tactic.Measurability.Init]
Mathlib.LinearAlgebra.AffineSpace.Basic:
  remove #[Mathlib.Algebra.AddTorsor]
  fix Mathlib.Algebra.AddTorsor: #[Mathlib.LinearAlgebra.AffineSpace.AffineMap]
Mathlib.Algebra.Order.Field.Pi:
  remove #[Mathlib.Data.Fintype.Lattice]
  add #[Mathlib.Data.Fintype.Card, Mathlib.Data.Finset.Lattice]
Mathlib.Algebra.Order.Sub.Prod:
  remove #[Mathlib.Algebra.Order.Sub.Basic]
  add #[Mathlib.Algebra.Order.Sub.Defs]
Mathlib.AlgebraicTopology.DoldKan.Notations:
  remove #[Mathlib.AlgebraicTopology.AlternatingFaceMapComplex]
  fix Mathlib.AlgebraicTopology.AlternatingFaceMapComplex: #[Mathlib.AlgebraicTopology.DoldKan.Homotopies]
Mathlib.Analysis.SpecificLimits.IsROrC:
  remove #[Mathlib.Analysis.Complex.ReImTopology]
  add #[Mathlib.Data.IsROrC.Basic]
Mathlib.CategoryTheory.Preadditive.Yoneda.Projective:
  remove #[Mathlib.Algebra.Category.ModuleCat.EpiMono]
  fix Mathlib.Algebra.Category.ModuleCat.EpiMono: #[Mathlib.CategoryTheory.Abelian.Projective]
Mathlib.Data.Array.Basic:
  remove #[Std.Tactic.HaveI, Mathlib.Data.List.Basic]
Mathlib.Data.Finset.Pointwise.Interval:
  remove #[Mathlib.Data.Finset.Interval]
Mathlib.Data.Fintype.Small:
  remove #[Mathlib.Logic.Small.Basic]
  add #[Mathlib.Logic.Small.Defs]
Mathlib.Init.Data.Nat.GCD:
  remove #[Mathlib.Init.Data.Nat.Lemmas, Mathlib.Init.Meta.WellFoundedTactics]
  add #[Std.Data.Nat.Lemmas]
Mathlib.Probability.Notation:
  remove #[Mathlib.Probability.ProbabilityMassFunction.Basic, Mathlib.MeasureTheory.Function.ConditionalExpectation.Basic, Mathlib.MeasureTheory.Decomposition.Lebesgue]
  fix Mathlib.MeasureTheory.Function.ConditionalExpectation.Basic: #[Mathlib.Probability.Martingale.Basic, Mathlib.Probability.ConditionalExpectation, Mathlib.Probability.Kernel.CondDistrib]
  fix Mathlib.MeasureTheory.Decomposition.Lebesgue: #[Mathlib.Probability.Distributions.Gaussian]
Mathlib.Probability.ConditionalExpectation:
  remove #[Mathlib.Probability.Notation]
Mathlib.Tactic.ReduceModChar.Ext:
  remove #[Lean.Meta.Tactic.Simp]
  add #[Lean.Meta.Tactic.Simp.SimpTheorems]

Ruben Van de Velde (Dec 29 2023 at 21:45):

That doesn't seem like a lot

Mario Carneiro (Dec 29 2023 at 21:46):

It is supposed to be an empty list, so I suppose this isn't so bad

Mario Carneiro (Dec 29 2023 at 21:47):

This is the most basic kind of unnecessary import, we can do much better with file splitting

Mario Carneiro (Dec 29 2023 at 21:47):

I'm thinking now about how we can have a nolints.json file and add this as a linter script to mathlib CI

Ruben Van de Velde (Dec 29 2023 at 21:52):

Also, this reminds me that we have a bunch of files in Mathlib/Init/ that I don't think are actually special, so perhaps should be moved elsewhere

Mario Carneiro (Dec 29 2023 at 21:55):

By the way, the way to read this data:

  • remove X means that the import is unnecessary and should be removed
  • add X means that removing something else caused a transitive import to go missing so it has to be added back in
  • fix X: Y means that Y transitively depends on X through this file, so you have to add it back in in Y

Mario Carneiro (Dec 30 2023 at 04:31):

I've added the tree shaking tool to mathlib in #9346, and updated mathlib in #9347

Mario Carneiro (Dec 30 2023 at 04:37):

I ended up with the following noshake.json file:

{"ignoreImport":
 ["Init",
  "Lean",
  "Mathlib.Mathport.Rename",
  "Mathlib.Init.Data.Nat.Notation",
  "Mathlib.Data.Rat.Init",
  "Mathlib.Probability.Notation",
  "Mathlib.AlgebraicTopology.DoldKan.Notations",
  "Mathlib.Tactic.Ring"],
 "ignore":
 {"Mathlib.Tactic.SetLike": ["Aesop"],
  "Mathlib.Tactic.NoncommRing": ["Mathlib.Tactic.Abel"],
  "Mathlib.Tactic.Measurability": ["Mathlib.Tactic.Measurability.Init"],
  "Mathlib.Tactic.HaveI": ["Std.Tactic.HaveI"],
  "Mathlib.Tactic.Contrapose": ["Mathlib.Tactic.PushNeg"],
  "Mathlib.Tactic.Continuity": ["Mathlib.Tactic.Continuity.Init"],
  "Mathlib.Tactic.Attr.Register": ["Lean.Meta.Tactic.Simp.SimpTheorems"],
  "Mathlib.Tactic.Attr.Core": ["Mathlib.Tactic.Attr.Register"],
  "Mathlib.Probability.Notation": ["Mathlib.MeasureTheory.Function.ConditionalExpectation.Basic", "Mathlib.MeasureTheory.Decomposition.Lebesgue"],
  "Mathlib.LinearAlgebra.AffineSpace.Basic": ["Mathlib.Algebra.AddTorsor"],
  "Mathlib.Init.Data.Nat.GCD": ["Std.Data.Nat.Gcd"],
  "Mathlib.Data.Rat.Init": ["Std.Data.Rat.Basic"],
  "Mathlib.Data.Array.Basic": ["Std.Tactic.Alias"],
  "Mathlib.Control.Monad.Basic": ["Mathlib.Init.Control.Lawful"],
  "Mathlib.AlgebraicTopology.DoldKan.Notations": ["Mathlib.AlgebraicTopology.AlternatingFaceMapComplex"]}}

These are all the actual false positives, either because of a tactic or notation dependency or because we just want to allow an unnecessarily broad import like import Lean.

Johan Commelin (Dec 30 2023 at 14:51):

That's a surprisingly short list!!

Eric Rodriguez (Dec 31 2023 at 01:38):

Is this tree-shake a "one-time" or does it calculate the smallest possible tree based on what we use?

Mario Carneiro (Dec 31 2023 at 01:38):

The tool I just wrote has aspirations of being a CI step

Mario Carneiro (Dec 31 2023 at 01:39):

and it's really only about catching unused imports, not really rejiggering declarations and splitting files

Matthew Ballard (Feb 04 2024 at 22:38):

A fun consequence of this can be found here

Mario Carneiro (Feb 04 2024 at 23:50):

what are you pointing to specifically?

Matthew Ballard (Feb 04 2024 at 23:52):

One fairly-large untouched file lost 32% of its instructions

Eric Rodriguez (Feb 05 2024 at 00:24):

somehow the build got more expensive though, is that just within the noise threshold?

Matthew Ballard (Feb 05 2024 at 00:38):

Expensive by what measure?

Eric Rodriguez (Feb 05 2024 at 00:40):

wall-clock up 0.67%

Matthew Ballard (Feb 05 2024 at 00:44):

That’s well within tolerance

Matthew Ballard (Feb 05 2024 at 00:45):

Overall instructions are more stable

Yury G. Kudryashov (Feb 05 2024 at 11:27):

In https://github.com/leanprover-community/mathlib4/actions/runs/7781663201/job/21216508799?pr=10232 lake exe shake says that I need to import Data.Set.Lattice in Order.Cover but it compiles fine without Data.Set.Lattice.

Yury G. Kudryashov (Feb 05 2024 at 11:28):

It's a4af199b9b in branch#YK-set-notation

Yury G. Kudryashov (Feb 05 2024 at 11:31):

Matthew Ballard said:

One fairly-large untouched file lost 32% of its instructions

That's surprising because most probably import closure of that file didn't change.

Yury G. Kudryashov (Feb 05 2024 at 11:33):

UPD: no, I removed Uryshohn's metrization theorem (and its dependencies) from the import closure.

Mario Carneiro (Feb 06 2024 at 05:33):

@Yury G. Kudryashov I don't think shake has any false positives when it comes to dependencies, which is to say that if it says Data.Set.Lattice is needed in Order.Cover then it really is, at least in the original version of the file. However, it may be the case that Data.Set.Lattice is already transitively imported by another path, or that removing the import somehow changed the behavior of the tactics in such a way that the transitive import is no longer necessary (i.e. some simp lemma was being applied even though it doesn't really help anything and now that it is not imported it is being skipped)

Mario Carneiro (Feb 06 2024 at 05:44):

more specifically, on a4af199 docs#Set.covBy_insert and several other functions make use of docs#Set.Set.completeAtomicBooleanAlgebra, so if you remove the import then typeclass inference takes an alternative path which does not need this instance

Yury G. Kudryashov (Feb 06 2024 at 06:10):

Thank you for the explanation!

Jireh Loreaux (Feb 09 2024 at 14:56):

@Mario Carneiro I think I have a false negative for shake. That is, it told me to remove a dependency, and I did, but then the file didn't have access to a declaration it needed. See #10369, after the first commit shake complained, and after the second (implementing shake's suggestion), it didn't build.

Mario Carneiro (Feb 09 2024 at 16:12):

Shake has lots of false negatives, that's why noshake.json exists

Mario Carneiro (Feb 09 2024 at 16:13):

In this case it's because the constant is used only inside attribute [foo]. In that case you want to add it to the import list of noshake.json

Yury G. Kudryashov (Feb 12 2024 at 21:17):

In https://github.com/leanprover-community/mathlib4/actions/runs/7875771521/job/21488387831?pr=10466 shake told me to move the PartENat import to the new file. However, it led to quite a few compile failures because other files relied on dependencies of PartENat being available via transitive import.

Yury G. Kudryashov (Feb 12 2024 at 21:18):

It's this revision.

Ruben Van de Velde (Feb 12 2024 at 21:23):

Huh, it's supposed to list all downstream files that will now need to import it themselves

Mario Carneiro (Feb 12 2024 at 21:23):

what compile failures? Do you have a failing run?

Mario Carneiro (Feb 12 2024 at 21:25):

Shake is not supposed to be completely foolproof, it is limited by incomplete information. I always check the results of --fix first, and the consequence might be that there is an unseen dependency due to attributes or tactics

Yury G. Kudryashov (Feb 12 2024 at 21:25):

No, it's just ENat not available in a dependency.

Mario Carneiro (Feb 12 2024 at 21:26):

constants not being available should not happen though

Yury G. Kudryashov (Feb 12 2024 at 21:26):

this is the next commit

Yury G. Kudryashov (Feb 12 2024 at 21:26):

I manually moved an import as suggested.

Mario Carneiro (Feb 12 2024 at 21:26):

Oh but the other thing to keep in mind is that if you run shake on outdated oleans you will get the wrong answer

Yury G. Kudryashov (Feb 12 2024 at 21:26):

It was shake in the CI

Mario Carneiro (Feb 12 2024 at 21:26):

you should only run it after a clean lake build

Mario Carneiro (Feb 12 2024 at 21:27):

I see in your link shake making a suggestion, what file is missing what declaration in what other file afterward?

Mario Carneiro (Feb 12 2024 at 21:28):

On that commit the first error is expected token, on a line containing @[coe]. So probably that attribute is not imported

Mario Carneiro (Feb 12 2024 at 21:28):

the missing ENat is due to the previous command failing

Mario Carneiro (Feb 12 2024 at 21:30):

actually it's more likely the notation ℕ∞ which causes expected token

Mario Carneiro (Feb 12 2024 at 21:42):

actually you are right, it is a shake bug

Mario Carneiro (Feb 12 2024 at 21:43):

    -- In `downstream` mode, we should also check all the other modules to find out if
    -- we have a situation like `A -/> B -> C`, where we are removing the `A -> B` import
    -- but `C` depends on `A` and only directly imports `B`.
    -- This situation occurs when `A ∈ needs[C]` and `B ∈ transDeps[C]`.

This is not correct, because it could be that module A is not needed directly but it imports module A' which is needed by C

Mario Carneiro (Feb 12 2024 at 21:45):

This happened in this case, we have

Mathlib.Data.ENat.Basic ->
Mathlib.Data.Nat.PartENat -/>
Mathlib.SetTheory.Cardinal.Basic ->
Mathlib.SetTheory.Cardinal.ENat

where the second link was severed and the first file is used in the last

Mario Carneiro (Feb 12 2024 at 23:02):

Here's the new output on that commit with the fixed shake:

./././Mathlib/SetTheory/Cardinal/Basic.lean:
  remove #[Mathlib.Data.Nat.PartENat]
  instead:
    import [Mathlib.Data.Nat.PartENat] in Mathlib.SetTheory.Cardinal.PartENat
    import [Mathlib.Data.Nat.SuccPred] in Mathlib.SetTheory.Ordinal.Arithmetic
    import [Mathlib.Data.Nat.SuccPred] in Mathlib.Data.Nat.Nth
    import [Mathlib.Data.Finset.Preimage] in Mathlib.ModelTheory.Definability
    import [Mathlib.Data.ENat.Lattice] in Mathlib.SetTheory.Cardinal.ENat

The last line is the one that is relevant to this error

Mario Carneiro (Feb 12 2024 at 23:32):

#10469

Yury G. Kudryashov (Feb 13 2024 at 00:54):

Thank you! The output looks good (in the last case, ENat.Basic is enough but if you import ENat.Lattice, then probably Lean uses some instance from there).

Anne Baanen (Feb 13 2024 at 09:34):

I encountered two mild usability issues in #9697:

  • shake is not idempotent: in commit 2150373, shake replaced one import of Polynomial.FieldDivision with two lower-level imports, one of which, EuclideanDomain.Instances, in turn got removed by shake in 8900cb3.
  • shake requires fresh oleans but does not detect when they are out of date. So initially I couldn't reproduce the shake output in CI.

I assume the non-idempotency is caused by something like diamond inheritance of instances first choosing a path through EuclideanDomain.Instances and then another, so not something directly in shake's control, but might be worth investigating anway.

Making shake detect out of date oleans is going to help a lot with usability IMO.

Mario Carneiro (Feb 13 2024 at 09:45):

The second issue is fixed in #10484

Mario Carneiro (Feb 13 2024 at 09:47):

The thing with instances isn't likely to be fixed. I am considering implementing lake exe shake --slow which does essentially the same thing as your minimizer script, deleting import lines and seeing if it still compiles, but that will be way too slow to run most of the time

Kim Morrison (Feb 13 2024 at 11:12):

@Mario Carneiro , does shake take into account #align statements? I just had a bad suggestion:

./././Mathlib/Init/Data/Int/Basic.lean:
  remove #[Std.Data.Int.Order]
  add #[Std.Data.Int.Init.Order]

which caused #aligns to fail.

Mario Carneiro (Feb 13 2024 at 12:08):

No, aligns are similar to attributes and tactics, shake doesn't track them. Then again, generally aligns shouldn't be referring to definitions from other files under normal circumstances anyway

Mario Carneiro (Feb 13 2024 at 12:09):

If the issue is that #align itself is not found, the relevant import for that is already in noshake.json but you have to add it to the file

Mario Carneiro (Feb 13 2024 at 12:16):

This is expected behavior and you fixed it the way I would have recommended

Eric Wieser (Feb 14 2024 at 08:22):

Can we have shake --fix insert the new imports alphabetically?

Mario Carneiro (Feb 14 2024 at 08:23):

what if the original imports are not sorted?

Eric Wieser (Feb 14 2024 at 08:43):

You could still do a sorted insertion (insert D ACBF = ACBDF)

Eric Wieser (Feb 14 2024 at 08:43):

Appending in that case is probably also fine

Yaël Dillies (Mar 29 2024 at 07:35):

I just encountered a (minor) shake bug. On 17736f5d6f8411937ad801767f96e8ce60d857ef, the output is

./././Mathlib/Algebra/Order/BigOperators/Ring/Multiset.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Group.Multiset]
  add #[Mathlib.Algebra.BigOperators.Multiset.Basic]
./././Mathlib/Data/Nat/Choose/Sum.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Algebra/GeomSum.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Algebra.Order.BigOperators.Ring.Finset] in Mathlib.Algebra.Order.CauSeq.BigOperators
    import [Mathlib.Algebra.Order.BigOperators.Group.Finset] in Mathlib.Combinatorics.Colex
./././Mathlib/NumberTheory/Divisors.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Combinatorics/Enumerative/Composition.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Algebra.Order.BigOperators.Group.Multiset] in Mathlib.Combinatorics.Enumerative.Partition
./././Mathlib/LinearAlgebra/Multilinear/Basic.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Topology/Algebra/Order/LiminfLimsup.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.BigOperators.Ring]
./././Mathlib/Data/Sign.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.GroupWithZero.Units.Lemmas, Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Algebra/Order/Chebyshev.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Algebra/Order/Interval.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Combinatorics/Pigeonhole.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Combinatorics/Enumerative/DoubleCounting.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Data.Rat.Field, Mathlib.Algebra.Ring.Regular, Mathlib.Algebra.Order.BigOperators.Group.Finset] in Mathlib.Combinatorics.SetFamily.LYM
./././Mathlib/Combinatorics/Configuration.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Combinatorics/SetFamily/AhlswedeZhang.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Tactic.Positivity.Basic, Mathlib.Algebra.BigOperators.Ring, Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Combinatorics/SetFamily/FourFunctions.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/Data/Set/Equitable.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Algebra.Ring.Regular] in Mathlib.Combinatorics.SimpleGraph.Regularity.Equitabilise
./././Mathlib/Combinatorics/SimpleGraph/Regularity/Energy.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
./././Mathlib/NumberTheory/Harmonic/Defs.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]

Mario Carneiro (Mar 29 2024 at 07:36):

what's the bug?

Yaël Dillies (Mar 29 2024 at 07:37):

... with suggestions like

./././Mathlib/Combinatorics/Enumerative/DoubleCounting.lean:
  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Data.Rat.Field, Mathlib.Algebra.Ring.Regular, Mathlib.Algebra.Order.BigOperators.Group.Finset] in Mathlib.Combinatorics.SetFamily.LYM

making no sense. How can you need to import Mathlib.Algebra.Order.BigOperators.Group.Finset in Mathlib.Combinatorics.SetFamily.LYM if you already add it to Mathlib.Combinatorics.Enumerative.DoubleCounting, which is imported by Mathlib.Combinatorics.SetFamily.LYM?

Mario Carneiro (Mar 29 2024 at 07:40):

The individual per-file suggestions are considered independently; making some of the earlier changes can make later ones no longer applicable. This is one of the reasons shake is better suited for small scale changes rather than bulk updates

Mario Carneiro (Mar 29 2024 at 07:41):

the block

  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Data.Rat.Field, Mathlib.Algebra.Ring.Regular, Mathlib.Algebra.Order.BigOperators.Group.Finset] in Mathlib.Combinatorics.SetFamily.LYM

is correct, for editing a single file and fixing the downstream changes of it

Yaël Dillies (Mar 29 2024 at 07:42):

My claim is that it is overdoing, even on its own

Mario Carneiro (Mar 29 2024 at 07:42):

but once you do this, downstream files can change in unpredictable ways and you should re-run shake for best results

Yaël Dillies (Mar 29 2024 at 07:42):

The block

  remove #[Mathlib.Algebra.Order.BigOperators.Ring.Finset]
  add #[Mathlib.Algebra.Order.BigOperators.Group.Finset]
  instead
    import [Mathlib.Data.Rat.Field, Mathlib.Algebra.Ring.Regular] in Mathlib.Combinatorics.SetFamily.LYM

would be enough

Mario Carneiro (Mar 29 2024 at 07:43):

No, because apparently all of those files have unnecessary imports

Mario Carneiro (Mar 29 2024 at 07:43):

but the downstream effects of all of the changes of all of the files is complex and order-dependent

Mario Carneiro (Mar 29 2024 at 07:43):

so the "instead import" section only shows the effect of the remove/add pair in the same block

Yaël Dillies (Mar 29 2024 at 07:45):

Sorry, I don't get it. If it only showed the effect of the remove/add pair in the same block, I shouldn't see the instead add [Mathlib.Algebra.Order.BigOperators.Group.Finset]. So it must take the other remove/add pairs into account.

Yaël Dillies (Mar 29 2024 at 07:47):

A related feature-request: Would it be possible to accept/refuse shake suggestions on a per-block basis?

Mario Carneiro (Mar 29 2024 at 07:48):

For example, with your first example, it's saying: Let's fix Mathlib/Combinatorics/Enumerative/DoubleCounting.lean. To do so, first remove Mathlib.Algebra.Order.BigOperators.Ring.Finset, then add Mathlib.Algebra.Order.BigOperators.Group.Finset. Doing this has broken some transitive imports in Mathlib.Combinatorics.SetFamily.LYM so you should add Mathlib.Data.Rat.Field, Mathlib.Algebra.Ring.Regular, Mathlib.Algebra.Order.BigOperators.Group.Finset to the imports for that file. Okay now everything should be working again and that one file is fixed.

Also, Mathlib/Data/Set/Equitable.lean has unnecessary imports. (Not assuming you've made the previous fix,) to fix it remove Ring.Finset, ...

Yaël Dillies (Mar 29 2024 at 07:49):

No sorry I still don't get it. How can it need to add Mathlib.Algebra.Order.BigOperators.Group.Finset to Mathlib.Combinatorics.SetFamily.LYM if we already added it to Mathlib.Combinatorics.Enumerative.DoubleCounting?

Mario Carneiro (Mar 29 2024 at 07:50):

Oh, I see what you mean now. (These file names make me go cross-eyed...)

Yaël Dillies (Mar 29 2024 at 07:51):

The error to me is very simple: You calculate the instead add suggestion from the remove suggestion, without taking the add suggestion into account.

Yaël Dillies (Mar 29 2024 at 07:51):

Yeah sorry I should have foobarred it, but I was pretty certain I would be making a mistake in doing so

Mario Carneiro (Mar 29 2024 at 07:57):

Okay yes you were right, that is exactly the issue and I tested the fix

Mario Carneiro (Mar 29 2024 at 07:58):

#11763


Last updated: May 02 2025 at 03:31 UTC