Zulip Chat Archive
Stream: mathlib4
Topic: import changes bot on PRs
Kim Morrison (Jun 27 2024 at 07:23):
Who wrote this bot?
Does it report changes to unmodified files downstream of modified files?
Kim Morrison (Jun 27 2024 at 07:24):
Obviously if it reported every downstream file with reduced or increased imports separately, this would be disastrously to big.
But if it says nothing about the downstream files, then it is much less useful than it should be.
Kim Morrison (Jun 27 2024 at 07:26):
The actually useful number is something like "change in total transitive imports". (i.e. count of pairs X, Y
, where X
is transitively imported by Y
)
Ruben Van de Velde (Jun 27 2024 at 07:38):
@Damiano Testa , I think
Damiano Testa (Jun 27 2024 at 07:40):
That part of the PR summary
was written by @Johan Commelin, though I suspect with large chatGPT input.
Damiano Testa (Jun 27 2024 at 07:44):
(I had tried reading the python code, but got lost pretty quickly.)
Johan Commelin (Jun 27 2024 at 08:48):
But every change in downstream files must be caused by a change in a modified file, no?
Maybe modified files in the listing should come with a "weight" that is a function of how many downstream files depend on them?
Ruben Van de Velde (Jun 27 2024 at 09:48):
I'm not sure if that would capture the correct insight either - say I add an import A to a file B, then that affects all files downstream of B. But if all of those files already depend on A, the impact is only that I've linearized the dependency graph a bit, rather than adding overall weight
Kim Morrison (Jun 28 2024 at 07:43):
e.g. on #14220 I would love to have some statistics showing how far down the savings propagate, before the removed imports come in again from other sources.
Damiano Testa (Jun 28 2024 at 09:45):
I do not have a clear understanding of what is a good statistic/statistics that the import
section of the PR summary should report. I am pretty sure that we can write code for something that we would like to have, though! :smile:
Kim Morrison (Jun 28 2024 at 09:52):
The number that I've proposed is the change in the count of pairs X, Y, such that X transitively imports Y.
Kim Morrison (Jun 28 2024 at 09:53):
Alternatively, a histogram showing the change in the number of transitive imports per file would be perfect.
(but rendering a histogram sounds like too much work)
Damiano Testa (Jun 28 2024 at 09:55):
The histogram may indeed be difficult to render (or at least is something that can be done with other tools), but the change per file of transitive imports should definitely be doable, the sum of these numbers also, and maybe we can think about how to limit the size of the printing of the per-file import change.
Damiano Testa (Jun 28 2024 at 09:57):
I'm not sure if I can get to this quickly, but I agree that I rarely use the "import" section of the PR summary
. Since this seems to be a common state, I can try to improve that
Johan Commelin (Jun 28 2024 at 11:34):
I do use the "import" section. But that shouldn't stop you from improving it (-;
Johan Commelin (Jun 28 2024 at 11:34):
And I agree that Ruben and Kim point out good reasons why the current stats are insufficient.
Floris van Doorn (Jun 28 2024 at 14:10):
Kim Morrison said:
The number that I've proposed is the change in the count of pairs X, Y, such that X transitively imports Y.
Note that this number likely increases whenever we add files, but also possibly when we split up files and reduce imports. The average number of transitive imports per file is maybe better, though probably also has issues.
Kim Morrison (Jun 28 2024 at 23:52):
The histogram really would be great!
Damiano Testa (Jun 29 2024 at 08:19):
For the PR mentioned above, #14220, I think that what is below is the difference in number of transitive imports by file:
File | Import difference |
---|---|
Mathlib.NumberTheory.ClassNumber.AdmissibleAbsoluteValue | -13 |
Mathlib.Data.Real.Sign | -11 |
Mathlib.Algebra.Order.CauSeq.Completion | -14 |
Mathlib.Tactic.Rify | -13 |
Mathlib.Data.Real.Basic | -14 |
Mathlib.Algebra.Order.CauSeq.Basic | -25 |
Would this be useful to add to the import summary?
Kim Morrison (Jun 30 2024 at 07:00):
LGTM! I worry that some PRs will produce very long lists here, however.
Damiano Testa (Jun 30 2024 at 07:02):
Yes, I wonder whether we can have several layers of unfold: first, the maximal imports that have changed, as it is now, and then all the other within a second unfoldable tab.
Damiano Testa (Jul 02 2024 at 16:39):
The histogram is not there (at least not yet), but I have added content to the "Import" section of the PR summary: #14363.
Michael Rothgang (Jul 02 2024 at 20:25):
Just to clarify the distinction between the two kinds of import changes reported:
the changes on current master show the change in the number transitive dependencies of all modified files, whereas the new collapsible section from #14363 shows this change for all files in mathlib (regardless whether they were modified).
Is this correct? Mentioning something like this in the sticky comment could be helpful - to answer my future me who may wonder about the same again :-)
Damiano Testa (Jul 02 2024 at 20:32):
Yes, that is correct: the "new" section lists all mathlib files for which the number of transitively imported files changed and the amount of the change. The information that is currently displayed is the same but restricted to the files that have had some modifications to their import statements.
I'll write a summary in the PR probably tomorrow (but feel free to suggest a description that seems clear and concise!).
Damiano Testa (Jul 03 2024 at 04:02):
I went with renaming the pair (Import changes, All import changes)
to (Import changes for modified files, Import changes for all files)
.
Hopefully the parallel description explains better what the difference is.
Bryan Gin-ge Chen (Oct 21 2024 at 15:25):
Not a big priority, but the bot seems to be failing on dependabot PRs, see e.g. https://github.com/leanprover-community/mathlib4/actions/runs/11442821600/job/31834037554?pr=18017
Junyan Xu (Nov 09 2024 at 12:24):
Two questions:
-
Is there a command I can insert in Lean code or a
lake
command I can run to get the number of transitive imports of a given file? -
There's apparently a bug in
shake
, uncovered by the import change bot. In #18791 I'm able to remove 4 imports out of 7 straight fromRingTheory.AlgebraicIndependent
. This was discovered because the bot reported thatRingTheory.Algebraic
(transitively) imports 1314 files andRingTheory.AlgebraicIndependent
imports 1317 files. However, the latter directly imports 6 files other than the former. None of the involved files appear in noshake.json.
Ruben Van de Velde (Nov 09 2024 at 12:39):
Are they transitive imports?
Junyan Xu (Nov 09 2024 at 13:04):
You can see here that I'm able to remove 4 direct imports of RingTheory.AlgebraicIndependent
.
Damiano Testa (Nov 09 2024 at 14:54):
This can show you the number of transitive imports of a file:
import Lean.Elab.Command
open Lean in
/-- `#trans_imports` reports how many transitive imports the current module has.
`#trans_imports str` also shows the transitively imported modules whose name begins with `str`.
-/
elab tk:"#trans_imports" stx:(str)? : command => do
let imports := (← getEnv).allImportedModuleNames
let currMod := if let mod@(.str _ _) := ← getMainModule then m!"'{mod}' has " else ""
let rest := match stx with
| none => m!""
| some str =>
let imps := imports.filterMap fun i =>
if i.toString.startsWith str.getString then some i else none
m!"\n\n{imps.size} starting with {str}:\n{imps.qsort (·.toString < ·.toString)}"
logInfoAt tk m!"{currMod}{imports.size} transitive imports{rest}"
/--
info: 426 transitive imports
3 starting with "Lean.Elab.I":
[Lean.Elab.InfoTree, Lean.Elab.InfoTree.Main, Lean.Elab.InfoTree.Types]
-/
#guard_msgs in
#trans_imports "Lean.Elab.I"
Damiano Testa (Nov 09 2024 at 14:55):
The string input is optional: no string, just prints the number, while any other string will also print the imports whose name starts with the input string.
Damiano Testa (Nov 09 2024 at 14:56):
For mathlib imports, it may be more meaningful to use #trans_imports "Mathlib."
.
Junyan Xu (Nov 09 2024 at 15:12):
Thank you!
For my question 2, I think I might have a misconception about how shake
works. According to original PR #9346, it checks that every import is used in some constant, which probably mean that every directly imported file contains some declaration (within the file) that is used. It probably doesn't try to minimize the imports, so it might decide to keep some file in the import list because some declaration in the file is used, even though the file is transitively imported by another file in the import list.
Junyan Xu (Nov 16 2024 at 13:05):
@Damiano Testa Can you please PR the #trans_imports script to make it available everywhere in Mathlib? I'm finding myself using it a lot lately to figure out which imports are okay to add to an existing file.
Damiano Testa (Nov 16 2024 at 16:10):
Damiano Testa (Nov 16 2024 at 16:11):
Junyan, I am glad that you find it useful!
I find that the command is pretty basic: if you find that it would benefit from further functionality, feel free to suggest additions!
Last updated: May 02 2025 at 03:31 UTC