Zulip Chat Archive

Stream: mathlib4

Topic: import changes bot on PRs


Kim Morrison (Jun 27 2024 at 07:23):

Who wrote this bot?

Does it report changes to unmodified files downstream of modified files?

Kim Morrison (Jun 27 2024 at 07:24):

Obviously if it reported every downstream file with reduced or increased imports separately, this would be disastrously to big.

But if it says nothing about the downstream files, then it is much less useful than it should be.

Kim Morrison (Jun 27 2024 at 07:26):

The actually useful number is something like "change in total transitive imports". (i.e. count of pairs X, Y, where X is transitively imported by Y)

Ruben Van de Velde (Jun 27 2024 at 07:38):

@Damiano Testa , I think

Damiano Testa (Jun 27 2024 at 07:40):

That part of the PR summary was written by @Johan Commelin, though I suspect with large chatGPT input.

Damiano Testa (Jun 27 2024 at 07:44):

(I had tried reading the python code, but got lost pretty quickly.)

Johan Commelin (Jun 27 2024 at 08:48):

But every change in downstream files must be caused by a change in a modified file, no?
Maybe modified files in the listing should come with a "weight" that is a function of how many downstream files depend on them?

Ruben Van de Velde (Jun 27 2024 at 09:48):

I'm not sure if that would capture the correct insight either - say I add an import A to a file B, then that affects all files downstream of B. But if all of those files already depend on A, the impact is only that I've linearized the dependency graph a bit, rather than adding overall weight

Kim Morrison (Jun 28 2024 at 07:43):

e.g. on #14220 I would love to have some statistics showing how far down the savings propagate, before the removed imports come in again from other sources.

Damiano Testa (Jun 28 2024 at 09:45):

I do not have a clear understanding of what is a good statistic/statistics that the import section of the PR summary should report. I am pretty sure that we can write code for something that we would like to have, though! :smile:

Kim Morrison (Jun 28 2024 at 09:52):

The number that I've proposed is the change in the count of pairs X, Y, such that X transitively imports Y.

Kim Morrison (Jun 28 2024 at 09:53):

Alternatively, a histogram showing the change in the number of transitive imports per file would be perfect.

(but rendering a histogram sounds like too much work)

Damiano Testa (Jun 28 2024 at 09:55):

The histogram may indeed be difficult to render (or at least is something that can be done with other tools), but the change per file of transitive imports should definitely be doable, the sum of these numbers also, and maybe we can think about how to limit the size of the printing of the per-file import change.

Damiano Testa (Jun 28 2024 at 09:57):

I'm not sure if I can get to this quickly, but I agree that I rarely use the "import" section of the PR summary. Since this seems to be a common state, I can try to improve that

Johan Commelin (Jun 28 2024 at 11:34):

I do use the "import" section. But that shouldn't stop you from improving it (-;

Johan Commelin (Jun 28 2024 at 11:34):

And I agree that Ruben and Kim point out good reasons why the current stats are insufficient.

Floris van Doorn (Jun 28 2024 at 14:10):

Kim Morrison said:

The number that I've proposed is the change in the count of pairs X, Y, such that X transitively imports Y.

Note that this number likely increases whenever we add files, but also possibly when we split up files and reduce imports. The average number of transitive imports per file is maybe better, though probably also has issues.

Kim Morrison (Jun 28 2024 at 23:52):

The histogram really would be great!

Damiano Testa (Jun 29 2024 at 08:19):

For the PR mentioned above, #14220, I think that what is below is the difference in number of transitive imports by file:


File Import difference
Mathlib.NumberTheory.ClassNumber.AdmissibleAbsoluteValue -13
Mathlib.Data.Real.Sign -11
Mathlib.Algebra.Order.CauSeq.Completion -14
Mathlib.Tactic.Rify -13
Mathlib.Data.Real.Basic -14
Mathlib.Algebra.Order.CauSeq.Basic -25

Would this be useful to add to the import summary?

Kim Morrison (Jun 30 2024 at 07:00):

LGTM! I worry that some PRs will produce very long lists here, however.

Damiano Testa (Jun 30 2024 at 07:02):

Yes, I wonder whether we can have several layers of unfold: first, the maximal imports that have changed, as it is now, and then all the other within a second unfoldable tab.

Damiano Testa (Jul 02 2024 at 16:39):

The histogram is not there (at least not yet), but I have added content to the "Import" section of the PR summary: #14363.

Michael Rothgang (Jul 02 2024 at 20:25):

Just to clarify the distinction between the two kinds of import changes reported:
the changes on current master show the change in the number transitive dependencies of all modified files, whereas the new collapsible section from #14363 shows this change for all files in mathlib (regardless whether they were modified).

Is this correct? Mentioning something like this in the sticky comment could be helpful - to answer my future me who may wonder about the same again :-)

Damiano Testa (Jul 02 2024 at 20:32):

Yes, that is correct: the "new" section lists all mathlib files for which the number of transitively imported files changed and the amount of the change. The information that is currently displayed is the same but restricted to the files that have had some modifications to their import statements.

I'll write a summary in the PR probably tomorrow (but feel free to suggest a description that seems clear and concise!).

Damiano Testa (Jul 03 2024 at 04:02):

I went with renaming the pair (Import changes, All import changes) to (Import changes for modified files, Import changes for all files).

Hopefully the parallel description explains better what the difference is.

Bryan Gin-ge Chen (Oct 21 2024 at 15:25):

Not a big priority, but the bot seems to be failing on dependabot PRs, see e.g. https://github.com/leanprover-community/mathlib4/actions/runs/11442821600/job/31834037554?pr=18017

Junyan Xu (Nov 09 2024 at 12:24):

Two questions:

  1. Is there a command I can insert in Lean code or a lake command I can run to get the number of transitive imports of a given file?

  2. There's apparently a bug in shake, uncovered by the import change bot. In #18791 I'm able to remove 4 imports out of 7 straight from RingTheory.AlgebraicIndependent. This was discovered because the bot reported that RingTheory.Algebraic (transitively) imports 1314 files and RingTheory.AlgebraicIndependent imports 1317 files. However, the latter directly imports 6 files other than the former. None of the involved files appear in noshake.json.

Ruben Van de Velde (Nov 09 2024 at 12:39):

Are they transitive imports?

Junyan Xu (Nov 09 2024 at 13:04):

You can see here that I'm able to remove 4 direct imports of RingTheory.AlgebraicIndependent.

Damiano Testa (Nov 09 2024 at 14:54):

This can show you the number of transitive imports of a file:

import Lean.Elab.Command

open Lean in
/-- `#trans_imports` reports how many transitive imports the current module has.
`#trans_imports str` also shows the transitively imported modules whose name begins with `str`.
-/
elab tk:"#trans_imports" stx:(str)? : command => do
  let imports := ( getEnv).allImportedModuleNames
  let currMod := if let mod@(.str _ _) :=  getMainModule then m!"'{mod}' has " else ""
  let rest := match stx with
      | none => m!""
      | some str =>
        let imps := imports.filterMap fun i =>
          if i.toString.startsWith str.getString then some i else none
        m!"\n\n{imps.size} starting with {str}:\n{imps.qsort (·.toString < ·.toString)}"
  logInfoAt tk m!"{currMod}{imports.size} transitive imports{rest}"

/--
info: 426 transitive imports

3 starting with "Lean.Elab.I":
[Lean.Elab.InfoTree, Lean.Elab.InfoTree.Main, Lean.Elab.InfoTree.Types]
-/
#guard_msgs in
#trans_imports "Lean.Elab.I"

Damiano Testa (Nov 09 2024 at 14:55):

The string input is optional: no string, just prints the number, while any other string will also print the imports whose name starts with the input string.

Damiano Testa (Nov 09 2024 at 14:56):

For mathlib imports, it may be more meaningful to use #trans_imports "Mathlib.".

Junyan Xu (Nov 09 2024 at 15:12):

Thank you!
For my question 2, I think I might have a misconception about how shake works. According to original PR #9346, it checks that every import is used in some constant, which probably mean that every directly imported file contains some declaration (within the file) that is used. It probably doesn't try to minimize the imports, so it might decide to keep some file in the import list because some declaration in the file is used, even though the file is transitively imported by another file in the import list.

Junyan Xu (Nov 16 2024 at 13:05):

@Damiano Testa Can you please PR the #trans_imports script to make it available everywhere in Mathlib? I'm finding myself using it a lot lately to figure out which imports are okay to add to an existing file.

Damiano Testa (Nov 16 2024 at 16:10):

#19129

Damiano Testa (Nov 16 2024 at 16:11):

Junyan, I am glad that you find it useful!

I find that the command is pretty basic: if you find that it would benefit from further functionality, feel free to suggest additions!

Yaël Dillies (May 10 2025 at 07:22):

It looks like the import changes bot is broken now. It reports no import changes on #24701 even though there clearly should be some import change.

Kim Morrison (May 10 2025 at 09:47):

If someone would like to diagnose this, start at https://github.com/leanprover-community/mathlib4/actions and try to find the most recent run of the relevant action.

Eric Wieser (May 10 2025 at 11:29):

https://github.com/leanprover-community/mathlib4/actions/runs/14933607582/job/41955735549?pr=24701 is the job for @Yaël Dillies' PR above

Eric Wieser (May 10 2025 at 11:29):

No errors, it just computes nothing

Eric Wieser (May 10 2025 at 17:38):

I would guess there is a regression in https://github.com/leanprover-community/mathlib4/commits/master/.github/workflows/PR_summary.yml

Eric Wieser (May 10 2025 at 17:41):

Maybe @Damiano Testa has an idea of what's going on here?

Damiano Testa (May 10 2025 at 17:52):

Could it be that the new step that checks for removed files, leaves git on master, while the import diff step assumes that it is on the PR branch?

Damiano Testa (May 10 2025 at 17:53):

The Compute transitive imports step linked to above starts with a suspicious Already on 'master'.

Damiano Testa (May 10 2025 at 17:54):

That step follows the new Compute (re)moved files without deprecation.

Eric Wieser (May 10 2025 at 17:55):

I'm a little confused why we call scripts/count-trans-deps.py four times, twice from the CI yaml and twice more from a script it calls

Damiano Testa (May 10 2025 at 17:56):

PR summary calls it twice to figure out the imports before and after the PR.

Damiano Testa (May 10 2025 at 17:57):

I see it being called only once more, though.

Eric Wieser (May 10 2025 at 18:00):

By "PR summary" do you mean .github/workflows/PR_summary.yml, or the scripts/import_trans_difference.sh file which it calls?

Damiano Testa (May 10 2025 at 18:01):

Ah, I simply grepped scripts/count-trans-deps.py and found it twice in .github/workflows/PR_summary.yml and once in scripts/import_trans_difference.sh.

Damiano Testa (May 10 2025 at 18:05):

I won't have time to analyze the workflow surrounding trans_imports, but #24750 may be a (somewhat brittle) quick fix for the issue mentioned above.

Damiano Testa (May 10 2025 at 20:38):

I checked at #24755 that the bot appears to be working, at least in the sense that it reports Yaël's PR as large-import. Hence, I merged the fix.

Eric Wieser (May 10 2025 at 20:45):

Thanks for the fix!

Junyan Xu (Jul 21 2025 at 16:50):

Could we make the docs page of each file show the number of its transitive mathlib imports?

Eric Wieser (Jul 22 2025 at 09:12):

I would imagine a PR to do that would be accepted

Eric Wieser (Jul 22 2025 at 09:12):

(replacing "transitive mathlib imports" with "all transitive imports")


Last updated: Dec 20 2025 at 21:32 UTC