Zulip Chat Archive

Stream: general

Topic: mathlib import graph


Patrick Massot (Mar 29 2020 at 16:31):

Yesterday, @Scott Morrison asked for the current mathlib import graph, without having to install anything. This is available at https://www.yworks.com/yed-live/?file=https://gist.githubusercontent.com/PatrickMassot/a23fb34e1b26409f10013de07525ece8/raw/ae80aec753cbf37b36ca683c94db08b99f0cf36b/import_graph, but this will require your web browser to be patient.

Yury G. Kudryashov (Mar 29 2020 at 16:37):

Do you have it in some text format (e.g., graphviz)?

Patrick Massot (Mar 29 2020 at 16:37):

Sure.

Patrick Massot (Mar 29 2020 at 16:37):

import_graph.dot

Yury G. Kudryashov (Mar 29 2020 at 16:37):

Thank you!

Yury G. Kudryashov (Mar 29 2020 at 16:38):

Do we have any tool to determine unneeded imports?

Patrick Massot (Mar 29 2020 at 16:38):

I think we used to have something testing that in CI, but it was driving people crazy.

Gabriel Ebner (Mar 29 2020 at 16:39):

The category_theory.limits.shapes.constructions.sums module doesn't have any edges: https://github.com/leanprover-community/mathlib/blob/38544f14662d9d1fd97bd1fb132213c546292025/src/category_theory/limits/shapes/constructions/sums.lean

Patrick Massot (Mar 29 2020 at 16:40):

Yes, this is among the most sober files in mathlib.

Patrick Massot (Mar 29 2020 at 16:41):

I think this was already noticed when writing the mathlib paper, but somehow it was decided not to fix it.

Patrick Massot (Mar 29 2020 at 16:41):

Probably it's meant as a placeholder

Patrick Massot (Mar 29 2020 at 16:42):

There are 648 nodes in this graph (and 1764 edges), we can afford a couple of empty nodes.

Patrick Massot (Mar 29 2020 at 16:43):

But note that https://github.com/leanprover-community/mathlib/blob/38544f14662d9d1fd97bd1fb132213c546292025/src/category_theory/limits/shapes/constructions/finite_products.lean is conceptually closely related to that file, it almost counts as an edge.

Kevin Buzzard (Mar 29 2020 at 17:14):

I didn't know this was allowed! It reminds me of all the directories here (when I was recently trying to figure out what maths has been done in HoTT theories)

Patrick Massot (Mar 29 2020 at 17:15):

This not really allowed, this is more like overlooked.

Rob Lewis (Mar 29 2020 at 17:34):

I think this was already noticed when writing the mathlib paper, but somehow it was decided not to fix it.

Was it decided or just forgotten? The "placeholder" has been empty for 7 months, I think a PR deleting empty files would be welcome.

Rob Lewis (Mar 29 2020 at 17:34):

Yury G. Kudryashov said:

Do we have any tool to determine unneeded imports?

Simon and Mario's olean-rs tool does this, iirc

Patrick Massot (Mar 29 2020 at 17:34):

This is why I wrote it was more overlooked than allowed.

Kevin Buzzard (Mar 29 2020 at 17:43):

Yury G. Kudryashov said:

Do we have any tool to determine unneeded imports?

ooh found one import.png Might take a while to do it by hand though

Patrick Massot (Mar 29 2020 at 17:44):

I'm not sure this is what Yury wanted.

Patrick Massot (Mar 29 2020 at 17:44):

But maybe it is.

Kevin Buzzard (Mar 29 2020 at 17:44):

Oh -- imports which aren't used?

Patrick Massot (Mar 29 2020 at 17:45):

There are two kinds of useless imports. Those which are unused and those which would be imported by transitivity anyway.

Kevin Buzzard (Mar 29 2020 at 17:45):

I guess that is the more important issue. I should imagine that the loop above makes no difference to anything.

Yury G. Kudryashov (Mar 29 2020 at 17:55):

I was talking about something like this

Patrick Massot (Mar 29 2020 at 18:30):

I just opened https://github.com/leanprover-community/mathlib-tools/pull/38 for people interested in import graphs. It's purely about imports, not things being actually used.

Scott Morrison (Mar 30 2020 at 02:32):

I just wrote (I thought I should remember how to write python) lean-deps, which you use as follows.

Let's look at the dependency tree for src/category_theory/types.lean, restricting to import chains ending within src/category_theory:

$ ./lean-deps src/category_theory/types.lean src/category_theory/
src/category_theory/types.lean
├── src/category_theory/functor_category.lean
│   └── src/category_theory/natural_transformation.lean
│       └── src/category_theory/functor.lean
│           ├── src/category_theory/category/default.lean
│           └── src/tactic/reassoc_axiom.lean
│               └── ≪ src/category_theory/category/default.lean ≫
└── src/category_theory/fully_faithful.lean
    └── src/category_theory/isomorphism.lean
        └── ≪ src/category_theory/functor.lean ≫

Dependencies which occur multiple times are shown wrapped in ≪ ... ≫, and their further transitive dependencies are omitted. (You can omit repeats entirely using lean-deps -o, or show their dependencies too with lean-deps -a.

Without a second argument, lean-deps shows dependencies all the way down. It takes a few seconds on files deep in the hierarchy, but seems usable. You can also specify one or more specific files as second arguments, so we can analyze the somewhat surprising dependency of the Hahn-Banach theorem theorem on the category theory library as follows:

$ ./lean-deps -o src/analysis/normed_space/hahn_banach.lean src/category_theory/category/default.lean
src/analysis/normed_space/hahn_banach.lean
└── src/analysis/normed_space/operator_norm.lean
    └── src/topology/metric_space/lipschitz.lean
        └── src/category_theory/endomorphism.lean
            └── src/category_theory/category/default.lean

Scott Morrison (Mar 30 2020 at 02:35):

This seems pretty duplicative, of course, with what Patrick has done, but I find the text mode output very satisfying.

Yury G. Kudryashov (Mar 30 2020 at 02:35):

I prove that list.prod of Lipschitz Ends is Lipschitz.

Scott Morrison (Mar 30 2020 at 02:36):

Perhaps we could just combine the two, so leanproject import-graph -from XXX but with no filename prints an ASCII tree on stdout

Scott Morrison (Mar 30 2020 at 02:37):

Also, I think leanproject import-graph doesn't yet support using both -from and -to, which I think is very helpful for understanding a particular chain.

Scott Morrison (Mar 30 2020 at 02:38):

My code is at https://gist.github.com/semorrison/d27e35ed5eac96b0dec939f72fada56d.

Scott Morrison (Mar 30 2020 at 02:38):

I think I'll stop playing with it until hearing from Patrick.

Mario Carneiro (Mar 30 2020 at 03:25):

I think this also duplicates functionality in olean-rs

Mario Carneiro (Mar 30 2020 at 03:26):

you have to calculate this info in order to read any lean file

Mario Carneiro (Mar 30 2020 at 03:27):

it should be a lot faster than a few seconds too ('cause Rust and all that)

Mario Carneiro (Mar 30 2020 at 03:31):

oh I see, you are calling lean a bunch of times to process the files. Yeah, olean-rs will be a lot faster, since it just does the same thing lean does but without the text processing and elaboration

Mario Carneiro (Mar 30 2020 at 03:33):

The dependencies of an olean file are all literally written in the first line

Scott Morrison (Mar 30 2020 at 04:05):

I just want ASCII trees, in a tool whose installation procedure is already covered by pip install mathlib-tools. :-)

Mario Carneiro (Mar 30 2020 at 04:13):

how feasible is it to link an olean-rs library build to a python project? I would guess it's possible, not sure about how distribution works

Gabriel Ebner (Mar 30 2020 at 09:29):

lean --deps is much faster than running lean. It doesn't import anything, it just parses the first command in the file and outputs the imported files.

Patrick Massot (Mar 30 2020 at 16:00):

I worked a bit on this, see https://github.com/leanprover-community/mathlib-tools/pull/38/commits/6a94d737f54fd259d7ee228e2ef56b5b4f997a45. The motivation for adding ascii art output is very low for me, but feel free to add a commit to this PR. I also just realized I didn't interpret "support using both --from and --to" in the same way as Scott. I added this possibility, but it return the intersection of the subgraphs that each option would have generated, whereas Scott's example suggest generating only the shortest path from one file to the other. For comparison, here is the current output of

leanproject import-graph --to analysis.normed_space.hahn_banach --from category_theory.category.default hb.pdf

hb.pdf

Patrick Massot (Mar 30 2020 at 16:05):

Note that my version clearly points out the culprit: topology.metric_space.lipschitz randomly imports two category theory files.

Patrick Massot (Mar 30 2020 at 16:06):

which are used in three isolated lemmas in the middle of the file.

Yury G. Kudryashov (Mar 30 2020 at 17:20):

Is there something bad with importing category theory files?

Yury G. Kudryashov (Mar 30 2020 at 17:20):

How else should I argue about long compositions of self-maps?

Patrick Massot (Mar 30 2020 at 17:21):

This was described as "somewhat surprising", not bad (especially coming from Scott).

Patrick Massot (Mar 30 2020 at 17:22):

Then there is the question of optimizing the import graph, but this is a complicated question.

Scott Morrison (Mar 30 2020 at 23:16):

@Patrick Massot, how do I run leanproject from a branch of mathlib-tools?

Scott Morrison (Mar 30 2020 at 23:17):

I tried:

  • checkout the git repository on branch imports
  • locate leanproject.py in mathlibtools
  • try running it, get "permission denied"
  • chmod u+x ...
  • run it, get error message
import: delegate library support not built-in '' (X11) @ error/import.c/ImportImageCommand/1282.
./leanproject.py: line 13: syntax error near unexpected token `('
./leanproject.py: line 13: `from mathlibtools.lib import (LeanProject, log, LeanDirtyRepo,'

after the usual help text.

Am I meant to instead be using pip to install the branch version?

Scott Morrison (Mar 30 2020 at 23:18):

(Actually, not after the usual help text: instead it prints help text from ImageMagick...)

Patrick Massot (Mar 31 2020 at 09:36):

Using leanproject without installing the mathlibtools package is definitely not supported, even when developing it. The correct workflow is

git checkout imports
pip install .

If you are modifying it you can pip install -e . to avoid constantly running pip install.

Patrick Massot (Mar 31 2020 at 09:38):

The ImageMagick thing is hilarious, I wonder what your shell is trying to launch leanproject.py directly (this file has no shebang indicating this should be ran using python).

Patrick Massot (Mar 31 2020 at 09:41):

Since you don't seem to be familiar with python packaging, let me point out that the name similarity between the user-facing script leanproject and the file leanproject.py is not what create the user facing script, which comes from https://github.com/leanprover-community/mathlib-tools/blob/master/setup.py#L15-L18.

Patrick Massot (Apr 05 2020 at 20:22):

@Scott Morrison what is the status of this thread? Do you still want to contribute to https://github.com/leanprover-community/mathlib-tools/pull/38 or should we merge it?

Scott Morrison (Apr 06 2020 at 01:51):

I added some documentation to what you'd done. My attempts to add text mode output failed, so I added some comments to the PR in case anyone else wants to take this up again.

Scott Morrison (Apr 06 2020 at 01:51):

I'll use my lean-deps script myself for the "dependencies between X and Y" question.

Patrick Massot (Apr 06 2020 at 09:38):

I still don't understand the specification here. What do you mean by "dependencies between X and Y" ? Why isn't my version the answer?

Patrick Massot (Apr 06 2020 at 09:40):

Remember the answer of my version is https://leanprover.zulipchat.com/user_uploads/3121/oMG7VwMiOT61e9YWTIrcskk5/hb.pdf

Scott Morrison (Apr 06 2020 at 10:10):

Oh, I see, that's great. I just never managed to run it locally (because I only worked out today that I needed to install graphviz for anything to work), and missed the link you provided to the PDF.

Scott Morrison (Apr 06 2020 at 10:11):

I saw in the code the line

    if to and from_:
        G = graph.path(start=from_, end=to)

and made an (incorrect) assumption about what it was doing.

Kevin Buzzard (Apr 06 2020 at 10:16):

topology.metric_space.antilipschitz imports topology.metric_space.lipschitz ;-)

Patrick Massot (Apr 06 2020 at 10:18):

I'm sorry I thought the doc was clear. Please feel free to clarify it. It's still not quite true that graphviz is required. You can output graphml or gexf and look at them using yed or Gephi for instance.

Patrick Massot (Apr 06 2020 at 10:19):

But it's true you need some graph visualization software to see the graph.

Patrick Massot (Apr 06 2020 at 10:22):

@Scott Morrison what is this timing stuff you added in the import graph branch?

Patrick Massot (Apr 06 2020 at 10:22):

I guess this is a git accident.

Scott Morrison (Apr 06 2020 at 11:23):

git accident!

Kevin Buzzard (Apr 06 2020 at 11:24):

/me looks it up in the git manual

Patrick Massot (Apr 06 2020 at 12:14):

Ok, I cleanup up and forced push. I'm afraid you lost the authorship here, but I hope it's not too bad.

Scott Morrison (Apr 06 2020 at 13:00):

no worries :-)

Alena Gusakov (May 26 2020 at 18:05):

Hello! I'm interested in joining in on this project - I really like following theorem dependencies when I take notes so this project is pretty cool.

I'm particularly interested in making user-friendly visualizations - stuff like color-coding the ideas or bigger themes underlying the theorems, replacing Lean theorem names with more commonly known names, that sort of thing

Patrick Massot (May 26 2020 at 18:18):

You are looking for leancrawler

Patrick Massot (May 26 2020 at 18:18):

See eg the graph at https://leanprover-community.github.io/lean-perfectoid-spaces/

Patrick Massot (May 26 2020 at 18:19):

Here nodes are colored by auto-generated cluster, and sized by outgoing degree, but you can play endlessly with variations

Patrick Massot (May 26 2020 at 18:20):

I posted mathlib data less than one week ago I think.

Patrick Massot (May 26 2020 at 18:20):

Note that you can also get a much cruder graph by using leanproject to generate import graphs

Alena Gusakov (May 26 2020 at 18:26):

That's a very pretty graph

Alena Gusakov (May 26 2020 at 18:27):

Thank you though! Good to know

Kenny Lau (May 26 2020 at 20:39):

PS C:\mathlib> leanproject import-graph --from .\src\linear_algebra\basis.lean
The node .\src\linear_algebra\basis.lean is not in the graph.
PS C:\mathlib> leanproject import-graph --from linear_algebra.basis
[WinError 2] "dot" not found in path.
PS C:\mathlib>

Kenny Lau (May 26 2020 at 20:39):

what's the canonical way to use leanproject import-graph?

Bhavik Mehta (May 26 2020 at 20:40):

leanproject import-graph works for me

Bryan Gin-ge Chen (May 26 2020 at 20:40):

What happens if you pass it paths with forward slashes?

Bhavik Mehta (May 26 2020 at 20:41):

Oh hang on, it seems like import-graph wants to use dot, which it doesn't seem like you have?

Kenny Lau (May 26 2020 at 20:41):

what is dot?

Bryan Gin-ge Chen (May 26 2020 at 20:42):

It's a graph-drawing program. https://graphviz.gitlab.io/download/


Last updated: Dec 20 2023 at 11:08 UTC