Zulip Chat Archive

Stream: mathlib4

Topic: lake exe graph


Yury G. Kudryashov (Jul 14 2023 at 23:41):

I tried to run lake exe graph and got

uncaught exception: could not find native implementation of external declaration 'UInt64.ofNatCore' (symbols 'l_UInt64_ofNatCore___boxed' or 'l_UInt64_ofNatCore')

Yury G. Kudryashov (Jul 14 2023 at 23:41):

OS: NixOS Linux

Yury G. Kudryashov (Jul 15 2023 at 00:05):

@Scott Morrison :up:

Scott Morrison (Jul 15 2023 at 00:11):

I doubt this is due to anything specific to graph. Could you try minimizing ImportGraph/Main.lean, seeing if the error goes away as you remove dependencies?

Yury G. Kudryashov (Jul 15 2023 at 02:18):

Minimized to

/-
Copyright (c) 2023 Scott Morrison. All rights reserved.
Released under Apache 2.0 license as described in the file LICENSE.
Authors: Scott Morrison
-/
import Mathlib.Util.Imports
import Mathlib.Lean.IO.Process

/-!
# `lake exe graph`

This is a replacement for Lean 3's `leanproject import-graph` tool.
-/

open Lean Meta

/-- Write an import graph, represented as a `NameMap (Array Name)` to the ".dot" graph format. -/
def asDotGraph (_ : NameMap (Array Name)) (_ := "import_graph") : String := ""

open Lean Core System

-- Next two declarations borrowed from `runLinter.lean`.

instance : ToExpr FilePath where
  toTypeExpr := mkConst ``FilePath
  toExpr path := mkApp (mkConst ``FilePath.mk) (toExpr path.1)

elab "compileTimeSearchPath" : term =>
  return toExpr ( searchPathRef.get)

open IO.FS IO.Process in
/-- `lake exe graph` -/
def main : IO Unit := do
  searchPathRef.set compileTimeSearchPath
  let _  unsafe withImportModules [{module := `Mathlib}] {} (trustLevel := 1024) fun env =>
    return asDotGraph env.importGraph

Mauricio Collares (Jul 15 2023 at 07:23):

Try setting supportInterpreter := true in the lake_exe graph section of the lakefile

Yury G. Kudryashov (Jul 15 2023 at 10:53):

Does it switch from "compile&run" to "interpret"? If yes, then this sounds like a workaround, not a solution.

Yury G. Kudryashov (Jul 15 2023 at 10:56):

I mean, probably there is a bug in lake or lean behind this and while we can run small executables like graph in interpreted mode, the bug may appear later in a larger executable where running it in the interpreted mode will not be desirable.

Sebastian Ullrich (Jul 15 2023 at 11:56):

Yury G. Kudryashov said:

Does it switch from "compile&run" to "interpret"? If yes, then this sounds like a workaround, not a solution.

It does not: https://github.com/leanprover/lake#binary-executables

Yury G. Kudryashov (Jul 15 2023 at 11:57):

Thank you!

Yury G. Kudryashov (Jul 15 2023 at 16:01):

I added supportInterpreter := true. Now it works better but it needs oleans. Is this intentional? Lake should have all the information about dependencies somewhere. Can't we just ask it?

Mario Carneiro (Jul 15 2023 at 20:07):

What do you mean? Anything that uses withImportModules or similar has to read olean files on disk. These are found via the search path, but since you are using a compile time search path it will only work if the program is run from the project root and the files are in the same place as they were at compile time

Yury G. Kudryashov (Jul 15 2023 at 20:59):

I mean that lake somehow knows dependencies of all files before they're compiled. I wonder if we can ask it instead about this information instead of whatever API is used now.

Arthur Paulino (Jul 15 2023 at 21:03):

I think Lake doesn't have an API to build the dependency graph out of the box.

One possibility, however, is doing what Cache does to figure out (and traverse) the dependency tree. It basically parses text files. You can look at the source code for the hashing routine.

Yury G. Kudryashov (Jul 15 2023 at 21:07):

So, lake parses source files, builds the dependency graph but doesn't have an API to query this information?

Arthur Paulino (Jul 15 2023 at 21:26):

Yes

Yury G. Kudryashov (Jul 15 2023 at 21:26):

So, the proper fix is to add this API to lake?

Arthur Paulino (Jul 15 2023 at 21:28):

It's not needed by Lake. Lake does its job by traversing dependencies on demand, like Cache

Arthur Paulino (Jul 15 2023 at 21:29):

Unless there's a design decision to refactor Lake so it precomputes the whole DAG before doing anything. Then it would make sense to add such API (since it's needed)

Yury G. Kudryashov (Jul 15 2023 at 21:29):

But it has functions like "give me all dependencies of this file" or "give me all goals in this module".

Arthur Paulino (Jul 15 2023 at 21:29):

Yes

Mario Carneiro (Jul 15 2023 at 21:30):

lean already has the API, Lean.parseImports'

Yury G. Kudryashov (Jul 15 2023 at 21:30):

Why doesn't withImportGraph use this?

Mario Carneiro (Jul 15 2023 at 21:31):

:shrug:

Mario Carneiro (Jul 15 2023 at 21:32):

I don't see a withImportGraph function in mathlib4, lean4 or lake

Yury G. Kudryashov (Jul 15 2023 at 21:33):

I'm sorry, it doesn't exist.

Yury G. Kudryashov (Jul 15 2023 at 21:34):

I mean, why doesn't docs#Lean.Environment.importGraph parse files instead of using oleans.

Mario Carneiro (Jul 15 2023 at 21:38):

because it's way more expensive?

Mario Carneiro (Jul 15 2023 at 21:39):

if you already have oleans it's probably better to use them than to parse text files

Mario Carneiro (Jul 15 2023 at 21:39):

I think I timed this at one point

Eric Wieser (Jul 15 2023 at 21:44):

And if you don't have oleans then you don't have a guarantee that your imports exist

Eric Wieser (Jul 15 2023 at 21:44):

Or that the graph they form is acyclic

Yury G. Kudryashov (Jul 15 2023 at 22:00):

Thank you for the explanation.

Sorawee Porncharoenwase (Jul 18 2023 at 04:14):

I tried lake exe graph. While it generates a dot file for my project dependencies (Mathlib, etc) fine, it doesn't generate entries for my project itself. How do I make it recognize the module in my project?

Note that my .olean files are under build/lib. Do I need to specify this path anywhere?

Mario Carneiro (Jul 18 2023 at 04:17):

Sorawee Porncharoenwase said:

Note that my .olean files are under build/lib. Do I need to specify this path anywhere?

unlikely, most lean processing tools already know this and should be able to find the oleans without any additional hints as long as they are run using lake env, lake exe, or lake run

Mario Carneiro (Jul 18 2023 at 04:23):

looking at the code it appears that if you don't specify a main module it defaults to Mathlib

Mario Carneiro (Jul 18 2023 at 04:27):

lake exe graph --to MyProject seems to work

Sorawee Porncharoenwase (Jul 18 2023 at 05:47):

I read the the --help page and thought that : is a part of the command-line syntax, and so I wrote lake exe graph --to : MyProject. /facepalm

Anyhow, lake exe graph --to MyProject doesn't work either (empty digraph), but I now know why: I have

globs := #[`MyProject, .submodules `MyProject]

in lakefile.lean, and MyProject.lean is essentially an empty file. If I directly specify a path to a module that has actual Lean content, it works as expected!

So I guess the globs trick doesn't play well with lake exe graph, and it's better to generate a main file that imports everything instead.

Mario Carneiro (Jul 18 2023 at 05:48):

that sounds like it is working as expected

Mario Carneiro (Jul 18 2023 at 05:48):

if the MyProject module doesn't have any dependencies, then you should just get a single node

Sorawee Porncharoenwase (Jul 18 2023 at 05:52):

Right. And thank you! Really appreciate the help.

Patrick Massot (Oct 17 2023 at 16:42):

How is lake exe graph meant to work in a project depending on mathlib. I was expecting it to generate a graph of the project file, excluding mathlib, but it doesn't. @Scott Morrison

Patrick Massot (Oct 17 2023 at 16:43):

Nevermind, I noticed this topic already existed and contains the answer. So let's turn this into a documentation question. How is this meant to be documented?

Kevin Buzzard (Oct 17 2023 at 21:22):

(on the linux app when you start a topic which already exists, a right-arrow appears by the topic and if you click on it you can see previous messages sent to the topic. This is a relatively recent addition)

Scott Morrison (Oct 18 2023 at 01:05):

Can we access the name of the current project?

Scott Morrison (Oct 18 2023 at 01:08):

@Patrick Massot #7738

Patrick Massot (Oct 18 2023 at 01:09):

I don't understand the new doc, or the option name really.

Patrick Massot (Oct 18 2023 at 01:09):

I would rather expect that --from would achieve this.

Scott Morrison (Oct 18 2023 at 01:10):

I have always understood import graphs to have edges consistent with the "upstream" and "downstream" naming scheme: i.e. edges point to downstream files, from upstream files.

Scott Morrison (Oct 18 2023 at 01:11):

But maybe you have the opposite orientation in mind?

Scott Morrison (Oct 18 2023 at 01:11):

The root files of a project should appear at the top of the displayed graph, and the most complicated at the bottom, and the edges should point down the screen.

Scott Morrison (Oct 18 2023 at 01:12):

hence --to MyProject

Scott Morrison (Oct 18 2023 at 01:13):

Ideally someone would move graph out of Mathlib into its own project depending only on Std and Cli, so that it is usable elsewhere.

Scott Morrison (Oct 18 2023 at 01:18):

In the same PR, I just added a more helpful message if graphviz is missing.

Patrick Massot (Oct 18 2023 at 01:48):

I agree with all that, but I don't understand why --to MyProject removes Mathlib which is way upstream of MyProject.

Scott Morrison (Oct 18 2023 at 01:58):

ah, this happened in #6151. Initially it was proposed as a --noDeps flag, but I suggested, apparently incorrectly, that it would be better as a default.

Scott Morrison (Oct 18 2023 at 01:58):

Would you like me to switch that?

Scott Morrison (Oct 18 2023 at 01:58):

I had assumed that Mathlib was so enormous that no one would ever want to see all of it.

Patrick Massot (Oct 18 2023 at 02:07):

Oh I agree that --noDeps is a good default.

Patrick Massot (Oct 18 2023 at 02:08):

And now I understand your sentence!

Scott Morrison (Oct 18 2023 at 02:09):

Let me know if I should further change the docs!

Patrick Massot (Oct 18 2023 at 02:09):

I think part of my confusion is that I'm not used to the Mathlib module being all of Mathlib, and same with MyProject.

Patrick Massot (Oct 18 2023 at 02:09):

But I don't have a clearer wording.

Patrick Massot (Oct 18 2023 at 02:09):

Maybe I'm simply a bit tired.

Scott Morrison (Oct 18 2023 at 02:10):

Yes. I'm a bit worried that encouraging people to use globs rather than a file that imports everything is bifurcating our standard setups, and this is going to bite us down the track.

Scott Morrison (Oct 18 2023 at 02:10):

But I appreciate that having to maintain a file that imports everything is beyond tedious.

Patrick Massot (Oct 18 2023 at 02:10):

Another question: is there a way to discover the existence of this graph function?

Patrick Massot (Oct 18 2023 at 02:10):

It's not only tedious, it's also not clear at all for beginners.

Scott Morrison (Oct 18 2023 at 02:11):

I guess it is mentioned in the lakefile.lean as

lean_exe graph where
  root := `ImportGraph.Main

but otherwise: no, not at all. :-)

Scott Morrison (Oct 18 2023 at 02:11):

Part of the answer here is: it should be its own repo, separate from Mathlib, with a README

Scott Morrison (Oct 18 2023 at 02:12):

Otherwise, what should we add?

Patrick Massot (Oct 18 2023 at 02:12):

No lake exe --list or lake exe --help?

Scott Morrison (Oct 18 2023 at 02:12):

There should be a doc-string in the lakefile. (In fact, that should be a requirement!!)

Patrick Massot (Oct 18 2023 at 02:12):

I mean lake should have a command listing everything you can put after exe in the current project.

Patrick Massot (Oct 18 2023 at 02:12):

And maybe it has this command.

Scott Morrison (Oct 18 2023 at 02:13):

lean4#2708

Scott Morrison (Oct 18 2023 at 02:15):

@Mac Malone, is there any way we can run the linter on the lakefile.lean itself? e.g. here it would be nice if I had not been allowed to add this exe without a doc-string.

Scott Morrison (Oct 18 2023 at 02:18):

#7738 now includes doc-strings on all the lean_exe declarations in our lakefile

Patrick Massot (Oct 18 2023 at 02:22):

/-- `lake exe cache` retrieves precompiled `.olean` files from a central server. -/ is a bit too simple.

Patrick Massot (Oct 18 2023 at 02:23):

lake exe cache alone does nothing, right?

Scott Morrison (Oct 18 2023 at 02:28):

Good point. It prints a help message! I will change to doc-string to include get, on the basis that this is 99.9% of usage.

Mac Malone (Oct 18 2023 at 02:59):

Scott Morrison said:

Mac Malone, is there any way we can run the linter on the lakefile.lean itself? e.g. here it would be nice if I had not been allowed to add this exe without a doc-string.

I would think you could run it on lakefile.lean the same way you run it on any other Lean file. Has that not worked?

Scott Morrison (Oct 18 2023 at 03:09):

@Mac Malone, sorry, should have included that:

% lake exe runMathlibLinter lakefile
uncaught exception: unknown package 'lakefile'

Mac Malone (Oct 18 2023 at 04:02):

@Scott Morrison Oh, so the linter imports modules oleans rather than elaborating the lean files? In that case, it could probably read the module data of the top-level olean directly rather than importing it to support loading lakefile.olean (since we almost definitely do not want the lakefiles to generally be in the LEAN_PATH) .

Scott Morrison (Oct 18 2023 at 04:03):

Hmm, I see. Low priority, I guess. :-)

Utensil Song (Oct 18 2023 at 04:06):

Is the result of lake exe graph published somewhere for Mathlib? Is it expected to generate something like Dependency graph in blueprint, i.e. assigning background colors and border colors according to whether it's sorry-free or whether all its prerequisites are sorry-free?

image.png

Utensil Song (Oct 18 2023 at 04:14):

Oh, so far it seems to be only about modules, not declarations.

Scott Morrison (Oct 18 2023 at 04:35):

@Utensil Song, there are no sorries in Mathlib! :-)

Scott Morrison (Oct 18 2023 at 04:36):

lake exe graph for all of Mathlib is already extremely large and unreadable, at the module level. I can't imagine what the 150000 declaration graph would look like ...

Utensil Song (Oct 18 2023 at 05:27):

Scott Morrison said:

Utensil Song, there are no sorries in Mathlib! :-)

Yes, but I thought we are also talking about formalization projects that depend on Mathlib.

Scott Morrison said:

lake exe graph for all of Mathlib is already extremely large and unreadable, at the module level. I can't imagine what the 150000 declaration graph would look like ...

Also inspired by this topic, no one wants a full graph, but a graph with branch pruning (only the transitively used declarations, not the unused ones imported along with them, and you can select from or to which module level that you are interested) would be quite useful, even for Mathlib.

Kevin Buzzard (Oct 18 2023 at 06:26):

There will be plenty of sorries in FLT corresponding to hard theorems from pre-Wiles which we will be initially ignoring

Kevin Buzzard (Oct 18 2023 at 06:26):

I think a nice red "we are cheating here" would be fine

Kevin Buzzard (Oct 18 2023 at 06:27):

Oh except I wouldn't want that to pollute "we are currently working on this and will finish in a few months"


Last updated: Dec 20 2023 at 11:08 UTC