Zulip Chat Archive

Stream: lean4

Topic: Reintroducing default.lean


Thomas Murrills (Sep 14 2023 at 09:55):

I really like the notion of having a special file to collect imports (instead of Foo/ and Foo.lean). Could this extend to other folders as well? E.g. there have occasionally been discussions on being able to write import Mathlib.Tactic to import the whole Mathlib/Tactic/ folder. Would it make sense to also be able to have a lib.lean file in Mathlib/Tactic/ to specify what gets imported when you write import Mathlib.Tactic?

Thomas Murrills (Sep 14 2023 at 09:55):

Oops, a moment too late

Mario Carneiro (Sep 14 2023 at 09:56):

lean 3 has default.lean, rust has mod.rs, python has __init__.py

Mario Carneiro (Sep 14 2023 at 09:56):

putting the root file in the same folder is good for organization purposes

Eric Wieser (Sep 14 2023 at 09:57):

Did lean3 pick favorites between Foo/default.lean and Foo.lean, or just complain that it was ambiguous? I think Python picks favorites, I have no idea about rust.

Mario Carneiro (Sep 14 2023 at 09:57):

it picks favorites

Yaël Dillies (Sep 14 2023 at 09:57):

Why does it have to be a file even? Can't we make Lean import all context of folder X when we write import X?

Mario Carneiro (Sep 14 2023 at 09:57):

I think Foo.lean wins over Foo/default.lean

Mario Carneiro (Sep 14 2023 at 09:57):

lean 3 was totes okay with ambiguous imports

Thomas Murrills (Sep 14 2023 at 09:58):

Yaël Dillies said:

Why does it have to be a file even? Can't we make Lean import all context of folder X when we write import X?

I think we don’t want that sometimes; I believe it’s been discussed before but I’m not sure where

Eric Wieser (Sep 14 2023 at 09:58):

Mario Carneiro said:

lean 3 was totes okay with ambiguous imports

Looks like it complains.

Yaël Dillies (Sep 14 2023 at 09:59):

Thomas Murrills said:

Yaël Dillies said:

Why does it have to be a file even? Can't we make Lean import all context of folder X when we write import X?

I think we don’t want that sometimes; I believe it’s been discussed before but I’m not sure where

I mean, we don't really want that at all in mathlib because of import reduction efforts (except maybe Tactic), but we can always lint against it.

Notification Bot (Sep 14 2023 at 09:59):

14 messages were moved here from #lean4 > Pre-RFC: lean/lake module resolution by Eric Wieser.

Mario Carneiro (Sep 14 2023 at 10:01):

Another relevant point about lean module resolution: for the most part, lean only cares about resolving paths to oleans, and lake has a lot more flexibility to do whatever is convenient here inside the build/ directory

Mario Carneiro (Sep 14 2023 at 10:02):

e.g. even if there is A.lean and B/default.lean it could still normalize them to build/lib/A.olean and build/lib/B.olean

Mario Carneiro (Sep 14 2023 at 10:03):

The source path is only used in lean for the server to resolve go-to-def queries and similar things

James Gallicchio (Sep 14 2023 at 15:33):

my 2c, i find it a little annoying to have to look through a file list for the relevant lib.rs or mod.rs or whatever in rust packages, whereas Foo.lean one level up I find a bit easier.

plus the mental model of no distinction between resolving Foo.Bar and Foo.Bar.Bax is nice.

if VSCode had a setting to alphabetize files and folders together, I'd have ~no complaints with the status quo.

Eric Wieser (Sep 14 2023 at 15:40):

if VSCode had a setting to alphabetize files and folders together, I'd have ~no complaints with the status quo.

I agree this is the main obstacle; though the same complaint applies to GitHub, as that's the other main way that mathlib code is viewed

James Gallicchio (Sep 14 2023 at 15:46):

and if that's really the reason, then we should just use MyModule/aaa.lean instead. which, honestly, I think is a pretty great idea. it exists to be first in the list! :)

James Gallicchio (Sep 14 2023 at 15:48):

or 0.lean, which does appear to be first in VSCode

James Gallicchio (Sep 14 2023 at 15:48):

_.lean goes before 0, even

James Gallicchio (Sep 14 2023 at 15:54):

_ also works with lake already -- if you call a file Mathlib/_.lean, then import Mathlib._ just works :joy:

James Gallicchio (Sep 14 2023 at 15:54):

whereas it seems *.lean is special cased somehow

Mario Carneiro (Sep 14 2023 at 19:19):

or __init__.lean :see_no_evil:

James Gallicchio (Sep 14 2023 at 19:27):

import Mathlib.__init__ doesn't have quite the same ring to it :stuck_out_tongue_closed_eyes:

Mario Carneiro (Sep 14 2023 at 19:28):

you should never have to write that anyway, the whole meaning of introducing this mechanism is that you would just write import Mathlib

James Gallicchio (Sep 14 2023 at 19:43):

Sure. I'm just not convinced of the value of making the file resolution algorithm more complicated.

James Gallicchio (Sep 14 2023 at 19:45):

The mental model right now is quite clean, which I appreciate. No algorithm to learn, it just matches the file structure. If you want to be able to write import Mathlib, you make a file called Mathlib.

Mario Carneiro (Sep 14 2023 at 19:45):

but then you have one file which is not inside the source tree

Julian Berman (Sep 14 2023 at 19:46):

FWIW another small benefit to Python's model is to facilitate complexity changing while preserving backwards compatibility.

Julian Berman (Sep 14 2023 at 19:46):

Specifically if you have a Foo, a file, and later Foo grows and needs to become a package, you have a way of not breaking everyone's imports because import Foo can continue to work

Mario Carneiro (Sep 14 2023 at 19:47):

that's true even if it's spelled Foo.lean instead of Foo/default.lean though

James Gallicchio (Sep 14 2023 at 19:49):

Mario Carneiro said:

but then you have one file which is not inside the source tree

I mean, I'm okay with _.lean too! :joy: import Mathlib._ There's no real reason why we feel compelled to write one-component imports except the norms from other languages, which always feels like a questionable incentive.

Eric Wieser (Sep 14 2023 at 19:50):

I think the key advantages of the __init__ (or _ or some other alphabetically early) proposal, as opposed to the current Foo.lean design, are:

  • They work in tandem with Mario's other Pre-RFC
  • They mean that Foo/__init__.lean appears right next to Foo/Bar.lean in the VSCode sidebar/github sidebar/file explorer of choice, rather than being arbitrarily far away.

James Gallicchio (Sep 14 2023 at 19:52):

(In reality, I'm fine with any of these, it's not worth bikeshedding about :p)

Julian Berman (Sep 14 2023 at 19:54):

that's true even if it's spelled Foo.lean instead of Foo/default.lean though

(Ah, indeed, I forgot Lean lets you have Foo a directory and Foo.lean next to each other, Python does not really.)

James Gallicchio (Sep 14 2023 at 19:55):

Is there a reason why we want this to be controlled by packages instead of clients? Like, in Java-world, you write import Whatever.* for all importable names starting with Whatever.. I think Lake might already let you do that? and lake already lets you put all files in your tree into a package with a bit of lakefile magic

James Gallicchio (Sep 14 2023 at 19:56):

Only reason I have the import collection files right now is because it's the current norm. If we all switched to writing import Mathlib.* and not having these files, I'd be fine with that

Sebastian Ullrich (Sep 14 2023 at 19:57):

Mario Carneiro said:

lean 3 has default.lean, rust has mod.rs, python has __init__.py

Lean 4 specifically follows Rust's new model introduced in their module system cleanup https://doc.rust-lang.org/edition-guide/rust-2018/path-changes.html#no-more-modrs

Adam Topaz (Sep 14 2023 at 20:06):

This is similar to Haskell as well, right?

Mario Carneiro (Sep 14 2023 at 20:18):

Well, rust has the option to use either one, and both are used in practice

Mario Carneiro (Sep 14 2023 at 20:18):

the 2018 change was just to make the mod.rs thing not required

Mario Carneiro (Sep 14 2023 at 20:20):

regarding:

This eliminates the special name, and if you have a bunch of files open in your editor, you can clearly see their names, instead of having a bunch of tabs named mod.rs.

image.png

Mario Carneiro (Sep 14 2023 at 20:23):

The rust situation with a bunch of tabs named mod.rs is a bit worse than ours, because in rust modules named mod.rs often do a lot, while in lean default.lean files are almost always just imports (we have considered linting to ensure this), so you won't be spending a lot of time in them

Shreyas Srinivas (Sep 14 2023 at 20:35):

James Gallicchio said:

if VSCode had a setting to alphabetize files and folders together, I'd have ~no complaints with the status quo.

To me it sounds like the issue can be solved in the extension by offering a project view, like some IDEs do, rather than changing how modules work and asking everyone to update their code. Personally, since I never was a heavy lean 3 user, and have seen both conventions in different languages, the lean 4 convention feels much cleaner. Imagine opening a large project and then opening a number of files named default.lean and then having to carefully disambiguate which default file belongs to which module. (The tab headers shrink when you have many open files, and the folder names next to yhe filename get truncated).

James Gallicchio (Sep 14 2023 at 20:39):

is a project view like that possible in VSCode? (if anyone knows)

Shreyas Srinivas (Sep 14 2023 at 20:40):

It certainly would be. The side panel can be modified

Shreyas Srinivas (Sep 14 2023 at 20:41):

If latex-workshop can show me the doc structure and take me directly to the relevant file, then I don't see why a project view is impossible for lean.

Shreyas Srinivas (Sep 14 2023 at 20:51):

Mario Carneiro said:

regarding:

This eliminates the special name, and if you have a bunch of files open in your editor, you can clearly see their names, instead of having a bunch of tabs named mod.rs.

image.png

Isn't this a design choice mathlib made, and not something forced by the lean4 module system?

Mario Carneiro (Sep 14 2023 at 21:00):

of course, everything is design decisions

Mario Carneiro (Sep 14 2023 at 21:01):

anyway, I would be in favor of introducing the option to use either style, like rust does. It's a decision of the library whether they want to organize files one way or the other

Mario Carneiro (Sep 14 2023 at 21:03):

I suspect that, given the option, libraries will gravitate toward using default.lean, since in the current style these files are actually rather different from normal files, containing imports and nothing else. There are very few of these files in mathlib, they are generally discouraged, and this is independent of whether they are spelled Foo.lean or Foo/default.lean

Shreyas Srinivas (Sep 14 2023 at 21:04):

As long as it is optional, I guess I have no objections. Although, a project view with the option to completely abstract away the filesystem for vscode users might be very nice to have, while also resolving the issues raised in this thread.

Shreyas Srinivas (Sep 14 2023 at 21:04):

Mario Carneiro said:

I suspect that, given the option, libraries will gravitate toward using default.lean, since in the current style these files are actually rather different from normal files, containing imports and nothing else. There are very few of these files in mathlib, they are generally discouraged, and this is independent of whether they are spelled Foo.lean or Foo/default.lean

I wouldn't, for the reason mentioned earlier.

Mario Carneiro (Sep 14 2023 at 21:05):

would you use "subpackage files" at all (that is, Foo.lean or Foo/default.lean where the folder Foo/ exists)? What would you put in them?

Shreyas Srinivas (Sep 14 2023 at 21:05):

In addition to imports, some top level definitions and theorems.

Mario Carneiro (Sep 14 2023 at 21:06):

if you don't use subpackage files then the whole discussion is moot

Mario Carneiro (Sep 14 2023 at 21:07):

Top level definitions usually have to go at the bottom of the folder import structure, not the top. That's why mathlib has so many Basic.lean files

Shreyas Srinivas (Sep 14 2023 at 21:08):

By top level, I mean the kind of things that go into the introduction section of a paper: the main theorem statements. The goal would be to setup the rest of the subpackage to allow me to write the main proofs in a few easy to read lines with heavy commenting to explain the intuition

Mario Carneiro (Sep 14 2023 at 21:09):

I think a reasonable convention is to use Foo.lean for files with stuff in them and Foo/default.lean for pure aggregation files

Shreyas Srinivas (Sep 14 2023 at 21:11):

Mario Carneiro said:

I think a reasonable convention is to use Foo.lean for files with stuff in them and Foo/default.lean for pure aggregation files

I guess at an abstract level, this is subjective. At a UX level on vscode, the default.lean thing is just adding extra steps and making it hard to tell which tab corresponds to the default.lean I am looking for. Currently Foo.lean tells me that this contains imports from and stuff about Foo/..

Mario Carneiro (Sep 14 2023 at 21:12):

the point is, you never want to open a default.lean file so there is no need to look for one

Mario Carneiro (Sep 14 2023 at 21:12):

there is nothing in those files

Shreyas Srinivas (Sep 14 2023 at 21:12):

I might want to. That's the point. Also, if you are writing a program in lean, the programming language, the top level file of a subpackage includes the entry point of a framework or a main function

Mario Carneiro (Sep 14 2023 at 21:13):

again, that's not a default.lean file under the convention I described

Shreyas Srinivas (Sep 14 2023 at 21:13):

Why would you want two files, one for imports and one for toplevel stuff, when the toplevel file will have to get the imports again.

Mario Carneiro (Sep 14 2023 at 21:14):

You never have both

Mario Carneiro (Sep 14 2023 at 21:14):

lean wouldn't allow it anyway

Mario Carneiro (Sep 14 2023 at 21:14):

either you have Foo.lean with stuff in it (and imports), or Foo/default.lean with only imports, or neither

Shreyas Srinivas (Sep 14 2023 at 21:15):

Seems like a more complicated condition than the current one.

Then again, as long as it is optional, I am not worried. I would never use default files. I have had enough of __init__.py tab hell for a lifetime.

Mario Carneiro (Sep 14 2023 at 21:15):

In mathlib we found that it was a bad idea to have files that play double duty as aggregating their children and then adding more stuff

Shreyas Srinivas (Sep 14 2023 at 21:16):

That's probably because of the long list of imports you would have, given the number of modules right under each folder. Think of writing a simple web framework with a deeper file hierarchy that has fewer modules at each level.

Mario Carneiro (Sep 14 2023 at 21:17):

For a program this is less of an issue, but mathlib often has files that are intended for use by other things, not necessarily to come together into one main function

Shreyas Srinivas (Sep 14 2023 at 21:17):

If you had like a 100 lines of imports , I can understand not wanting to add anything else there.

Mario Carneiro (Sep 14 2023 at 21:17):

so if you forget to include one of your children then files can go missing

Mario Carneiro (Sep 14 2023 at 21:18):

Most folders don't have a fanout that large, 20 is more common

Shreyas Srinivas (Sep 14 2023 at 21:18):

Exactly. So adding some top level code under the imports is not a big deal.

Mario Carneiro (Sep 14 2023 at 21:19):

for mathlib this is also bad because it means you have to import all the children to get the top level theorem, even though not all the children are necessarily used in the top level theorem

Mario Carneiro (Sep 14 2023 at 21:20):

and when someone adds another child later that doesn't contribute to the top level theorem it's not clear whether the top file should import it or not

Shreyas Srinivas (Sep 14 2023 at 21:20):

My gut feeling is that the convention adopted will split cleanly along two divides:

  1. Lean 3 power users vs people who started off with lean 4
  2. Projects with flatter hierarchy and larger fanout vs projects with deeper hierarchy and smaller fanout

Mario Carneiro (Sep 14 2023 at 21:20):

I don't think this is about familiarity with lean 3

Mario Carneiro (Sep 14 2023 at 21:21):

there are actual technical and maintenance reasons to prefer one over the other

Shreyas Srinivas (Sep 14 2023 at 21:22):

One can find technical arguments for both sides if one is motivated. The motivation can be past experiences with module systems and/or projects worked on.

Shreyas Srinivas (Sep 14 2023 at 21:22):

I think it would an interesting experiment to have both options and see where things stand in 10 years.

Shreyas Srinivas (Sep 14 2023 at 21:24):

the best system I have used to date in this regard was dotnet core. Namespaces did everything and file-hierarchy never mattered.

Mario Carneiro (Sep 14 2023 at 21:27):

example?

Mario Carneiro (Sep 14 2023 at 21:28):

The advantage of file hierarchy is that you know where to find a package with a given name. Even if the compiler uses package declarations in every file to determine where things go, you still need to have a sane file structure or else people won't be able to find things

Shreyas Srinivas (Sep 14 2023 at 21:29):

The ones I gave above and then some. For example you say:

so if you forget to include one of your children then files can go missing

But the converse is also true. I may want to have fine grained control over imports. and add some toplevel code. In some situations, the files wouldn't be going missing.

Mario Carneiro (Sep 14 2023 at 21:29):

no I mean literally how would you lay out files and what would you write in .net

Shreyas Srinivas (Sep 14 2023 at 21:30):

Mario Carneiro said:

no I mean literally how would you lay out files and what would you write in .net

I wouldn't worry about it. I'd use the project/solution view

Mario Carneiro (Sep 14 2023 at 21:31):

I really don't like visual studio's "virtual folder hierarchy" which is almost but not quite the one on disk

Mario Carneiro (Sep 14 2023 at 21:31):

it means when you drag things around stuff gets out of sync

Shreyas Srinivas (Sep 14 2023 at 21:32):

That's true. But that's a visual studio thing and I am not sure if it was a bug. But it is possible to implement a sensible project view. IntelliJ has one.

Mario Carneiro (Sep 14 2023 at 21:33):

I don't think people would appreciate us trying to reinvent a custom folder hierarchy separate from the one vscode provides, even ignoring the work involved

Mario Carneiro (Sep 14 2023 at 21:34):

trying to replicate that in other editors sounds even harder, there will always be people using the plain old file hierarchy

Mario Carneiro (Sep 14 2023 at 21:34):

if nothing else, github clearly does

Shreyas Srinivas (Sep 14 2023 at 21:35):

There again, a cultural difference might exist, this time between mathematicians and computer scientists. If we can hide the folder structure and just present the module hierarchy, I doubt most people would dislike it. Power users are always a different story.

We don't have to supercede the folder view either. Latex Workshop goes even further. It extracts the file structure from one of the metadata files, constructs the document structure and presents it in the bottom half of the side panel.

Mario Carneiro (Sep 14 2023 at 21:36):

the document structure is a completely different matter, that's the outline view and we already have that (IIRC)

Shreyas Srinivas (Sep 14 2023 at 21:37):

Extracting the module hierarchy must be simpler. Anyway, the project overview idea is meant to solve the problem @James Gallicchio raised, namely the annoyances with vscode file explorer

Mario Carneiro (Sep 14 2023 at 21:37):

As a newcomer to visual studio I found the module hierarchy view endlessly confusing

James Gallicchio (Sep 14 2023 at 21:38):

big fan of not hiding filesystem

Shreyas Srinivas (Sep 14 2023 at 21:38):

James Gallicchio said:

big fan of not hiding filesystem

It doesn't have to be hidden.

Shreyas Srinivas (Sep 14 2023 at 21:41):

Also visual studio lets you switch to the file explorer view. Most IDEs do this. W.r.t to implementing a solution view, among the IDEs I used, I found XCode > IntelliJ > Visual Studio > Eclipse (personal opinion ofc)

Mario Carneiro (Sep 14 2023 at 21:43):

By the way, rust also lets you decouple the module hierarchy from the file hierarchy with #[path = "somewhere/else.rs"] mod foo;. But it's a sufficiently ugly syntax as to discourage people from abusing it to create on-disk structures not matching the module structure

Mario Carneiro (Sep 14 2023 at 21:45):

but this has a high cost for incremental compilation and IDEs, because it means you have to read the whole project to find out where a given file like somewhere/else.rs actually lives with respect to the module structure

Shreyas Srinivas (Sep 14 2023 at 21:47):

Anyway, dotnet aside, I am happily using lean's current module system. If default.lean is coming back, I hope it remains optional.

I am guessing a project module hierachy view should not be as complicated for lean since there is a one-one mapping of files and modules.

Shreyas Srinivas (Sep 14 2023 at 21:50):

As an aside on "hiding file systems", this is the world we are headed towards: https://www.theverge.com/22684730/students-file-folder-directory-structure-education-gen-z

James Gallicchio (Sep 14 2023 at 21:51):

^it's already routinely a problem for first year CS students here LOL

James Gallicchio (Sep 14 2023 at 21:52):

but we beat the unix into them by year 2 :-)

Shreyas Srinivas (Sep 14 2023 at 21:56):

I guess in German-speaking countries it is hard to survive your first year exams if you can't figure out how to compile your programs.

Patrick Massot (Sep 14 2023 at 21:56):

Shreyas Srinivas said:

As an side on "hiding file systems", this is the world we are headed towards: https://www.theverge.com/22684730/students-file-folder-directory-structure-education-gen-z

You make it sound like future, but it's already there.

Patrick Massot (Sep 14 2023 at 21:57):

Maybe it's not fully there. I still see students who understand the question "where did you put the file you downloaded?". But they have no clue about the answer.

Shreyas Srinivas (Sep 14 2023 at 22:01):

Patrick Massot said:

"where did you put the file you downloaded?". But they have no clue about the answer.

I think this particular problem has been around forever (i.e. since browsers)

Shreyas Srinivas (Sep 14 2023 at 22:05):

Getting back to the point, a module hierarchy view would be nice to have for this reason alone: making life simpler for newcomers of the younger generation.

Mario Carneiro (Sep 14 2023 at 22:05):

a command to be able to open a file by module name would be nice

Mario Carneiro (Sep 14 2023 at 22:06):

the closest approximation to this is to type import Foo and then ctrl-click on Foo

Shreyas Srinivas (Sep 14 2023 at 22:07):

A quick google search for this landed me on this repo : https://github.com/hashicorp/vscode-terraform/pull/746

Shreyas Srinivas (Sep 14 2023 at 22:08):

the second result was vscode's tree-view API: https://code.visualstudio.com/api/extension-guides/tree-view

Shreyas Srinivas (Sep 14 2023 at 22:10):

Mario Carneiro said:

the closest approximation to this is to type import Foo and then ctrl-click on Foo

This could be implemented as a command that can be entered in the new search bar in vscode

Mario Carneiro (Sep 14 2023 at 22:12):

just as soon as we get completion for imports

Thomas Murrills (Sep 15 2023 at 03:00):

I just want to establish something quickly: what exactly are the use cases of a default.lean file in the first place (comprehensively)? Specifically, what are all the situations where someone would need manual control over which files in a folder are imported, instead of just making import Foo.* figure out on the fly how to import everything in Foo/ automatically and relying on that?

I can think of a couple possible partial answers:

  1. sometimes (when?) import order matters, and we get different behavior for different orders, so we need to specify the right one.
  2. someone might want to exclude an import for some reason.
  3. (?) eliminating redundant imports is useful (is it?) but nontrivial to compute over and over (is it?), so we effectively want a permanent cache of the “minimized” list of imports. (But lake seems to do this every build, so my feeling is that this might not be a real concern?)

Are there other (possibly more technical) purposes?

Mario Carneiro (Sep 15 2023 at 03:07):

One of the reasons rust doesn't do import Foo.* style imports is that directory traversals might be slow (you don't know what else is in the directory) or impossible (e.g. network drives or linux directories without the traversal permission set)

Mario Carneiro (Sep 15 2023 at 03:10):

(But lake seems to do this every build, so my feeling is that this might not be a real concern?)

see my ongoing battle to make lake less slow

Mac Malone (Sep 15 2023 at 05:39):

Thomas Murrills said:

  1. (?) eliminating redundant imports is useful (is it?) but nontrivial to compute over and over (is it?), so we effectively want a permanent cache of the “minimized” list of imports. (But lake seems to do this every build, so my feeling is that this might not be a real concern?)

Yes, as @Mario Carneiro notes this somewhat of a performance bottleneck for Lake. Lake needs to maintain order and eliminate duplicate imports (and packages!) and this means that when computing a import graph there currently is little ability to reuse work between modules (i.e., the import arrays and sets have to be reconstructed for every module).

Mac Malone (Sep 15 2023 at 05:45):

There are potential avenues for improvement here, but the underlying issue of needing to order and deduplicate still remains and is less than ideal, but somewhat inherit to Lean's current module system.

Yaël Dillies (Sep 15 2023 at 06:29):

Shreyas Srinivas said:

As an aside on "hiding file systems", this is the world we are headed towards: https://www.theverge.com/22684730/students-file-folder-directory-structure-education-gen-z

Interesting... I'm right in that generation but I literally never heard of that phenomenon.

Kevin Buzzard (Sep 15 2023 at 06:30):

Yeah but you're a power user

Mario Carneiro (Sep 15 2023 at 06:30):

how long have you been using computers? (i.e. programming or other power user things, not youtube and snapchat)

Kevin Buzzard (Sep 15 2023 at 06:31):

It's the people who don't know what a command line is that don't know anything about directory structure

Kevin Buzzard (Sep 15 2023 at 06:31):

And they are legion in mathematics departments

Yaël Dillies (Sep 15 2023 at 06:31):

I've been doing computer stuff since I was 11, so 10 years?

Mario Carneiro (Sep 15 2023 at 06:34):

if you've been doing more-than-youtube/facebook since you were 11 then you are definitely a power user

Damiano Testa (Sep 15 2023 at 08:04):

Thomas Murrills said:

I just want to establish something quickly: what exactly are the use cases of a default.lean file in the first place (comprehensively)? Specifically, what are all the situations where someone would need manual control over which files in a folder are imported, instead of just making import Foo.* figure out on the fly how to import everything in Foo/ automatically and relying on that?

I can think of a couple possible partial answers:

  1. sometimes (when?) import order matters, and we get different behavior for different orders, so we need to specify the right one.
    [...]

@Thomas Murrills, #6763 might be of interest for 1. Most of the files build by reordering alphabetically the imports, the main difference being time to build the files. One file had a massive slowdown, one file I was not able to get to work, unless some import appeared before some others.

Shreyas Srinivas (Sep 15 2023 at 09:08):

Wait, why can't import order be computed the usual way? Construct the DAG, complain about cycles, and perform topological sort? What is an example of import order affecting behavior beyond this?

Eric Wieser (Sep 15 2023 at 09:11):

Import order affects the order of declarations in the environment

Mac Malone (Sep 15 2023 at 09:13):

Also, it effects the order initializers are run, which can easily be designed in a non-communicative manner (since they have arbitrary stateful effects).

Mario Carneiro (Sep 15 2023 at 09:15):

declarations are in a hashmap, they aren't ordered

Mario Carneiro (Sep 15 2023 at 09:15):

in fact even within a file lean isn't able to remember what order declarations came in

Mac Malone (Sep 15 2023 at 09:16):

@Mario Carneiro But there is const2ModIdx to figure that out?

Mario Carneiro (Sep 15 2023 at 09:17):

oh, modules are ordered, constants are not

Mac Malone (Sep 15 2023 at 09:17):

Yes, but you can go from constant to module and sort constants by their module index.

Mario Carneiro (Sep 15 2023 at 09:17):

constants are in modules, but the order of constants within a module is random

Mac Malone (Sep 15 2023 at 09:18):

Oh, yeah.

Mac Malone (Sep 15 2023 at 09:18):

But, we were talking about module order here, right?

Mario Carneiro (Sep 15 2023 at 09:18):

I was reacting to

Eric Wieser said:

Import order affects the order of declarations in the environment

Mac Malone (Sep 15 2023 at 09:20):

I assumed that also included the order of declarations which appear in environment extensions which usually are ordered by module.

Mario Carneiro (Sep 15 2023 at 09:20):

I think you are right that the main cause of order dependence is because of initializers; in practice this usually manifests as pushing things to a vector or something in the initializer, and then that becomes the default sort order for things like typeclass inference

Mario Carneiro (Sep 15 2023 at 09:20):

environment extensions are ordered

Mario Carneiro (Sep 15 2023 at 09:21):

if you declare tags on three declarations those tags will show up in the relevant environment extension in source order

Mario Carneiro (Sep 15 2023 at 09:21):

but the declarations themselves will be shuffled

Mac Malone (Sep 15 2023 at 09:22):

I was thinking of things like tag attributes where the order of all tagged declarations in the array of all entries will be ordered by import.

Mac Malone (Sep 15 2023 at 09:22):

You beat me to it. :laughing:

Thomas Murrills (Sep 15 2023 at 09:34):

Hmm, ok! Thanks for the responses; here are some thoughts!

My feeling is that there are really two classes of needs/concerns here:

  • having a sort of permanent cache (more or less) of imports in a folder, so that we can avoid directory traversals and minimize import redundancy
  • being able to fine-tune imports, so that we can customize the selection of imported files and avoid timing issues (or is the goal to just make this not happen?)

The interesting thing about the permanent cache is that it could in most cases be computed periodically without human intervention and maintenance.

(Re: dealing with impossible directory traversals when computing such a cache: I’d hope that either 1) they’re pulling from a place which already has this permanent cache, and don’t need to ever make it/adjust it themselves 2) they can occasionally surmount the issues, e.g. by running some command with appropriate permissions 3) they’re rare enough to make this a low burden.)

Thomas Murrills (Sep 15 2023 at 09:43):

So, here are the things I’d personally like to be true.

  • Your average uncomputery mathematician could write import Foo and by default, it imports Foo/. (Or import Foo.*, but I think even glob syntax would not be intuitive to most; but let’s table that bikeshed for now, the point is that by default, without setup, there’s a way to do folder imports)

  • Anyone who needs to can override the default behavior as suits their needs.

I think we could achieve that in a few ways; here’s a sketch of some partial possibilities.

  • Each folder comes with a file, say .imports.lean (starts with ., is hidden from people who don’t need it), which is by default constructed (and reconstructed) automatically by some tool at some point. I don’t know the optimal way to fill in these blanks, but I think we could probably find a reasonable (recurring) time to do this.

    • It could also have restrictions to cordon off the difficult edge cases—since this would be intended for the average user who likely has a typical-looking project, maybe we only do this automatically when the directory is small; if the traversal starts to take too long, we demand a manually triggered reconstruction/update of the file instead of constructing it at an automatic time. (I’m not sure exactly when directory traversal starts to get slow, but I’d hope it’s not slow at least up to ~100 files…)

    • We could maybe put extra info in the file that helps us not do extra work, like remembering imports of each file in the directory so that we only recompute on changes

    • In general, “by lake, during build” is the obvious choice, but maybe there’s a better, less-obvious one; maybe only certain builds, or maybe a linter should request a manual update. Maybe this should be configurable to be different between larger and smaller projects, enabling “average mathematicians” to work without fuss, while not tying larger projects down.

  • Then, when we want to override the default for some reason, we could either…

    • …simply edit .imports.lean, and remove some marker content at the top which indicates to the tool that it should autogenerate it/overwrite it

    • …or create a new file _imports.lean (or whatever) that’s not hidden, and we enforce never having both this file and .imports.lean at once

    • …or, if import order is eventually meant to be irrelevant, add another file e.g. _importOverrides.lean (or whatever) which just specifies changes to the imports (I.e. extra included/excluded files). .imports.lean would then be reconstructed by the tool with this file in mind—the file itself would be otherwise “inert”. EDIT: wrote this part before the above discussion; looks like it’s unavoidably relevant in certain specific situations!

Anyway, thought I’d put these thoughts out there, just in case it helps refine what would be a good idea one way or another! :)

Shreyas Srinivas (Sep 15 2023 at 10:03):

how is the existing module system preserved?

Shreyas Srinivas (Sep 15 2023 at 10:09):

Does any currently well-used language other than C or C++ force programmers to import modules/files in some order? How does it work in haskell, which would also have typeclass inference issues?

Mario Carneiro (Sep 15 2023 at 10:37):

directory traversal starts to become problematic around 5000 files, maybe a bit less on windows

Mac Malone (Sep 15 2023 at 10:46):

Shreyas Srinivas said:

Does any currently well-used language other than C or C++ force programmers to import modules/files in some order?

Order of imports matters in almost every interpreted language, including Python, Ruby, and JavaScript/TypeScript, just to name a few.

Damiano Testa (Sep 15 2023 at 13:45):

For fun, I tried to see how I would implement the import Foo.* and I tried with importing a fixed file. However, with the code below, I get an error:

import Lean.Elab.Command

open Lean in
elab "myImport" : command => do
  let imp : List Import := ( getEnv).imports.toList ++ [{ module := `Mathlib.Data.Nat.Basic }]
  let newEnv :=  Lean.importModules imp ( getOptions)
  setEnv newEnv

myImport
/-
Lean server printed an error: libc++abi: terminating due to uncaught exception of type
lean::exception: cannot evaluate `[init]` declaration 'Std.CodeAction.cmdCodeActionExt'
in the same module
-/

I imagine that what I am doing is very crude: what is a better approach?

Scott Morrison (Sep 18 2023 at 00:23):

You're on an old version of Lean. This works fine for me, after changing imp to imp.toArray.

Damiano Testa (Sep 18 2023 at 02:23):

I first ran

$ elan self update
info: checking for self-updates
info: downloading self-update
info: elan updated successfully to 3.0.0

$ lean --version
Lean (version 4.1.0-rc1, commit 339615042d90, Release)

Then I pulled master on mathlib and ran

$ lake exe cache get!
Attempting to download 3765 file(s)
Downloaded: 3765 file(s) [attempted 3765/3765 = 100%] (100% success)
Decompressing 3765 file(s)
unpacked in 7688 ms

However, I still get an error with the code above (I did have to change List to Array):

Lean (version 4.1.0-rc1, commit 339615042d90, Release)
libc++abi: terminating due to uncaught exception of type lean::exception: cannot evaluate `[init]` declaration 'Std.CodeAction.cmdCodeActionExt' in the same module
[Error - 4:21:40 AM] Request textDocument/codeAction failed.
  Message: Server process for file:///home/damiano/Matematica/Lean4/mathlib4/Mathlib/adomaniLeanUtils/importStar.lean crashed, likely due to a stack overflow or a bug.
  Code: -32902

Scott Morrison (Sep 18 2023 at 04:08):

Hmm.. trying sprinkling enableInitializersExecution in your code?

Damiano Testa (Sep 18 2023 at 04:38):

I tried

elab "myImport" : command => do
  enableInitializersExecution
  let imp := ( getEnv).imports ++ #[{ module := `Mathlib.Data.Nat.Basic }]
  let newEnv :=  Lean.importModules imp ( getOptions)
  setEnv newEnv

but Lean does not like that enableInitializersExecution is `unsafe:

(kernel) invalid declaration, it uses unsafe declaration 'Lean.enableInitializersExecution'

Anyway, thank you very much for your suggestions!

Scott Morrison (Sep 18 2023 at 04:42):

Use unsafe from Std.Util.TermUnsafe.

Damiano Testa (Sep 18 2023 at 04:52):

Success! Thanks Scott! This works:

import Std.Util.TermUnsafe

open Lean in
elab "myImport" : command => do
  unsafe enableInitializersExecution
  let imp := ( getEnv).imports ++ #[{ module := `Mathlib.Data.Nat.Basic }]
  let newEnv :=  Lean.importModules imp ( getOptions)
  setEnv newEnv

myImport

#check Nat.commSemiring

Lean is clearly not too happy with revisiting imports. Currently, this may panic, but that's ok! Thanks a lot!

Damiano Testa (Sep 18 2023 at 08:48):

This "almost" works:

import Std.Data.List.Basic

syntax (name := importStar) "import*" "?"? (colGt ident) : command

open Lean in
elab_rules : command | `(command| import* $[?%$info]? $pat:ident) => do
  let imps : System.FilePath := "Mathlib.lean"
  let stripImp := ( IO.FS.lines imps).map (String.trim <| String.drop · 7)
  let withDots := stripImp.filter (String.isPrefixOf pat.getId.toString)
  let paths := withDots.map fun s =>
    let spl := String.splitOn s "."
    System.mkFilePath <| spl.dropLast ++ [(spl.last'.getD "") ++ ".lean"]
  let names :=  paths.mapM (moduleNameOfFileName . none)
  let newImports : Array Import := names.map ({ module := · })
  let oldAndNewImports := ( getEnv).imports ++ newImports
  if info.isSome then logInfo m!"{"\n".intercalate (withDots.map ("import " ++ ·)).toList}"
  let newEnv :=  importModules oldAndNewImports ( getOptions)
  unsafe enableInitializersExecution
  setEnv newEnv

Damiano Testa (Sep 18 2023 at 08:50):

Writing import* Mathlib.Data.ENat is equivalent to

import Mathlib.Data.ENat.Basic
import Mathlib.Data.ENat.Lattice

while import* Mathlib.Data.Nat.L
is equivalent to

import Mathlib.Data.Nat.Lattice
import Mathlib.Data.Nat.Log

The only issue is that, if you import the command in the previous message from a separate file, then there needs to be "something" between the last actual import and import*. For instance, this works

import Mathlib.Tactic.importStar        -- we import the `import*` command
import Mathlib.Data.MvPolynomial.Basic  -- random imports
import Mathlib.Data.Polynomial.Basic

section end  -- "something", `/-!-/` would also work as "something"
import* Mathlib.Data.Nat.L  -- this works.  It would not, without the previous `section end`.

Last updated: Dec 20 2023 at 11:08 UTC