Zulip Chat Archive

Stream: lean4

Topic: include_str and lake dependency tracking


Joachim Breitner (Aug 26 2023 at 13:21):

I see that mathlib defines an include_str macro. Upon first glance it seems that lake will be unaware of that dependency, and may not recompile build products when (only) the included file change changes – is that right? Or is there some magic behind the scenes where lake will know about this dependency?

Is that something that we should improve at some point?

I like that usually, the dependency graph of lean is rather straight forward to determine, just by parsing the initial bits of the files. This is a good kind of simplicity.

If we want to stick to that, it would require that include_str isn’t just arbitrary IO happening at elab time, but that all imported files (lean or text) are included in the header where the build tool can find them. This might suggest a top-level import $ident : ByteArray := $filepath command that must be part of the imports, to allow you to import not just .lean files. Maybe even extensible to other formats – importing a .json file as data can be very useful as well (TypeScript allows that, for example.)

Mario Carneiro (Aug 26 2023 at 13:22):

lake has no idea about this dependency

Patrick Massot (Aug 26 2023 at 13:23):

I've seen that used only to include javascript code for widgets.

Mario Carneiro (Aug 26 2023 at 13:23):

Another case where lake should really be made aware of the dependency is Std.Util.Pickle and its use for the library_search discrimination tree

Mario Carneiro (Aug 26 2023 at 13:24):

One way to handle this kind of thing would be additional directives in the header, like export or import "file.foo"

Mario Carneiro (Aug 26 2023 at 13:24):

they have to be in the header and not just randomly in lean code because lake parses the header to determine the dependency structure before elaboration has started

Joachim Breitner (Aug 26 2023 at 13:25):

Right, that’s my suggestion as well

Joachim Breitner (Aug 26 2023 at 13:26):

I could imagine something like this:

import Some.Lean.Module
import widgetCode : String := "./widget.js"
import favIcon : ByteArray := "./icon.png"

lake woudn’t care about the name for the imported thing, nor its type.

Mario Carneiro (Aug 26 2023 at 13:27):

I don't think it should be specifically import foo : ByteArray := "bla.png" because that's not very extensible, that's just include_str literally

Mario Carneiro (Aug 26 2023 at 13:27):

I would prefer an option that decouples the design of this feature (to be consumed by lake) from the actual processing (which could be whatever user code)

Joachim Breitner (Aug 26 2023 at 13:28):

So you’d say

import Some.Lean.Module
imports "./widget.js"

to just declare the dependency, and then later, in the body, use import_str "./widget.js"?

Mario Carneiro (Aug 26 2023 at 13:28):

yes

Joachim Breitner (Aug 26 2023 at 13:30):

Would import_str fail if it wasn’t declared? and won’t people be annoyed by having to write the file name twice?
Ah, but I see what you mean with “not extensible”, maybe you want to do elab-type processing of the file content before turning it into a lean value.

Mario Carneiro (Aug 26 2023 at 13:30):

because it might not be import_str, it could also be initialize foo <- IO.readFile "./widget.js" or initialize foo <- unpickle "./widget.js"

Mario Carneiro (Aug 26 2023 at 13:30):

No it wouldn't fail, it would just not get dependency tracked as today if you didn't declare it

Mario Carneiro (Aug 26 2023 at 13:31):

we could have a linter to try to detect this but it's pretty hard to reliably determine

Joachim Breitner (Aug 26 2023 at 13:32):

I’d be inclined to forbid it; it can be really bad to have unreliable recompilation.

Mario Carneiro (Aug 26 2023 at 13:32):

That's really not an option with the design of lean

Mario Carneiro (Aug 26 2023 at 13:32):

elaboration is not sandboxed

Mario Carneiro (Aug 26 2023 at 13:33):

For sure, it's not recommended, but it sounds pretty sci-fi to actually prevent it

Mario Carneiro (Aug 26 2023 at 13:33):

besides, users might not even be using lake or the dependency management might not be relevant

Joachim Breitner (Aug 26 2023 at 13:33):

Fair enough. Maybe I should say “not support” then, i.e. if you do and it breaks, you get to keep the pieces.

Mario Carneiro (Aug 26 2023 at 13:33):

like if they are just playing around with #eval

Patrick Massot (Aug 26 2023 at 14:34):

We could still improve include_str so that it checks the file has been declared.

Mac Malone (Aug 27 2023 at 14:39):

If you are okay declaring the dependency manually, you can still achieve this in current Lake by adding the included file as an inputFile target to the library 's extraDepTargets.

Patrick Massot (Aug 27 2023 at 15:02):

It's good to know this is already possible, but this is happening even further away from the use site, so I think we should still aim for the above mechanism.

Joachim Breitner (Aug 27 2023 at 15:04):

If the library’s extraDepTargets, will it add it as a dependency to all files of that library?

Mac Malone (Aug 27 2023 at 16:57):

Yes. :(

Mario Carneiro (Aug 27 2023 at 21:26):

yeah, that's not really a solution, what we want is a way to declare that individual files in a lib have specific additional dependencies (or outputs)

Mario Carneiro (Aug 27 2023 at 21:28):

lake's architecture seems sufficiently flexible that this should be possible via some kind of declaration in the lakefile, but like Patrick says it would be much preferable for the declaration to go in the file itself

Chris Wong (Sep 26 2023 at 03:17):

Joachim Breitner said:

Would import_str fail if it wasn’t declared? and won’t people be annoyed by having to write the file name twice?
Ah, but I see what you mean with “not extensible”, maybe you want to do elab-type processing of the file content before turning it into a lean value.

We could have the imports statement introduce a FileToken into scope, and change include_str to accept that token instead of a plain filename


Last updated: Dec 20 2023 at 11:08 UTC