Zulip Chat Archive

Stream: lean4

Topic: Weird Lake errors


Henrik Böving (Jul 03 2022 at 13:43):

I just migrated doc-gen4 to the 06-30 nightly on this branch: https://github.com/leanprover/doc-gen4/tree/update but had to do this blindly since both vscode and emacs kept shooting:

`/home/nix/.elan/toolchains/leanprover--lean4---nightly-2022-06-30/bin/lake print-paths Init DocGen4.Process DocGen4.Load DocGen4.IncludeStr DocGen4.Output DocGen4.LeanInk` failed:

stderr:
warning: Syntax for package dependencies has changed. Use the new `require` syntax.
warning: Syntax for package dependencies has changed. Use the new `require` syntax.

at me and refused to analyze anything in the file. Funnily enough if I execute this command myself:

λ /home/nix/.elan/toolchains/leanprover--lean4---nightly-2022-06-30/bin/lake print-paths Init DocGen4.Process DocGen4.Load DocGen4.IncludeStr DocGen4.Output DocGen4.LeanInk
{"srcPath":["./lean_packages/Cli/./.","./lean_packages/leanInk/./.","./.","./lean_packages/Unicode/./.","./lean_packages/mathlib/./.","./lean_packages/lake/./.","./lean_packages/CMark/./."],"oleanPath":["./lean_packages/Cli/./build/lib","./lean_packages/leanInk/./build/lib","./build/lib","./lean_packages/Unicode/./build/lib","./lean_packages/mathlib/./build/lib","./lean_packages/lake/./build/lib","./lean_packages/CMark/./build/lib"],"loadDynlibPaths":[]}

I can see no error message at all, I already tried to delete the build and lean_package directories etc. but nothing changed so I fixed the things using lake build + vim in the hopes that it will disappear after the build works again but nope, I'm still stuck in this state. What am I doing wrong?

Henrik Böving (Jul 03 2022 at 13:59):

To add to this error I get from my emacs I'm now also observing segfaults when running doc-gen4, even just against a lake new test project (with fitting compiler version of course):

#0  0x0000000003898ba2 in l_Lake_WfName_ofName ()
#1  0x00000000038e2ee8 in l_Lake_Package_loadUnsafe___lambda__1 ()
#2  0x00000000038e3cf3 in l_Lake_Package_loadUnsafe___lambda__2 ()
#3  0x00000000038e5290 in l_Lake_Package_loadUnsafe ()
#4  0x000000000386615f in l_Lake_loadWorkspace ()
#5  0x0000000003709a72 in l_DocGen4_lakeSetup ()
#6  0x00000000036feb62 in l_runDocGenCmd ()
#7  0x0000000005e6ab18 in lean_apply_2 ()
#8  0x0000000005e6afe9 in lean_apply_2 ()
#9  0x0000000005e6ad20 in lean_apply_2 ()
#10 0x000000000370064e in main ()

coming from this line: https://github.com/leanprover/doc-gen4/blob/update/DocGen4/Load.lean#L26 which used to work perfectly fine until this update. Am I doing something wrong here as well? Most likely @Mac will know

Mac (Jul 03 2022 at 21:23):

Henrik Böving said:

I just migrated doc-gen4 to the 06-30 nightly on this branch: https://github.com/leanprover/doc-gen4/tree/update but had to do this blindly since both vscode and emacs kept shooting:

This is very weird because there were no changes to the bundled Lake between the Lean version on main (06-24) and 06-30.

Henrik Böving (Jul 03 2022 at 21:24):

Can you at least reproduce it on your machine?

Mac (Jul 04 2022 at 01:12):

@Henrik Böving Yep, I was able to reproduce the segfault. I think the problem is clash between the Lake version distributed with Lean and the one you are depending on. You need to depend on the Lake version of your Lean version (i.e., 401e738e4ca989ced8d9bb0cf7f66be9133fc435 / tsyntax).

Mac (Jul 04 2022 at 01:13):

I would also advise trying to update to the latest nightly as opposed to an old one.

Henrik Böving (Jul 04 2022 at 07:07):

I'm always trying to update to the nightly that is in sync with the one of mathlib to avoid ABI issues when building mathlib.... testing the other lake commit, I thought I already got the right one :/

Henrik Böving (Jul 04 2022 at 07:11):

Okay that did indeed fix both issues. Do you have any idea why my emacs started throwing this error message at me? And how we could at least improve it in the future? This seems near impossible to debug without the correct guess.

Mac (Jul 04 2022 at 08:24):

Sadly, no. I have no idea why the bugs appeared, I was just able to figure out how to fix them by guessing that the version mismatch may be causing problems.

Mac (Jul 04 2022 at 08:35):

If we could narrow down the source of the problem though, that might help me figure out what it is and fix it. A #mwe would be quite helpful -- doc-gen4 is a big project with a lot of dependencies and many moving parts which make it hard to debug what part of the build system is failing (and if that is even the source of the problem).

Mac (Jul 04 2022 at 08:43):

One of the problems I did have when inspected doc-gen4 was that I got a n error saying that the Lake.olean object file did not exists. This one I do know the cause of and it was already fixed (hopefully) in v3.2.0. The problem is the Lake would use the workspace's Lake module to evaluate the lakefile rather than the builtin one. For instance, it would use the Lake required by doc-gen4 rather than the Lake in the toolchain, which could cause all kinds of errors. That may also be the source of your editor woes, but I am not sure (since your error looks different).

Mac (Jul 04 2022 at 08:48):

Note: Doc-gen4 may also have this bug as it initializes the search path using initSearchPath which similarly prioritizes user modules over builtin ones.

Mac (Jul 04 2022 at 08:49):

It also uses findSysroot rather than just using the sysroot of the Lean found by Lake(i.e., lean.sysroot) which is a bit weird.

Mac (Jul 04 2022 at 08:51):

More generally, there is no reason to initialize the search path in doc-gen4 anyway since you call out to Lake.loadWorkspace which already initializes the search path itself.

Henrik Böving (Jul 04 2022 at 08:52):

The main reason I started using Lake as a library instead of calling to it as a foreign binary was that I wanted to add the links to the source of the libraries which required me to interact with the Lake project structure in order to figure out the remote urls (here: https://github.com/leanprover/doc-gen4/blob/main/DocGen4/Output/SourceLinker.lean). If I could somehow get this information without depending on Lake.loadWorkspace to give me the workspace I can fetch this information from I would gladly avoid pulling in Lake as yet another dependency.

I'll remove the initSearchPath then, thank you^^

Mac (Jul 04 2022 at 08:53):

Also, sorry for the stream of consciousness critique. I don't mean to be critical. I am just noting down oddities as I see them in a (hopefully not vain attempt) to stumble upon whatever may be the bug.

Henrik Böving (Jul 04 2022 at 08:57):

No worries, my interactions with lake have always been on a eh... "it works, commit it" basis so its good if someone cleans it up :D

Mac (Jul 04 2022 at 08:58):

Henrik Böving said:

to figure out the remote urls

One super low-tech, hacky way to do this without Lake would be to just iterate over the directories in lean_packages and capture the output of git remote get-url origin. However, using Lake for this is probably a more principled solution. On the other hand, it may be worth it to create some Lake CLI command (ala print-paths) that outputs a JSON blob containing the information you want from Lake.

Henrik Böving (Jul 04 2022 at 08:59):

Now, removing the initSearchPath call causes the CI to fail: https://github.com/leanprover/doc-gen4/runs/7177244361?check_suite_focus=true because it cannot find Mathlib as a library so that did not work out.

Sebastian Ullrich (Jul 04 2022 at 09:00):

Mac said:

Henrik Böving said:

to figure out the remote urls

One super low-tech, hacky way to do this without Lake would be to just iterate over the directories in lean_packages and capture the output of git remote get-url origin. However, using Lake for this is probably a more principled solution. On the other hand, it may be worth it to create some Lake CLI command (ala print-paths) that outputs a JSON blob containing the information you want from Lake.

What part of that is missing from the new manifest.json?

Henrik Böving (Jul 04 2022 at 09:02):

Henrik Böving said:

Now, removing the initSearchPath call causes the CI to fail: https://github.com/leanprover/doc-gen4/runs/7177244361?check_suite_focus=true because it cannot find Mathlib as a library so that did not work out.

It's also noting that it is loading its oleans from here:

Loading modules from: [/home/runner/work/doc-gen4/doc-gen4/build/lib, /home/runner/.elan/toolchains/leanprover--lean4---nightly-2022-06-30/lib/lean]

So it somehow remembered where it was built in even.

Henrik Böving (Jul 04 2022 at 09:05):

Sebastian Ullrich said:

Mac said:

Henrik Böving said:

to figure out the remote urls

One super low-tech, hacky way to do this without Lake would be to just iterate over the directories in lean_packages and capture the output of git remote get-url origin. However, using Lake for this is probably a more principled solution. On the other hand, it may be worth it to create some Lake CLI command (ala print-paths) that outputs a JSON blob containing the information you want from Lake.

What part of that is missing from the new manifest.json?

True, that does seem to contain almost the informations I need. There is one thing missing though, in my source linker I perform this check: https://github.com/leanprover/doc-gen4/blob/main/DocGen4/Output/SourceLinker.lean#L78 which attempts to find which lake package a certain module came from in order to use the correct remote URL, that's missing still.

Henrik Böving (Jul 04 2022 at 09:20):

Also if i end up parsing this JSON I have an other more general question: Both with LeanInk and Lake I would end up parsing some JSON files. This is just some data stuff but I'd hate to replicate the code from both other libraries to doc-gen which is also the reson it is pulling in LeanInk as a foreign library right now (the actual LeanInk work is done by a real LeanInk binary). Do we have some nice solution to this idea of just importing the data representation of something? Maybe splitting up libraries?

Sebastian Ullrich (Jul 04 2022 at 09:24):

You only really pay for the imports you actually use from a dependency, so moving those data structures into separate modules should be sufficient I think

Mac (Jul 04 2022 at 19:38):

Henrik Böving said:

There is one thing missing though, in my source linker I perform this check: https://github.com/leanprover/doc-gen4/blob/main/DocGen4/Output/SourceLinker.lean#L78 which attempts to find which lake package a certain module came from in order to use the correct remote URL, that's missing still.

How about a lake graph <mods> that prints out all the information about the dependency graph of mods (package, imports, source file, olean file, ilean file, etc.). Does that sound like a generally useful tool?

Henrik Böving (Jul 04 2022 at 19:50):

Definitely yeah, we could even have some tool that processes this (presumably JSON?) output to a dot file or w/e to visualize dependencies

Mac (Jul 04 2022 at 23:31):

Henrik Böving said:

I'll remove the initSearchPath then, thank you^^

I just realized gave you bad advice here. Lake does initialize the search path itself. However, it does not initialize it with the workspace's olean directories, which is what you want. So you do need to initialize the search path yourself. Using initSearchPath, though can lead to the problem I mentioned (prioritizing user modules over built-in ones) so you will need to rolll your own like so:

let mut sp : Lean.SearchPath := []
sp  addSearchPathFromEnv sp
sp := lake.oleanDir :: lean.oleanDir :: sp
sp := ws.oleanPath ++ sp
searchPathRef.set sp

Henrik Böving (Jul 05 2022 at 11:31):

Mac said:

Henrik Böving said:

There is one thing missing though, in my source linker I perform this check: https://github.com/leanprover/doc-gen4/blob/main/DocGen4/Output/SourceLinker.lean#L78 which attempts to find which lake package a certain module came from in order to use the correct remote URL, that's missing still.

Regarding this, I also have another idea that lake might be able to address. In doc-gen4 and by now also in other projects my include_str macro for statically reading a file at compile time into a static Lean string has spread. But the issue to this day remains that the Lean compiler has of course no clue that this file is part of its dependency chain and thus the lean files that include it should be rebuilt. Could we maybe have some mechanism to explicitly add nodes to the dependency graph? So in this case e.g. tell Lake that if style.css is more recently touched than StaticStrings.lean rebuild StaticStrings.lean?....although at this point we are basically a full blown make :D
How about a lake graph <mods> that prints out all the information about the dependency graph of mods (package, imports, source file, olean file, ilean file, etc.). Does that sound like a generally useful tool?

Jannis Limperg (Jul 05 2022 at 12:14):

(Template Haskell has addDependentFile etc. to address this issue.)

Mac (Jul 05 2022 at 18:51):

Henrik Böving said:

But the issue to this day remains that the Lean compiler has of course no clue that this file is part of its dependency chain and thus the lean files that include it should be rebuilt.

And that issue is logged here: https://github.com/leanprover/lake/issues/86 :big_smile:

Wojciech Nawrocki (Aug 07 2022 at 10:23):

Hi @Mac , it looks like in Lake 4.0 there is no target property on targets. How can I use another target as a dep now, as in here?

Mac (Aug 07 2022 at 22:50):

@Wojciech Nawrocki That is because that entire notion of targets no longer exists (including FileTarget).

Mac (Aug 07 2022 at 22:53):

See the updated FFI example for example as to how to use another target as a dep. (You build it with fetch <| pkg.target <target-name>).

Mac (Aug 07 2022 at 23:01):

From looking at your example, I realized that you really needed the feature mentioned in your #86 comment. Sadly, it is not there yet as I did not realize it was that pivotal. I will try to implement it 2 weeks from now when I have some free time (and will be trying to ensure everything is ship-shape for Lake).

Wojciech Nawrocki (Aug 08 2022 at 07:28):

Mac said:

See the updated FFI example for example as to how to use another target as a dep. (You build it with fetch <| pkg.target <target-name>).

I saw that example, but unfortunately it does not work. We get the following error

def tsxTarget (pkg : Package) (tsxName : String) : IndexBuildM (BuildJob FilePath) := do
  let data  fetch (pkg.target ``packageLock)
  let deps : Array (BuildJob FilePath) := #[
    /- application type mismatch
      List.cons data
    argument
      data
    has type
      CustomData (Package.name pkg, `packageLock) : Type
    but is expected to have type
      BuildJob FilePath : Type -/
    data
  ]

It does work if I put this code inside a target rubiksTarget (pkg : Package) : FilePath, but then I no longer have a single and reusable tsxTarget which is polymorphic over the tsxName : Name of the file to build. I think the reason why target rubiksTarget works is thanks to the custom elaborator which inserts the right value for pkg that makes the fetch open family give the right type. This brings me to, not to bash your work as overall Lake is great, but I'm sorry to say I think the open family API is a mistake. It uses really advanced language features in a user-facing part of a build system, and it's guaranteed to cause lots of confusion, as well as errors like the one we see above. The API in Lake 3 wherein we simply got a BuildTarget FilePath (or whatever wrapper around a concrete type FilePath) out of the plain def created by target packageLock was easier to use.

Mac (Aug 08 2022 at 21:21):

Wojciech Nawrocki said:

The API in Lake 3 wherein we simply got a BuildTarget FilePath (or whatever wrapper around a concrete type FilePath) out of the plain def created by target packageLock was easier to use.

The problem with this is that API was broken. It was not (and could not be) properly integrated with the build store so the same target could end up being rebuilt multiple times (and separate builds could race with each other). It only worked in its very limited application because the targets being used (e.g., extern_lib, extraDepTarget) were not meant to depend on each other.

However, in your case, you were already violating this assumption. So, if the library was building multiple TSX targets at once, it would attempt to build packageLock multiple times simultaneously, which could cause problems. Even if bugs did not emerge, such rebuilds also are performance hits.

Mac (Aug 08 2022 at 21:24):

Wojciech Nawrocki said:

It uses really advanced language features in a user-facing part of a build system, and it's guaranteed to cause lots of confusion, as well as errors like the one we see above.

It should be noted that custom targets are not really meant to be a novice user-facing part of the system. After all, writing them requires looking into Lake's source code and understanding the internal Build API (or, previously, the Target API). It is meant for advanced users who want as much power as they can get.

Mac (Aug 08 2022 at 21:29):

Wojciech Nawrocki said:

I think the reason why target rubiksTarget works is thanks to the custom elaborator which inserts the right value for pkg that makes the fetch open family give the right type.

You are mostly correct, the problem with your general solution is that it does not necessarily ensure that the pkg is the right package, which the target does ensure. You can however replicate this by mirroring the target's function signature. That is, add a [Fact (pkg.name = _package.name)] argument to tsxTarget.

Mac (Aug 08 2022 at 21:38):

Wojciech Nawrocki said:

The API in Lake 3 wherein we simply got a BuildTarget FilePath (or whatever wrapper around a concrete type FilePath) out of the plain def created by target packageLock was easier to use.

Note that target still generates just as plain of a definition as the previous Lake 3 target command did. It is still just a macro, no custom elaboration shenanigans.

Mac (Aug 08 2022 at 21:46):

Mac said:

That is, add a [Fact (pkg.name = _package.name)] argument to tsxTarget.

Fyi, in case you are curious as to why you need to do this. The problem is rather straightforward: the target in question (e.g., packageLock) is only available on package you are defining, not other packages (e.g., dependencies, dependents) -- thus you need to ensure the package being passed to tsxTarget is the current package, not some other package (and this is what the Fact does).

Mac (Aug 08 2022 at 22:18):

Wojciech Nawrocki said:

not to bash your work as overall Lake is great

No worries. Feel free to critique away! (Though, I will, of course, not always agree.)

Wojciech Nawrocki (Aug 08 2022 at 22:18):

It should be noted that custom targets are not really meant to be a novice user-facing part of the system.

Let me push back a bit on this. Being able to write down a few commands and state "these produce file foo", then another few commands and state "these produce bar but make sure foo is built first" is the most basic functionality of a build system. This is trivial to do in Make and it should be easy in Lake. All the nice features for auto-building Lean modules, shared libraries and so on are conceptually sugar on top of this. I totally understand that there are engineering constraints such as the target duplication you mention that can make (hah) a good balance difficult to achieve, but I think we should err on the side of simplicity.

You can however replicate this by mirroring the target's function signature. That is, add a [Fact (pkg.name = _package.name)] argument to tsxTarget.

So to ensure that the pkg argument I get is my package, I need to synthesize a propositional equation through typeclass inference so that more typeclass inference can use a value-indexed dependent type family to synthesize the right type for the path of my file? I'm sorry, I just think this is way too complicated. Setting up a build script for a project is often the first step one takes in trying out a new language, and it is easy to get discouraged - I know I have. I don't think doing this should require a PhD in type theory.

you need to ensure the package being passed to tsxTarget is the current package, not some other package (and this is what the Fact does)

I understand, thank you for explaining. My point is mostly about the mechanism by which this is ensured.

So far this was not very constructive. I don't understand the internals enough to really propose something, but perhaps:

  • could we have a global __package__, like the global __dir__? There is some disconnect in that the latter is global, whereas the former is given as an argument.
  • whatever the API, could we hide the name-indexing families as an implementation detail and keep only concrete Types (of course modulo monadic wrappers which are unavoidable) in the user-facing API? Hopefully I highlighted above why simplicity seems important to me here.

And let me say again -- thank you for all your work spent on Lake! It really does work very well, and was clearly a lot of effort. It is just this one corner I am complaining about :)

Mac (Aug 08 2022 at 22:25):

Wojciech Nawrocki said:

  • could we have a global __package__, like the global __dir__? There is some disconnect in that the latter is global, whereas the former is given as an argument.

No, this is impossible, because the Package object can only be constructed after the whole Lakefile is elaborate (and dependencies are resolved, etc.). The pattern here is that of dependency injection, which is a very common pattern for systems with dynamic dependencies like a build system.

Wojciech Nawrocki said:

  • whatever the API, could we hide the name-indexing families as an implementation detail and keep only concrete Types (of course modulo monadic wrappers which are unavoidable) in the user-facing API? Hopefully I highlighted above why simplicity seems important to me here.

This is specifically what target is intended to do, hide the ugliness.. You broke this by defining a separate def (such parameterization is non-trivial even in make). I am still not sure what you mean by concrete Types, everything here is concrete (i.e., target has a concrete signature)?

Mac (Aug 08 2022 at 22:33):

Wojciech Nawrocki said:

Being able to write down a few commands and state "these produce file foo", then another few commands and state "these produce bar but make sure foo is built first" is the most basic functionality of a build system. This is trivial to do in Make and it should be easy in Lake.

I agree. However, the arbitrary build system portion of Lake was not initially its primary purpose, and is essentially a new feature implemented this summer. It is thus very much in its infancy and thus missing many niceties. I plan on improving this part as more use cases are found and better idea of what is desired is established.

Thus, the following statement is not really true:

Wojciech Nawrocki said:

All the nice features for auto-building Lean modules, shared libraries and so on are conceptually sugar on top of this.

In fact, as originally designed, the arbitrary build support was largely a hack to support a few specific use cases -- the building Lean modules was the core system. It is only with the Lake v4 overhaul that building Lean modules are really a convivence layer on top of the arbitrary builds.

Wojciech Nawrocki (Aug 08 2022 at 22:42):

This is specifically what target is intended to do, hide the ugliness.. You broke this by defining a separate def (such parameterization is non-trivial even in make).

Here is a Makefile doing what I would like to:

.PHONY: all
all: dist/rubiks.js

package-lock.json: package.json
        npm install

dist/%.js: src/%.tsx package-lock.json rollup.config.js tsconfig.json
        npm run build -- --tsxName $(basename $(notdir $@))

I am still not sure what you mean by concrete Types, everything here is concrete (i.e., target has a concrete signature)

I wasn't specific enough, sorry. I meant a dependent type, basically. In particular opaque CustomData (target : Name × Name) : Type which appeared in the error message I originally got.

Mac (Aug 08 2022 at 22:43):

Wojciech Nawrocki said:

I'm sorry, I just think this is way too complicated. Setting up a build script for a project is often the first step one takes in trying out a new language, and it is easy to get discouraged - I know I have. I don't think doing this should require a PhD in type theory.

I am not sure I agree with this framing. Most arbitrary build systems are not what I call easy-to-use for beginners (Rake, CMake, Make, Shake, etc.). Generally, there is a simple default template beginners use and then very complicate build magic that experts use. Just look at Lean's CMake configuration -- not something I think a beginner would ever be able to comprehend.

Now, I don't think this is necessarily good, so I would very much like to make Lake as user-friendly as possible. But, I do not think arbitrary build scripts are generally meant for beginners, that is what the pretty declarative library and executable configurations are for. The build scripts are meant to be written by users with a strong grasp of Lean who need specific build magic on their unique project (like you do, with the complicated magic of widgets).

Wojciech Nawrocki (Aug 08 2022 at 22:44):

However, the arbitrary build system portion of Lake was not initially its primary purpose, and is essentially a new feature implemented this summer. It is thus very much in its infancy and thus missing many niceties.

Totally understandable! This is not easy. I think my frustration here is mostly about the fact that something which used to work now doesn't seem to, insofar as I don't think I would have been able to figure this out without your help.

Mac (Aug 08 2022 at 22:48):

Wojciech Nawrocki said:

Here is a Makefile doing what I would like to

Yes, your particular case can be done concisely in Make because of its % magic for file names. However, were the input and output file names not directly related in this manner, parameterization would not be so easy. Which is kind of my point, build systems generally introduce some magic to make common applications easier, but once you leave these bounds, things start getting very complex.

Mac (Aug 08 2022 at 22:52):

Wojciech Nawrocki said:

I think my frustration here is mostly about the fact that something which used to work now doesn't seem to, insofar as I don't think I would have been able to figure this out without your help.

I am still surprised you figured out how to get it working with the old API. The .target part of a target was not documented (and was not really meant to be used the way you did), so you kind of lucked into a solution. That is, you were doing things that weren't intended to be supported in the first place. :stuck_out_tongue_wink:

Mac (Aug 08 2022 at 23:02):

Wojciech Nawrocki said:

So to ensure that the pkg argument I get is my package, I need to synthesize a propositional equation through typeclass inference

One alternative would be to use a Package type parameterized by the package name (e.g., TPackage name), which might be a bit more clear? The package name does need to statically known to be correct, there is no way to avoid that.

Wojciech Nawrocki (Aug 08 2022 at 23:05):

Could we replace (at least in the user-facing API) fetch with something like buildTargetOn (t : TargetDecl a) (pkg : Package) : M a which works in a plain def as well as in a target? Note that this requires indexing TargetDecl by the output type, but it drops the name-indexed state injected by target.

Wojciech Nawrocki (Aug 08 2022 at 23:06):

If that fails at runtime given the wrong Package, that's fine with me.

Mac (Aug 08 2022 at 23:09):

Wojciech Nawrocki said:

If that fails at runtime given the wrong Package, that's fine with me.

I very much don't like this. If a Package configuration is successfully loaded its builds should be well-formed (unless the user did something evil). The point of a strongly typed language is to avoid such dynamic checking runtime errors.

Wojciech Nawrocki (Aug 08 2022 at 23:15):

You are probably right! I am too tired to generate any good ideas now so I will head off, but thank you for the discussion so far.

Mac (Aug 08 2022 at 23:19):

Wojciech Nawrocki said:

Could we replace (at least in the user-facing API) fetch with something like buildTargetOn (t : TargetDecl a) (pkg : Package) : M a which works in a plain def as well as in a target? Note that this requires indexing TargetDecl by the output type, but it drops the name-indexed state injected by target.

Note that fetch is not just used for fetching custom package targets,, so this would not be a replacement.

To the main question, sadly, no, for multiple reasons. First, TargetDecl is specifically designed not to be parameterized by type (because it needs to be loaded via evalConstCheck in the package loader). Second, the package still needs to be verified to be the right one (because that is how the result type in the build store is determined).

It is possible to do what you want via a runtime check and a per-target projection (which could be generated by the target macro). For example,

def packageLock.fetchFor (pkg : Package) : IndexBuildM (BuildJob FilePath) := do
  if h : pkg.name = _package.name then
    have : Fact (pkg.name = _package.name) := (h)
    fetch <| pkg.target `packageLock
  else
    error "wrong package"

Mac (Aug 08 2022 at 23:23):

Honestly, to me, what your example really wants is just some syntactic sugar for parameterized targets (ala % in Make).

Wojciech Nawrocki (Aug 09 2022 at 17:40):

It would be useful! I made a feature request for it, but since the workaround works, please consider it low priority.

Wojciech Nawrocki (Aug 11 2022 at 16:51):

Okay, I have been thinking a little about this. I would say parametric targets would already make everything much easier. The other thing I am still hung up on is the mysterious signature of fetch whose output type depends on a separate declaration and is only synthesized through typeclass inference. It seems like the root cause is this:

First, TargetDecl is specifically designed not to be parameterized by type (because it needs to be loaded via evalConstCheck in the package loader). Second, the package still needs to be verified to be the right one (because that is how the result type in the build store is determined).

Could you say a bit more about why the evalConst is necessary? Could we have a typed wrapper around TargetDecl which is not itself evalConst-able but which keeps the target's output type around? It seems to me that having target abc : Foo generate a TTargetDecl Foo, or TTargetDecl Foo `thisPkg would be exactly the thing we'd need to simplify fetch into something like def fetch (pkg : TPackage `thisPkg) (tgt : TargetDecl Foo `thisPkg) : IndexBuildM Foo. (Note: It seems TPackage might also be a good idea, so I was already assuming it here.)

Wojciech Nawrocki (Aug 11 2022 at 16:55):

I very much don't like this. If a Package configuration is successfully loaded its builds should be well-formed (unless the user did something evil). The point of a strongly typed language is to avoid such dynamic checking runtime errors.

Come to think of this, is it even possible to load another package's lakefile? Suppose that in any given lakefile, the (pkg : Package) part of a target could not possibly refer to anything other than the current package. Then making that a privileged special case would be quite natural, and a runtime failure would be impossible. (It would also avoid the TPackage/TTargetDecl from my post above.)

Wojciech Nawrocki (Aug 11 2022 at 16:57):

A short way to put what I am trying to get at: from the build system's perspective, it is important to keep track of target/package IDs. But as a user, I really don't want to care about this, and trying to come up with ways to hide it.

Mac (Aug 11 2022 at 16:58):

Wojciech Nawrocki said:

Come to think of this, is it even possible to load another package's lakefile?

Of course, this is what require does. You can also build targets from dependencies (and even dependents) in local packages (for example, one might wish to build the docs facet contributed by doc-gen4).

Mac (Aug 11 2022 at 17:00):

Wojciech Nawrocki said:

A short way to put what I am trying to get at: from the build system's perspective, it is important to keep track of target/package IDs. But as a user, I really don't want to care about this, and trying to come up with ways to hide it.

This stuff is very important for cross-package targets, which may not be of use to you, but does have its uses elsewhere.

Wojciech Nawrocki (Aug 11 2022 at 17:01):

Poor wording, sorry. I get that facets support being built on any module/library/etc, even in other packages. But do targets?

Wojciech Nawrocki (Aug 11 2022 at 17:02):

In particular, because of the typeclass constraint pkg.name = _package.name, I couldn't possibly build a dependency's target with/on my own Package, right?

Mac (Aug 11 2022 at 17:02):

Wojciech Nawrocki said:

Could you say a bit more about why the evalConst is necessary?

Because that is how Lake loads targets from configuration files. It loads TargetDecl defs marked @[target] from the configuration environment using evalConstCheck. I imagine widgets do something similar with their widget declarations.

Wojciech Nawrocki (Aug 11 2022 at 17:03):

I think I understand this now, thank you.

Mac (Aug 11 2022 at 17:04):

Wojciech Nawrocki said:

In particular, because of the typeclass constraint pkg.name = _package.name, I couldn't possibly build a dependency's target with/on my own Package, right?

Not on the passed package. no. But you can easily search for other packages and build targets on them via findPackage?. Or find targets directly via findTarget?.

Wojciech Nawrocki (Aug 11 2022 at 17:09):

If I use findPackage?, how would I satisfy the constraint pkg.name = something? The output type is just Option Package with no proof about the name.

Mac (Aug 11 2022 at 17:10):

You'd have to do a runtime check for proper typing. (Avoiding this would be another of the benefits having Packages parameterized by name.)

Mac (Aug 11 2022 at 17:11):

However, you could still build the target and get back an opaque job.

Mac (Aug 11 2022 at 17:13):

One reason why Packages have not already been refactored to include the name in the type, is that there are many places in the Lake codebase where this is not desirable, so this would introduce back and forth coercions in a number of places.

Wojciech Nawrocki (Aug 11 2022 at 17:18):

Got it. So to come back to my main question, do you see any path towards simplifying fetch to get its output type a from a TTargetDecl a?

Wojciech Nawrocki (Aug 11 2022 at 20:18):

I had a much simpler idea. I can see that transferring bundled data of arbitrary type between build jobs is tricky. What would entirely suffice here instead is if I could tell Lake "You know that other FilePath you know how to build? Please make sure it's built first before building this other one, i.e. please make it a dependency of the other". It would need to support substituting both "the current target when we're in IndexBuildM" as well as "some target that Lake defines implicitly, as in a specific .olean" for "this other one". This would also basically solve the part of lake#86 that doesn't depend on custom elaborators communicating data to Lake.


Last updated: Dec 20 2023 at 11:08 UTC