Zulip Chat Archive

Stream: lean4

Topic: using file paths relative to command caller


Jakob von Raumer (Dec 26 2023 at 15:57):

There seems to have been some change in what the directory is in which lake dependencies are executed: I have a package A which provides a command that involves loading and reading a file via

def elabThatCommand (s : String) : M Foo := do
  let filePath : System.FilePath := s
  let foo  IO.FS.readFile filePath
  ...

Now theres a package B that uses that command with a file that lies in the root dir of B and I give the path relativ to B's root dir. That works fine whenever I open B in VSCode or when I build B. But when I try to add B as a dependency to a package C, the build process of C cannot built B anymore because it can't find the file, apparently because it runs the IO.FS.readFile from C's root dir and not Bs. Has that changed? How can I make that more robust?

Jakob von Raumer (Dec 26 2023 at 16:07):

/cc @Mac Malone

Mario Carneiro (Dec 26 2023 at 16:24):

mathlib uses docs#Mathlib.getMathlibDir for this

Jakob von Raumer (Dec 26 2023 at 16:36):

Which in turn seems to be using initSrcSearchPath to go off of... Ideally I'd want the snippet above to work with both absolute and relative paths and I don't have that name of the package B at compile time of package A and neither do I want the user to need to specify which package it calls elabThatCommand from :confused:

Mario Carneiro (Dec 26 2023 at 16:47):

Shouldn't B be able to provide its path or otherwise identify itself to elabThatCommand? There are various ways you can make this less onerous by making elabThatCommand a macro or elab and gather some compile time info about the elaborating environment that way

Mario Carneiro (Dec 26 2023 at 16:48):

basically, don't assume you know what the CWD is

Mario Carneiro (Dec 26 2023 at 16:49):

use other methods to locate files you want to load at runtime

Jakob von Raumer (Dec 26 2023 at 17:51):

It's just a bit annoying that this worked fine up to Lean 4.1 and now it doesn't...

Mario Carneiro (Dec 26 2023 at 17:53):

I don't know of any reason this would be different on older versions of lake, this has always been the case with lake AFAIK

Jakob von Raumer (Dec 26 2023 at 17:54):

Mario Carneiro said:

Shouldn't B be able to provide its path or otherwise identify itself to elabThatCommand? There are various ways you can make this less onerous by making elabThatCommand a macro or elab and gather some compile time info about the elaborating environment that way

How so? I still don't really get how I get the package root dir of the caller, be it in IO or in ElabCommandM

Jakob von Raumer (Dec 26 2023 at 17:54):

Mario Carneiro said:

I don't know of any reason this would be different on older versions of lake, this has always been the case with lake AFAIK

Well the dir just was the package root, maybe it changed when the location was was moved from lake-packages to .lake/package?

Mario Carneiro (Dec 26 2023 at 18:02):

You can look in the LEAN_SRC_PATH for your file

Jakob von Raumer (Dec 26 2023 at 18:08):

That seems like it could introduce unnecessary ambiguities?

Jakob von Raumer (Dec 26 2023 at 18:32):

I can't imagine this is a rare case? Don't more lake packages store and read non-lean files in their own repo?

Jakob von Raumer (Dec 26 2023 at 18:36):

And it's a bit weird that lake build in C's root dir fails while cd .lake/packages/B && lake build works

Mario Carneiro (Dec 26 2023 at 18:42):

It's not particularly ambiguous if you look for the foo.txt file adjacent to the B.lean file because there can only be one B.lean file in the lean path

Mario Carneiro (Dec 26 2023 at 18:43):

But more generally, the fact that lake doesn't provide enough elaboration-time information to the lean files it builds is a long standing open problem

Mario Carneiro (Dec 26 2023 at 18:45):

Jakob von Raumer said:

And it's a bit weird that lake build in C's root dir fails while cd .lake/packages/B && lake build works

Lean files can do arbitrary things and depend on arbitrary state so it's not that weird that this can happen if you don't use approved mechanisms to do file resolution. The problem is that there aren't really any approved mechanisms for this right now so you have to make do with the information that is available

Mario Carneiro (Dec 26 2023 at 18:48):

Jakob von Raumer said:

I can't imagine this is a rare case? Don't more lake packages store and read non-lean files in their own repo?

If you want to do file resolution the same way as lean files, that means you should use the LEAN_SRC_PATH

Jakob von Raumer (Dec 26 2023 at 18:51):

Okay, maybe I'm misunderstanding the LEAN_SRC_PATH solution. How do I make sure I get the foo.txt adjacent to B.lean?

Mario Carneiro (Dec 26 2023 at 18:53):

Loop through the directories in LEAN_SRC_PATH looking for $DIR/Path/To/B.lean where Path.To.B is the current module name, and when you find it you use either $DIR/foo.txt or $DIR/Path/To/foo.txt depending on how you want the file to be located

Jakob von Raumer (Dec 26 2023 at 18:55):

current module name

Current in the sense of the location of the command when in CommandElabM? How do I get that?

Mario Carneiro (Dec 26 2023 at 18:55):

it should be in the CommandElabM state

Mario Carneiro (Dec 26 2023 at 18:56):

It's in the Environment.Header I think

Mario Carneiro (Dec 26 2023 at 18:56):

docs#Lean.EnvironmentHeader.mainModule

Jakob von Raumer (Dec 26 2023 at 18:58):

Are you sure that doesn't resolve to, in my case, C? It being the "module being compiled"... I'll try

Mario Carneiro (Dec 26 2023 at 18:58):

The "module being compiled" is the file in which we are currently running commands and such

Jakob von Raumer (Dec 26 2023 at 18:58):

Ah nvm, yea, the module should be the right one

Jakob von Raumer (Dec 27 2023 at 18:35):

How do I normalize a file path like foo/../bar to bar?

Mario Carneiro (Dec 27 2023 at 18:39):

You can use IO.FS.realPath but this resolves to an absolute path

Jakob von Raumer (Jan 02 2024 at 09:46):

For future reference: This solved it for me

/-- Convert a file path to one that's relative to the location of the module containing the
command that's elaborated. -/
def toCallerRelativeFilePath (p : System.FilePath) : CommandElabM System.FilePath := do
  let x := modToFilePath "." (( getEnv).header.mainModule) "lean"
  let .some x := x.parent
    | throwError "Current lean file does not have a parent directory!"
  let y  realPathNormalized <| x / p
  return y.toString

Mario Carneiro (Jan 02 2024 at 09:49):

I think you want to wrap the first two lines in an elab so that you can capture the location of the file that wrote the syntax

Mario Carneiro (Jan 02 2024 at 09:50):

if it's just a regular function then it will not capture the caller context

Mario Carneiro (Jan 02 2024 at 09:50):

well, maybe you want the rest of it to be part of the elab as well, if the file path in question is compile-time known

Mario Carneiro (Jan 02 2024 at 09:51):

That will make it similar to the include_str macro

Mario Carneiro (Jan 02 2024 at 09:53):

the implementation of which is:

@[builtin_term_elab includeStr] def elabIncludeStr : TermElab
  | `(include_str $path:term), _ => do
    let path  evalFilePath path
    let ctx  readThe Lean.Core.Context
    let srcPath := System.FilePath.mk ctx.fileName
    let some srcDir := srcPath.parent
      | throwError "cannot compute parent directory of '{srcPath}'"
    let path := srcDir / path
    mkStrLit <$> IO.FS.readFile path

Mario Carneiro (Jan 02 2024 at 09:56):

apparently docs#Lean.Core.Context.fileName exists, and is better suited to your use case than Environment.header.mainModule

Sebastian Ullrich (Jan 02 2024 at 10:02):

If you want to keep the .oleans machine-independent, it's probably better to capture only the module name at compile time

Mario Carneiro (Jan 02 2024 at 10:18):

from what I can tell of the original use case the purpose is not to capture the path itself but to read the file at that path and do something with the result, in which case OS dependence of the path syntax shouldn't matter

Jakob von Raumer (Jan 02 2024 at 11:03):

Uh yea, this is not about reading (o)lean files

Mario Carneiro (Jan 02 2024 at 11:04):

No, Sebastian's point is that if you capture the current file name at compile time then you will end up baking that string into the compilation results for that file, which makes the build result non-portable

Mario Carneiro (Jan 02 2024 at 11:10):

e.g. you should not put this in a lean file:

elab "path%" : term => do
  return Lean.mkStrLit ( readThe Lean.Core.Context).fileName
def foo := path%

because the olean file you get out is effectively the same as if you had written

def foo : String := "/home/mario/lean/test/Test.lean"

which contains an absolute path in it. If something like this was in mathlib then we would end up shipping a declaration referencing some path like /home/github-runner/Test.lean which doesn't exist on user machines

Jakob von Raumer (Jan 02 2024 at 11:11):

I have a hard time wrapping my head around what exactly you mean by compile time. I think I don't need an elab around because I use it in the elaboration of a command in the same package anyway, so the context is handed on. Or do you mean at the compile time of the package that calls it?

Mario Carneiro (Jan 02 2024 at 11:12):

You described a setup with three packages A (which provides the command), B (which uses the command and whose path we want to capture), and C (which calls B as a library and needs it to be able to reference its own files)

Jakob von Raumer (Jan 02 2024 at 11:13):

And you mean B's compilation?

Mario Carneiro (Jan 02 2024 at 11:13):

While compiling the end user package C, we will first compile B, and B contains a usage of an elab whose definition is in A

Mario Carneiro (Jan 02 2024 at 11:14):

So in my example A would be the one containing the elab "path%" declaration and B is the one with def foo := path%

Mario Carneiro (Jan 02 2024 at 11:14):

and C is using foo in some way

Mario Carneiro (Jan 02 2024 at 11:16):

The code defined in A is run while compiling B, because it is an elab. If it was a regular function, used in B, then it would only be referenced, not run, while compiling B, and depending on how it is used in C it would either be run there at compile time, or at run time if C is an executable

Jakob von Raumer (Jan 02 2024 at 11:17):

Ah, hm, the current way has the advantage that I don't need to change B at all and I don't require the maintainer of B to use some new syntax to refer to files...

Mario Carneiro (Jan 02 2024 at 11:17):

run time meaning that you run lake exe C

Mario Carneiro (Jan 02 2024 at 11:17):

I think you should show what the code of B and C look like here

Mario Carneiro (Jan 02 2024 at 11:18):

because there are a few ways you could be making use of this definition. The function toCallerRelativeFilePath is "A code", it doesn't actually do anything on its own

Jakob von Raumer (Jan 02 2024 at 11:19):

B just contains a line like

import A

my_load_file "foo.txt"

Mario Carneiro (Jan 02 2024 at 11:19):

okay so my_load_file is the elab I'm talking about

Mario Carneiro (Jan 02 2024 at 11:19):

I presume that this is also defined in A

Jakob von Raumer (Jan 02 2024 at 11:20):

Yes, that's why I said above I don't need to wrapt in my specific case

Mario Carneiro (Jan 02 2024 at 11:21):

I would drop the "Caller" part of the function name, it's not capturing any caller context, it is just using the current environment

Mario Carneiro (Jan 02 2024 at 11:21):

Q: have you just reinvented include_str?

Mario Carneiro (Jan 02 2024 at 11:24):

I'm still think that include_str is doing exactly what you want, there are not any concerns of olean path dependence if you are reading the file (instead of storing its path) and processing the result in some way

Mario Carneiro (Jan 02 2024 at 11:30):

def processStr (s : String) : Elab.Command.CommandElabM Unit := sorry

elab "my_load_file" path:term : command => do
  let str  Elab.Command.liftTermElabM do
    let path  unsafe Elab.Term.evalTerm FilePath (.const ``FilePath []) path
    let srcPath := System.FilePath.mk ( readThe Core.Context).fileName
    let some srcDir := srcPath.parent
      | throwError "cannot compute parent directory of '{srcPath}'"
    IO.FS.readFile (srcDir / path)
  processStr str

-- B.lean
my_load_file "foo.txt"

Jakob von Raumer (Jan 02 2024 at 11:30):

Indeed, I didn't know include_str exsisted, I never came across it somehow :face_palm:

Jakob von Raumer (Jan 02 2024 at 11:32):

Note that at the start of this thread I wanted the path to be relative to the package root, but then found the solution to be too unwieldy to justify not just having it relative to the file where the command is called.

Jakob von Raumer (Jan 02 2024 at 11:33):

Thanks a lot!


Last updated: May 02 2025 at 03:31 UTC