Zulip Chat Archive
Stream: lean4
Topic: difficulties with `Frontend`
Scott Morrison (Aug 09 2023 at 04:02):
I am trying to use Lean.Elab.Frontend
, and running into difficulties with persistent extensions. I'm not sure if I'm "doing it wrong", or what I want isn't feasible.
To begin, I define a
compileModule (module : Name) (initializers := true) : IO (Environment × List Message × List InfoTree)
function which is intended to load the source for a specified module
, and compile it, return the resulting Environment
and any Message
s and InfoTree
s produced.
I also pass initializers
, which can control whether enableInitializersExecution
will be run.
Scott Morrison (Aug 09 2023 at 04:02):
Here's the implementation:
-- `Frontend/Compile.lean`
import Lean.Elab.Frontend
open Lean Elab System
/--
Wrapper for `IO.processCommands` that enables info states, and returns
* the new environment
* messages
* info trees
-/
def processCommandsWithInfoTrees
(inputCtx : Parser.InputContext) (parserState : Parser.ModuleParserState)
(commandState : Command.State) : IO (Environment × List Message × List InfoTree) := do
let commandState := { commandState with infoState.enabled := true }
let s ← IO.processCommands inputCtx parserState commandState <&> Frontend.State.commandState
pure (s.env, s.messages.msgs.toList, s.infoState.trees.toList)
/--
Process some text input, with or without an existing environment.
If there is no existing environment, we parse the input for headers (e.g. import statements),
and create a new environment.
Otherwise, we add to the existing environment.
Returns the resulting environment, along with a list of messages and info trees.
-/
unsafe def processInput (input : String) (initializers := true) (env? : Option Environment)
(opts : Options) (fileName : Option String := none) :
IO (Environment × List Message × List InfoTree) := do
let fileName := fileName.getD "<input>"
let inputCtx := Parser.mkInputContext input fileName
let (parserState, commandState) ← match env? with
| none => do
if initializers then enableInitializersExecution
let (header, parserState, messages) ← Parser.parseHeader inputCtx
let (env, messages) ← processHeader header opts messages inputCtx
pure (parserState, (Command.mkState env messages opts))
| some env => do
pure ({ : Parser.ModuleParserState }, Command.mkState env {} opts)
processCommandsWithInfoTrees inputCtx parserState commandState
def findLean (mod : Name) : IO FilePath := do
return FilePath.mk ((← findOLean mod).toString.replace "build/lib/" "") |>.withExtension "lean"
/-- Read the source code of the named module. -/
def moduleSource (mod : Name) : IO String := do
IO.FS.readFile (← findLean mod)
unsafe def compileModule (mod : Name) (initializers := true) : IO (Environment × List Message × List InfoTree) := do
processInput (← moduleSource mod) initializers none {}
Scott Morrison (Aug 09 2023 at 04:04):
Now, suppose I build a dummy PersistentEnvExtension
, e.g. as:
-- `Frontend/DummyExt.lean`
import Lean.Environment
open Lean
initialize dummyExt :
PersistentEnvExtension Unit Unit Unit ←
registerPersistentEnvExtension {
mkInitial := pure ()
addImportedFn := fun _ => pure ()
addEntryFn := fun _ _ => ()
exportEntriesFn := fun _ => #[]
}
Scott Morrison (Aug 09 2023 at 04:05):
and create a file that imports it:
-- `Frontend/Main.lean`
import Frontend.DummyExt
def f := 37
Scott Morrison (Aug 09 2023 at 04:06):
Now I try to compile it:
-- `Frontend/Test1.lean`
import Frontend.Compile
open Lean
#eval show MetaM _ from do
let (_, msgs, _) ← compileModule `Frontend.Main
for m in msgs do dbg_trace ← m.data.format
return msgs.length
and all is well: no messages are printed.
Scott Morrison (Aug 09 2023 at 04:08):
However I now try to compile it from a file that also depends on Frontend.DummyExt
:
-- `Frontend/Test2/lean`
import Frontend.Compile
import Frontend.DummyExt
open Lean
#eval show MetaM _ from do
let (_, msgs, _) ← compileModule `Frontend.Main
for m in msgs do dbg_trace ← m.data.format
return msgs.length
and I get the error message from compilation:
invalid environment extension, 'dummyExt' has already been used`
Scott Morrison (Aug 09 2023 at 04:11):
Is this feasible? I want to be able to compile other files, without having to worry about what I've already imported myself.
Mac Malone (Aug 09 2023 at 04:11):
@Scott Morrison First thing, I would advise defining the following to avoid marking meaningful functions as unsafe
:
@[implemented_by enableInitializersExecution]
opaque enableInitializersExecutionWrapper : IO Unit
Scott Morrison (Aug 09 2023 at 04:12):
(In my real use case I just use unsafe
from Std.Util.TermUnsafe
.)
Scott Morrison (Aug 09 2023 at 04:13):
For completeness, I'll note that this test does work fine if I don't run enableInitializersExecution
:
--- `Frontend/Test3.lean`
import Frontend.Compile
import Frontend.DummyExt
open Lean
#eval show MetaM _ from do
let (_, msgs, _) ← compileModule `Frontend.Main (initializers := false)
for m in msgs do dbg_trace ← m.data.format
return msgs.length
Scott Morrison (Aug 09 2023 at 04:15):
However I'm pretty sure that's not the right approach? Certainly it prevents me from compiling interesting files. One example is:
--- `Frontend/Test4.lean`
-- Requires Mathlib on the path:
import Frontend.Compile
open Lean
#eval show MetaM _ from do
let (_, msgs, _) ← compileModule `Mathlib.Init.ZeroOne (initializers := false)
for m in msgs do dbg_trace ← m.data.format
return msgs.length
which will crash the Lean server with:
libc++abi: terminating with uncaught exception of type lean::exception: cannot evaluate `[init]` declaration 'Mathlib.Prelude.Rename.linter.uppercaseLean3' in the same module
Mac Malone (Aug 09 2023 at 04:15):
@Scott Morrison Do you need to run initializers? Generally, frontends that are not producing oleans to import (or oleans with initializers) do not need it.
Scott Morrison (Aug 09 2023 at 04:16):
(I am a bit confused there, in any case: compiling either of the dependencies of Mathlib.Init.ZeroOne
works fine.)
Scott Morrison (Aug 09 2023 at 04:17):
Certainly I want to be able to run this on any single file from Mathlib (retrieving the resulting Environment
and InfoTree
s).
Mac Malone (Aug 09 2023 at 04:19):
@Scott Morrison does running your frontend from the command line (via lake env lean
) work?
Mac Malone (Aug 09 2023 at 04:20):
Regardless, I am bit concerned where that linter @[init]
is coming from. It seems to me that this may be some oddity (perhaps a bug) with how the linter works (or perhaps to to_additive
?)
Scott Morrison (Aug 09 2023 at 04:22):
Running lake env lean Frontend/Test4.lean
gives the same error:
cannot evaluate `[init]` declaration 'Mathlib.Prelude.Rename.linter.uppercaseLean3' in the same module
Mac Malone (Aug 09 2023 at 04:23):
I was expecting it to, just wanted to make sure. :thumbs_up:
Mac Malone (Aug 09 2023 at 04:24):
Assuming you do, in fact, need to initialize the extension twice (i.e., use initializers := true
). You can do what I do in Alloy's extensions and create a findOrRegisterPersistentExtension
helper.
Scott Morrison (Aug 09 2023 at 04:26):
Interesting. It seems though that we would have to use it everywhere, which feels like it might be a workaround for a deeper issue.
Mac Malone (Aug 09 2023 at 04:27):
@Scott Morrison no, this is expected when using a persistent extension that is both part of imported compiled module and used by the frontend (and is not builtin).
Mac Malone (Aug 09 2023 at 04:29):
Both the imported module and the frontend wish to initialize the extension, so you have to tell it to pick one.
Scott Morrison (Aug 09 2023 at 04:30):
But then shouldn't every persistent extension be registered this way? i.e. this findOrRegisterPersistentExtension
should be moved to core, possibly replacingregisterPersistentExtension
?
Mac Malone (Aug 09 2023 at 04:30):
The source of this is that persistentEnvExtensionsRef
is a static ref for all of lean and not part of the individual environment.
Mac Malone (Aug 09 2023 at 04:31):
@Scott Morrison In core, builtin_initialize
avoids this.
Scott Morrison (Aug 09 2023 at 04:31):
Okay, but what is everyone else meant to be doing?
Mac Malone (Aug 09 2023 at 04:33):
@Scott Morrison I am not sure custom frontends are a well-specified feature of Lean at the moment (so I doubt anything is currently meant). Recall that custom user-defined extensions did not even exist when Lake was first developed.
Scott Morrison (Aug 09 2023 at 04:35):
In any case, is your suggestion:
- I should not be using
enableInitializersExecution
for my use case, and instead - I should try to minimize the crash when compiling
Mathlib.Init.ZeroOne
, and see if that reveals something?
Mac Malone (Aug 09 2023 at 04:35):
Originally, the idea was that end users would also use builtin_initialize
and add extensions via ---plugin
. However, that has yet to come to pass due to the fact that custom frontends themselves cannot support plugins (though they can utilize them). That is, a custom frontend can be built with a plugin, but it cannot build things with plugins.
Mac Malone (Aug 09 2023 at 04:39):
@Scott Morrison Yes. I think your use case should not need enableInitializersExecution
. However, I would like you to try compile and running your custom frontend as executable. I may be mixing up the rules for executables with interpreted contexts.
Scott Morrison (Aug 09 2023 at 04:57):
Ahha! When compiled, I do not get the invalid environment extension, 'dummyExt' has already been used
error message, even when running with enableInitializersExecution
, and import Frontend.DummyExt
.
Scott Morrison (Aug 09 2023 at 05:07):
I can also compile things from all over Mathlib successfully, extracting InfoTrees!
It would certainly be nice to be able to do this interpreted, because I want to write code that manipulates these InfoTrees in complicated ways and it's easiest to do that in a live session in the editor.
Scott Morrison (Aug 09 2023 at 05:20):
@Mac Malone, I'm surprised that your findOrRegisterPersistentExtension
is not implemented as:
unsafe def findOrRegisterPersistentEnvExtensionUnsafe {α β σ : Type} [Inhabited σ] (descr : PersistentEnvExtensionDescr α β σ) : IO (PersistentEnvExtension α β σ) := do
let pExts ← persistentEnvExtensionsRef.get
match pExts.find? (fun ext => ext.name == descr.name) with
| some ext => return unsafeCast ext
| none =>
let ext ← registerEnvExtension do
let initial ← descr.mkInitial
let s : PersistentEnvExtensionState α σ := {
importedEntries := #[],
state := initial
}
pure s
let pExt : PersistentEnvExtension α β σ := {
toEnvExtension := ext,
name := descr.name,
addImportedFn := descr.addImportedFn,
addEntryFn := descr.addEntryFn,
exportEntriesFn := descr.exportEntriesFn,
statsFn := descr.statsFn
}
persistentEnvExtensionsRef.modify fun pExts => pExts.push (unsafeCast pExt)
return pExt
@[implemented_by findOrRegisterPersistentEnvExtensionUnsafe]
opaque findOrRegisterPersistentEnvExtension {α β σ : Type} [Inhabited σ] (descr : PersistentEnvExtensionDescr α β σ) : IO (PersistentEnvExtension α β σ)
i.e. having the same signature as registerPersistentEnvExtension
.
Scott Morrison (Aug 09 2023 at 05:21):
In any case, using this one, if I just replace my
initialize dummyExt :
PersistentEnvExtension Unit Unit Unit ←
registerPersistentEnvExtension { ... }
with
initialize dummyExt :
PersistentEnvExtension Unit Unit Unit ←
findOrRegisterPersistentEnvExtension { ... }
then the interpreted version of my examples above all start working.
Scott Morrison (Aug 09 2023 at 05:22):
I'm not really sure what to do with this observation, however. Should I try to persuade Std
and Mathlib
that they should be using findOrRegister...
rather than register...
?
Scott Morrison (Aug 09 2023 at 05:22):
I might try pinging @Mario Carneiro at this point (sorry it is mid-thread!) to see what they think.
Mario Carneiro (Aug 09 2023 at 05:27):
Loading lean files from within lean is basically unsupported because of the way initializers work
Mario Carneiro (Aug 09 2023 at 05:27):
honestly it seems like a design mistake to me
Scott Morrison (Aug 09 2023 at 05:28):
(you mean, the current way initializers work?)
Scott Morrison (Aug 09 2023 at 05:28):
(or: trying to load lean files from within lean?)
Mario Carneiro (Aug 09 2023 at 05:28):
yes, you basically have to load every lean file in its own process
Mario Carneiro (Aug 09 2023 at 05:29):
the reason is because initializer state is a global
Mario Carneiro (Aug 09 2023 at 05:29):
and the set of environment extensions is part of that
Mario Carneiro (Aug 09 2023 at 05:30):
this is the same reason why it's impossible to interact with an environment extension defined in the current file without crashing lean
Mario Carneiro (Aug 09 2023 at 05:34):
That's not to say that you can't hack around it though: the affected statics are basically append-only arrays so it's possible that something like Mac's findOrRegisterPersistentExtension
will work
Mario Carneiro (Aug 09 2023 at 05:35):
I think enableInitializersExecution
should already be on for an interactive lean session
Mario Carneiro (Aug 09 2023 at 05:36):
you usually only need to call it if you are calling lean code from main
Scott Morrison (Aug 09 2023 at 05:36):
It definitely has an effect here: see the difference between Test2
and Test3
above.
Mario Carneiro (Aug 09 2023 at 05:40):
oh, looks like the flag is reset to false after processing the imports (of Test2.lean itself)
Mario Carneiro (Aug 09 2023 at 05:40):
so you indeed have to call it again to turn it back on for the imports of the target module
Mario Carneiro (Aug 09 2023 at 05:48):
on further review I think findOrRegisterPersistentExtension
is a solution at the wrong level: it is trying to handle the case of multiple initialization of a module but modules already track whether they have been initialized and skip repeat initialization
Scott Morrison (Aug 09 2023 at 05:52):
Okay, so my summary of the situation is as follows:
- initializer state is global, so
- if you either run
compileModule
from an interpreted session (where initializers have already been initialized), or if you runcompileModule
twice on different files from any session, global state will clash in interesting and unfun ways. - but in a compiled session, you can turn on
enableInitializerExecution
and then get a successful compile of one other file.
Scott Morrison (Aug 09 2023 at 05:52):
That's not what I was hoping for, but if that is correct the situation is probably workable, and at least no longer confusing.
Mario Carneiro (Aug 09 2023 at 05:59):
there is some internal state to prevent multiple import
from causing multiple initialization of the same files; I think it should be possible to hook into that to avoid this issue (lean modifications will be required though)
Mario Carneiro (Aug 09 2023 at 07:36):
woohoo, I have a working patch for this issue: lean4#2398
Mario Carneiro (Aug 09 2023 at 07:37):
this now compiles error-free:
#eval show MetaM _ from do
enableReinitialization (← getEnv)
let (_, msgs, _) ← compileModule `Frontend.Main
for m in msgs do dbg_trace ← m.data.format
return msgs.length
Scott Morrison (Aug 09 2023 at 10:16):
This is working well! I can run compileModule
all over Mathlib, in the same session, without initialization errors.
Scott Morrison (Aug 09 2023 at 10:16):
If you'd like I can make a self-contained test to add to your PR?
Mario Carneiro (Aug 09 2023 at 10:21):
Please do!
Mac Malone (Aug 09 2023 at 17:09):
Scott Morrison said:
Mac Malone, I'm surprised that your
findOrRegisterPersistentExtension
is not implemented as:
There was no deep though beyond it. :laughing: I just needed to get Alloy to work and went with the first thing that worked and seemed reasonably principled (even if the not the most optimal).
Adam Topaz (Aug 11 2023 at 02:50):
I've also been fighting similar issues recently. Here's something curious...
I have a copy of mathlib4 with an additional entry in the lakefile:
lean_exe compile where
root := `Compile.Main
supportInterpreter := true
and, obviously, a file called Compile/Main.lean
.
First, suppose that the following is the contents of this file:
import Lean
open Lean Elab Parser System
def compile (module : Name) : IO Unit := do
let contents ← IO.FS.readFile (modToFilePath "." module "lean")
let inputCtx : InputContext := mkInputContext contents "<input>"
let ⟨header, parserState, messages⟩ ← parseHeader inputCtx
if messages.toList.length != 0 then
for m in messages.toList do
IO.println <| ← m.toString
let ⟨env, messages⟩ ← processHeader header {} messages inputCtx
if messages.toList.length != 0 then
for m in messages.toList do
IO.println <| ← m.toString
let cmdState : Command.State :=
{ Command.mkState env messages {} with infoState.enabled := true }
let _ ← IO.processCommands inputCtx parserState cmdState
IO.println s!"Success! {module}"
return
def modules : IO (Array Name) := do
let out ← IO.FS.lines "Mathlib.lean"
let out := out.map fun i => i.splitOn[1]!
let out := out.map fun i => i.toName
return out
unsafe def main (arg : List String) : IO Unit := do
let arg := arg[0]!.toName
enableInitializersExecution
initSearchPath (← findSysroot)
compile arg
In this case, lake exe compile Mathlib.Algebra.Abs
succeeds with Success! Mathlib.Algebra.Abs
.
A similar result happens with Mathlib.Algebra.AddTorsor
, which is the second import appearing in our current Mathlib.lean
file.
Now, consider the following variant of the main
function above:
unsafe def main : IO Unit := do
enableInitializersExecution
initSearchPath (← findSysroot)
let modules ← modules
for module in modules do
compile module
This fails, with the following
Success! Mathlib.Algebra.Abs
libc++abi: terminating due to uncaught exception of type lean::exception: cannot evaluate `[init]` declaration 'Mathlib.Tactic.reflExt' in the same module
suggesting it failed on Mathlib.Algebra.AddTorsor
.
So, basically I'm very confused. I tried a few other files from mathlib4 with the successful variant, and they all worked fine. What am I doing wrong?
Scott Morrison (Aug 11 2023 at 04:29):
Haha, looks we have been doing almost identical things. :-) I am trying to write something up that describes every possible variation here.
Scott Morrison (Aug 11 2023 at 04:35):
For me, these experiments give the same results whether or not I specify supportInterpreter := true
in the lakefile
. @Adam Topaz, could you check if that is the same for you?
Scott Morrison (Aug 11 2023 at 08:03):
@Adam Topaz, my summary is at lean4#2408, with a (failing) test case at lean4#2407, and @Mario Carneiro's relevant PR, updated to including the (succeeding) test case, is at lean4#2398.
Adam Topaz (Aug 11 2023 at 11:46):
In the single-file example without supportInterpreter, it seems to compile, but I get an error:
<input>:6:0: error: could not find native implementation of external declaration 'IO.getRandomBytes' (symbols 'l_IO_getRandomBytes___boxed' or 'l_IO_getRandomBytes')
Adam Topaz (Aug 11 2023 at 11:48):
Same with the for loop, and in that case the error pops up for each file.
Adam Topaz (Aug 11 2023 at 11:48):
But the original error doesn't appear in this case.
Adam Topaz (Aug 11 2023 at 11:49):
Strange. I still don't really understand what's going on internally...
Scott Morrison (Aug 11 2023 at 12:04):
I don't understand at all when supportInterpreter := true
is needed. I don't seem to need it in any of my tests locally, but it was needed in the test case lean4#2407?
Sebastian Ullrich (Aug 11 2023 at 12:08):
I believe this option does not have any effect on non-Linux (even on macOS where it is at least accepted by the linker)
Sebastian Ullrich (Aug 11 2023 at 12:11):
@Mac Malone Perhaps true
would be a better default for it then, the Linux binary size overhead is probably insignificant compared to the footgun size
Adam Topaz (Aug 11 2023 at 12:38):
FWIW, my main use case for such things right now is in the following file:
https://github.com/adamtopaz/lean_grader/blob/master/Main.lean
I'm setting up something for automatic grading in github classroom. Right now I'm only checking things in a single file whose name is hardcoded, but it would be nice to generalize and allow multiple files.
Last updated: Dec 20 2023 at 11:08 UTC