Zulip Chat Archive

Stream: lean4

Topic: Simple parsing+elaboration with metaprogramming


Relyth (Jun 12 2024 at 23:01):

Hello. I am using Lean 4 and I am trying to use its metaprogramming capabilities to evaluate a string that happens to be a valid expression (when some defs are in the context) into the corresponding expression that it represents. To make my scenario more concrete, let's say that I have the following recursive type:

inductive MyType
| a : MyType -> MyType
| b : MyType -> MyType
| c : MyType
deriving Inhabited, BEq, Repr

With this I can write the following expression:

#eval MyType.a (.b (.a (.a .c)))

The twist now is that I have this as a string "MyType.a (.b (.a (.a .c)))" and I want a function stringToMyType : String -> Option MyType such that the following two are equivalent and both give me the same value:

#eval MyType.a (.b (.a (.a .c)))
#eval stringToMyType "MyType.a (.b (.a (.a .c)))" |>.get!

I have been reading the Lean 4 Metaprogramming book and I am fairly sure that this should be achievable with a function call to Lean.Elab.Term.evalTerm, but the arguments are undocumented and the manual does not contain a single example call to this function.
So the most I have managed to do is this, which is not much:

open Lean Elab

def stringToMyType (string : String) : Option MyType := do
  let asSyntax := sorry -- from `string : String` to `Syntax`
  let α : Type := sorry -- what is this? required by `evalTerm`
  let type : Expr := sorry -- what is this? required by `evalTerm`
  let x : TermElabM α := Term.evalTerm α type asSyntax
  let y : MetaM (α × Term.State) := x.run  -- what is even this monad. `run` here has type `TermElabM α → Term.Context → Term.State`, but the last two are optional
  sorry -- no idea how to continue to obtain `Option MyType` (or similar)

Does anyone here have experience with Lean 4 metaprogramming and would be willing to help me a bit?

Online playground link:
https://live.lean-lang.org/#code=import%20Lean%0D%0A%0D%0Ainductive%20MyType%0D%0A%7C%20a%20%3A%20MyType%20-%3E%20MyType%0D%0A%7C%20b%20%3A%20MyType%20-%3E%20MyType%0D%0A%7C%20c%20%3A%20MyType%0D%0Aderiving%20Inhabited%2C%20BEq%2C%20Repr%0D%0A%0D%0Aopen%20Lean%20Elab%0D%0A%0D%0Aunsafe%20def%20stringToMyType%20(string%20%3A%20String)%20%3A%20Option%20MyType%20%3A%3D%20do%0D%0A%20%20let%20asSyntax%20%3A%3D%20sorry%20--%20from%20%60string%60%20to%20%60Syntax%60%0D%0A%20%20let%20%CE%B1%20%3A%20Type%20%3A%3D%20sorry%20--%20what%20is%20this%3F%20required%20by%20%60evalTerm%60%0D%0A%20%20let%20type%20%3A%20Expr%20%3A%3D%20sorry%20--%20what%20is%20this%3F%20required%20by%20%60evalTerm%60%0D%0A%20%20let%20x%20%3A%20TermElabM%20%CE%B1%20%3A%3D%20Term.evalTerm%20%CE%B1%20type%20asSyntax%0D%0A%20%20let%20y%20%3A%20MetaM%20(%CE%B1%20%C3%97%20Term.State)%20%3A%3D%20x.run%20%20--%20what%20is%20even%20this%20monad.%20%60run%60%20here%20has%20type%20%60TermElabM%20%CE%B1%20%E2%86%92%20Term.Context%20%E2%86%92%20Term.State%60%2C%20but%20the%20last%20two%20are%20optional%0D%0A%20%20sorry%0D%0A%0D%0A%23eval%20MyType.a%20(.b%20(.a%20(.a%20.c)))%0D%0A%23eval%20stringToMyType%20%22MyType.a%20(.b%20(.a%20(.a%20.c)))%22%20%7C%3E.get!%0D%0A

Alex J. Best (Jun 12 2024 at 23:29):

Youll generally need to work in some monad that lets you do these things like TermElabM. In your simple case the function you want could probably be defined without all this but its really much easier if you use the right monads!

import Lean

inductive MyType
| a : MyType -> MyType
| b : MyType -> MyType
| c : MyType
deriving Inhabited, BEq, Repr

open Lean Elab

unsafe def stringToMyType (string : String) : TermElabM (Option MyType) := do
  let e  getEnv
  let .ok syn := Parser.runParserCategory e `term string | return none
  logInfo syn
  -- we could elab here, but its not needed
  -- let expr ← Term.elabTerm syn (some (.const `MyType []))
  -- logInfo expr

  let out  Term.evalTerm MyType (.const `MyType []) syn
  logInfo (repr out)
  return out

#eval MyType.a (.b (.a (.a .c)))
#eval stringToMyType "MyType.a (.b (.a (.a .c)))"

Relyth (Jun 13 2024 at 00:01):

Sorry if this is a stupid question but can I "pull" the Option MyType out of TermElabM so that I can use it in other functions that don't have any references to Lean metaprogramming stuff?

llllvvuu (Jun 13 2024 at 00:04):

You can, but you have to put it back into TermElabM afterwards:

https://leanprover.github.io/functional_programming_in_lean/monads.html
https://lean-lang.org/lean4/doc/monads/intro.html

Relyth (Jun 13 2024 at 00:05):

That's the issue, I can't really do that. I need to actually unwrap it to make it fit with existing code easily. I can't just use a bind here.

llllvvuu (Jun 13 2024 at 00:09):

You'll need to pass context via .run methods then

llllvvuu (Jun 13 2024 at 00:10):

docs#Lean.Elab.Term.TermElabM.run
docs#Lean.Elab.Term.TermElabM.toIO

Although I'm not sure if you can get out of IO

Relyth (Jun 13 2024 at 00:11):

I was going to say, I managed to see in some doc page that this mysterious and undocumented TermElabM monad is actually ReaderT Lean.Elab.Term.Context (StateRefT' IO.RealWorld Lean.Elab.Term.State Lean.MetaM), so I was going to ask if I can call run on it or something. But I don't even understand at a first glance what the signature of run would even look like with such a long type full of names I've never seen before. Even if I knew the signature, no idea how to construct values of the requested types.

Eric Wieser (Jun 13 2024 at 00:12):

https://github.com/leanprover-community/mathlib4/wiki/Monad-map can be handy for questions like "How do I get from FooM to BarM"

Relyth (Jun 13 2024 at 00:13):

The main function of my app is List String -> IO UInt32, so no problem with the IO. The main issue is understanding the signatures around this entire thing and how to get a context (whatever that is), and a state.

Relyth (Jun 13 2024 at 00:15):

Ok I think I can solve my issue with Lean.Elab.Term.TermElabM.toIO. This is very useful. I will try to figure this out during the next hours and I will come back if it doesn't work I guess. Thank you.

Eric Wieser (Jun 13 2024 at 00:16):

I think probably you want docs#CoreM.withImportModules

Eric Wieser (Jun 13 2024 at 00:18):

when some defs are in the context

These defs are in the build-time context. If you're running IO from a binary, then you're in the runtime context, and so the environment doesn't exist at all! (and so you create one with the declaration above)

Relyth (Jun 13 2024 at 00:19):

Uhhh I am having very extreme difficulties right now to follow all this, even after having read the Metaprogramming in Lean book. Do you think just writing a parser manually that constructs a MyType out of a string is more feasible? I am kind of stuck with the metaprogramming way of doing it.

Eric Wieser (Jun 13 2024 at 00:20):

Why are you in IO?

Relyth (Jun 13 2024 at 00:20):

Because it's a command-line app that gathers inputs, does some process and prints outputs

Eric Wieser (Jun 13 2024 at 00:21):

Relyth said:

Do you think just writing a parser manually that constructs a MyType out of a string is more feasible?

This is probably a better idea anyway, since then the interface to your CLI doesn't have to match Lean

Eric Wieser (Jun 13 2024 at 00:22):

But you can use the Parsec stuff to do most of the work for you

Eric Wieser (Jun 13 2024 at 00:23):

The CoreM.withImportModules approach shouldn't be too bad if you do want to pursue that, you can see an example at https://github.com/leanprover-community/batteries/blob/main/scripts/runLinter.lean#L55

Relyth (Jun 13 2024 at 00:26):

Let me try to write something short without actually showing all the code, because it's a lot:

def main (args : List String) : IO UInt32 := do
  let parsedArgs := doSomeParsing args
  let someData1 := parsedArgs.someData1
  let someData2 := parsedArgs.someData2

  let someInput : String := IO.readFile someData2
  let someParsedInput : MyType := MyType.parseFromString someInput -- I wish to implement this step with metaprogramming

  match someComplexOperation someData1 someParsedInput with
  | .ok result =>
    IO.print result
    return 0
  | .error err =>
    IO.eprint err
    return 255

The real thing is unfortunately way more complex than this.

Relyth (Jun 13 2024 at 00:27):

This is probably a better idea anyway, since then the interface to your CLI doesn't have to match Lean

I designed the input language of the tool on purpose so that it does, planning ahead to try to avoid having to actually implement the parsing step with a bit of metaprogramming, but I was not expecting it to be so involved.

Eric Wieser (Jun 13 2024 at 00:38):

Do you want to allow your input to be MyType.a (by (run_tac IO.println "hello world"); exact .b)?

Eric Wieser (Jun 13 2024 at 00:38):

Because the problem with using the Lean parser is that you get all of lean, which can then make arbitrary IO calls

Relyth (Jun 13 2024 at 00:39):

No, only the constructors of the type nested recursively appear, like in my example above.

Eric Wieser (Jun 13 2024 at 00:40):

Then you almost certainly want to use docs#Lean.Parsec, which unfortunately has no documentation

Eric Wieser (Jun 13 2024 at 00:41):

But you could use https://github.com/leanprover/lean4/blob/be6c4894e0a6c542d56a6f4bb1238087267d21a0/src/Lean/Data/Json/Parser.lean for inspiration

Relyth (Jun 13 2024 at 00:41):

which unfortunately has no documentation

Well, like this entire metaprogramming code then. Back to square one, still stuck trying to go from a string into MyType after 3 days.

Eric Wieser (Jun 13 2024 at 00:42):

The Json parser above should hopefully give you something in square n/2

Relyth (Jun 13 2024 at 00:44):

I was under the impression that parsing and evaluating an expression is not only possible but one of the core common use cases for metaprogramming. I think it's a bit unfortunate that I have to take the parser route as the actual MyType in my code has many constructors taking many arguments, so parsing is not difficult but it is fairly tedious.

Relyth (Jun 13 2024 at 00:45):

Anyway, thank you for the help and the links.

Eric Wieser (Jun 13 2024 at 01:05):

You might want to use metaprogramming to generate the parser automatically (by iterating over the constructors)!

Jon Eugster (Jun 13 2024 at 05:12):

Eric Wieser said:

I think probably you want docs#CoreM.withImportModules

I was just quickly skimming this but thread but this suggestion sounds very reasonable. I think Damiano's UpdateDeprecations package is a recent example how to set this up to get an environment, and the gameserver exe in lean4game/server might contain another such example.

Jon Eugster (Jun 13 2024 at 07:13):

First, both references I mentioned do something else :sweat_smile:

But, I think this is roughly what I had in mind:

import Lean

open Lean Elab.Term Meta Core

def sample := "[True, False]"

def testRunnerCore (source : String)  (env : Environment) : IO Unit := do
  match Lean.Parser.runParserCategory env `term sample with
  | .ok t =>
    -- is this the correct context?
    let ctx : Core.Context := {
      fileName := ""
      fileMap := source.toFileMap
    }
    -- and what's the correct state?
    let state : Core.State := {
      env := env
    }
    -- I'm only interested in the resulting `Expr`, not in the updated context/state/…
    let (((m, _), _), _)  CoreM.toIO (MetaM.run <| TermElabM.run (elabTerm t none)) ctx state
    -- m : Expr
    dbg_trace f!"out: {m}"

  | .error err =>
    panic err

unsafe def testRunner : IO Unit := do
  withImportModules #[{module := `Lean}] Options.empty  0 (testRunnerCore sample)

#eval testRunner

I'm not super confident this is correct, especially I'm a bit confused that the output List.cons.{?_uniq.1} ?_uniq.2 True (List.cons.{?_uniq.3} ?_uniq.4 False (List.nil.{?_uniq.6} ?_uniq.7)) contains so many ?_uniq. But maybe that's because I didn't use evalTerm?

Maybe some more experienced metacoders can jump in, or at least it helps a bit.

Eric Wieser (Jun 13 2024 at 08:01):

Note that you are calling a different withImportModules to the one I recommended

Damiano Testa (Jun 13 2024 at 09:05):

Jon Eugster said:

Eric Wieser said:

I think probably you want docs#CoreM.withImportModules

I was just quickly skimming this but thread but this suggestion sounds very reasonable. I think Damiano's UpdateDeprecations package is a recent example how to set this up to get an environment, and the gameserver exe in lean4game/server might contain another such example.

Specifically, these lines set up an environment with an import statement and then process a string (buildOutput) in that environment and collect the messages that the parsing produces.

Damiano Testa (Jun 13 2024 at 09:08):

The functions that I use are in the Lean.Elab.Frontend file, I used docs#Lean.Elab.process, but docs#Lean.Elab.runFrontend is also relevant.

Damiano Testa (Jun 13 2024 at 09:11):

(My code uses yet another import command, docs#Lean.importModules.)

Jon Eugster (Jun 13 2024 at 12:57):

Eric Wieser said:

Note that you are calling a different withImportModules to the one I recommended

Oh you are completely right, yours avoids this manual lifting I was struggling with this morning on the train :sweat_smile:

So this is equivalent to my code above, I think

import Mathlib.Lean.CoreM

open Lean Elab.Term Meta Core

def sample := "[True, False]"

def testRunnerCore : CoreM Unit := do
  let env  getEnv
  match Lean.Parser.runParserCategory env `term sample with
  | .ok t =>
    let ((m, _), _)  MetaM.run <| TermElabM.run (elabTerm t none)
    dbg_trace f!"out: {m}"
  | .error err =>
    panic err

unsafe def testRunner : IO Unit := do
  CoreM.withImportModules #[`Lean] testRunnerCore none

#eval testRunner

(the same questions about correctness and evaluation still applies)

Mac Malone (Jun 13 2024 at 21:22):

Relyth said:

I was under the impression that parsing and evaluating an expression is not only possible but one of the core common use cases for metaprogramming.

This is mostly correct, but is missing a key qualifier: metaprogramming operates at a program's compile-time, not its run-time. Thus, if you want to parse and evaluate an expression to generate your program's code, then metaprogramming is the way to do it. However, if you want to parse and evaluate an expression within your program, metaprogramming is less useful.


Last updated: May 02 2025 at 03:31 UTC