Zulip Chat Archive
Stream: lean4
Topic: Producing the syntax of a file
Damiano Testa (Sep 21 2023 at 14:44):
Dear All,
I understand very little of how Syntax
works, so this question may be very confused -- I am trying to learn about it.
If I create a .lean
file and I want to extract from it the parsed syntax, how can I do that? I found some functions in Lean.Parser.Types
(e.g. parseHeader
and parseCommand
), but I imagine that there should be something that already takes a file path and directly returns the syntax object associated to it, right? Can anyone point me in the right direction?
Thanks!
Matthew Ballard (Sep 21 2023 at 14:53):
Is it hiding in the info tree?
Damiano Testa (Sep 21 2023 at 14:56):
Matt, I know so little that I am not really sure how to use your hint...
Matthew Ballard (Sep 21 2023 at 14:57):
I reached the edge of my knowledge with that comment. But I know where to look next for an example: std linters (or maybe core linters...)
Damiano Testa (Sep 21 2023 at 14:57):
Ok, I am looking at docs#Lean.Elab.InfoTree !
Damiano Testa (Sep 21 2023 at 15:04):
Ok, looking at InfoTree
, it seems to store Syntax
inside the Info
nodes -- this is good progress!
I'm moving on to the std linters now to get hints on how to produce InfoTree
s!
Matthew Ballard (Sep 21 2023 at 15:04):
I edited the message above because it might be core linters I am thinking about
Damiano Testa (Sep 21 2023 at 15:10):
docs#Lean.Elab.Command.State seems to store InfoTrees, so maybe in the end in the right monad it is simply let infotree ← get
... I find this so hard!
Damiano Testa (Sep 21 2023 at 15:14):
So, this is probably very roundabout, but does it seem ok?
Use docs#Lean.Elab.runFrontend to get the environment, then use docs#Lean.Elab.Command.mkState to get the Lean.Elab.Command.State
and from there tease out infostate
and trees
as needed.
Damiano Testa (Sep 21 2023 at 15:14):
I'm going to try and see if I get to the bottom of this before giving up for the day!
Damiano Testa (Sep 21 2023 at 15:44):
I have created a file called A.lean
containing only
def A : Nat := 0
Then, in a separate file, I run
import Std
open Lean Elab Command
def myTrees : IO Lean.Elab.Command.State := do
let (env, _) ← runFrontend "Mathlib/A.lean" .empty "Mathlib/A.lean" (.str (.str .anonymous "Mathlib") "A")
let state := mkState env
let trees := state.infoState.trees
dbg_trace f!"\nNumber of trees: {trees.size}"
return state
#eval do
let _ ← myTrees
/-
Mathlib/A.lean:1:0: error: unexpected identifier; expected command
Number of trees: 0
-/
What am I doing wrong? Why are there 3 inputs to runFrontend
that all feel like they should be the same file name?
Matthew Ballard (Sep 21 2023 at 15:45):
What is .empty
?
Damiano Testa (Sep 21 2023 at 15:45):
Option
s. I did not know what to write there...
Damiano Testa (Sep 21 2023 at 15:46):
This is what runFrontend
wants:
@[export lean_run_frontend]
def runFrontend
(input : String)
(opts : Options)
(fileName : String)
(mainModuleName : Name)
(trustLevel : UInt32 := 0)
(ileanFileName? : Option String := none)
: IO (Environment × Bool) := do
Damiano Testa (Sep 21 2023 at 15:46):
Ah, maybe input is what I read from the file!
Damiano Testa (Sep 21 2023 at 15:46):
I'll try IO.FS.readFile
in that field!
Damiano Testa (Sep 21 2023 at 15:47):
Ok, the error is gone, but there are still no trees...
Damiano Testa (Sep 21 2023 at 16:27):
I don't understand what is going on.
Lean seems to react to the content of the file and produces errors, if there are some. However, it produces no infoTrees.
Damiano Testa (Sep 21 2023 at 16:47):
Got it! I changed runFrontend
to return the actual State
, rather than the Environment
.
A posteriori, everything is obvious: the Environment
discards the Syntax
information, since it has completed its job. Returning the State
that it was in when it created the Environment
remembers it!
Damiano Testa (Sep 21 2023 at 17:08):
For future reference, this is what I have and I can leave happy for the rest of the day! :smile:
import Std
import Mathlib.Tactic.RunCmd
open Lean Elab Tactic Command
/-- Copied from `Lean.Elab.runFrontend`, the only difference is in the last line: the original has
`pure (s.commandState.env, !s.commandState.messages.hasErrors)`
instead of `pure s.commandState`. -/
@[export lean_run_frontend]
def my_runFrontend
(input : String)
(opts : Options)
(fileName : String)
(mainModuleName : Name)
(trustLevel : UInt32 := 0)
(ileanFileName? : Option String := none)
: IO Command.State := do
let inputCtx := Parser.mkInputContext input fileName
let (header, parserState, messages) ← Parser.parseHeader inputCtx
let (env, messages) ← processHeader header opts messages inputCtx trustLevel
let env := env.setMainModule mainModuleName
let mut commandState := Command.mkState env messages opts
if ileanFileName?.isSome then
-- Collect InfoTrees so we can later extract and export their info to the ilean file
commandState := { commandState with infoState.enabled := true }
let s ← IO.processCommands inputCtx parserState commandState
for msg in s.commandState.messages.toList do
IO.print (← msg.toString (includeEndPos := getPrintMessageEndPos opts))
if let some ileanFileName := ileanFileName? then
let trees := s.commandState.infoState.trees.toArray
let references := Lean.Server.findModuleRefs inputCtx.fileMap trees (localVars := false)
let ilean := { module := mainModuleName, references : Lean.Server.Ilean }
IO.FS.writeFile ileanFileName $ Json.compress $ toJson ilean
pure s.commandState
example : True := by
run_tac do
let state ← my_runFrontend ("#eval \"hello!\"") (← getOptions) "whatFileIsThis?" `whatNameIsThis?
dbg_trace f!"\nNumber of trees: {state.infoState.trees.size}"
trivial
/-
"hello!"
Number of trees: 2
-/
Scott Morrison (Sep 22 2023 at 01:39):
@Damiano Testa, please look at https://github.com/semorrison/lean-training-data/blob/master/TrainingData/Frontend.lean which gives a access to everything you might want from the frontend. :-) I would like to polish this and put it somewhere more findable, but hopefully you can extract something you like from it already.
Damiano Testa (Sep 22 2023 at 03:06):
Wow, thanks Scott!
I am not going to be able to look at it before next week, but scanning the file it looks like it is extracting just the information that I wanted!
Scott Morrison (Sep 22 2023 at 03:14):
There is an off-by-one error still in the "source for each declaration" field, which I haven't fixed. Patches welcome!
Damiano Testa (Sep 22 2023 at 03:17):
Is that related to the fact that the .ilean
s also count from zero the number of lines? Or are you referring to something else? (I only looked at the code from my mobile, so my understanding is very shallow.)
Scott Morrison (Sep 22 2023 at 04:31):
Sorry, I haven't looked at it for a while. It should be easy to track down, it just an off-by-one error in the line counting, but I forgot to track it down.
Sebastian Ullrich (Sep 22 2023 at 07:00):
(not even all the editors agree whether the first line is 0 or 1, if it's about that :) )
Damiano Testa (Sep 22 2023 at 07:33):
Ok, that's not a big deal: my internal convention is to start counting from 1, but I'm used to the unnatural convention of the natural numbers including 0.
Damiano Testa (Sep 22 2023 at 07:34):
Anyway, I'll look at it, but it may take me a while before I have some serious time to do it.
Last updated: Dec 20 2023 at 11:08 UTC