Zulip Chat Archive
Stream: general
Topic: piggy back off of the lean4 parser
Frederick Pu (Oct 22 2025 at 22:12):
I'm trying to piggy back off of the lean parser to parse the generated draft suggesstion of a language model. I created a custom syntax category but when I run the categoryParser it throws an error:
import Lean
open Lean Elab Tactic Parser
/-- Custom syntax: "have <name> : <term> := sorry\nclear <names>*" -/
syntax (name := haveClearTac)
"have" ident ":" term ":=" term ppLine "clear" ident* : tactic
/-- Example of parsing a string into syntax manually. -/
def parseHaveClear (src : String) : IO Syntax := do
let env ← Lean.importModules #[{ module := `Init }] {}
let opts := Options.empty
let inputCtx := Parser.mkInputContext src "<input>"
let stx := Parser.runParserCategory env `haveClearTac src "<input>"
match stx with
| .ok x => pure x
| .error e => throw (IO.userError "not working")
#eval do
let stx ← parseHaveClear "have h : Nat := sorry\nclear h x"
IO.println s!"Parsed syntax:\n{stx.prettyPrint}"
can someone give me some pointers as to what I am doing wrong?
Kenny Lau (Oct 22 2025 at 22:19):
@Frederick Pu instead of printing "not working", I replaced it with the error generated e, which made it clearer what went wrong:
<input>:1:0: unknown parser category 'haveClearTac'
Frederick Pu (Oct 22 2025 at 22:19):
when i change it to `tactic i get "expected tactic"
Frederick Pu (Oct 22 2025 at 22:23):
so i moved the parser definition into a seperate file Basic.lean and it seems the strong is parsing fine in the parser mode `(tactic | ...) but when i run parsercategory it's faililng:
import Lean
import UtmLeannotes.Basic
open Lean Elab Tactic Parser
def parseTacticSeq (src : String) : IO Syntax := do
let env ← Lean.importModules #[{ module := `Init }] {}
let inputCtx := Parser.mkInputContext src "<input>"
let stx := Parser.runParserCategory env `tactic src "<input>"
match stx with
| .ok x => pure x
| .error e => throw (IO.userError e)
#eval `(tactic | have h : Nat := sorry clear h x)
-- works fine
#eval do
let stx ← parseTacticSeq "have h : Nat := sorry clear h x"
IO.println s!"Parsed syntax:\n{stx.prettyPrint}"
-- <input>:1:0: expected tactic
Kenny Lau (Oct 22 2025 at 22:24):
@Frederick Pu presumably it's because your environment just imports Init so it doesn't know about a lot of things
Kenny Lau (Oct 22 2025 at 22:24):
because for example changing the category to term and the input to 0 works
Frederick Pu (Oct 22 2025 at 22:24):
wdym input to 0?
Kenny Lau (Oct 22 2025 at 22:25):
import Lean
open Lean Elab Tactic Parser
/-- Custom syntax: "have <name> : <term> := sorry\nclear <names>*" -/
syntax (name := haveClearTac)
"have" ident ":" term ":=" term ppLine "clear" ident* : tactic
/-- Example of parsing a string into syntax manually. -/
def parseHaveClear (src : String) : IO Syntax := do
let env ← Lean.importModules #[{ module := `Init }] {}
let opts := Options.empty
let inputCtx := Parser.mkInputContext src "<input>"
let stx := Parser.runParserCategory env `term src "<input>"
match stx with
| .ok x => pure x
| .error e => throw (IO.userError e)
#eval do
let stx ← parseHaveClear "0"
IO.println s!"Parsed syntax:\n{stx.prettyPrint}"
Kenny Lau (Oct 22 2025 at 22:25):
this feels like an #xy problem, why are you using the IO monad in the first place?
Frederick Pu (Oct 22 2025 at 22:26):
so i can elab a string directly
Frederick Pu (Oct 22 2025 at 22:26):
since im getting a string as input
Frederick Pu (Oct 22 2025 at 22:27):
oh i think with the importModules it's only importing Init anyway to get the entire lean enviorment?
Kenny Lau (Oct 22 2025 at 22:30):
Frederick Pu said:
so i can elab a string directly
I'm asking why IO (instead of other more powerful monads), not why monad
Frederick Pu (Oct 22 2025 at 22:30):
what's a more powerful monad than io?
Frederick Pu (Oct 22 2025 at 22:31):
MetaM?
Frederick Pu (Oct 22 2025 at 22:31):
how would i be able to test my code then?
Kenny Lau (Oct 22 2025 at 22:32):
import Lean
open Lean Elab Tactic Parser
/-- Custom syntax: "have <name> : <term> := sorry\nclear <names>*" -/
syntax (name := haveClearTac)
"have" ident ":" term ":=" term ppLine "clear" ident* : tactic
/-- Example of parsing a string into syntax manually. -/
def parseHaveClear (src : String) : MetaM Syntax := do
match Parser.runParserCategory (← getEnv) `tactic src with
| .ok x => pure x
| .error e => throwError e
#eval do
let stx ← parseHaveClear "have h : Nat := sorry\nclear h x"
IO.println s!"Parsed syntax:\n{stx.prettyPrint}"
Kenny Lau (Oct 22 2025 at 22:34):
Frederick Pu said:
how would i be able to test my code then?
i'm a bit confused, what are the limitations you're facing with MetaM?
Frederick Pu (Oct 22 2025 at 22:34):
i didnt know that you could just eval a MetaM
Frederick Pu (Oct 22 2025 at 22:34):
im guessing it just does runMetaM with an empty context under the hood
Kenny Lau (Oct 22 2025 at 22:38):
import Lean
open Lean Elab
elab "bar" : term => do
return mkNatLit 67
def foo : TermElabM Expr := do
Term.elabTerm (← `(bar)) (mkConst ``Nat)
#eval foo
Kenny Lau (Oct 22 2025 at 22:38):
you can even eval a TermElabM
Frederick Pu (Oct 22 2025 at 22:39):
any where i can check out where teh reprs are implemented
Frederick Pu (Oct 22 2025 at 22:40):
oh nvm it's in CommandElabM so ig all of the monads can lift into it anyways
Frederick Pu (Oct 22 2025 at 22:47):
do you happen to know how to unquote the syntax eg:
/-- Extract ident and type term using pattern-matching quasiquotes -/
def extractIdentAndTerm (stx : Syntax) : Option (Name × Syntax) :=
match stx with
| `(tactic| have $id:ident : $ty:term := $val ..; clear $[_]*) =>
some (id.getId, ty)
| _ => none
Kenny Lau (Oct 22 2025 at 22:50):
@Frederick Pu not entirely sure what your question means, but i've just tried to fix the error:
/-- Extract ident and type term using pattern-matching quasiquotes -/
def extractIdentAndTerm (stx : Syntax) : Option (Name × Syntax) :=
match stx with
| `(tactic| have $id:ident : $ty:term := $val clear $ids:ident*) =>
some (id.getId, ty)
| _ => none
Aaron Liu (Oct 22 2025 at 22:50):
not sure what exactly you want this unquoting to do
Frederick Pu (Oct 22 2025 at 22:51):
ah ig i just didnt know what the antiquote pattern was for ident*
Kenny Lau (Oct 22 2025 at 22:52):
yeah it's just the same as when you declared it, i.e. $ids:ident*, but
- you need to turn
+to*, - and you use
ids.getElemsto get the array
(i wish there were more metaprogramming documentation, i doubt these two facts are documented anywhere)
Frederick Pu (Oct 22 2025 at 22:53):
i cant seem to get ids.getElems to work
Kenny Lau (Oct 22 2025 at 22:55):
huh looks like it's an array already there
Kenny Lau (Oct 22 2025 at 22:55):
so you don't need the getElems
Kenny Lau (Oct 22 2025 at 22:55):
aha, you only need getElems when you have a separator
Frederick Pu (Oct 22 2025 at 22:55):
oh nice!
Frederick Pu (Oct 22 2025 at 22:56):
what's the difference between TSyntax and Syntax? does it just contain additional metadata about terms?
Kenny Lau (Oct 22 2025 at 22:56):
Lean.TSyntax.raw {ks : SyntaxNodeKinds} (self : TSyntax ks) : Syntax
Lean.TSyntax.mk {ks : SyntaxNodeKinds} (raw : Syntax) : TSyntax ks
Kenny Lau (Oct 22 2025 at 22:57):
it tags the syntax with the kind, i.e. the `(tactic| in your example
Frederick Pu (Oct 22 2025 at 22:57):
so it's like a very light type signature for the Syntax?
Kenny Lau (Oct 22 2025 at 22:57):
yeah
Frederick Pu (Oct 22 2025 at 22:57):
kinda like which category you're in?
Frederick Pu (Oct 22 2025 at 22:59):
also is it possible to parser but stop at the first valid occurance of
/-- Custom syntax: "have <name> : <term> := <term>\nclear <names>* <tactic>*" -/
syntax (name := haveClearTac)
"have" ident ":" term ":=" term ppLine
"clear" ident* : tactic
that is if you have have n : term := sorry\nclear none\njunkvalues it will parse it correctly?
Kenny Lau (Oct 22 2025 at 23:00):
i'm actually confused here already, why are you redefining an existing syntax
Frederick Pu (Oct 22 2025 at 23:01):
we have an llm that's drafting in that format and sometimes it doesnt stop after the first clear properly and keeps going and makes multiple junk value drafts
Kenny Lau (Oct 22 2025 at 23:02):
what if you tell the llm to stop after the first clear
Kenny Lau (Oct 22 2025 at 23:03):
i don't see what the point of redefining syntax is
Kenny Lau (Oct 22 2025 at 23:03):
if you want to parse string to syntax or whatever that's fine, you still don't have to redefine an existing syntax
Frederick Pu (Oct 22 2025 at 23:03):
Or first clear with no indentation
Frederick Pu (Oct 22 2025 at 23:04):
Yeee that would prob be better
Frederick Pu (Oct 22 2025 at 23:04):
I was just having a bit of a brain damage moment
Kenny Lau (Oct 22 2025 at 23:06):
import Lean
open Lean Elab Tactic Parser
#check `(tacticSeq|
rcases d
obtain e)
Frederick Pu (Oct 22 2025 at 23:07):
So I just use the tacticSeq category
Kenny Lau (Oct 22 2025 at 23:08):
still, i feel like the better thing is to tell the llm what you want instead of filtering its output by hand
Frederick Pu (Oct 22 2025 at 23:08):
We're fine-tuning bfs prover so tool call format is apparently a no no
Frederick Pu (Oct 22 2025 at 23:12):
I don't really mind doing this since we need to run a parser anyways to get abln expr For the draft type
Frederick Pu (Oct 22 2025 at 23:28):
now im getting the following error:
import Lean
open Lean Elab Parser Tactic
/-- Example of parsing a string into syntax manually. -/
def parseHaveClear (src : String) : MetaM Syntax := do
match Parser.runParserCategory (← getEnv) `tacticSeq src with
| .ok x => pure x
| .error e => throwError e
#check TSyntaxArray
/-- Extract ident and type term using pattern-matching quasiquotes -/
def extractIdentAndTerm (stx : Syntax) : Option (Name × Syntax) :=
match stx with
| `(tacticSeq| have $id:ident : $ty:term := $val clear $ids:ident*) =>
some (id.getId, ty)
| _ => none
/-- Check if a `Syntax` is a `haveClearTac` -/
def isHaveClearTac (stx : Syntax) : Bool :=
stx.getKind == `haveClearTac
#check `(tacticSeq|
rcases d
obtain e)
#eval do
let stx ← parseHaveClear "have h : Nat := sorry\nclear h x"
if isHaveClearTac stx then
IO.println "✅ This is a haveClearTac!"
else
IO.println "❌ Not a haveClearTac."
IO.println (extractIdentAndTerm stx)
-- <input>:1:0: unknown parser category 'tacticSeq'
Aaron Liu (Oct 22 2025 at 23:39):
Try running parser category tactic
Frederick Pu (Oct 22 2025 at 23:40):
but that wont work since the input string is now a tacticSeq
Kenny Lau (Oct 22 2025 at 23:41):
Aaron Liu said:
Try running parser category
tactic
it interprets it as one tactic instead of two
Kenny Lau (Oct 22 2025 at 23:41):
hacky solution here, what if you just split on newlines and parse each line separately
Kenny Lau (Oct 22 2025 at 23:41):
or just do low level searching of the string literals have and clear
Frederick Pu (Oct 22 2025 at 23:42):
you can have statements as prop valued letE's inside of the draft type
Frederick Pu (Oct 22 2025 at 23:42):
i think this is good enough tho thanks:
import Lean
open Lean Elab Parser Tactic
/-- Example of parsing a string into syntax manually. -/
def parseHaveClear (src : String) : TermElabM Syntax := do
match Parser.runParserCategory (← getEnv) `tactic src with
| .ok x => pure x
| .error e => throwError e
#check TSyntaxArray
/-- Extract ident and type term using pattern-matching quasiquotes -/
def extractIdentAndTerm (stx : Syntax) : Option (Name × Syntax) :=
match stx with
| `(tactic| have $id:ident : $ty:term := $val clear $ids:ident*) =>
some (id.getId, ty)
| _ => none
#eval do
let stx ← parseHaveClear "have h : Nat := sorry\nclear h x"
IO.println (extractIdentAndTerm stx)
Frederick Pu (Oct 22 2025 at 23:43):
we could probably assume that the first line starting with clear is the clear line
Frederick Pu (Oct 22 2025 at 23:43):
and that would also work
Kenny Lau (Oct 22 2025 at 23:48):
it seems like the parser tacticSeq was defined without the category which is why it complains that there is no such category
Frederick Pu (Oct 22 2025 at 23:48):
that's so weird lol
Frederick Pu (Oct 22 2025 at 23:48):
how does that even work
Frederick Pu (Oct 22 2025 at 23:48):
is it like a one that's only used internally for tactic?
Aaron Liu (Oct 22 2025 at 23:52):
If you out a tactic seq in parens it becomes a tactic
Kenny Lau (Oct 22 2025 at 23:55):
@Frederick Pu i figured out how to run the parser directly:
import Lean
import Mathlib.Util.ParseCommand
open Lean Elab Parser Tactic
#eval do
let str := "have h : Nat := sorry\nclear h x"
let stx := Mathlib.GuardExceptions.captureException (← getEnv)
tacticSeq.fn str
logInfo s!"{stx}"
Kenny Lau (Oct 22 2025 at 23:57):
@Bryan Gin-ge Chen maybe we should upstream Mathlib.GuardExceptions.captureException to core?
Bryan Gin-ge Chen (Oct 22 2025 at 23:58):
Maybe! I'm probably not the right person to ask though.
Kenny Lau (Oct 22 2025 at 23:59):
who should i ask?
Kenny Lau (Oct 22 2025 at 23:59):
@Sebastian Ullrich maybe?
Last updated: Dec 20 2025 at 21:32 UTC