Zulip Chat Archive

Stream: metaprogramming / tactics

Topic: Leveraging Lean's macro/elab for `String` parsing?


Youngju Song (Jan 22 2025 at 20:43):

I have a simple parsing task where I want to parse e.g., the following String:

"(
(define-fun x () Int 42)
(define-fun y () (_ BitVec 4) #x5)
)"

into

(: Std.HashMap String Val) |>.insert "x" (.int 4) |>.insert "y" (.bitvec 0x5)

where Val is defined as inductive Val: Type := | int (i: Int) | bitvec {n} (bv: BitVec n).
Maybe I can define a simple parser combinators, but since I have no experience with parser combinators, I am curious if I can piggyback on Lean's features for this task.

Specifically, I think I know how to write macro/elaborator in Lean4 so that

#eval (parseThis (
(
(define-fun x () Int 42)
(define-fun y () (_ BitVec 4) #x5)
)
))

evaluates to the term above. But I am not sure how to parse the String

"(
(define-fun x () Int 42)
(define-fun y () (_ BitVec 4) #x5)
)"

which is different from Lean Syntax. I guess there is a Lean's internal parser that converts String to Syntax, and maybe I can chain that with the macro to get a String parser?

Does this make sense? Would it be something that could theoretically possible but discouraged pattern?

Henrik Böving (Jan 22 2025 at 20:45):

We generally use the currently not yet stabilized parser library in Std.Internal.Parsec for that kind of task.

Henrik Böving (Jan 22 2025 at 20:48):

Also out of curiosity, what's your reason for parsing SMTLIB?

Arthur Paulino (Jan 22 2025 at 21:02):

I was able to reuse an elaborator I wrote for parsing here: https://github.com/arthurpaulino/FxyLang

But be warned that this is not a proper solution. To parse strings you should not rely on metaprogramming.

Youngju Song (Jan 23 2025 at 22:58):

@Henrik Böving @Arthur Paulino Thank you for the comments! I will give it a try to Parsec.

Also out of curiosity, what's your reason for parsing SMTLIB?

I am looking for something that corresponds to z3py binding in Lean4. I don't intend to use SMT solver to prove things in Lean, but I instead need to retrieve the model and play with it inside Lean4. I looked at few libraries (lean-smt, lean-auto, lean-cvc5) but nothing seems to fit my use case, so I am building one myself which includes a simple parser for the model written in SMTLIB format. While the whole SMTLIB format could be quite large, I just need a parser for the model which seems to be feasible.

Henrik Böving (Jan 23 2025 at 23:01):

Generally speaking the SMT-LIB format is just built from s expressions so writing a parser for the entire format can be trivial if you don't want to have all syntactic variants encoded in the syntax tree but are content having them in the s expressions.


Last updated: May 02 2025 at 03:31 UTC