Zulip Chat Archive
Stream: lean4
Topic: prove termination for simple parser
Jiatong Yang (Apr 25 2024 at 03:20):
I have just translated a simple add-mul parser in Haskell to Lean4.
abbrev string := List Char
abbrev Parser := StateT string Option
def parseOK (p : Char → Bool) : Parser Char :=
do let s ← get
match s with
| [] => failure
| c::cs => if p c then set cs *> pure c else failure
def parseChar (c : Char) : Parser Char :=
parseOK (c == ·)
def parseString : string → Parser string := List.mapM parseChar
def parseNat : Parser Nat :=
do let s ← get
let n := s.takeWhile Char.isDigit
set (s.drop n.length)
(String.mk n).toNat?
inductive Exp : Type
| num : Nat → Exp
| add : Exp → Exp → Exp
| mul : Exp → Exp → Exp
def Exp.toString : Exp → String
| Exp.num n => n.repr
| Exp.add e e' => "(" ++ e.toString ++ " + " ++ e'.toString ++ ")"
| Exp.mul e e' => "(" ++ e.toString ++ " * " ++ e'.toString ++ ")"
def eval : Exp → Nat
| Exp.num n => n
| Exp.add e e' => eval e + eval e'
| Exp.mul e e' => eval e * eval e'
mutual
unsafe def parseExp : Parser Exp := do
let a ← parseMul
let b ← parseExp'
pure (b a)
unsafe def parseExp' : Parser (Exp → Exp) := (do
let _ ← parseChar '+'
let a ← parseMul
let b ← parseExp'
pure (λ e ↦ b (Exp.add e a)))
<|> pure id
unsafe def parseMul : Parser Exp := do
let a ← parseNum
let b ← parseMul'
pure (b a)
unsafe def parseMul' : Parser (Exp → Exp) := (do
let _ ← parseChar '*'
let a ← parseNum
let b ← parseMul'
pure (λ e ↦ b (Exp.mul e a)))
<|> pure id
unsafe def parseNum : Parser Exp := (do
let _ ← parseChar '('
let a ← parseExp
let _ ← parseChar ')'
pure a) <|> parseNat.map Exp.num
end
unsafe def main : IO Unit :=
match parseExp "(1+2*3)*5+8Hello, World!".data with
| none => IO.println "parse error"
| some (e, s) => IO.println s!"{e.toString} = {eval e}, remaining: {String.mk s}"
#eval main
But I have difficulty proving the termination for the mutual defined functions. (I also want help to improve the conciseness and readability of this code. Any idea is welcome.)
Alissa Tung (Apr 25 2024 at 05:56):
Showing the termination of parser combinators are hard. For example, when writing left-recursive syntaxes, one would write some parser definition won't make any progress so it never terminates.
Some methods can be used for writing total parsers:
- https://gallais.github.io/pdf/agdarsec18.pdf , Total Parser Combinators in dependent typed language
- Idris 2 tracks the termination of parser by a boolean value in type, and
sorry
(believe_me
) on basic building blocks - give some fuels to the parser, restrict the maximum recursion times
Last updated: May 02 2025 at 03:31 UTC