Zulip Chat Archive

Stream: lean4

Topic: prove termination for simple parser


Jiatong Yang (Apr 25 2024 at 03:20):

I have just translated a simple add-mul parser in Haskell to Lean4.

abbrev string := List Char
abbrev Parser := StateT string Option

def parseOK (p : Char  Bool) : Parser Char :=
  do let s  get
     match s with
     | [] => failure
     | c::cs => if p c then set cs *> pure c else failure

def parseChar (c : Char) : Parser Char :=
  parseOK (c == ·)

def parseString : string  Parser string := List.mapM parseChar

def parseNat : Parser Nat :=
  do let s  get
     let n := s.takeWhile Char.isDigit
     set (s.drop n.length)
     (String.mk n).toNat?

inductive Exp : Type
| num : Nat  Exp
| add : Exp  Exp  Exp
| mul : Exp  Exp  Exp

def Exp.toString : Exp  String
| Exp.num n => n.repr
| Exp.add e e' => "(" ++ e.toString ++ " + " ++ e'.toString ++ ")"
| Exp.mul e e' => "(" ++ e.toString ++ " * " ++ e'.toString ++ ")"

def eval : Exp  Nat
| Exp.num n => n
| Exp.add e e' => eval e + eval e'
| Exp.mul e e' => eval e * eval e'

mutual
unsafe def parseExp : Parser Exp := do
  let a  parseMul
  let b  parseExp'
  pure (b a)

unsafe def parseExp' : Parser (Exp  Exp) := (do
  let _  parseChar '+'
  let a  parseMul
  let b  parseExp'
  pure (λ e  b (Exp.add e a)))
  <|> pure id

unsafe def parseMul : Parser Exp := do
  let a  parseNum
  let b  parseMul'
  pure (b a)

unsafe def parseMul' : Parser (Exp  Exp) := (do
  let _  parseChar '*'
  let a  parseNum
  let b  parseMul'
  pure (λ e  b (Exp.mul e a)))
  <|> pure id

unsafe def parseNum : Parser Exp := (do
  let _  parseChar '('
  let a  parseExp
  let _  parseChar ')'
  pure a) <|> parseNat.map Exp.num
end

unsafe def main : IO Unit :=
  match parseExp "(1+2*3)*5+8Hello, World!".data with
  | none => IO.println "parse error"
  | some (e, s) => IO.println s!"{e.toString} = {eval e}, remaining: {String.mk s}"

#eval main

But I have difficulty proving the termination for the mutual defined functions. (I also want help to improve the conciseness and readability of this code. Any idea is welcome.)

Alissa Tung (Apr 25 2024 at 05:56):

Showing the termination of parser combinators are hard. For example, when writing left-recursive syntaxes, one would write some parser definition won't make any progress so it never terminates.

Some methods can be used for writing total parsers:

  • https://gallais.github.io/pdf/agdarsec18.pdf , Total Parser Combinators in dependent typed language
  • Idris 2 tracks the termination of parser by a boolean value in type, and sorry (believe_me) on basic building blocks
  • give some fuels to the parser, restrict the maximum recursion times

Last updated: May 02 2025 at 03:31 UTC