Zulip Chat Archive
Stream: general
Topic: mutual recursive parsers
Kana (Feb 17 2021 at 20:49):
It is rather easy to write recursive parser like this (example from data.buffer.parser
)
private def fix_core (f : parser α → parser α) : ℕ → parser α
| 0 := failure
| (n + 1) := f (fix_core n)
def fix (f : parser α → parser α) : parser α :=
λ input, fix_core f (input.length + 1) input
But what if I want a lot of mutual recursive parsers?
My current solution is:
- hlist
inductive hprod (f : Type → Type) : list Type → Type 1
| nil : hprod []
| cons {α αs} (x : f α) (xs : hprod αs) : hprod (α :: αs)
notation `⟦` l:(foldr `, ` (h t, cons h t) nil `⟧`) := l
- fixN by hprod
private def fixN_core {αs} (f : hprod parser αs → hprod parser αs) : ℕ → hprod parser αs
| 0 := hprod.pure @parser.failure
| (n + 1) := f (fixN_core n)
def fixN : ∀{αs}, (hprod parser αs → hprod parser αs) → hprod parser αs
| [] f := hprod.nil
| (α::αs) f :=
hprod.cons
(λ input, (fixN_core f (input.length + 1)).head input)
(fixN (hprod.tail ∘ f ∘ hprod.cons failure))
- example
def parsers : hprod parser
[ value -- element
, list value -- elements
, value -- value
, ...
] :=
fixN $ λp, match p with
| ⟦element, elements, valueP, ...⟧ :=
⟦ -- element
ws *> valueP <* ws
, -- elements
sep_by1 (ch' ',') element
, --value
value.object <$> object <|>
, ...
⟧
end
Is there some better way to do this?
Kana (Feb 17 2021 at 20:56):
I think I should try higher kinded structures like
structure parsers (h : Type -> Type) :=
(element : h value)
(elements : h (list value))
...
Mario Carneiro (Feb 17 2021 at 20:57):
The usual workaround is to pass later parsers as arguments to earlier parsers
Mario Carneiro (Feb 17 2021 at 20:58):
I don't think generalizing the monad is necessary here in any case
Mario Carneiro (Feb 17 2021 at 20:59):
In your case, you can probably get away with just one main parser that you pass to all the others, namely valueP
Mario Carneiro (Feb 17 2021 at 21:00):
do you have a MWE?
Mario Carneiro (Feb 17 2021 at 21:07):
here I made one for you
import data.buffer.parser
meta inductive value
| string : string → value
| object : list (string × value) → value
| array : list value → value
| number : ℕ → value
| bool : bool → value
| null : value
open parser
def ws : parser unit := sorry
def ch' : char → parser unit := sorry
meta def object : parser (list (string × value)) := sorry
meta def valueP : parser value := value.object <$> object <|> value.array <$> arrayP
meta def element : parser value := ws *> valueP <* ws
meta def elements : parser (list value) := sep_by1 (ch' ',') element
meta def arrayP : parser (list value) := ch' '[' *> elements <* ch' '['
Note that there isn't any order we can write the declarations since they are in a cyclic dependency
Mario Carneiro (Feb 17 2021 at 21:09):
However they aren't too far from being acyclic; if we break the dependence on valueP
then everything becomes acyclic
meta def object (valueP : parser value) : parser (list (string × value)) := sorry
meta def element (valueP : parser value) : parser value := ws *> valueP <* ws
meta def elements (valueP : parser value) : parser (list value) := sep_by1 (ch' ',') (element valueP)
meta def arrayP (valueP : parser value) : parser (list value) := ch' '[' *> (elements valueP) <* ch' '['
meta def valueP : parser value := value.object <$> object valueP <|> value.array <$> arrayP valueP
The trick here is that valueP
is allowed to pass itself as an argument to all the other definitions
Mario Carneiro (Feb 17 2021 at 21:11):
Lean 4 has the notion of a mutual block that makes these kinds of recursions a lot easier. Lean 3 has mutual definitions too but they are not compiled very well
Mario Carneiro (Feb 17 2021 at 21:14):
Here's what it looks like with a lean 3 mutual def:
meta mutual def object, element, elements, arrayP, valueP
with object : parser (list (string × value)) | x := sorry
with element : parser value | x := (ws *> valueP <* ws) x
with elements : parser (list value) | x := sep_by1 (ch' ',') element x
with arrayP : parser (list value) | x := (ch' '[' *> elements <* ch' '[') x
with valueP : parser value | x := (value.object <$> object <|> value.array <$> arrayP) x
The | x :=
nuisance parameter is because we can only use this to define values of function type, which parser
happens to be
Kana (Feb 17 2021 at 21:16):
My not so minimal WE (in MWE namespace)
https://gist.github.com/kana-sama/a4e2e7ca66f0bcefba9b5c15d3fe85a6
Kana (Feb 17 2021 at 21:17):
Thanks again, I am looking at your example
Mario Carneiro (Feb 17 2021 at 21:19):
Here's a version of your MWE:
namespace MWE
open parser
meta def parser_a (b : parser (list char)) : parser (list char) :=
(::) <$> ch 'a' <*> (b <|> eof $> [])
meta def parser_b : parser (list char) :=
(::) <$> ch 'b' <*> (parser_a parser_b <|> eof $> [])
meta def abab := list.as_string <$> (parser_a parser_b <|> parser_b)
#eval abab.run "abab"
end MWE
Kana (Feb 17 2021 at 21:22):
Oh thanks. So does meta
have an effect like "disable totality checker and mark function as possible partial"? It is what I want.
Mario Carneiro (Feb 17 2021 at 21:26):
And here's your longer example:
meta def element (valueP : parser value) : parser value :=
ws *> valueP <* ws
meta def elements (valueP : parser value) : parser (list value) :=
sep_by1 (ch' ',') (element valueP)
meta def member (valueP : parser value) : parser (string × value) :=
do key ← ws *> stringP <* ws,
ch ':',
value ← element valueP,
pure ⟨key, value⟩
meta def members (valueP : parser value) : parser (list (string × value)) :=
sep_by1 (ch' ',') (member valueP)
meta def object (valueP : parser value) : parser (list (string × value)) :=
[] <$ (ch '{' *> ws <* ch '}') <|>
(ch '{' *> members valueP <* ch '}')
meta def array (valueP : parser value) : parser (list value) :=
[] <$ (ch '[' *> ws <* ch ']') <|>
(ch '[' *> elements valueP <* ch ']')
meta def json_parser : parser value | x :=
(value.object <$> object json_parser <|>
value.array <$> array json_parser <|>
value.string <$> stringP <|>
value.number <$> nat <|>
value.bool true <$ str "true" <|>
value.bool false <$ str "false" <|>
value.null <$ str "null") x
#eval json_parser.run "[{\"1\\n\\u0030\": 1}, 2]"
Mario Carneiro (Feb 17 2021 at 21:28):
It doesn't disable totality checking, i.e. matches still have to be exhaustive, but it does disable termination checking. In fact, in the first version of the code I just posted, I left off the | x :=
part in json_parser
, and doing this will cause the #eval
line to stack overflow because it tries to eagerly evaluate all the closures before it even starts giving it the input
Mario Carneiro (Feb 17 2021 at 21:29):
the | x :=
says that we should leave json_parser
unevaluated (for example when this term appears in the other parsers) until we have an input to give it
Mario Carneiro (Feb 17 2021 at 21:31):
In lean 4 there is a partial
keyword for enabling this particular functionality without also opting in to the unsafe parts of the language
Last updated: Dec 20 2023 at 11:08 UTC