Zulip Chat Archive
Stream: general
Topic: monadic forgetting
Patrick Massot (Jan 18 2021 at 12:25):
What is the magic monadic combinator that would allow me to write a parser which throws away all the parsed tokens?
Anne Baanen (Jan 18 2021 at 12:29):
*> : m a -> m b -> m b
?
Rob Lewis (Jan 18 2021 at 12:29):
What do you mean "throw away"? tk
has type parser unit
, it will consume a token and you don't have to do anything with it
Patrick Massot (Jan 18 2021 at 12:35):
I still want to return something, but it doesn't depend on the token.
Patrick Massot (Jan 18 2021 at 12:36):
The return type is an inductive with several constructors, and I want to choose one constructor depending on which token I see. But the constructor takes no argument.
Anne Baanen (Jan 18 2021 at 12:38):
*>
would work, or <$
: (mkX <$ token "x") <|> (mkY <$ token "y")
.
Patrick Massot (Jan 18 2021 at 12:39):
I think <$
is what I was missing.
Patrick Massot (Jan 18 2021 at 12:39):
I'm curious to know how you would do it with *>
.
Mario Carneiro (Jan 18 2021 at 12:39):
you can use *>
or <*
depending on whether the thing you want to throw away is the left or right parser
Anne Baanen (Jan 18 2021 at 12:40):
a <$ mb
is the same as pure a <* mb
.
Mario Carneiro (Jan 18 2021 at 12:40):
for example tk "(" *> texpr <* tk ")"
or thereabouts
Patrick Massot (Jan 18 2021 at 12:41):
Is there any way to end a tactic invocation with a token?
Mario Carneiro (Jan 18 2021 at 12:41):
what do you mean?
Mario Carneiro (Jan 18 2021 at 12:41):
are you writing a tactic or a parser?
Patrick Massot (Jan 18 2021 at 12:41):
both
Patrick Massot (Jan 18 2021 at 12:42):
Say I want to rewrite intro
so that the syntax is now intro h bang,
where bang
is a token.
Mario Carneiro (Jan 18 2021 at 12:42):
def intro (h : parse texpr) (_ : parse (tk "bang")) : tactic unit
Mario Carneiro (Jan 18 2021 at 12:43):
or def intro (h : parse (texpr <* tk "bang")) : tactic unit
Mario Carneiro (Jan 18 2021 at 12:44):
However, keep in mind that this doesn't always work depending on the nature of the first parser, here texpr
, because the parser monad does no backtracking
Patrick Massot (Jan 18 2021 at 14:26):
I'm sorry, I was interrupted. Thanks for you answer, I'll now try to process it.
Patrick Massot (Jan 18 2021 at 14:27):
I already noticed that no backtracking issue, it's quite annoying.
Yakov Pechersky (Jan 18 2021 at 15:33):
We can define a try
parser, I'm surprised it wasn't defined in the core.
Yakov Pechersky (Jan 18 2021 at 15:37):
open parser parse_result
section try
variables {α : Type} {p : parser α}
def try (p : parser α) : parser α :=
λ cb n, match p cb n with
| (fail _ err) := fail n err
| ok := ok
end
@[simp] lemma try_of_done_iff {cb n n' a} :
p.try cb n = done n' a ↔ p cb n = done n' a :=
by cases hp : p cb n; simp [hp, try]
@[simp] lemma try_of_fail_iff {cb n n' err} :
p.try cb n = fail n' err ↔ n = n' ∧ ∃ np, p cb n = fail np err :=
by cases hp : p cb n; simp [hp, try]
lemma valid.try (h : p.valid) : p.try.valid :=
λ cb n, by { cases hp : p cb n; simpa [hp, try, parse_result.pos] using h cb n <|> simp [hp, try] }
@[simp] lemma mem_try_iff {a : α} : a ∈ p.try ↔ a ∈ p :=
by simp [mem_def]
end try
Yakov Pechersky (Jan 18 2021 at 15:41):
And here's a way to use it:
def p1 : parser ℕ := ch 'a' *> nat <* ch 'b'
def p2 : parser ℕ := ch 'a' *> nat <* ch 'c'
#eval run_string (p1 <|> p2) "a1c" -- fails: sum.inl "a1c\n ^\n\nexpected: b\n"
#eval run_string (mfirst id [p1, p2]) "a1c" -- fails: equal to above
#eval run_string (mfirst try [p1, p2]) "a1c" -- succeeds: sum.inr 1
Yakov Pechersky (Jan 18 2021 at 15:51):
It fails in the first case because there is no backtracking after consuming that first a
, so when the c
is reached and p1
fails because it expects b
, the orelse
falls back to p2
. However, orelse
is defined for parsers to pass the position at the failure point, so p2
sees a b
and expects an a
, and fails too. Then, the whole orelse
fails, and gives the failure message of the earlier (in terms of position) parser. That's why we only see the p1
's error.
But if we wrap each one in try
, then when p1
fails, we start parsing with p2
at the beginning of the buffer.
Patrick Massot (Jan 18 2021 at 16:27):
Oh thanks, this will probably help me clean up a lot of code. Up to now I was able to parse everything I wanted but I sort of simulate backtracking with unpleasant boilerplate.
Last updated: Dec 20 2023 at 11:08 UTC