Zulip Chat Archive

Stream: general

Topic: monadic forgetting


view this post on Zulip Patrick Massot (Jan 18 2021 at 12:25):

What is the magic monadic combinator that would allow me to write a parser which throws away all the parsed tokens?

view this post on Zulip Anne Baanen (Jan 18 2021 at 12:29):

*> : m a -> m b -> m b?

view this post on Zulip Rob Lewis (Jan 18 2021 at 12:29):

What do you mean "throw away"? tk has type parser unit, it will consume a token and you don't have to do anything with it

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:35):

I still want to return something, but it doesn't depend on the token.

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:36):

The return type is an inductive with several constructors, and I want to choose one constructor depending on which token I see. But the constructor takes no argument.

view this post on Zulip Anne Baanen (Jan 18 2021 at 12:38):

*> would work, or <$: (mkX <$ token "x") <|> (mkY <$ token "y").

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:39):

I think <$ is what I was missing.

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:39):

I'm curious to know how you would do it with *>.

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:39):

you can use *> or <* depending on whether the thing you want to throw away is the left or right parser

view this post on Zulip Anne Baanen (Jan 18 2021 at 12:40):

a <$ mb is the same as pure a <* mb.

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:40):

for example tk "(" *> texpr <* tk ")" or thereabouts

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:41):

Is there any way to end a tactic invocation with a token?

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:41):

what do you mean?

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:41):

are you writing a tactic or a parser?

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:41):

both

view this post on Zulip Patrick Massot (Jan 18 2021 at 12:42):

Say I want to rewrite intro so that the syntax is now intro h bang, where bang is a token.

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:42):

def intro (h : parse texpr) (_ : parse (tk "bang")) : tactic unit

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:43):

or def intro (h : parse (texpr <* tk "bang")) : tactic unit

view this post on Zulip Mario Carneiro (Jan 18 2021 at 12:44):

However, keep in mind that this doesn't always work depending on the nature of the first parser, here texpr, because the parser monad does no backtracking

view this post on Zulip Patrick Massot (Jan 18 2021 at 14:26):

I'm sorry, I was interrupted. Thanks for you answer, I'll now try to process it.

view this post on Zulip Patrick Massot (Jan 18 2021 at 14:27):

I already noticed that no backtracking issue, it's quite annoying.

view this post on Zulip Yakov Pechersky (Jan 18 2021 at 15:33):

We can define a try parser, I'm surprised it wasn't defined in the core.

view this post on Zulip Yakov Pechersky (Jan 18 2021 at 15:37):

open parser parse_result

section try

variables {α : Type} {p : parser α}

def try (p : parser α) : parser α :=
λ cb n, match p cb n with
| (fail _ err) := fail n err
| ok := ok
end

@[simp] lemma try_of_done_iff {cb n n' a} :
  p.try cb n = done n' a  p cb n = done n' a :=
by cases hp : p cb n; simp [hp, try]

@[simp] lemma try_of_fail_iff {cb n n' err} :
  p.try cb n = fail n' err  n = n'   np, p cb n = fail np err :=
by cases hp : p cb n; simp [hp, try]

lemma valid.try (h : p.valid) : p.try.valid :=
λ cb n, by { cases hp : p cb n; simpa [hp, try, parse_result.pos] using h cb n <|> simp [hp, try] }

@[simp] lemma mem_try_iff {a : α} : a  p.try  a  p :=
by simp [mem_def]

end try

view this post on Zulip Yakov Pechersky (Jan 18 2021 at 15:41):

And here's a way to use it:

def p1 : parser  := ch 'a' *> nat <* ch 'b'

def p2 : parser  := ch 'a' *> nat <* ch 'c'

#eval run_string (p1 <|> p2) "a1c" -- fails: sum.inl "a1c\n  ^\n\nexpected: b\n"
#eval run_string (mfirst id [p1, p2]) "a1c" -- fails: equal to above
#eval run_string (mfirst try [p1, p2]) "a1c" -- succeeds: sum.inr 1

view this post on Zulip Yakov Pechersky (Jan 18 2021 at 15:51):

It fails in the first case because there is no backtracking after consuming that first a, so when the c is reached and p1 fails because it expects b, the orelse falls back to p2. However, orelse is defined for parsers to pass the position at the failure point, so p2 sees a b and expects an a, and fails too. Then, the whole orelse fails, and gives the failure message of the earlier (in terms of position) parser. That's why we only see the p1's error.

But if we wrap each one in try, then when p1 fails, we start parsing with p2 at the beginning of the buffer.

view this post on Zulip Patrick Massot (Jan 18 2021 at 16:27):

Oh thanks, this will probably help me clean up a lot of code. Up to now I was able to parse everything I wanted but I sort of simulate backtracking with unpleasant boilerplate.


Last updated: May 14 2021 at 13:24 UTC