Zulip Chat Archive

Stream: general

Topic: monadic forgetting

Patrick Massot (Jan 18 2021 at 12:25):

What is the magic monadic combinator that would allow me to write a parser which throws away all the parsed tokens?

Anne Baanen (Jan 18 2021 at 12:29):

*> : m a -> m b -> m b?

Rob Lewis (Jan 18 2021 at 12:29):

What do you mean "throw away"? tk has type parser unit, it will consume a token and you don't have to do anything with it

Patrick Massot (Jan 18 2021 at 12:35):

I still want to return something, but it doesn't depend on the token.

Patrick Massot (Jan 18 2021 at 12:36):

The return type is an inductive with several constructors, and I want to choose one constructor depending on which token I see. But the constructor takes no argument.

Anne Baanen (Jan 18 2021 at 12:38):

*> would work, or <$: (mkX <$ token "x") <|> (mkY <$ token "y").

Patrick Massot (Jan 18 2021 at 12:39):

I think <$ is what I was missing.

Patrick Massot (Jan 18 2021 at 12:39):

I'm curious to know how you would do it with *>.

Mario Carneiro (Jan 18 2021 at 12:39):

you can use *> or <* depending on whether the thing you want to throw away is the left or right parser

Anne Baanen (Jan 18 2021 at 12:40):

a <$ mb is the same as pure a <* mb.

Mario Carneiro (Jan 18 2021 at 12:40):

for example tk "(" *> texpr <* tk ")" or thereabouts

Patrick Massot (Jan 18 2021 at 12:41):

Is there any way to end a tactic invocation with a token?

Mario Carneiro (Jan 18 2021 at 12:41):

what do you mean?

Mario Carneiro (Jan 18 2021 at 12:41):

are you writing a tactic or a parser?

Patrick Massot (Jan 18 2021 at 12:41):

both

Patrick Massot (Jan 18 2021 at 12:42):

Say I want to rewrite intro so that the syntax is now intro h bang, where bang is a token.

Mario Carneiro (Jan 18 2021 at 12:42):

def intro (h : parse texpr) (_ : parse (tk "bang")) : tactic unit

Mario Carneiro (Jan 18 2021 at 12:43):

or def intro (h : parse (texpr <* tk "bang")) : tactic unit

Mario Carneiro (Jan 18 2021 at 12:44):

However, keep in mind that this doesn't always work depending on the nature of the first parser, here texpr, because the parser monad does no backtracking

Patrick Massot (Jan 18 2021 at 14:26):

I'm sorry, I was interrupted. Thanks for you answer, I'll now try to process it.

Patrick Massot (Jan 18 2021 at 14:27):

I already noticed that no backtracking issue, it's quite annoying.

Yakov Pechersky (Jan 18 2021 at 15:33):

We can define a try parser, I'm surprised it wasn't defined in the core.

Yakov Pechersky (Jan 18 2021 at 15:37):

open parser parse_result

section try

variables {α : Type} {p : parser α}

def try (p : parser α) : parser α :=
λ cb n, match p cb n with
| (fail _ err) := fail n err
| ok := ok
end

@[simp] lemma try_of_done_iff {cb n n' a} :
  p.try cb n = done n' a ↔ p cb n = done n' a :=
by cases hp : p cb n; simp [hp, try]

@[simp] lemma try_of_fail_iff {cb n n' err} :
  p.try cb n = fail n' err ↔ n = n' ∧ ∃ np, p cb n = fail np err :=
by cases hp : p cb n; simp [hp, try]

lemma valid.try (h : p.valid) : p.try.valid :=
λ cb n, by { cases hp : p cb n; simpa [hp, try, parse_result.pos] using h cb n <|> simp [hp, try] }

@[simp] lemma mem_try_iff {a : α} : a ∈ p.try ↔ a ∈ p :=
by simp [mem_def]

end try

Yakov Pechersky (Jan 18 2021 at 15:41):

And here's a way to use it:

def p1 : parser ℕ := ch 'a' *> nat <* ch 'b'

def p2 : parser ℕ := ch 'a' *> nat <* ch 'c'

#eval run_string (p1 <|> p2) "a1c" -- fails: sum.inl "a1c\n  ^\n\nexpected: b\n"
#eval run_string (mfirst id [p1, p2]) "a1c" -- fails: equal to above
#eval run_string (mfirst try [p1, p2]) "a1c" -- succeeds: sum.inr 1

Yakov Pechersky (Jan 18 2021 at 15:51):

It fails in the first case because there is no backtracking after consuming that first a, so when the c is reached and p1 fails because it expects b, the orelse falls back to p2. However, orelse is defined for parsers to pass the position at the failure point, so p2 sees a b and expects an a, and fails too. Then, the whole orelse fails, and gives the failure message of the earlier (in terms of position) parser. That's why we only see the p1's error.

But if we wrap each one in try, then when p1 fails, we start parsing with p2 at the beginning of the buffer.

Patrick Massot (Jan 18 2021 at 16:27):

Oh thanks, this will probably help me clean up a lot of code. Up to now I was able to parse everything I wanted but I sort of simulate backtracking with unpleasant boilerplate.

Last updated: May 02 2025 at 03:31 UTC