Zulip Chat Archive

Stream: metaprogramming / tactics

Topic: Backtracking parsers


Eric Wieser (Jun 10 2022 at 10:50):

Is this supposed to work?

open lean lean.parser

reserve notation `a`
reserve notation `b`
reserve notation `c`
meta def the_parser : parser  :=
((do tk "a", tk "c", pure 1) <|> (do tk "a", tk "b", pure 2))

#eval do
  x  parser.run_with_input the_parser "a c",
  tactic.trace x  -- 1

#eval do
  x  parser.run_with_input the_parser "a b", -- error: c expected
  tactic.trace x

It seems like docs#lean.parser.parser_orelse doesn't actually behave how I would expect an orelse to behave

Reid Barton (Jun 10 2022 at 11:05):

It's pretty common in a parser combinator library for <|> not to backtrack out of a branch that has made some progress, so that you have to write something like try (do tk "a", tk "c", pure 1) <|> (do tk "a", tk "b", pure 2) if you want to allow it to

Eric Wieser (Jun 10 2022 at 11:07):

But we don't have a parser try, right?

Eric Wieser (Jun 10 2022 at 11:08):

Also, in the tactic monad <|> does backtrack (src#interaction_monad_orelse)

Eric Wieser (Jun 10 2022 at 11:11):

Attempting to use interaction_monad_orelse for parser fails, I guess because parser is (needlessly?) impure

Damiano Testa (Jun 10 2022 at 11:15):

I am very far from my comfort zone, but this works:

meta def the_parser1 : parser  :=
do tk "a" >> do tk "c" >> pure 1 <|> do tk "b", pure 2

#eval do
  x  parser.run_with_input the_parser1 "a c",
  tactic.trace x  -- 1

#eval do
  x  parser.run_with_input the_parser1 "a b",
  tactic.trace x -- 2

Does this suggest that the parser backtracks "a little"?

Eric Wieser (Jun 10 2022 at 11:15):

There's no backtracking there as the parser makes no progress

Damiano Testa (Jun 10 2022 at 11:16):

Ah, I see! Thanks!

Reid Barton (Jun 10 2022 at 11:17):

You can probably just copy the implementation of https://hackage.haskell.org/package/parsec-3.1.15.1/docs/src/Text.Parsec.Prim.html#try

Eric Wieser (Jun 10 2022 at 11:18):

I'm pretty sure you can't because parser_state is lying and there is state stored elsewhere

Eric Wieser (Jun 10 2022 at 11:18):

Which means that backtracking the state doesn't actually work

Damiano Testa (Jun 10 2022 at 11:20):

Maybe another very superficial observation that suggests that this does not work is that the parentheses in the definition of the_parser are optional. There seems to be no record of which sequence of commands is executed first...

Damiano Testa (Jun 10 2022 at 11:20):

i.e.

meta def the_parser : parser  :=
do tk "a", tk "c", pure 1 <|> do tk "a", tk "b", pure 2

"works" just as well.

Reid Barton (Jun 10 2022 at 11:21):

Oh I was looking at the thing actually called parser.

Reid Barton (Jun 10 2022 at 11:23):

I'm also not sure what the real question is--is it "do we want <|> to work like this" or "how do I implement try" or "how do I implement some other parser Y"

Eric Wieser (Jun 10 2022 at 11:24):

It's "how do I implement try"

Eric Wieser (Jun 10 2022 at 11:24):

And I think the answer is "edit the C++ to expose parser::backtracking_scope" (https://github.com/leanprover-community/lean/blob/848cddb3efb06cdcc53dc46c3784c04a0d49fc8a/src/frontends/lean/parser.cpp#L2138)

Damiano Testa (Jun 10 2022 at 11:29):

Can I ask if this is related to the alternative instance on lean.parser (that seems to exist)? Is that what should take care of backtracking and isn't?

Eric Wieser (Jun 10 2022 at 11:29):

Yes, that's what this question is about

Eric Wieser (Jun 10 2022 at 11:30):

It's implemented in terms of docs#lean.parser.parser_or_else which I link in my first message

Damiano Testa (Jun 10 2022 at 11:30):

Ok, thanks for confirming! A few days ago, I would not even have been able to formulate a coherent thought about this, but I am happy to see that I understand more now!

Reid Barton (Jun 10 2022 at 11:39):

OK I understand the problem now. Normally one would attempt to avoid try by left-factoring the parser (or if necessary, by redefining the grammar).

Mario Carneiro (Jun 10 2022 at 11:39):

Yes, lean 3 is fundamentally incapable of backtracking due to its implementation. It has one token lookahead and that's it

Mario Carneiro (Jun 10 2022 at 11:40):

the monad tricks don't work because it's actually mutable state in the background

Eric Wieser (Jun 10 2022 at 11:41):

Do you thinking the backtracking_scope API could resolve that Mario?

Mario Carneiro (Jun 10 2022 at 11:41):

I don't know much about it, is it currently used to implement something?

Mario Carneiro (Jun 10 2022 at 11:42):

personally I don't think it is worth it at this stage. Lean 4 has a fully backtracking parser

Mario Carneiro (Jun 10 2022 at 11:43):

and there are usually hacks you can do to avoid needing lots of lookahead by postprocessing the parser results

Mario Carneiro (Jun 10 2022 at 11:43):

like your example can trivially be factored to do tk "a" first

Mario Carneiro (Jun 10 2022 at 11:55):

Looking at backtracking_scope, it seems not to be used much, and the issue that introduced it - leanprover/lean#1745 - does not seem to be handling any kind of general backtracking system but rather a way to do error recovery after parsing a missing expression. But I think the main reason it is not suitable here is that backtracking_scope stores a single backtrack position in the parser state, which means it does not work in nested situations, it can't support the full scope of a parsec style try combinator

Eric Wieser (Jun 10 2022 at 12:04):

Ouch, I'd hoped that something with RAII semantics would be reentrant. You're right then that there's no point trying to use that

Eric Wieser (Jun 10 2022 at 13:21):

Mario Carneiro said:

and there are usually hacks you can do to avoid needing lots of lookahead by postprocessing the parser results

This is what I ended up doing in the end


Last updated: Dec 20 2023 at 11:08 UTC