Zulip Chat Archive
Stream: metaprogramming / tactics
Topic: Backtracking parsers
Eric Wieser (Jun 10 2022 at 10:50):
Is this supposed to work?
open lean lean.parser
reserve notation `a`
reserve notation `b`
reserve notation `c`
meta def the_parser : parser ℕ :=
((do tk "a", tk "c", pure 1) <|> (do tk "a", tk "b", pure 2))
#eval do
x ← parser.run_with_input the_parser "a c",
tactic.trace x -- 1
#eval do
x ← parser.run_with_input the_parser "a b", -- error: c expected
tactic.trace x
It seems like docs#lean.parser.parser_orelse doesn't actually behave how I would expect an orelse
to behave
Reid Barton (Jun 10 2022 at 11:05):
It's pretty common in a parser combinator library for <|>
not to backtrack out of a branch that has made some progress, so that you have to write something like try (do tk "a", tk "c", pure 1) <|> (do tk "a", tk "b", pure 2)
if you want to allow it to
Eric Wieser (Jun 10 2022 at 11:07):
But we don't have a parser try, right?
Eric Wieser (Jun 10 2022 at 11:08):
Also, in the tactic monad <|>
does backtrack (src#interaction_monad_orelse)
Eric Wieser (Jun 10 2022 at 11:11):
Attempting to use interaction_monad_orelse
for parser
fails, I guess because parser
is (needlessly?) impure
Damiano Testa (Jun 10 2022 at 11:15):
I am very far from my comfort zone, but this works:
meta def the_parser1 : parser ℕ :=
do tk "a" >> do tk "c" >> pure 1 <|> do tk "b", pure 2
#eval do
x ← parser.run_with_input the_parser1 "a c",
tactic.trace x -- 1
#eval do
x ← parser.run_with_input the_parser1 "a b",
tactic.trace x -- 2
Does this suggest that the parser backtracks "a little"?
Eric Wieser (Jun 10 2022 at 11:15):
There's no backtracking there as the parser makes no progress
Damiano Testa (Jun 10 2022 at 11:16):
Ah, I see! Thanks!
Reid Barton (Jun 10 2022 at 11:17):
You can probably just copy the implementation of https://hackage.haskell.org/package/parsec-3.1.15.1/docs/src/Text.Parsec.Prim.html#try
Eric Wieser (Jun 10 2022 at 11:18):
I'm pretty sure you can't because parser_state
is lying and there is state stored elsewhere
Eric Wieser (Jun 10 2022 at 11:18):
Which means that backtracking the state doesn't actually work
Damiano Testa (Jun 10 2022 at 11:20):
Maybe another very superficial observation that suggests that this does not work is that the parentheses in the definition of the_parser
are optional. There seems to be no record of which sequence of commands is executed first...
Damiano Testa (Jun 10 2022 at 11:20):
i.e.
meta def the_parser : parser ℕ :=
do tk "a", tk "c", pure 1 <|> do tk "a", tk "b", pure 2
"works" just as well.
Reid Barton (Jun 10 2022 at 11:21):
Oh I was looking at the thing actually called parser
.
Reid Barton (Jun 10 2022 at 11:23):
I'm also not sure what the real question is--is it "do we want <|>
to work like this" or "how do I implement try
" or "how do I implement some other parser Y"
Eric Wieser (Jun 10 2022 at 11:24):
It's "how do I implement try
"
Eric Wieser (Jun 10 2022 at 11:24):
And I think the answer is "edit the C++ to expose parser::backtracking_scope
" (https://github.com/leanprover-community/lean/blob/848cddb3efb06cdcc53dc46c3784c04a0d49fc8a/src/frontends/lean/parser.cpp#L2138)
Damiano Testa (Jun 10 2022 at 11:29):
Can I ask if this is related to the alternative
instance on lean.parser
(that seems to exist)? Is that what should take care of backtracking and isn't?
Eric Wieser (Jun 10 2022 at 11:29):
Yes, that's what this question is about
Eric Wieser (Jun 10 2022 at 11:30):
It's implemented in terms of docs#lean.parser.parser_or_else which I link in my first message
Damiano Testa (Jun 10 2022 at 11:30):
Ok, thanks for confirming! A few days ago, I would not even have been able to formulate a coherent thought about this, but I am happy to see that I understand more now!
Reid Barton (Jun 10 2022 at 11:39):
OK I understand the problem now. Normally one would attempt to avoid try
by left-factoring the parser (or if necessary, by redefining the grammar).
Mario Carneiro (Jun 10 2022 at 11:39):
Yes, lean 3 is fundamentally incapable of backtracking due to its implementation. It has one token lookahead and that's it
Mario Carneiro (Jun 10 2022 at 11:40):
the monad tricks don't work because it's actually mutable state in the background
Eric Wieser (Jun 10 2022 at 11:41):
Do you thinking the backtracking_scope
API could resolve that Mario?
Mario Carneiro (Jun 10 2022 at 11:41):
I don't know much about it, is it currently used to implement something?
Mario Carneiro (Jun 10 2022 at 11:42):
personally I don't think it is worth it at this stage. Lean 4 has a fully backtracking parser
Mario Carneiro (Jun 10 2022 at 11:43):
and there are usually hacks you can do to avoid needing lots of lookahead by postprocessing the parser results
Mario Carneiro (Jun 10 2022 at 11:43):
like your example can trivially be factored to do tk "a"
first
Mario Carneiro (Jun 10 2022 at 11:55):
Looking at backtracking_scope
, it seems not to be used much, and the issue that introduced it - leanprover/lean#1745 - does not seem to be handling any kind of general backtracking system but rather a way to do error recovery after parsing a missing expression. But I think the main reason it is not suitable here is that backtracking_scope
stores a single backtrack position in the parser state, which means it does not work in nested situations, it can't support the full scope of a parsec style try
combinator
Eric Wieser (Jun 10 2022 at 12:04):
Ouch, I'd hoped that something with RAII semantics would be reentrant. You're right then that there's no point trying to use that
Eric Wieser (Jun 10 2022 at 13:21):
Mario Carneiro said:
and there are usually hacks you can do to avoid needing lots of lookahead by postprocessing the parser results
This is what I ended up doing in the end
Last updated: Dec 20 2023 at 11:08 UTC