Zulip Chat Archive

Stream: lean4

Topic: Syntax quotation for &"ident" parsers


Jannis Limperg (Jul 12 2021 at 16:28):

What is the syntax quotation incantation that'll make this example work?

import Lean

open Lean

declare_syntax_cat debug

syntax &"foo" : debug
syntax &"bar" : debug

def parseDebug : Syntax  CoreM Bool
  | `(debug|foo) => sorry
  | `(debug|bar) => sorry
  | _ => unreachable!

The issue is with the syntax &"foo". If you replace &"foo" with "foo" (and the same for bar), the example works.

Wojciech Nawrocki (Jul 12 2021 at 17:16):

Why do you want to use &? It's used to specify non-reserved tokens, but a non-reserved token must generally follow a reserved one. Otherwise it's ambiguous whether you're referring to foo-the-syntax or foo-the-symbol.

Jannis Limperg (Jul 12 2021 at 17:32):

I have a grammar with a lot of keywords and I don't want to reserve them all. I didn't know about this restriction on non-reserved tokens, but my grammar works fine with them when I write parsers without syntax quotations. I just wanted to refactor a bit so the code is less horrible.

Wojciech Nawrocki (Jul 12 2021 at 17:40):

(I don't know if this restriction is intended, I just found it empirically and it seems to make sense because of the ambiguity.) Can you write something like this, which does work?

syntax "far" &"foo" : debug
syntax "far" &"bar" : debug

def parseDebug : Syntax  CoreM Bool
  | `(debug|far foo) => sorry
  | `(debug|far bar) => sorry
  | _ => unreachable!

Sebastian Ullrich (Jul 12 2021 at 17:42):

See also https://github.com/leanprover/lean4/blob/22c2f146c96ecaeb8ed44825292b2a2d56407051/src/Lean/Parser/Basic.lean#L1565-L1597 - this is exposed as an optional parameter to registerParserCategory, but not declare_syntax_cat (yet).

Jannis Limperg (Jul 12 2021 at 18:05):

Ah so whether this works or not is a property of the syntax category (and indeed the example works with tactic instead of debug). Interesting. That means I can hack the system so the debug category does this too. Thanks for the demystification! Though I might still follow Wojciech's suggestion to limit the hacking.

Mac (Jul 12 2021 at 19:27):

Sebastian Ullrich said:

See also https://github.com/leanprover/lean4/blob/22c2f146c96ecaeb8ed44825292b2a2d56407051/src/Lean/Parser/Basic.lean#L1565-L1597 - this is exposed as an optional parameter to registerParserCategory, but not declare_syntax_cat (yet).

This reminds me of a question I had. I know that tactics use the non-default behavior and terms use default (because otherwise there would be ambiguity between identifiers and keywords). But I am curious as to why commands also use the default behavior. It seems like they could safely use the non-default (i.e., symbol) behavior (as afaik there is no ident parser for commands to disambiguate with). This would have the nice advantage of making this like section , constant ,private, protected, etc. no longer be keywords, which can be quite useful (especially, for example, section as it is quite a common word to use for other reasons).

Mario Carneiro (Jul 12 2021 at 20:10):

In lean 3 at least, command keywords being keywords is really important, because they are used by the parser to reset after exprs with ambiguous ending (less of a problem given whitespace-sensitivity, but it still shows up if you end a definition with def foo := x + for example), or other malformed syntax. Since there aren't any top level separators in lean files besides possibly newlines, the only reliable marker that the previous command is done is the command keyword for the next command.

Mario Carneiro (Jul 12 2021 at 20:13):

lean 3 had the . marker to end a definition with an ambiguous ending, but lean 4 doesn't. I use . in lean 3 a lot when writing unfinished definitions to ensure that garbage below the cursor doesn't interact with the definition I'm writing, and I've been using def for this purpose in lean 4, which is much noisier but works in a pinch

Mac (Jul 12 2021 at 21:02):

Mario Carneiro said:

In lean 3 at least, command keywords being keywords is really important, because they are used by the parser to reset after exprs with ambiguous ending (less of a problem given whitespace-sensitivity, but it still shows up if you end a definition with def foo := x + for example), or other malformed syntax.

Mario Carneiro said:

lean 3 had the . marker to end a definition with an ambiguous ending, but lean 4 doesn't.

Ah, well that at least explains it. However, I personally would prefer having one keyword to terminate ambiguous syntax (like the .) than having whole suite of keywords for commands (it would also make defining custom commands more convenient as they would no longer need to eat up identifiers for keywords). Especially considering, as you mentioned, whitespace sensitivity resolves the ambiguity in many a case and thus this issue appears much less often in Lean 4.


Last updated: Dec 20 2023 at 11:08 UTC