Zulip Chat Archive

Stream: lean4

Topic: Negative lookahead syntax


Andrés Goens (May 17 2022 at 14:11):

We're trying to define custom syntax with negative lookaheads, which are conveniently defined already in Lean. However, we get a strange behavior when combining it with comma-separated syntax. Here's a MWE for context:

declare_syntax_cat args
declare_syntax_cat kwargs

syntax (ident "=" ident),+ : kwargs
syntax (ident !"="),+ ("," kwargs )? : args
-- syntax sepBy(ident, ",", notFollowedBy("," kwargs)) ("," kwargs )? : args
syntax "[args|" args "]" : term

def foo := [args| foo, bar]
def bar := [args| foo, bar=baz]

This will parse the first list foo, bar but not the second, foo, bar=baz. If we change the syntax definition for the explicit sepBy commented above, the first definition stops parsing but it does parse the second. Does anyone know what's happening/what we're missing? Thanks!
CC @Siddharth Bhat

Sébastien Michelland (May 17 2022 at 14:16):

It seems to work for me on leanprover/lean4:nightly-2022-05-17, assuming the macro expansion for that piece is defined.

Arthur Paulino (May 17 2022 at 14:28):

The problem seems to lie on the usage of comma. This seems to work:

declare_syntax_cat args
declare_syntax_cat kwargs

syntax (ident "=" ident),+ : kwargs
syntax (ident !"="),+ ("~" kwargs )? : args
-- syntax sepBy(ident, ",", notFollowedBy("," kwargs)) ("," kwargs )? : args
syntax "[args|" args "]" : term

def foo := [args| foo, bar]
def bar := [args| foo ~ bar=baz]

Simplified:

syntax kwargs := (ident "=" ident),+
syntax "[args|" ident,+ ("~" kwargs)? "]" : term

def foo := [args| foo, bar]
def bar := [args| foo ~ q=rt, bar=baz]

Andrés Goens (May 17 2022 at 14:43):

@Sébastien Michelland it doesn't work for me on that nightly :/

➜  lean-mlir-frontend git:(master) ✗ lean --version
Lean (version 4.0.0-nightly-2022-05-17, commit 61c7b8b94c9e, Release)
➜  lean-mlir-frontend git:(master) ✗ lean DSL/Python/MWE.lean
DSL/Python/MWE.lean:11:11: error: elaboration function for '«term[args|_]»' has not been implemented
  [args|foo, bar]
DSL/Python/MWE.lean:12:26: error: unexpected element

Andrés Goens (May 17 2022 at 14:45):

as in, bar won't parse (line 12)

Arthur Paulino (May 17 2022 at 14:45):

Just to make sure, you're expecting the has not been implemented error, right?

Andrés Goens (May 17 2022 at 14:45):

yep, that one I'm expecting, the unexpected element one I'm not (and they switch with the other syntax definition commented out)

Arthur Paulino (May 17 2022 at 14:48):

While a better solution is not presented, you can keep going with this:

declare_syntax_cat       kwarg
syntax ident           : kwarg
syntax ident "=" ident : kwarg

syntax "[args|" kwarg,+ "]" : term

def foo := [args| foo, bar]
def bar := [args| foo, q=rt, bar=baz]

But then you will need to manually throw an error for def bar := [args| foo, q=rt, www, bar=baz]

Arthur Paulino (May 17 2022 at 14:49):

(I believe you don't want to allow an ident after an ident=ident)

Sébastien Michelland (May 17 2022 at 14:52):

@Andrés Goens Here's what works for me so we're clear:

declare_syntax_cat args
declare_syntax_cat kwargs

syntax (ident "=" ident),+ : kwargs
syntax (ident !"="),+ ("," kwargs )? : args
syntax sepBy(ident, ",", notFollowedBy("," kwargs)) ("," kwargs )? : args
syntax "[args|" args "]" : term

macro_rules
  | `([args| $args ]) => `(73)

#eval [args| foo, bar] -- 73
#eval [args| foo, bar=baz] -- 73

I un-commented your third line, which gives me two not-been-implemented errors, and then I can implement whatever macro expansion.

Sébastien Michelland (May 17 2022 at 17:22):

As it turns out I wasn't supposed to keep both definitions for args, which is why it worked for me. :rofl:

Andrés Goens (May 18 2022 at 10:57):

Yep, I mean, the second one (with sepBy) is an attempt to get the first one working. The way I understand the syntax-extension syntax, foo, bar=baz should be parsed by the rule syntax (ident !"="),+ ("," (ident "=" ident"),+ but that doesn't work. It appears as if the parser doesn't backtrack after trying to parse the first part and seeing the "=" to parse the second part, and we don't understand why/how to get around that

Sebastian Ullrich (May 18 2022 at 11:16):

The problem is that ,+ (i.e. sepBy) consumes the ,, after which it always expects another element (if allowTrailingSep is not set). This looks like the correct behavior in general, we would likely create bad error messages if it just stopped before the separator instead.


Last updated: Dec 20 2023 at 11:08 UTC