Zulip Chat Archive

Stream: lean4

Topic: when to declare_syntax_cat


Henrik Böving (Feb 28 2022 at 20:58):

So far I've seen two ways to do syntax stuff in Lean one is with: syntax name := parser and the other is by syntax parser : syntax_cat. When should I prefer to use the first or the second one? It seems to me that they are about equally powerful? But I see code for example here: https://github.com/tydeu/lean4-alloy/blob/master/Alloy/C/Syntax.lean that mixes both declaring several syntax categories and layering within each other as well as this direct name syntax.

Mario Carneiro (Feb 28 2022 at 23:13):

The first kind of syntax creates a closed class, you list exactly what is valid syntax in the class after the :=. The other form is an open class: you can extend it later in later files or even in downstream user code.

Mario Carneiro (Feb 28 2022 at 23:16):

The syntax class approach also works a little different in the parser: syntax can have priorities, and they are all tried. By contrast, the <|> alternation you get in regular parsers does not try later alternatives if at least one token matches (unless you use atomic), so if you have complicated overlap between patterns you might need to use the syntax class approach

Mario Carneiro (Feb 28 2022 at 23:18):

You need to use a syntax definition when multiple atoms are parsed in a parser combinator like (a b c)* or (a b c) <|> (d e) (i.e. it should be rewritten as syntax abc := a b c and then use abc*). This introduces a node named abc into the generated syntax tree which is necessary for the parser and delaborator to work

Mario Carneiro (Feb 28 2022 at 23:22):

I generally prefer to use syntax definitions if possible, that is, if there are one or a few alternatives and they don't overlap too much, and it isn't supposed to be extended later. If it is an extension point, or there is large fanout, then use a syntax class


Last updated: Dec 20 2023 at 11:08 UTC