The parenthesizer inserts parentheses into a Syntax
object where syntactically necessary, usually as an intermediary
step between the delaborator and the formatter. While the delaborator outputs structurally well-formed syntax trees that
can be re-elaborated without post-processing, this tree structure is lost in the formatter and thus needs to be
preserved by proper insertion of parentheses.
The abstract problem & solution #
The Lean 4 grammar is unstructured and extensible with arbitrary new parsers, so in general it is undecidable whether
parentheses are necessary or even allowed at any point in the syntax tree. Parentheses for different categories, e.g.
terms and levels, might not even have the same structure. In this module, we focus on the correct parenthesization of
parsers defined via Lean.Parser.prattParser
, which includes both aforementioned built-in categories. Custom
parenthesizers can be added for new node kinds, but the data collected in the implementation below might not be
appropriate for other parenthesization strategies.
Usages of a parser defined via prattParser
in general have the form p prec
, where prec
is the minimum precedence
or binding power. Recall that a Pratt parser greedily runs a leading parser with precedence at least prec
(otherwise
it fails) followed by zero or more trailing parsers with precedence at least prec
; the precedence of a parser is
encoded in the call to leadingNode/trailingNode
, respectively. Thus we should parenthesize a syntax node stx
supposedly produced by p prec
if
- the leading/any trailing parser involved in
stx
has precedence <prec
(because without parentheses,p prec
would not produce all ofstx
), or - the trailing parser parsing the input to the right of
stx
, if any, has precedence >=prec
(because without parentheses,p prec
would have parsed it as well and made it a part ofstx
). We also check that the two parsers are from the same syntax category.
Note that in case 2, it is also sufficient to parenthesize a parent node as long as the offending parser is still to
the right of that node. For example, imagine the tree structure of (f fun x => x) y
without parentheses. We need to
insert some parentheses between x
and y
since the lambda body is parsed with precedence 0, while the identifier
parser for y
has precedence maxPrec
. But we need to parenthesize the $
node anyway since the precedence of its
RHS (0) again is smaller than that of y
. So it's better to only parenthesize the outer node than ending up with
(f $ (fun x => x)) y
.
Implementation #
We transform the syntax tree and collect the necessary precedence information for that in a single traversal. The
traversal is right-to-left to cover case 2. More specifically, for every Pratt parser call, we store as monadic state
the precedence of the left-most trailing parser and the minimum precedence of any parser (contPrec
/minPrec
) in this
call, if any, and the precedence of the nested trailing Pratt parser call (trailPrec
), if any. If stP
is the state
resulting from the traversal of a Pratt parser call p prec
, and st
is the state of the surrounding call, we
parenthesize if prec > stP.minPrec
(case 1) or if stP.trailPrec <= st.contPrec
(case 2).
The traversal can be customized for each [*Parser]
parser declaration c
(more specifically, each SyntaxNodeKind
c
) using the [parenthesizer c]
attribute. Otherwise, a default parenthesizer will be synthesized from the used
parser combinators by recursively replacing them with declarations tagged as [combinator_parenthesizer]
for the
respective combinator. If a called function does not have a registered combinator parenthesizer and is not reducible,
the synthesizer fails. This happens mostly at the Parser.mk
decl, which is irreducible, when some parser primitive has
not been handled yet.
The traversal over the Syntax
object is complicated by the fact that a parser does not produce exactly one syntax
node, but an arbitrary (but constant, for each parser) amount that it pushes on top of the parser stack. This amount can
even be zero for parsers such as checkWsBefore
. Thus we cannot simply pass and return a Syntax
object to and from
visit
. Instead, we use a Syntax.Traverser
that allows arbitrary movement and modification inside the syntax tree.
Our traversal invariant is that a parser interpreter should stop at the syntax object to the left of all syntax
objects its parser produced, except when it is already at the left-most child. This special case is not an issue in
practice since if there is another parser to the left that produced zero nodes in this case, it should always do so, so
there is no danger of the left-most child being processed multiple times.
Ultimately, most parenthesizers are implemented via three primitives that do all the actual syntax traversal:
maybeParenthesize mkParen prec x
runs x
and afterwards transforms it with mkParen
if the above
condition for p prec
is fulfilled. visitToken
advances to the preceding sibling and is used on atoms. visitArgs x
executes x
on the last child of the current node and then advances to the preceding sibling (of the original current
node).
- cat : Lean.Name
We need to store this
categoryParser
argument to deal with the implicit Pratt parser call intrailingNode.parenthesizer
. - forceParens : Bool
Whether to add parentheses regardless of any other conditions. This is cached from the
pp.parens
option.
Instances For
Equations
- p₁.orElse p₂ = do let s ← get Lean.catchInternalId Lean.PrettyPrinter.backtrackExceptionId p₁ fun (x : Lean.Exception) => do set s p₂ ()
Instances For
Equations
- Lean.PrettyPrinter.instOrElseParenthesizerM = { orElse := Lean.PrettyPrinter.ParenthesizerM.orElse }
Instances For
Instances For
Instances For
Equations
- Lean.PrettyPrinter.Parenthesizer.throwBacktrack = throw (Lean.Exception.internal Lean.PrettyPrinter.backtrackExceptionId)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Execute x
at the right-most child of the current node, if any, then advance to the left.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Run x
and parenthesize the result using mkParen
if necessary.
If canJuxtapose
is false, we assume the category does not have a token-less juxtaposition syntax a la function application and deactivate rule 2.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Adjust state and advance.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Instances For
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Wraps the term stx
in parentheses and then moves its SourceInfo
to the result.
The purpose of this is to move synthetic delaborator positions from the stx
node to the parentheses node,
which causes the info view to view the node with parentheses as referring to the parenthesized expression.
If we did not move info, the info view would consider the parentheses to belong to the outer term.
Note: we do not do withRef stx
because that causes the "(" and ")" tokens to have source info as well,
causing the info view to highlight each parenthesis as an independent expression.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.PrettyPrinter.Parenthesizer.rawStx.parenthesizer x = Lean.Syntax.MonadTraverser.goLeft
Instances For
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.PrettyPrinter.Parenthesizer.pushNone.parenthesizer = Lean.Syntax.MonadTraverser.goLeft
Instances For
Equations
- Lean.PrettyPrinter.Parenthesizer.hygieneInfoNoAntiquot.parenthesizer = Lean.Syntax.MonadTraverser.goLeft
Instances For
Instances For
Equations
Instances For
Instances For
Equations
- Lean.PrettyPrinter.Parenthesizer.instCoeForallParenthesizerAliasValue = { coe := Lean.Parser.AliasValue.unary }
Equations
- Lean.PrettyPrinter.Parenthesizer.instCoeForallForallParenthesizerAliasValue = { coe := Lean.Parser.AliasValue.binary }
Adds necessary parentheses in stx
parsed by parser
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Adds necessary parentheses to the syntax in the given category (for example, term
, tactic
, or command
).