Zulip Chat Archive

Stream: metaprogramming / tactics

Topic: Include leading whitespace


Max Nowak 🐉 (Jan 20 2026 at 12:56):

I want to parse typst and distinguish between syntax such as #box[] foo and #box[]foo. The following code is self-contained:

import Lean
open Lean Parser PrettyPrinter

/-- Plain text -/
def typText : Parser :=
  withAntiquot (mkAntiquot "typText" `typText) {
    fn := fun c s =>
      let startPos := s.pos
      let s := takeWhile1Fn (not  "{}[]#$".contains) "expected Typst text" c s
      mkNodeToken `typText startPos true c s }
@[combinator_formatter typText] def typText.formatter : Formatter := Formatter.visitAtom ``typText
@[combinator_parenthesizer typText] def typText.parenthesizer : Parenthesizer := Parenthesizer.visitToken

declare_syntax_cat typst
declare_syntax_cat typstPart
syntax typText : typstPart
syntax "#box[]" : typstPart
syntax typstPart* : typst
elab "#[" stx:typst "]" : command => logInfo s!"{stx}"

#[foo #box[] bar]

This prints

(typst_ [
  (typstPart_ (typText "foo "))
  («typstPart#box[]» "#box[]")
  (typstPart_ (typText "bar"))
])

which is missing the leading space. I want the third element to be " bar". Can I do this?

Kyle Miller (Jan 20 2026 at 13:03):

If you log repr stx, you can see the leading/trailing information in the SourceInfo, which is where whitespace is recorded.

The whitespace is after #box[]:

                 Lean.Syntax.node
                   (Lean.SourceInfo.none)
                   typstPart#box[]»
                   #[Lean.Syntax.atom
                       (Lean.SourceInfo.original
                         "".toRawSubstring
                         { byteIdx := 721 }
                         " ".toRawSubstring
                         { byteIdx := 727 })
                       "#box[]"],

Whitespace out of the parser is always trailing whitespace. It's possible to make a pass to move whitespace between nodes, to make it be leading, if that's what you want to do.

Max Nowak 🐉 (Jan 20 2026 at 13:06):

It's possible to make a pass to move whitespace between nodes, to make it be leading, if that's what you want to do.

Yeah that sounds like what I want to do! How do I do that?

Max Nowak 🐉 (Jan 20 2026 at 13:06):

So far I've completely ignored SourceInfo.

Robin Arnez (Jan 20 2026 at 13:28):

To show how you can work with trailing whitespace, here is an example:

import Lean
open Lean Parser PrettyPrinter

/-- Plain text -/
def typText : Parser :=
  withAntiquot (mkAntiquot "typText" `typText) {
    fn := fun c s =>
      let startPos := s.pos
      let s := takeWhile1Fn (fun a => !"{}[]#$".contains a) "expected Typst text" c s
      mkNodeToken `typText startPos true c s }
@[combinator_formatter typText] def typText.formatter : Formatter := Formatter.visitAtom ``typText
@[combinator_parenthesizer typText] def typText.parenthesizer : Parenthesizer := Parenthesizer.visitToken

declare_syntax_cat typst
declare_syntax_cat typstPart
syntax typText : typstPart
syntax "#box[]" : typstPart
syntax typstPart* : typst

elab "#[" stx:typst "]" : command => do
  let ⟨.node _ _ #[.node _ _ args] := stx | pure ()
  let parts : TSyntaxArray `typstPart := .mk args
  let mut out := ""
  for part in parts do
    let trailing := part.raw.getTrailing?.elim "" (·.toString)
    match part with
    | `(typstPart| $text:typText) =>
      let str := (text.raw.isLit? `typText).get!
      out := s!"{out}Text: {str.quote}\n"
    | `(typstPart| #box[]) => out := s!"{out}Box\n"
    | _ => Elab.throwUnsupportedSyntax
    if !trailing.isEmpty then
      out := s!"{out}Trailing whitespace: {trailing.quote}\n"
  Lean.logInfo out

/--
info: Text: "foo "
Box
Trailing whitespace: " "
Text: "bar"
-/
#guard_msgs in
#[foo #box[] bar]

Robin Arnez (Jan 20 2026 at 13:30):

I would've preferred let `(typst| $parts:typstPart*) := stx but that doesn't work so I had to do a manual match

Aaron Liu (Jan 20 2026 at 13:34):

Robin Arnez said:

I would've preferred let `(typst| $parts:typstPart*) := stx but that doesn't work so I had to do a manual match

what about (typst| $[$parts:typstPart]*)

Max Nowak 🐉 (Jan 20 2026 at 13:35):

Wonderful, that works! And it's a rather elegant solution too:

let mut acc := ""
for part in parts do
  /- match on part ... -/
  acc := acc ++ (part.raw.getTrailing?.elim "" (·.toString))

Thank you so much ^^

Robin Arnez (Jan 20 2026 at 13:59):

Aaron Liu schrieb:

Robin Arnez said:

I would've preferred let `(typst| $parts:typstPart*) := stx but that doesn't work so I had to do a manual match

what about (typst| $[$parts:typstPart]*)

Doesn't work either. The problem is that typText also matches through the ) of the quotation, so you end up with typText that extends until the next {.


Last updated: Feb 28 2026 at 14:05 UTC