Zulip Chat Archive
Stream: metaprogramming / tactics
Topic: Include leading whitespace
Max Nowak 🐉 (Jan 20 2026 at 12:56):
I want to parse typst and distinguish between syntax such as #box[] foo and #box[]foo. The following code is self-contained:
import Lean
open Lean Parser PrettyPrinter
/-- Plain text -/
def typText : Parser :=
withAntiquot (mkAntiquot "typText" `typText) {
fn := fun c s =>
let startPos := s.pos
let s := takeWhile1Fn (not ∘ "{}[]#$".contains) "expected Typst text" c s
mkNodeToken `typText startPos true c s }
@[combinator_formatter typText] def typText.formatter : Formatter := Formatter.visitAtom ``typText
@[combinator_parenthesizer typText] def typText.parenthesizer : Parenthesizer := Parenthesizer.visitToken
declare_syntax_cat typst
declare_syntax_cat typstPart
syntax typText : typstPart
syntax "#box[]" : typstPart
syntax typstPart* : typst
elab "#[" stx:typst "]" : command => logInfo s!"{stx}"
#[foo #box[] bar]
This prints
(typst_ [
(typstPart_ (typText "foo "))
(«typstPart#box[]» "#box[]")
(typstPart_ (typText "bar"))
])
which is missing the leading space. I want the third element to be " bar". Can I do this?
Kyle Miller (Jan 20 2026 at 13:03):
If you log repr stx, you can see the leading/trailing information in the SourceInfo, which is where whitespace is recorded.
The whitespace is after #box[]:
Lean.Syntax.node
(Lean.SourceInfo.none)
`«typstPart#box[]»
#[Lean.Syntax.atom
(Lean.SourceInfo.original
"".toRawSubstring
{ byteIdx := 721 }
" ".toRawSubstring
{ byteIdx := 727 })
"#box[]"],
Whitespace out of the parser is always trailing whitespace. It's possible to make a pass to move whitespace between nodes, to make it be leading, if that's what you want to do.
Max Nowak 🐉 (Jan 20 2026 at 13:06):
It's possible to make a pass to move whitespace between nodes, to make it be leading, if that's what you want to do.
Yeah that sounds like what I want to do! How do I do that?
Max Nowak 🐉 (Jan 20 2026 at 13:06):
So far I've completely ignored SourceInfo.
Robin Arnez (Jan 20 2026 at 13:28):
To show how you can work with trailing whitespace, here is an example:
import Lean
open Lean Parser PrettyPrinter
/-- Plain text -/
def typText : Parser :=
withAntiquot (mkAntiquot "typText" `typText) {
fn := fun c s =>
let startPos := s.pos
let s := takeWhile1Fn (fun a => !"{}[]#$".contains a) "expected Typst text" c s
mkNodeToken `typText startPos true c s }
@[combinator_formatter typText] def typText.formatter : Formatter := Formatter.visitAtom ``typText
@[combinator_parenthesizer typText] def typText.parenthesizer : Parenthesizer := Parenthesizer.visitToken
declare_syntax_cat typst
declare_syntax_cat typstPart
syntax typText : typstPart
syntax "#box[]" : typstPart
syntax typstPart* : typst
elab "#[" stx:typst "]" : command => do
let ⟨.node _ _ #[.node _ _ args]⟩ := stx | pure ()
let parts : TSyntaxArray `typstPart := .mk args
let mut out := ""
for part in parts do
let trailing := part.raw.getTrailing?.elim "" (·.toString)
match part with
| `(typstPart| $text:typText) =>
let str := (text.raw.isLit? `typText).get!
out := s!"{out}Text: {str.quote}\n"
| `(typstPart| #box[]) => out := s!"{out}Box\n"
| _ => Elab.throwUnsupportedSyntax
if !trailing.isEmpty then
out := s!"{out}Trailing whitespace: {trailing.quote}\n"
Lean.logInfo out
/--
info: Text: "foo "
Box
Trailing whitespace: " "
Text: "bar"
-/
#guard_msgs in
#[foo #box[] bar]
Robin Arnez (Jan 20 2026 at 13:30):
I would've preferred let `(typst| $parts:typstPart*) := stx but that doesn't work so I had to do a manual match
Aaron Liu (Jan 20 2026 at 13:34):
Robin Arnez said:
I would've preferred
let `(typst| $parts:typstPart*) := stxbut that doesn't work so I had to do a manual match
what about (typst| $[$parts:typstPart]*)
Max Nowak 🐉 (Jan 20 2026 at 13:35):
Wonderful, that works! And it's a rather elegant solution too:
let mut acc := ""
for part in parts do
/- match on part ... -/
acc := acc ++ (part.raw.getTrailing?.elim "" (·.toString))
Thank you so much ^^
Robin Arnez (Jan 20 2026 at 13:59):
Aaron Liu schrieb:
Robin Arnez said:
I would've preferred
let `(typst| $parts:typstPart*) := stxbut that doesn't work so I had to do a manual matchwhat about
(typst| $[$parts:typstPart]*)
Doesn't work either. The problem is that typText also matches through the ) of the quotation, so you end up with typText that extends until the next {.
Last updated: Feb 28 2026 at 14:05 UTC