Zulip Chat Archive

Stream: lean4

Topic: Take precedence over tokenizer

Siegmentation Fault (Dec 24 2023 at 15:46):

I want to have nice notation for the homotopy groups using unicode subscript characters, that is, to parse strings like “πₙ₊₂(A, a)”; but (unfortunately) tokenizer is too eager so that “πₙ” is parsed as an separate identifier. Issue can be easily resolved by replacing “π” with non-identifier (something like 𝛑 or π + Zero Width Space), however, this seems to be too hacky. Is there any way to take precedence over tokenizer?

Eric Wieser (Dec 24 2023 at 16:27):

Can you provide a #mwe of what you tried that didn't work?

Siegmentation Fault (Dec 24 2023 at 16:44):

Eric Wieser said:

Can you provide a #mwe of what you tried that didn't work?

Of course:

open Lean Lean.Parser

declare_syntax_cat subscript

syntax "ᵢ" : subscript
syntax "ⱼ" : subscript
syntax "ₙ" : subscript

macro:max "π" noWs n:subscript noWs "(" τ:term "," ε:term ")" : term => `(sorry)

#check πᵢ(A, a) -- unknown identifier 'πᵢ'

Eric Wieser (Dec 24 2023 at 21:13):

It doesn't answer your question, but you might find https://leanprover-community.github.io/mathlib4_docs/Mathlib/Util/Superscript.html#Mathlib.Tactic.subscript useful

Last updated: May 02 2025 at 03:31 UTC