Zulip Chat Archive
Stream: lean4
Topic: Two ranges for one syntax
Damiano Testa (Mar 13 2025 at 22:20):
I am trying to match the string input for a Lean command, to the pretty-printed version of the same command.
Damiano Testa (Mar 13 2025 at 22:21):
If I traverse the syntax tree of the command, each node will give me precise range information of where I can retrieve in the input string the "source" of each node.
Damiano Testa (Mar 13 2025 at 22:21):
Something similar should happen also from the pretty-printed version of the same syntax tree. Is there some easy way to access it?
Damiano Testa (Mar 13 2025 at 22:21):
Here is an example.
import Lean.Elab.Command
open Lean Elab Command in
elab "format_me " cmd:command : command => do
elabCommand cmd
let fmt ← liftCoreM <| PrettyPrinter.ppCategory `command cmd
let alternateString := fmt.pretty
logInfo m!"The command looks like this:\n---\n{alternateString}\n---"
/--
info: The command looks like this:
---
example : True :=
trivial
---
-/
#guard_msgs in
format_me
example : True := trivial -- `cmd` carries information about this line
Damiano Testa (Mar 13 2025 at 22:21):
Is there some way to assign ranges in alternateString
from nodes of cmd
?
Aaron Liu (Mar 13 2025 at 22:28):
I don't see anything here that would tie down the source.
Here's the raw format:
Std.Format.group
(Std.Format.group
(Std.Format.nest
2
(Std.Format.append
(Std.Format.text "example")
(Std.Format.append
(Std.Format.nest
2
(Std.Format.append
(Std.Format.text " :")
(Std.Format.append
(Std.Format.line)
(Std.Format.group
(Std.Format.nest 2 (Std.Format.tag 519 (Std.Format.text "True")))
(Std.Format.FlattenBehavior.fill)))))
(Std.Format.append
(Std.Format.text " :=")
(Std.Format.append
(Std.Format.text "\n")
(Std.Format.group
(Std.Format.nest 2 (Std.Format.tag 527 (Std.Format.text "trivial")))
(Std.Format.FlattenBehavior.fill)))))))
(Std.Format.FlattenBehavior.fill))
(Std.Format.FlattenBehavior.fill)
Damiano Testa (Mar 13 2025 at 22:29):
I agree that the Format
does not carry position information, but something has converted the syntax into a format and eventually into a string, so I am wondering if it is possible to establish a correspondence somehow.
Damiano Testa (Mar 13 2025 at 22:35):
Btw, I do not insist on going through a Format
, I am happy with getting any kind of reasonably formatted string out of the syntax, as long as there is a way to match parts of the user input string with parts of the "formatted" string.
Damiano Testa (Mar 13 2025 at 22:36):
However, it is essential that whitespace is (largely) accounted for, since this is precisely for the linter that tells you how to format code following the guidelines.
Aaron Liu (Mar 13 2025 at 22:47):
So, you want to index into the formatted string using the syntax object?
Damiano Testa (Mar 13 2025 at 22:58):
That would be the most convenient. Right now, any principled way of going from the original string to the formatted one is probably useful.
Damiano Testa (Mar 13 2025 at 22:59):
What the linter currently does is to replace line breaks and all the following spaces with a single space on the formatted text and tries its best at matching this string with the user input.
Damiano Testa (Mar 13 2025 at 23:00):
Where they differ, it (usually) flags an issue.
Damiano Testa (Mar 13 2025 at 23:00):
This is very flimsy though and it breaks often.
Damiano Testa (Mar 13 2025 at 23:01):
Although it works surprisingly well for the part of the declarations that contains the variables, which is where the scope of the linter is currently limited.
Aaron Liu (Mar 13 2025 at 23:12):
You could "inject" a special character into the formatted string, and then index that character.
Last updated: May 02 2025 at 03:31 UTC