Zulip Chat Archive

Stream: lean4

Topic: What can one actually do with docs#Syntax.getRange?

Anne Baanen (Aug 26 2025 at 10:10):

What exactly are the semantics of docs#String.Range as output by docs#Syntax.getRange?

Context: A lot of Mathlib linters use the output of docs#Syntax.getRange? to make decisions about what to lint. For example, the unused tactics linter uses the range of a piece of syntax as a unique reference:

      if let some r := stx.getRange? true then
        modify fun m => m.insert r stx
...
      if let some r := stx.getRange? true then
        if exceptions.contains kind
        -- if the tactic is allowed to not change the goals
        then modify (·.erase r)

But while developing the tactic analysis framework in #28802, it turns out that Syntax.getRange? isn't shrinking as we go deeper into infotrees: def foo where ... expands to def foo := { ... }, which ends a couple bytes further on.

So the questions are: is this intended behaviour? More generally, can we use the numbers reported by docs#Syntax.getRange? for anything, or are they all for internal use?

Marc Huisinga (Aug 26 2025 at 11:36):

docs#Lean.SourceInfo

is this intended behaviour?

Yes. For where, we use the end of the trailing whitespace for the SourceInfo of the expanded syntax so that we can provide auto-completions for structure fields in whitespace.

More generally, can we use the numbers reported by docs#Syntax.getRange? for anything

It depends on what you want to do with them, but they certainly are not intended to uniquely identify a piece of original syntax across the InfoTree. Non-canonical synthetic ranges are mostly (but not always) ignored by the language server, canonical synthetic ranges are intended to represent some meaningful range in the source for a macro expansion and original ranges represent syntax as it exists in the source.

Do you really want to lint the syntax produced by macro expansions, or only original syntax? In the latter case, you should probably limit linting to original SourceInfo, in which case the range should represent what you expect it to represent. In the former case, the ranges are definitely not intended to uniquely identify the original syntax across the InfoTree. I suppose you could use MacroExpansionInfos during traversal to build a mapping, though.

Damiano Testa (Aug 26 2025 at 17:20):

Anne, I don't know if this is useful for your application, but sometimes the getRange? dance is helpful to figure out whether some tactic that is executed by multiple branches of the proof is used at least in one branch or it is never used.

Damiano Testa (Aug 26 2025 at 17:20):

Also, I often use the pair (range, substring) as a stand-in for something that is HashHable and a decent replacement for Syntax.

Last updated: Feb 28 2026 at 14:05 UTC