Zulip Chat Archive
Stream: lean4 dev
Topic: go to definition jumps to incorrect file
Xiyu Zhai (Mar 07 2024 at 18:16):
I'm a newbie trying to help by starting from bug fixes.
I'm looking at a 2yr old issue https://github.com/leanprover/lean4/issues/1170.
The issue is
"If there are two different files that contain declarations with the same name, the go to definition feature will sometimes jump to the declaration in the wrong file. This seems to happen consistently if the offending files are all open in a particular editor group (in tabs)."
On my computer, I can easily replicate this bug.
I'm a newbie so I'm just trying to understand the situation.
It seems to me that it could be caused by Environment sharing.
The following code finds location links from decl given uri.
open Elab in
def locationLinksFromDecl (srcSearchPath : SearchPath) (uri : DocumentUri) (n : Name)
(originRange? : Option Range) : MetaM (Array LocationLink) := do
let mod? ← findModuleOf? n
let modUri? ← match mod? with
| some modName => documentUriFromModule srcSearchPath modName
| none => pure <| some uri
let ranges? ← findDeclarationRanges? n
if let (some ranges, some modUri) := (ranges?, modUri?) then
let ll : LocationLink := {
originSelectionRange? := originRange?
targetUri := modUri
targetRange := ranges.range.toLspRange
targetSelectionRange := ranges.selectionRange.toLspRange
}
return #[ll]
return #[]
function findModuleOf?
is invoked, defined as
def findModuleOf? [Monad m] [MonadEnv m] [MonadError m] (declName : Name) : m (Option Name) := do
discard <| getConstInfo declName -- ensure declaration exists
match (← getEnv).getModuleIdxFor? declName with
| none => return none
| some modIdx => return some ((← getEnv).allImportedModuleNames[modIdx.toNat]!)
, which calls getModuleIdxFor
defined as
def getModuleIdxFor? (env : Environment) (declName : Name) : Option ModuleIdx :=
env.const2ModIdx.find? declName
which uses env.const2ModIdx
.
env.const2ModIdx
is of type HashMap Name ModuleIdx
, which map each name to a module.
Now it seems to me that for different files, we are using the same environment, and thus unavoidably there could be name clashing.
My experiments show that if only one file is open, there is no such issue. Partially convinces me this is the issue.
I guess there should be a separate Environment
for each file?
I'm just trying to see if my understanding of the origin of the bug is correct. Fixing the bug could be an overhaul though.
Xiyu Zhai (Mar 07 2024 at 18:23):
Technically I only started looking at the source code seriously today, so I could be totally wrong
Xiyu Zhai (Mar 07 2024 at 18:24):
I guess this is not a problem occurring in compilation because in compilation there is no sharing of Environment
?
Kyle Miller (Mar 07 2024 at 18:26):
(This is my first time looking at this LSP server code.) There's not necessarily anything wrong with locationLinksFromDecl
itself, since part of MetaM
is the environment state. So long as the function is called with the Environment
set correctly, it's looks like it should work fine.
Xiyu Zhai (Mar 07 2024 at 18:29):
I'm not suggesting locationLinksFromDecl
is wrong though. I'm suggesting that Environment
's field const2ModIdx
is not detailed enough. Maybe HashMap (ModuleIdx, Name) ModuleIdx
?
Kyle Miller (Mar 07 2024 at 18:29):
Not only is there a separate Environment
per file, but there are multiple checkpoints of the environment, from between every command, all saved in various Lean.Elab.Command.State
objects.
Xiyu Zhai (Mar 07 2024 at 18:29):
I see.
Xiyu Zhai (Mar 07 2024 at 18:29):
Thanks a lot
Xiyu Zhai (Mar 07 2024 at 18:30):
Then the bug should be somewhere else
Kyle Miller (Mar 07 2024 at 18:33):
To help you with the search, in a function like locationLinksOfInfo
, you can see code like
match i with
| .ofTermInfo ti =>
let mut expr := ti.expr
...
match expr with
| Expr.const n .. => return ← ci.runMetaM i.lctx <| locationLinksFromDecl i n
...
The runMetaM
function is what is setting up the Environment, among other things, from the data in ci
.
Xiyu Zhai (Mar 07 2024 at 18:33):
Wow, thank you so much! I guess I need to figure out a way to print debug the lean server. I'm a newbie, lol.
Kyle Miller (Mar 07 2024 at 18:49):
Maybe it's worth logging information about infoWithCtx
in Lean.Server.FileWorker.handleDefinition
. For example, you could use the map₂
field of constants
in the context's Environment
to deduce which environment you're looking at. The map₂
field is for locally defined constants, rather than imported ones. There are probably better ways to debug this, but that's something that came to mind.
Xiyu Zhai (Mar 07 2024 at 18:49):
Thanks a lot! That’s extremely helpful!
Xiyu Zhai (Mar 08 2024 at 07:47):
Seems that I've found out the culprit. The server uses a global variable references to handle the request, thus causing the trouble.
handleDefinition
is not the function actually being called.
In Watchdog.lean
the function handleRequest
is called, then in the wierd if statement before the match statement
if method == "textDocument/definition" || method == "textDocument/declaration" then
let params ← parseParams TextDocumentPositionParams params
let definitions ← findDefinitions params
dbg_trace s!"handleRequest method = {method} params = {params} defns = {ToJson.toJson definitions}"
if !definitions.isEmpty then
(← read).hOut.writeLspResponse ⟨id, definitions⟩
return
match method with
Just look at the invokation of findDefinitions
, which goes to
def findDefinitions (p : TextDocumentPositionParams) : ServerM <| Array Location := do
let mut definitions := #[]
if let some path := fileUriToPath? p.textDocument.uri then
let srcSearchPath := (← read).srcSearchPath
if let some module ← searchModuleNameOfFileName path srcSearchPath then
let references ← (← read).references.get
for ident in references.findAt module p.position (includeStop := true) do
if let some ⟨definitionLocation, _⟩ ← references.definitionOf? ident srcSearchPath then
definitions := definitions.push definitionLocation
return definitions
Notice that
let references ← (← read).references.get
for ident in references.findAt module p.position (includeStop := true) do
if let some ⟨definitionLocation, _⟩ ← references.definitionOf? ident srcSearchPath then
definitions := definitions.push definitionLocation
, the references are got from server context, the type of which is defined by
structure ServerContext where
hIn : FS.Stream
hOut : FS.Stream
hLog : FS.Stream
/-- Command line arguments. -/
args : List String
fileWorkersRef : IO.Ref FileWorkerMap
/-- We store these to pass them to workers. -/
initParams : InitializeParams
workerPath : System.FilePath
srcSearchPath : System.SearchPath
references : IO.Ref References
So references is a global variable, shared by all files (I believe with high confidence), then definitions are collected from this references global variable. The type of this global variable is
structure References where
/-- References loaded from ilean files -/
ileans : HashMap Name (System.FilePath × Lsp.ModuleRefs)
/-- References from workers, overriding the corresponding ilean files -/
workers : HashMap Name (Nat × Lsp.ModuleRefs)
See the fields ileans
, a name is mapped to a file path and a collection of references.
Hence, if two different set of compilations with clashing names exist for the same time, it causes problems.
Marc Huisinga (Mar 08 2024 at 08:07):
Good diagnosis! Yes, that's one of the known issues with the References data structure. I was going to fix this until the next rc1 release :-)
Xiyu Zhai (Mar 08 2024 at 08:15):
Lol. You know all along. But I did learn a lot.
Notification Bot (Mar 08 2024 at 08:17):
Xiyu Zhai has marked this topic as resolved.
Marc Huisinga (Mar 08 2024 at 08:20):
@Sebastian Ullrich mentioned it to me recently. I hadn't diagnosed the issue myself yet!
By the way, the original reason for why we prefer to use .ileans for go-to-definition was that we want go-to-definition to jump to the current location if the definition is from a file that the user edited, but hasn't saved or compiled yet (i.e. the current data isn't available via .oleans). Since we need to synchronize up-to-date .ilean data from the file workers to the watchdog anyways for other "global" requests, we also re-use them for go-to-definition if we have them available.
Notification Bot (Mar 08 2024 at 08:21):
Xiyu Zhai has marked this topic as unresolved.
Xiyu Zhai (Mar 08 2024 at 08:22):
I understand the problem as user might use the same server to analyze two totally separate compilations, where there could be name clashes. There should be separate data for each "package".
Xiyu Zhai (Mar 08 2024 at 08:23):
In my own programming lauguage, basically everything is localized to a certain package, crate or module. I believe this is the same for rust-analyzer.
Xiyu Zhai (Mar 08 2024 at 08:25):
Btw, I'm curious about any plan of introducing fine-grained incremental computation system into the compilation system. In a fine-grained incremental computation compilation, change in each function affects others in a minimal sense. But this could be very complicated to implement.
Marc Huisinga (Mar 08 2024 at 08:29):
This is on our roadmap and @Sebastian Ullrich continues to work on this topic. The first PR of several is at lean4#3014 :-)
Last updated: May 02 2025 at 03:31 UTC