Zulip Chat Archive

Stream: mathlib4

Topic: Module references in comments


Snir Broshi (Nov 03 2025 at 19:08):

When a comment references a module, for it to be clickable in the docs it needs to look like Mathlib/Data/List/Basic.lean rather than any of these similar alternatives:

  • Mathlib/Data/List/Basic
  • Data/List/Basic.lean
  • Mathlib.Data.List.Basic
  • Data.List.Basic

I fixed 2 but I suspect there are a lot more, not to mention references to modules that don't exist anymore.
Is it possible to lint against it, or maybe use an LLM to find and fix all such references?

Michael Rothgang (Nov 03 2025 at 19:14):

You'd want to both lint against it, and fix all current cases. The linting could happen (not now, but in the medium term) by converting all doc-strings to verso, which also checks for other errors. Until then, PRs fixing these violations are welcome.

Michael Rothgang (Nov 03 2025 at 19:15):

If you make a LLM-generated PR, please disclose this very clearly, and verify all output manually (so you could vouch for it yourself).

Ruben Van de Velde (Nov 03 2025 at 19:42):

This seems like something that could be done much more reliably by what old people like myself called "a script"

Ruben Van de Velde (Nov 03 2025 at 19:43):

And note that (3) also works if you already import the module in question

Ruben Van de Velde (Nov 03 2025 at 19:43):

And once we move to verso, I understand we're getting a new syntax anyway

Ruben Van de Velde (Nov 03 2025 at 19:44):

One other thing to watch out for: sometimes the module name is just wrong

Eric Wieser (Nov 03 2025 at 21:25):

And sometimes something that looks like a module name really is a namespace name

Snir Broshi (Nov 03 2025 at 23:21):

Ruben Van de Velde said:

This seems like something that could be done much more reliably by what old people like myself called "a script"

That might work for easy cases like the ones I mentioned, e.g. a simple regex led me to Mathlib.Data.Int.Cast.Basic which references 3 modules of which 1 is broken since it isn't imported.

But in general I want to find things that were intended to be references, which don't necessarily fit any pattern, which is where an LLM can help. For example, "for more info see the Lemmas module". In any case, I don't subscribe to any LLMs so it was more of an open invitation for anyone who wants to do it.

Snir Broshi (Nov 03 2025 at 23:21):

A related but harder problem is references to declarations, and LLMs might not be able to help there. I recently found docs#Matrix.mulVec which says "see col_mulVec" but I haven't been able to find this mysterious col_mulVec (similarly docs#Matrix.vecMul mentions row_vecMul).
Although searching for valid identifiers surrounded by backticks might be a good approximation, maybe with a minimum length of 3 to exclude things which are meant to be variable names (e.g. n v i hIsCycle).

Michael Rothgang (Nov 04 2025 at 08:27):

Verso will really help with declaration references.

Michael Rothgang (Nov 04 2025 at 08:28):

I did have code to search for valid identifiers surrounded by backticks. There were a number of very short false positives, but for long strings, it would probably work well. I could try to dig up my code; going through that list would certainly be welcome!

Eric Wieser (Nov 04 2025 at 23:09):

(docs#Matrix.replicateCol_mulVec is the lemma that went missing)

Harald Husum (Nov 24 2025 at 10:05):

I have a related draft PR here: #32025

This PR moves from (mostly non-functioning) module refs to functioning file refs in cases where it was unequivocally clear to me that the author intended to be referencing the file as opposed to the module. Do people in this thread deem this useful?

The PR was split out of a larger automated replacement of references, #31972, that I was experimenting with before I realized, as @Ruben Van de Velde pointed out, that not all module refs are non-functioning.

I guess I could turn this larger PR into one that at least gives all module refs a _chance_ at working, by prefixing Mathlib. where that is missing. From there, one could look more closely at detecting links that fail due to missing imports.

Snir Broshi (Nov 24 2025 at 20:47):

Harald Husum said:

Do people in this thread deem this useful?

Yes! tbh I was hoping you would try this, as you have experience with LLM PRs, thanks :)

Harald Husum said:

I guess I could turn this larger PR into one that at least gives all module refs a _chance_ at working, by prefixing Mathlib. where that is missing. From there, one could look more closely at detecting links that fail due to missing imports.

I like this approach, but IIUC we have to wait for Verso to be integrated to be able to detect the failures, and idk what's the timeline on that


Last updated: Dec 20 2025 at 21:32 UTC