Zulip Chat Archive

Stream: general

Topic: reading the "visible" file, rather than the saved one

Damiano Testa (Jul 24 2024 at 07:13):

While talking to Johan about writing the long-line linter as a syntax linter, we came up with the code below.

It works well, but uses information about the saved file. How do I access the actual file as it is being edited?

E.g., if you copy the long line below and paste it somewhere else, you will not get a warning unless you save the file and make sure that lean re-parses the command.

import Lean.Elab.Command

open Lean in
run_cmd
  let fil ← IO.FS.lines (← getFileName)
  let longLines := (List.range fil.size).filterMap fun l =>
    if 70 < fil[l]!.length then some (l + 1) else none
  let fm ← getFileMap
  for l_idx in longLines do
    let longLineStart := fm.ofPosition ⟨l_idx, 0⟩
    let longLineEnd := fm.ofPosition ⟨l_idx, fil[l_idx-1]!.length⟩
    logWarningAt (.ofRange ⟨longLineStart, longLineEnd⟩) "too long"
/-
0         1         2         3         4         5         6         7
^ flagged as too long
-/

Damiano Testa (Jul 24 2024 at 07:14):

(Also note that the code above does not work in the playground, since it tries to access the current file being used, something that does not seem to exist in the playground.)

Sebastian Ullrich (Jul 24 2024 at 09:03):

If you're writing a syntax linter, why would you not take that information from the syntax tree given to you?

Sebastian Ullrich (Jul 24 2024 at 09:06):

I.e. use Syntax.getSubstring?

Damiano Testa (Jul 24 2024 at 10:18):

Thanks! I did not know about Syntax.getSubstring?! This works as expected!

import Lean.Elab.Command

open Lean in
/-- flags lines longer than 70 characters in the following command -/
elab "longLines " cmd:command : command => do
  Elab.Command.elabCommand cmd
  let sstr := cmd.raw.getSubstring?
  let longLines := ((sstr.getD default).splitOn "\n").filter (70 < ·.toString.length)
  for l_idx in longLines do
    logWarningAt (.ofRange ⟨l_idx.startPos, l_idx.stopPos⟩) "too long"

longLines
example : True := trivial
/-
0         1         2         3         4         5         6         7
^ flagged as too long
-/

Damiano Testa (Jul 24 2024 at 10:19):

It does not flag long lines after #exit, but #exit already flags itself, so that's probably not a great deal!

Adam Topaz (Jul 24 2024 at 21:39):

In case you didn't know, this is also possible:

import Lean
open Lean

run_cmd do
  let ctx ← read
  println! ctx.fileMap.source

Damiano Testa (Jul 25 2024 at 03:50):

Adam, I did not know about this, thanks!

Sebastian Ullrich (Jul 26 2024 at 16:54):

@Damiano Testa The remaining readFile in the linter seems to break live.lean-lang.org

Damiano Testa (Jul 26 2024 at 22:09):

Oh, would putting that code behind an if pathExists then solve the issue?

Damiano Testa (Jul 26 2024 at 22:20):

#15179 hopefully this fixes it, but I do not know how to test it.

Mac Malone (Jul 27 2024 at 01:00):

@Damiano Testa I am curious, why are you re-reading the file in the linter rather than just using fileMap.source like Adam suggested?

Damiano Testa (Jul 27 2024 at 03:19):

Mostly, the answer is that I do not really understand the difference between the two, so I just kept using what I already had, instead of switching over.

Damiano Testa (Jul 27 2024 at 03:20):

So, I guess from your comment that using Adam's suggestion would still avoid the issue on the lean server and be more efficient?

Damiano Testa (Jul 27 2024 at 03:20):

If so, then I'll adapt the linter!

Kyle Miller (Jul 27 2024 at 03:34):

Another benefit to using the filemap is that it gives you a version whose line ending have all been normalized to \n (vs on Windows, where they might otherwise be \r\n, depending on how you've got everything configured!)

Kyle Miller (Jul 27 2024 at 03:35):

Oh, never mind, I just read the linter code, and I see it goes to mkInputContext anyway. That function normalizes line endings by default.

Damiano Testa (Jul 27 2024 at 03:50):

As far as I can see, #15180 should therefore be a better solution, right? It does look better to me!

Mac Malone (Jul 27 2024 at 03:56):

Damiano Testa said:

Mostly, the answer is that I do not really understand the difference between the two, so I just kept using what I already had, instead of switching over.

The key difference is that the fileMap.source uses the text of the current file that is already available to Lean processes, whereas the readFile solution attempts to find the source file on disk and read it again. The later can fail if the file cannot be accessed (e.g., on the server or when there is no file, such as with lean --stdin), and, even when successful, will not contain unsaved edits to the file. There are cases were these properties can be useful, but it does not seem like this linter is one of them.

Mac Malone (Jul 27 2024 at 03:58):

Also, as a minor style note, there is a helper called getFileMap (like getFileName) that can be used instead of (<- read).fileMap .

Damiano Testa (Jul 27 2024 at 04:01):

Thanks for the explanation: that clarified some haziness that I had not even realized I had!

Also, I definitely want to access the not-necessarily-saved latest version of the file.

I'll edit the PR with you getFileMap suggestion.

Damiano Testa (Jul 27 2024 at 04:02):

(It looks like Kyle also mentioned it in the PR! :smile: )

Last updated: May 02 2025 at 03:31 UTC