Zulip Chat Archive

Stream: mathlib4

Topic: slow syntax linters


Floris van Doorn (Feb 12 2025 at 19:04):

The syntax linters are quite slow for me. It was ~10x faster for @Michael Rothgang on Linux (on a different machine of course). This is causing noticeable slowdowns when compiling a large file for me.

import Mathlib.Tactic.Core

set_option profiler.threshold 10
set_option profiler true
set_option profiler true -- linting takes about 90ms on average
set_option profiler true

example : Nat := 1
example : Nat := 1 -- linting takes about 75ms on average
example : Nat := 1

Damiano Testa (Feb 13 2025 at 09:28):

Floris, I am happy to try unifying at least some of the linters and see whether that helps. I am not sure that I will have time today, though.

Damiano Testa (Feb 13 2025 at 09:29):

I will probably try a very simple-minded unification of just the linters that detect some SyntaxNodeKind and act on that, check that it has a measurable improvement and, if that is the case, continue to unify more.

Damiano Testa (Feb 13 2025 at 09:30):

Would you be willing to test out the initial implementation?

Floris van Doorn (Feb 13 2025 at 11:29):

Yes, I can test implementations (not always within a day, though). Thanks for looking into this!

Michael Rothgang (Feb 13 2025 at 12:53):

Arend and I were wondering about the performance difference - that could be OS-related. The header linter does some substantial IO, which could explain things.

Michael Rothgang (Feb 13 2025 at 12:53):

Is the following different, disabling the header linter?

import Mathlib.Tactic.Core

set_option linter.style.header false
set_option profiler.threshold 10
set_option profiler true
set_option profiler true -- linting takes about 90ms on average
set_option profiler true

example : Nat := 1
example : Nat := 1 -- linting takes about 75ms on average
example : Nat := 1

Michael Rothgang (Feb 13 2025 at 12:55):

(Fun fact: on the playground, which runs on some fast Linux server, the above to threshold set to 1 becomes very slightly slower when setting the option, from 1 to 2ms per line.)

Floris van Doorn (Feb 13 2025 at 15:14):

Not significantly. In fact, the linting time seems to roughly linearly decrease in the following code snippet (with a lot of noise):

import Mathlib.Tactic.Core

set_option profiler.threshold 10
set_option profiler true -- linting takes about 80ms on average
set_option linter.style.header false
set_option linter.style.cdot false
set_option linter.style.dollarSyntax false
set_option linter.style.lambdaSyntax false
set_option linter.style.longFile 10000
set_option linter.style.longLine false
set_option linter.style.missingEnd false
set_option linter.style.multiGoal false
set_option linter.style.setOption false
set_option linter.hashCommand false
set_option linter.constructorNameAsVariable false
set_option linter.deprecated false
set_option linter.docPrime false
set_option linter.dupNamespace false
set_option linter.globalAttributeIn false
set_option linter.hashCommand false
set_option linter.missingDocs false
set_option linter.oldObtain false
set_option linter.omit false
set_option linter.refine false
set_option linter.suspiciousUnexpanderPatterns false
set_option linter.unnecessarySimpa false
set_option linter.unusedRCasesPattern false
set_option linter.unusedSectionVars false
set_option linter.unusedTactic false
set_option linter.unusedVariables false


example : Nat := 1
example : Nat := 1 -- linting takes about 30ms on average
example : Nat := 1

Floris van Doorn (Feb 13 2025 at 15:15):

An option to turn off all linters at once would be nice!

Sebastian Ullrich (Feb 13 2025 at 18:35):

If linter.all false doesn't work, that's a bug in the respective linter

Floris van Doorn (Feb 13 2025 at 22:29):

set_option linter.all false is the default value for that option, so setting that probably doesn't turn off linters. The description states enable all linters. Maybe it should be a 3-valued setting: enable all (true), disable all (false) or default (leave it up to the linter). (There are other places where 3-valued settings make sense that are currently 2-valued.)

Patrick Massot (Feb 13 2025 at 22:34):

Sebastian Ullrich said:

If linter.all false doesn't work, that's a bug in the respective linter

Could you comment on the long discussion we had in #triage > PR !4#19556: perf: disable the unused tactic linter @ 💬 ? Is it all about bugs in the respective linters?

Damiano Testa (Feb 14 2025 at 08:01):

In the simple example below, the unused tactic linter does seem to heed the linter.all option:

import Mathlib.Tactic.Linter.UnusedTactic

-- commenting this makes the unused tactic linter flag `skip`
set_option linter.all false

example : True := by
  skip
  trivial

Damiano Testa (Feb 14 2025 at 08:02):

Floris, I think that the options almost have the 3 states that you describe: I think that Lean knows whether the option has been explicitly set to something or is at its default value (which is different than setting it with set_option to its default value).

Patrick Massot (Feb 14 2025 at 08:27):

Crucially we want the option to set once and for all, not in every file.

Floris van Doorn (Feb 14 2025 at 08:35):

Damiano Testa said:

Floris, I think that the options almost have the 3 states that you describe: I think that Lean knows whether the option has been explicitly set to something or is at its default value (which is different than setting it with set_option to its default value).

I think we shouldn't treat it as having 3 values unless we can actually (manually) set the option value to all 3 values ourselves.

Sebastian Ullrich (Feb 14 2025 at 08:53):

Patrick Massot said:

Crucially we want the option to set once and for all, not in every file.

Are you saying linter.all false does not work in a lakefile?

Damiano Testa (Feb 14 2025 at 08:57):

I am preparing a single file which triggers all (or almost all) the mathlib linters and I will then add a test to mathlib that linter.all "works".

Damiano Testa (Feb 14 2025 at 08:58):

I can also add the all option to the lakefile of the DownstreamProject and add the test there as well.

Damiano Testa (Feb 14 2025 at 09:41):

This is proving much harder than I expected.

Damiano Testa (Feb 14 2025 at 09:42):

I think that the inconsistencies that I am observing are due to the following differences.

Damiano Testa (Feb 14 2025 at 09:43):

Some mathlib linters are by default true (e.g. the unusedTactic linter), others are by default false but set to true in the lakefile (e.g. the hashCommand linter).

Damiano Testa (Feb 14 2025 at 09:43):

The ones that use their default value from the place where the option is registered, do follow the linter.all option as expected.

Damiano Testa (Feb 14 2025 at 09:44):

However, setting the option in the lakefile does count as setting the option explicitly, so they ignore the linter.all option.

Damiano Testa (Feb 14 2025 at 09:46):

This is made trickier by the fact that the online server appears to use yet another default option setting.

Damiano Testa (Feb 14 2025 at 09:46):

For instance, in the example below, I get different behaviour on the online server and locally.

import Mathlib.Init

set_option linter.all false

#guard true

example : True := by
  skip
  trivial

Damiano Testa (Feb 14 2025 at 09:47):

Locally, the #guard is always flagged, regardless of whether the set_option line is commented or not.

Damiano Testa (Feb 14 2025 at 09:47):

In the online server, the #guard is never flagged, regardless of whether the set_option line is commented or not.

Damiano Testa (Feb 14 2025 at 09:48):

The skip behaves as expected in both settings: it is flagged when the linter.all is not set and not flagged with linter.all set to false.

Damiano Testa (Feb 14 2025 at 09:51):

My conclusions are the following.

  1. We should not trust the online server when it comes to testing linter options.
  2. Maybe, all mathlib linters should have their default value set to what is now implied by the lakefile, so that linter.all works as expected downstream.

Damiano Testa (Feb 14 2025 at 09:51):

A possibly unwanted consequence of 2 is that all the style decisions that mathlib makes would be enforced by default on downstream projects, unless they explicitly opt out.

Damiano Testa (Feb 14 2025 at 09:53):

My mental model, though, was that setting options in the lakefile was "weaker" than using set_option and was more analogous to saying "pretend that I registered the option with this value instead of what I actually wrote".

Damiano Testa (Feb 14 2025 at 09:58):

Here is probably a better mental model (I am writing this now mostly for my own clarity).

Setting the option in the lakefile is essentially the same as writing set_option whatever value at the beginning of every file.

It is not analogous to replacing the defValue of the option to what you want.

Eric Wieser (Feb 14 2025 at 09:58):

I think maybe the lakefile is a distraction, and already the semantics of

set_option linter.all true
set_option linter.foo false -- does this override `all`?

and

set_option linter.all false
set_option linter.foo true -- does this override `all`?

are confusing

Damiano Testa (Feb 14 2025 at 09:59):

Yes, the lakefile is a further twist of who sets what when.

Damiano Testa (Feb 14 2025 at 10:00):

The fact that the online server has yet another convention does not simplify the situation.

Eric Wieser (Feb 14 2025 at 10:01):

One resolution here would be to drop linter.all as a "real" option, and have it just mean "right now, change all the other options to this value"

Damiano Testa (Feb 14 2025 at 10:04):

I think that would clarify a lot.

Damiano Testa (Feb 14 2025 at 10:05):

So, basically, every option is effectively linter.all && linter.option. Not really. :man_facepalming:

Eric Wieser (Feb 14 2025 at 10:13):

No, what you crossed out is how things are today

Mario Carneiro (Feb 14 2025 at 13:27):

I think how things are today is linter.option.getD (linter.all.getD linter_option_default)

Mario Carneiro (Feb 14 2025 at 13:27):

keeping in mind that options are indeed 3-state

Damiano Testa (Feb 14 2025 at 13:58):

I would personally find it clearer if set_option linter.all x meant "all options are x", rather than "all options that are not explicitly set, even in the lakefile, are x".

Damiano Testa (Feb 14 2025 at 13:59):

And, if later you write set_option myOption y, then I would expect myOption to be y and all others x.

Mario Carneiro (Feb 14 2025 at 14:57):

but those two statements are explicitly contradictory?

Patrick Massot (Feb 14 2025 at 15:29):

Sebastian Ullrich said:

Patrick Massot said:

Crucially we want the option to set once and for all, not in every file.

Are you saying linter.all false does not work in a lakefile?

I’m saying the only think that worked for me was to use set_option in every single file to kill the unwanted linters. In particular setting it in the lakefile of a project depending on Mathlib didn’t work.

Eric Wieser (Feb 14 2025 at 15:41):

Mario Carneiro said:

keeping in mind that options are indeed 3-state

Is there an unset_option to re-enter this third state?

Damiano Testa (Feb 14 2025 at 17:36):

Below are the 3 "states" that options can have. Note that the #guard_msgs refer to the "local" values, not the ones in the online server. For the online server, all the status reports are none, as though they were all unset.

import Mathlib.Init

open Lean Elab.Command

def Lean.Option.readStatus (os : Options) (o : Lean.Option Bool) : Option Bool :=
  os.get? o.name

elab "#option_status " opt:ident : command => do
  match ( getEnv).find? opt.getId with
  | some (.opaqueInfo v) =>
    let leanOptionExpr := Expr.app (mkConst `Lean.Option) (mkConst `Bool)
    if v.type == leanOptionExpr then
      elabCommand ( `(
        run_cmd
          Lean.logInfo m!"{Lean.Option.readStatus (← getOptions) $(⟨mkIdent v.name⟩)}")
        )
    else
      logWarning m!"'{opt}' is not an option."
  | _ => logWarning m!"'{opt}' is not an option."

/-- info: some (true) -/
#guard_msgs in
#option_status Mathlib.Linter.linter.hashCommand

/-- info: some (false) -/
#guard_msgs in
#option_status Mathlib.Linter.linter.docPrime

/-- info: none -/
#guard_msgs in
#option_status Mathlib.Linter.linter.unusedTactic

Damiano Testa (Feb 14 2025 at 17:36):

Eric Wieser said:

Is there an unset_option to re-enter this third state?

I looked for this, and I could not find an unset_option.

Damiano Testa (Feb 14 2025 at 17:38):

Mario Carneiro said:

but those two statements are explicitly contradictory?

I was hoping that they would not be!

What I meant to say is that

set_option linter.all true
-- here all options are `true`
...
set_option linter.unusedVariables false
-- here all options are `true`, except for `linter.unusedVariables`

Is this more meaningful now?

Eric Wieser (Feb 14 2025 at 17:43):

What do you want

#eval return Lean.Linter.linter.all.get? ( Lean.getOptions)

to produce in those states?

Damiano Testa (Feb 14 2025 at 17:48):

First, the easy bit:

  • before the first set_option, it should return none,
  • between the two set_options, it should return some true.

After the second set_option, I am not sure. Probably, I would still expect it to be some true.

Damiano Testa (Feb 14 2025 at 17:49):

And even if you systematically set all options (except linter.all!) to false, I would probably still expect the linter.all option to be some true.

Damiano Testa (Feb 14 2025 at 17:51):

My mental model is

  • none means that set_option opt x has not be written;
  • some x means that set_option opt x has been written above and set_option opt (not x) has not been written since.

Damiano Testa (Feb 14 2025 at 18:03):

Btw, the #option_status command above does not actually work as intended, as it does not seem to be aware of intervening set_options. :man_facepalming:

Damiano Testa (Feb 14 2025 at 18:07):

Anyway, summarising, I was also hoping that setting the linter.all option in the lakefile to false would set all linter options to false, whereas it seems that it only sets to false the ones that have not been set to some value in the lakefile.

Mario Carneiro (Feb 14 2025 at 18:13):

Damiano Testa said:

My mental model is

  • none means that set_option opt x has not be written;
  • some x means that set_option opt x has been written above and set_option opt (not x) has not been written since.

that is correct

Mario Carneiro (Feb 14 2025 at 18:16):

Damiano Testa said:

What I meant to say is that

set_option linter.all true
-- here all options are `true`
...
set_option linter.unusedVariables false
-- here all options are `true`, except for `linter.unusedVariables`

Is this more meaningful now?

What you should be asking is whether

set_option linter.all true
set_option linter.unusedVariables false

and

set_option linter.unusedVariables false
set_option linter.all true

result in the same state. Currently this is the case, because setting an option only sets that option, and all the logic about option interaction happens at get time

Mario Carneiro (Feb 14 2025 at 18:17):

In both cases, the raw option state is {all ↦ true, unusedVariables ↦ false}

Mario Carneiro (Feb 14 2025 at 18:18):

and the way we find out whether unusedVariables is on is to check the unusedVariables option, and if that's not present then check the all option, and if that's not present either then use the default value of the unusedVariables option

Mario Carneiro (Feb 14 2025 at 18:18):

which is what I wrote in code before

Mario Carneiro (Feb 14 2025 at 18:19):

so setting linter.all has the effect of changing the values of any linters that are not set_option'd themselves (before or after the set_option linter.all line)

Mario Carneiro (Feb 14 2025 at 18:20):

Eric Wieser said:

Mario Carneiro said:

keeping in mind that options are indeed 3-state

Is there an unset_option to re-enter this third state?

No but it would not be hard to write one

Mario Carneiro (Feb 14 2025 at 18:30):

import Lean

open Lean Elab Command
elab "unset_option " id:Parser.identWithPartialTrailingDot : command => do
  let ref  getRef
  addCompletionInfo <| CompletionInfo.option ref
  let optionName := id.raw.getId.eraseMacroScopes
  let decl  IO.toEIO (fun (ex : IO.Error) => Exception.error ref ex.toString) (getOptionDecl optionName)
  pushInfoLeaf <| .ofOptionInfo { stx := id, optionName, declName := decl.declName }
  let options := ( getOptions).erase optionName
  modify fun s => { s with maxRecDepth := maxRecDepth.get options }
  modifyScope fun scope => { scope with opts := options }

set_option linter.all true

unset_option linter.all

Bhavik Mehta (Feb 17 2025 at 19:48):

I have a case where turning off the unusedVariables linter takes linting from 12 seconds to 117ms. I'm guessing this is unexpected behaviour?

Damiano Testa (Feb 17 2025 at 19:57):

Can you share the example?

Bhavik Mehta (Feb 17 2025 at 21:06):

Of course! It's the instance on line 124 of https://github.com/leanprover-community/mathlib4/blob/314dd15079ad88e363cbc5f605dcb52e518cc742/Counterexamples/AharoniKorman.lean

Bhavik Mehta (Feb 17 2025 at 21:08):

I believe it should be almost trivial to make this mathlib-free: for this declaration virtually nothing from mathlib is actually needed, save for Equiv.refl and the notation for nat.

Damiano Testa (Feb 17 2025 at 22:30):

I copied your example in the online server, but I observe virtually no difference between adding or not set_option linter.unusedVariables false.

Damiano Testa (Feb 17 2025 at 22:37):

(With or without the linter, it takes approximately 8-8.5s and the linter fails to notice an unused x.)

Bhavik Mehta (Feb 17 2025 at 23:31):

For me there's a pretty stark difference in the output of set_option profiler true when I change that option

Bhavik Mehta (Feb 17 2025 at 23:37):

The absolute difference is less stark on the online server, but for me it outputs that lint took around 2.5s with the option on, and 26ms with it off, so still the same order of magnitude of relative difference.

Bhavik Mehta (Feb 17 2025 at 23:49):

To make sure we're on the same page, here's the example showing a 10x slowdown in linting with that option changed: link

Eric Wieser (Feb 17 2025 at 23:57):

trace.profiler reveals

[5.394990] running linters 
  [] [4.566265] running linter: Lean.Linter.UnusedVariables.unusedVariables
  [] [0.816414] running linter: Mathlib.Linter.globalAttributeInLinter.globalAttributeIn

Sebastian Ullrich (Feb 18 2025 at 09:54):

Thanks for the reproducer, lean#7129 brings the overhead down to at or below other tasks that have to traverse the created expression (which as you might guess is huge in this example!)

cumulative profiling times:
        attribute application 0.0469ms
        blocked 2.89ms
        compilation 8.79ms
        elaboration 741ms
        fix level params 0.508ms
        import 37.5ms
        initialization 16ms
        instantiate metavars 491ms
        interpretation 1.38ms
        linting 339ms
        parsing 3.65ms
        process pre-definitions 91.4ms
        share common exprs 516ms
        simp 9.49ms
        tactic execution 48.1ms
        type checking 335ms
        typeclass inference 11.7ms

Bhavik Mehta (Feb 19 2025 at 09:30):

Amazing, thank you!

Sebastian Ullrich said:

(which as you might guess is huge in this example!)

This is a sign of my ignorance, but why is it so huge in this case? There are lots of omega happening and quite a few cases, but it seems to me like the expr here shouldn't be much bigger than that in some of the more advanced algebraic geometry files, for instance. My impression is that heavy tactics can cause a large proof term, but most of this declaration is in term mode already, and omega seems to me like one of those tactics which makes fairly nice proof terms. But clearly one of my assumptions is wrong!

Damiano Testa (Feb 19 2025 at 13:51):

And a side question: does the unused x get flagged by the quicker linter?


Last updated: May 02 2025 at 03:31 UTC