Zulip Chat Archive

Stream: lean4

Topic: hanging Lean 4 in VS Code


Kevin Buzzard (Mar 29 2025 at 15:54):

Christian Merten was visiting me last week and we were trying to understand the slowness in JacobiZariski. During the process I found that I was able to reliably hang Lean (or at least it seemed to hang). Can anyone else reproduce this? I'm on a mac using VS Code and this is mathlib commit eff9da572aaa6b532c82a3994deceb722cd92cb5, so leanprover/lean4:v4.18.0-rc1.

In mathlib, create a scratch file containing this:

import Mathlib.RingTheory.Kaehler.JacobiZariski

namespace Algebra.Generators.H1Cotangent

open KaehlerDifferential TensorProduct MvPolynomial

attribute [local instance] SMulCommClass.of_commMonoid

universe u₁ u₂ u₃ u₄ w' w u v uT

variable {R : Type u} {S : Type v} [CommRing R] [CommRing S] [Algebra R S]
    {P : Generators.{w} R S} {T : Type uT}
    [CommRing T] [Algebra R T] [Algebra S T] [IsScalarTower R S T] (Q : Generators.{w} S T)
      (P : Generators.{w'} R S)

#count_heartbeats in -- this is what seems to trigger the problem
lemma map_comp_cotangentComplex_baseChange' :
    (Extension.CotangentSpace.map (Q.toComp P).toExtensionHom).liftBaseChange T ∘ₗ
      P.toExtension.cotangentComplex.baseChange T =
    (Q.comp P).toExtension.cotangentComplex ∘ₗ
      (Extension.Cotangent.map (Q.toComp P).toExtensionHom).liftBaseChange T := by
  ext x
  --have : P.Ring = P.toExtension.Ring := rfl

  simp [Extension.CotangentSpace.map_cotangentComplex]

This is an example of a lemma which compiles very slowly, and I can talk at length about why this is but perhaps this is not the place. Wait for the orange bars to disappear (this can take anything from between 4 and 15 seconds depending on your machine). And then in the blank line (line 24) if I type the code from the commented-out line just above, not too fast and not too slow (e.g. at my personal regular typing speed, but don't cut and paste because this is too fast, and not at a snail's pace because this doesn't work either), so the proof ends up as

  ext x
  --have : P.Ring = P.toExtension.Ring := rfl
  have : P.Ring = P.toExtension.Ring := rfl
  simp [Extension.CotangentSpace.map_cotangentComplex]

then, by the time I've got to the end of the line, I have what seems to be permanent orange bars on the statement and proof. My CPU usage rockets up to max, the fan comes on and as far as I can see Lean never recovers. In this state, if I mash the keyboard on e.g. line 10 of the file, then the orange bars just jump up so that they cover from line 10 to the end of the file, and I have to e.g. restart Lean on the file, which cause things to go back to normal, although I can re-trigger the problem by again typing something not too fast and not too slow in the proof.

No doubt people who know something about how Lean works can explain why the "not too fast and not too slow" part is relevant, and what is probably happening here. Can anyone else reproduce? Is it a VS Code thing?

Aaron Liu (Mar 29 2025 at 16:56):

Tried this on the web server, crashed after 4 minutes

Eric Wieser (Mar 29 2025 at 18:17):

I would guess that each keypress is launching a task that ignores the cancellation token for more than 30s

Eric Wieser (Mar 29 2025 at 18:18):

#16666 is an example of how to fix this for one tactic at a time

Eric Wieser (Mar 29 2025 at 18:19):

Though probably in this case it's the kernel / elaborator that is timing out, not a tactic itself

Eric Wieser (Mar 29 2025 at 18:20):

At any rate for your example this would mean a core change, not a mathlib change.

Kevin Buzzard (Mar 29 2025 at 21:01):

Eric Wieser said:

I would geuss that each keypress is launching a task that ignores the cancellation token for more than 30s

Are you saying that it will terminate in time 30s * number of keypresses? I don't understand several of the words in this claim.

Ruben Van de Velde (Mar 29 2025 at 21:05):

My understanding is - whenever you write something, lean says "hey, start checking this code", and when you then write something more, it says "hey, never mind the previous code, here's some new code to check" - but some code ignores the "never mind" command and just keeps running in the background, so when you write "abc", you've got code running for "a", for "ab", and for "abc" simultaneously

Aaron Liu (Mar 29 2025 at 21:06):

That matches my experience

Kevin Buzzard (Mar 29 2025 at 21:10):

Here's an example which uses far less mathlib:

import Mathlib.Util.CountHeartbeats

#count_heartbeats in
theorem foo : 2 + 2 = 4 := by
  sleep 5000
  -- have : hjsdlgkjhldfshfkghsldfjkghsdlfjkghsldkfjghsldkjghslkdfjgh

  sleep 5000
  rfl

If I start filling in the blank line with have : randomcharactersdfkldlgkjdfgkl (typing not too fast and not too slow) and watch CPU usage whilst typing in more random characters after have : then suddenly it spikes to 2000% (or 1100% on another machine, but basically "use all availble CPUs") and that's when you know you've had it.

Kevin Buzzard (Mar 29 2025 at 22:20):

Ruben Van de Velde said:

My understanding is - whenever you write something, lean says "hey, start checking this code", and when you then write something more, it says "hey, never mind the previous code, here's some new code to check" - but some code ignores the "never mind" command and just keeps running in the background, so when you write "abc", you've got code running for "a", for "ab", and for "abc" simultaneously

What's the 30s thing about?

Eric Wieser (Mar 30 2025 at 00:10):

I just picked a random number such that n / keypress_interval > cpu_cores

Eric Wieser (Mar 30 2025 at 00:11):

Eric Wieser said:

#16666 is an example of how to fix this for one tactic at a time

This Pr also added a check_timeouts combinator that can tell you if a particular tactic is ignoring this signatl

Eric Wieser (Mar 30 2025 at 00:12):

You could copy the start of that test and use check_timeouts 500 => in your proof to check that none of the tactics run for longer than 500ms beyond their cancellation

Eric Wieser (Mar 30 2025 at 00:12):

(check_timeouts 1 => will crash your PC, so type it in a comment first)

Junyan Xu (Mar 30 2025 at 04:59):

have : P.Ring = P.toExtension.Ring := rfl

This looks like another structure eta projection defeq that the kernel fails to check; if you follow the link in #lean4 > changing a proof can break a later proof @ 💬 you can see we've observed 2-3 occurrences.

Kevin Buzzard (Apr 02 2025 at 19:08):

I agree, what I was actually analysing at the time is another story :-) This appears to be an independent count_heartbeats bug.

Johan Commelin (Apr 03 2025 at 04:16):

But it's good to have a small way to reproduce it now.

Jovan Gerbscheid (Apr 17 2025 at 10:54):

Ruben Van de Velde said:

Lean says "hey, start checking this code", and when you then write something more, it says "hey, never mind the previous code, here's some new code to check" - but some code ignores the "never mind" command and just keeps running in the background, so when you write "abc", you've got code running for "a", for "ab", and for "abc" simultaneously

I found a way to visibly reproduce this on the webeditor:

  1. go to the top of a large file
  2. start typing some nonsensical tactic
  3. watch as the file gets re-elaborated over and over

In the video, I type the tactic apply qwertyui, and the error message slowly changes from unknown identifier 'q', to 'qw', to 'qwe', etc.

elaborationSlowness.mp4

I'm using Mathlib stable (v4.18.0) to show it's not caused by the recent elaboration changes in v4.19.0

Jovan Gerbscheid (Apr 17 2025 at 10:55):

Note that here the code isn't running simultaneously, but all in a row

Sebastian Ullrich (Apr 17 2025 at 11:22):

I can reproduce this in the web editor but not locally using the same file and Lean version (and, in fact, hardware)

Jovan Gerbscheid (Apr 17 2025 at 13:33):

What is different in the web editor that could cause this discrepancy?


Last updated: May 02 2025 at 03:31 UTC