Zulip Chat Archive

Stream: lean4

Topic: Debugging (kernel) deep recursion detected


Geoffrey Irving (Sep 06 2024 at 19:28):

How does one debug deep kernel recursion? Is there an option I can set to get a stack trace or similar when it occurs?

Jovan Gerbscheid (Sep 07 2024 at 00:03):

When this happens to me, I can see the stack trace in the ouput, which in VSCode is one of the tabs next to terminal.

Joachim Breitner (Sep 07 2024 at 08:22):

I have heard that this stack trace may possibly only be visible on Linux, not sure

Jovan Gerbscheid (Sep 07 2024 at 12:19):

I have it on Windows :)

Geoffrey Irving (Sep 07 2024 at 19:50):

I see the Output tab, but alas it doesn't show anything. (I'm on an M2 MacBook Air.)

Jovan Gerbscheid (Sep 07 2024 at 20:29):

The other thing I get in the output tab are panic messages. Do you get them there?

Geoffrey Irving (Sep 07 2024 at 20:30):

No, they show up as popups.

Jovan Gerbscheid (Sep 07 2024 at 20:32):

For me panics and deep recursion usually show up as pop ups, and the pop up contains a link to the output tab, where I see the history of both kinds of errors

Jovan Gerbscheid (Sep 07 2024 at 23:25):

Actually the behaviour for panics has changed for me: they now appear as messages in the infoview. And it shows up with a blue squigle, instead of red so that it doesn't look like an error message :(

Philogy (Oct 30 2024 at 18:19):

encountering this mystical error too, how do I see where it's coming from?

Jovan Gerbscheid (Oct 30 2024 at 18:59):

I discovered that whether or not the stack is shown to me depends on whether the code is run in compiled or interpreted form. So I sometimes alternate between setting precompileModules := true for speed, and then back off to see the stack trace in a stack overflow.

Philogy (Oct 30 2024 at 19:02):

Jovan Gerbscheid said:

I discovered that whether or not the stack is shown to me depends on whether the code is run in compiled or interpreted form. So I sometimes alternate between setting precompileModules := true for speed, and then back off to see the stack trace in a stack overflow.

Haven't heard of this option before, what's the syntax for toggling it?

Philogy (Oct 30 2024 at 19:26):

I just have a default math template lake project (new to Lean so still a bit unfamiliar with wider project configuration / commands)

Jovan Gerbscheid (Oct 30 2024 at 19:26):

In the lakefile.lean, you can put it in like this:

lean_lib «LibraryName» where
  precompileModules := true
  -- add library configuration options here

It compiles the functions in your other files when importing them. But it is off by default.

However, there can be functions from other packages that give stack overflows if those other packages are compiled. Also all functions that come with Lean automatically, such as in Lean and Std namespaces are compiled.

Jovan Gerbscheid (Oct 30 2024 at 19:27):

But setting this to true won't help with debugging this.

Jovan Gerbscheid (Oct 30 2024 at 19:29):

Do you have an idea of whether the stack overflow is from on of your own functions, or a built in function?

Philogy (Oct 30 2024 at 19:32):

Jovan Gerbscheid said:

Do you have an idea of whether the stack overflow is from on of your own functions, or a built in function?

Unsure, I have a fairly large proof but I have identified the specific line where if I comment everything from the bottom up including it and add a sorry it goes away. But it's a simple rw [have_lemma_defined_earlier] any ideas why that would cause it and not the definition use of the have lemma itself?

Jovan Gerbscheid (Oct 30 2024 at 19:34):

Oh I see you get it from a tactic call, I was thinking you got it from executing some code you wrote. Can you make a minimal example of this problem?

Philogy (Oct 30 2024 at 19:37):

Jovan Gerbscheid said:

Oh I see you get it from a tactic call, I was thinking you got it from executing some code you wrote. Can you make a minimal example of this problem?

Is this minimal enough? https://gist.github.com/Philogy/127f5d0e6753b6c7ed304123deb0fc04#file-basic-lean-L224

I've highlighted the line that makes the error disappear if it's commented out. Although the error shows on the line where my theorem is defined (L136)

Jovan Gerbscheid (Oct 30 2024 at 19:52):

I see. Can you minimize it further? So remove all the bits that aren't relevant to the error showing up. #mwe

Jovan Gerbscheid (Oct 30 2024 at 20:04):

Jovan Gerbscheid said:

Do you have an idea of whether the stack overflow is from on of your own functions, or a built in function?

The anwer to this is: a built in function, namely the implementation of rw. The only way to see some sort of stack trace would be to copy paste the definition of rw :sweat_smile:

Philogy (Oct 30 2024 at 20:05):

Jovan Gerbscheid said:

Jovan Gerbscheid said:

Do you have an idea of whether the stack overflow is from on of your own functions, or a built in function?

The anwer to this is: a built in function, namely the implementation of rw. The only way to see some sort of stack trace would be to copy paste the definition of rw :sweat_smile:

Why to built in functions suppress the error like this? Currently working to minimize the example...

Philogy (Oct 30 2024 at 20:10):

Jovan Gerbscheid said:

I see. Can you minimize it further? So remove all the bits that aren't relevant to the error showing up. #mwe

Minimized: https://gist.github.com/Philogy/1e01d255732740edb19bd29ad3225ed7

Jovan Gerbscheid (Oct 30 2024 at 20:13):

I got confused about where the deep recursion is happening. I thought it was in the tactic implementation, but it is in the kernel that checks the proof term. So I'm not actually sure about that

Philogy (Oct 30 2024 at 20:16):

Jovan Gerbscheid said:

I got confused about where the deep recursion is happening. I thought it was in the tactic implementation, but it is in the kernel that checks the proof term. So I'm not actually sure about that

Does Lean not give you some way to dump a stack trace or anything? How is one meant to debug something like this? Should I open an RFC to improve error messaging maybe?

Jovan Gerbscheid (Oct 30 2024 at 20:16):

In your minimized example, replacing the definition of m0 by a small number, the proof works without a deep recursion in the kernel. This means that the kernel is trying to reduce this number for some reason (reducing into a sequence of applications of Nat.succ.

Jovan Gerbscheid (Oct 30 2024 at 20:18):

The idea in lean is that the tactics do all the hard work, and they produce a proofterm that should be easy to check for the kernel. So generally, error messages from the kernel aren't that common and also tend to not be that helpful. I think Ideally, such an error should be thrown by the tactic before the kernel gets a chance to get stuck.

Philogy (Oct 30 2024 at 20:19):

Jovan Gerbscheid said:

In your minimized example, replacing the definition of m0 by a small number, the proof works without a deep recursion in the kernel. This means that the kernel is trying to reduce this number for some reason (reducing into a sequence of applications of Nat.succ.

Is this thread/fact by anyway related? https://leanprover.zulipchat.com/#narrow/channel/270676-lean4/topic/Kernel.20reduction.20of.20Nat.2Esucc.20is.20more.20strict.20that.20expected/near/468136717

Jovan Gerbscheid (Oct 30 2024 at 20:20):

It seems to be.

Jovan Gerbscheid (Oct 30 2024 at 20:21):

In your case, it's probably an easy fix of proving the lemmas in a more general form instead of just for this specific big natural number.

Jovan Gerbscheid (Oct 30 2024 at 20:22):

By the way you can post minimal examples between triple back-ticks like this, and then it comes with a link to the web version of lean for anyone to try it quickly:

import Mathlib.Data.Nat.ModEq

abbrev m0 :=
  11577
abbrev m1 := m0 - 1

lemma succ_coprime (n : ): Nat.Coprime (n + 1) n := by simp

theorem remco_equiv_naive(x y : BitVec 256) : 42 = (x.toNat * y.toNat) % (m0 * m1) := by
  let z := x.toNat * y.toNat
  let x0 := z % m0
  let x1 := z % m1

  have comp_z_chinese :=
    Nat.chineseRemainder_modEq_unique (succ_coprime m1) (a := x0) (b := x1) (z := z)
    ( by simp [Nat.ModEq, x0, m0])
    ( by simp [Nat.ModEq, x1, m0, m1])

  simp [Nat.chineseRemainder, Nat.chineseRemainder'] at comp_z_chinese

  rw [(by simp : x.toNat * y.toNat = z)]
  simp [m0]
  rw [comp_z_chinese]
  sorry

Philogy (Oct 30 2024 at 20:25):

Jovan Gerbscheid said:

By the way you can post minimal examples between triple back-ticks like this, and then it comes with a link to the web version of lean for anyone to try it quickly:

import Mathlib.Data.Nat.ModEq

abbrev m0 :=
  11577
abbrev m1 := m0 - 1

lemma succ_coprime (n : ): Nat.Coprime (n + 1) n := by simp

theorem remco_equiv_naive(x y : BitVec 256) : 42 = (x.toNat * y.toNat) % (m0 * m1) := by
  let z := x.toNat * y.toNat
  let x0 := z % m0
  let x1 := z % m1

  have comp_z_chinese :=
    Nat.chineseRemainder_modEq_unique (succ_coprime m1) (a := x0) (b := x1) (z := z)
    ( by simp [Nat.ModEq, x0, m0])
    ( by simp [Nat.ModEq, x1, m0, m1])

  simp [Nat.chineseRemainder, Nat.chineseRemainder'] at comp_z_chinese

  rw [(by simp : x.toNat * y.toNat = z)]
  simp [m0]
  rw [comp_z_chinese]
  sorry

Ah yes, thanks for the note. So how lazy is Lean? If I prove this in the general case or maybe split out Nat.chineseRemainder' into a lemma and then put my big number as an input will it work? Or will it still try to expand things?

Jovan Gerbscheid (Oct 30 2024 at 20:28):

I would suggest doing the problematic rewrite in a context that doesn't have access to the big number. This means proving your theorem remco_equiv_naive for general m0/m1, and any extra hypotheses about them that you need.

I don't know if there is any better workaround.


Last updated: May 02 2025 at 03:31 UTC