Zulip Chat Archive

Stream: lean4

Topic: Memory issues with `Vector.ofFn`.

Ayhon (May 27 2025 at 09:31):

Consider the following snippet of code.

def N := 800
abbrev foo(state: Vector Bool N): List Bool :=
  state.toArray.toList -- EDIT: There used to be a `.val` here, it was a typo

theorem problem
: N = (foo (Vector.ofFn (fun _ => false))).length
:= by
  /- simp only [foo] -/
  sorry

If I uncomment simp only [foo], my RAM usage spikes by around 5GB and stays used for a while. If I increase N to 1600, the spike rises to about 15GB (which in a bigger project made my computer run out of memory). When N is 800 again, I've observed that if I comment and uncomment the simp only [foo] line successively, the RAM spike seems to happen as many times as this line was uncommented.

I was able to reproduce this in a fresh new project, Lean version 4.19 (stable at the time of writing).

I have recorded a video in which I show these spikes, but unfortunately the quality and contrast are not great, which means that you can't really tell what the RAM usage percentage reads in htop. At least the usage bar gives an idea as to what the spike looks like. The code shown is also slightly different, an earlier version.

lean-memory-issues.webm

Ayhon (May 27 2025 at 09:31):

As suggested by @Son Ho, marking Array.ofFn.go as irreducible seems to fix the problem.

def N := 800
abbrev foo(state: Vector Bool N): List Bool :=
  let aux: {ls: List Bool // ls.length = N} := {
    val := state.1.1
    property := state.2
  }
  aux.val

/- ADAPTED FROM STDLIB -/
def Array.ofFn' {n} (f : Fin n → α) : Array α := go 0 (emptyWithCapacity n) where
  @[irreducible]
  go (i : Nat) (acc : Array α) : Array α :=
    if h : i < n then go (i+1) (acc.push (f ⟨i, h⟩)) else acc
  decreasing_by simp_wf; decreasing_trivial_pre_omega

@[simp] theorem Array.size_ofFn'_go {n} {f : Fin n → α} {i acc} :
    (ofFn'.go f i acc).size = acc.size + (n - i) := by
  if hin : i < n then
    unfold ofFn'.go
    have : 1 + (n - (i + 1)) = n - i :=
      Nat.sub_sub .. ▸ Nat.add_sub_cancel' (Nat.le_sub_of_add_le (Nat.add_comm .. ▸ hin))
    rw [dif_pos hin, size_ofFn'_go, size_push, Nat.add_assoc, this]
  else
    have : n - i = 0 := Nat.sub_eq_zero_of_le (Nat.le_of_not_lt hin)
    unfold ofFn'.go
    simp [hin, this]
termination_by n - i

@[inline] def Vector.ofFn' (f : Fin n → α) : Vector α n :=
  ⟨Array.ofFn' f, by simp [Array.ofFn']⟩
/- ADAPTED FROM STDLIB -/

theorem problem
: N = (foo (Vector.ofFn' (fun _ => false))).length
:= by
  /- simp only [foo] -/
  sorry

Son Ho (May 27 2025 at 09:51):

Adding a question on my side: my understanding is that the Lean kernel repeatedly unfolds Array.ofFn.go when type-checking the theorem, leading to a huge term because this definition uses well-founded recursion. As far as I know, this issue is the reason why definitions which use well-founded recursion are irreducible by default. So I was wondering: why has Array.ofFn.go been marked as semireducible (what is the use case)?

Mario Carneiro (May 27 2025 at 11:28):

if you mark it as irreducible, you won't be able to compute the values of the array anymore?

Ayhon (May 27 2025 at 11:34):

What do you mean by "not being able to compute the values"?

When I tried using #eval with the definition that fixed the issue, it seems to work fine (up to me not understanding something).

Code

def Array.ofFn' {n} (f : Fin n → α) : Array α := go 0 (emptyWithCapacity n) where
  go (i : Nat) (acc : Array α) : Array α :=
    if h : i < n then go (i+1) (acc.push (f ⟨i, h⟩)) else acc
  decreasing_by simp_wf; decreasing_trivial_pre_omega

@[simp] theorem Array.size_ofFn'_go {n} {f : Fin n → α} {i acc} :
    (ofFn'.go f i acc).size = acc.size + (n - i) := by
  if hin : i < n then
    unfold ofFn'.go
    have : 1 + (n - (i + 1)) = n - i :=
      Nat.sub_sub .. ▸ Nat.add_sub_cancel' (Nat.le_sub_of_add_le (Nat.add_comm .. ▸ hin))
    rw [dif_pos hin, size_ofFn'_go, size_push, Nat.add_assoc, this]
  else
    have : n - i = 0 := Nat.sub_eq_zero_of_le (Nat.le_of_not_lt hin)
    unfold ofFn'.go
    simp [hin, this]
termination_by n - i

@[inline] def Vector.ofFn' (f : Fin n → α) : Vector α n :=
  ⟨Array.ofFn' f, by simp [Array.ofFn']⟩

#eval (Vector.ofFn' (fun _ => false) : Vector Bool 10)

Mario Carneiro (May 27 2025 at 11:35):

I mean computing with rfl or decide

Son Ho (May 27 2025 at 12:04):

But I guess you could use simp?

Kim Morrison (May 27 2025 at 12:34):

Ayhon said:

Consider the following snippet of code.

def N := 800
abbrev foo(state: Vector Bool N): List Bool :=
  state.toArray.toList.val

theorem problem
: N = (foo (Vector.ofFn (fun _ => false))).length
:= by
  /- simp only [foo] -/
  sorry

@Ayhon, could you give a #mwe? This isn't illustrating your problem for me, but just gives:

invalid field 'val', the environment does not contain 'List.val'
  state.toArray.toList
has type
  List Bool

(You can use https://live.lean-lang.org/#project=lean-nightly to prepare reproducible examples.)

Ayhon (May 27 2025 at 12:34):

Sorry, the val is a leftover from my previous implementation which used subtypes. It should be fine to remove it. Let me confirm.

Kim Morrison (May 27 2025 at 12:35):

Yes, that works for me, thanks.

Kim Morrison (May 27 2025 at 13:22):

@Ayhon, @Son Ho, this is fixed in lean#8499, which should hopefully land in nightly-2025-05-28.

Ayhon (May 27 2025 at 13:45):

I understand then that the spikes in memory came from the kernel type checking the proof. This also seems to be a common footgun of using well-founded recursion.

However, the real issue for me was that the spike in memory usage continued even after I commented out the simp only call. From a user experience perspective, I would like to be able to tell Lean to cancel doing a large computation without having to close the editor. I thought I'd achieve this by simply changing the state of my proof, but that wasn't the case. Is there a reason why this happens?

Son Ho (May 27 2025 at 14:46):

Kim Morrison said:

Ayhon, Son Ho, this is fixed in lean#8499, which should hopefully land in nightly-2025-05-28.

Thanks a lot for this! :)

Ayhon (Jun 06 2025 at 13:02):

Ayhon said:

I understand then that the spikes in memory came from the kernel type checking the proof. This also seems to be a common footgun of using well-founded recursion.

However, the real issue for me was that the spike in memory usage continued even after I commented out the simp only call. From a user experience perspective, I would like to be able to tell Lean to cancel doing a large computation without having to close the editor. I thought I'd achieve this by simply changing the state of my proof, but that wasn't the case. Is there a reason why this happens?

After the most recent Lean FRO's office hours, Sebastian Ullrich encouraged me to mention this problem in a new Lean issue. I have now opened that issue here. Referencing it in this topic for completeness.

Last updated: Feb 28 2026 at 14:05 UTC