Zulip Chat Archive
Stream: batteries
Topic: Large vector hangs
Violeta Hernández (Sep 27 2024 at 22:25):
I'm running into this bizzare issue: a vector of length 32 can be defined almost instantly, but a vector of length 33 seemingly hangs indefinitely.
import Batteries.Data.Vector.Basic
-- instant
example : Batteries.Vector Nat 32 :=
#v[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
-- (deterministic) timeout at `whnf`, maximum number of heartbeats (200000) has been reached
example : Batteries.Vector Nat 33 :=
#v[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
Violeta Hernández (Sep 27 2024 at 22:31):
Batteries.Vector.mkVector 33 1
works just fine, so I suspect this is an issue with the #v
macro
Violeta Hernández (Sep 27 2024 at 22:42):
Actually, this still reproduces the bug:
import Batteries.Data.Vector.Basic
-- instant
example : Batteries.Vector Nat 32 :=
⟨List.toArray [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
-- (deterministic) timeout at `whnf`, maximum number of heartbeats (200000) has been reached
example : Batteries.Vector Nat 33 :=
⟨List.toArray [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
Violeta Hernández (Sep 27 2024 at 22:43):
But replacing rfl
by sorry
gets the bug to disappear
Violeta Hernández (Sep 27 2024 at 22:44):
As does replacing rfl
with by simp
Shreyas Srinivas (Sep 27 2024 at 22:50):
Are you able to get the same bug with arrays?
François G. Dorais (Sep 27 2024 at 22:51):
Very interesting! Issues in Batteries have been very underused but this may change in the near future. Please feel free to file a bug report!
That said, the definition of Vector
is pretty straightforward so there is a decent chance the underlying issue is in core. Can you minimize further by defining your own Vector
type?
structure MyVector (α : Type u) (n : Nat) where
toArray : Array α
size_eq : toArray.size = n
Shreyas Srinivas (Sep 27 2024 at 22:51):
Specifically, array notation?
Violeta Hernández (Sep 27 2024 at 22:52):
The issue doesn't seem to appear with array
import Init.Prelude
-- length 32, also works
example : Array Nat :=
List.toArray [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
-- length 33, also works
example : Array Nat :=
List.toArray [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
Shreyas Srinivas (Sep 27 2024 at 22:52):
No that's converting lists to arrays
Shreyas Srinivas (Sep 27 2024 at 22:52):
I think there is a separate array notation.
Shreyas Srinivas (Sep 27 2024 at 22:52):
That uses mkArray
Violeta Hernández (Sep 27 2024 at 22:53):
Yep, works too
import Init.Prelude
example : Array Nat :=
#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
example : Array Nat :=
#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
François G. Dorais (Sep 27 2024 at 22:53):
The issue is almost surely with rfl
.
Shreyas Srinivas (Sep 27 2024 at 22:53):
Are you getting the same issue with your custom vector type + notation?
Violeta Hernández (Sep 27 2024 at 22:54):
I suspect as much, this reproduces the bug
import Init.Prelude
structure MyVector (α : Type u) (n : Nat) where
toArray : Array α
size_eq : toArray.size = n
-- works
example : MyVector Nat 32 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
-- doesn't work
example : MyVector Nat 33 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
Violeta Hernández (Sep 27 2024 at 22:54):
I'll file a bug under core Lean
Shreyas Srinivas (Sep 27 2024 at 22:55):
Then the issue is with rfl. How does rfl interact with the new wf reduction opaqueness policy?
François G. Dorais (Sep 27 2024 at 22:55):
That's what I would recommend. Thank you for letting us know about this!
Violeta Hernández (Sep 27 2024 at 22:57):
Huh? This issue doesn't reproduce in Lean nightly
https://live.lean-lang.org/#project=lean-nightly&codez=JYWwDg9gTgLgBASQHbBgOgApQKYBsCuAJtgFAkDOMU+AxjPjnALICeAatndHABSCNwHABccACosw2OPgCUvJELgA5AIYxZAdwAW2HCThwYEAIJQoylgpNmLfPXHLAAXtgD62AI4LDV82gfO4AF44JDJsAA9lcFxJYVYOLiglVTgAZgAmIUC7QAvyAGIAbQBGABpS8rLKiuqq2pr6usb6gF0SuCgAM1xAS/I4MMjo2OZ2TkMklXhU1Kzcwob5psWF5aXW9q7uoA
Shreyas Srinivas (Sep 27 2024 at 22:57):
Are you able to get the proof done with some other tactic, say simp?
Violeta Hernández (Sep 27 2024 at 22:57):
simp
does work in this instance, yes
Shreyas Srinivas (Sep 27 2024 at 23:02):
Then maybe we ought to modify the notation to get the proof from simp
Shreyas Srinivas (Sep 27 2024 at 23:03):
I'll fix this on the batteries side
Shreyas Srinivas (Sep 27 2024 at 23:09):
Is it linked to this by any strange chance: https://github.com/leanprover/lean4/pull/5446
François G. Dorais (Sep 27 2024 at 23:12):
The issue is the same in lean-nightly, you just have to wait a while longer to see it happen on the web interface.
Violeta Hernández (Sep 27 2024 at 23:12):
I'm starting to think this might have nothing to do with arrays. I'm running into a separate issue where making a list of length 33 causes Lean to eat up all of my memory
image.png
This one I can't quite as easily reproduce though
Violeta Hernández (Sep 27 2024 at 23:13):
(and again, 32 is the magic number where it works fine)
Jannis Limperg (Sep 27 2024 at 23:17):
Afaik there are different elaborators for large and small array/list literals. Perhaps 32 is the cutoff and the large elaborator is somehow producing code that is difficult for rfl
to evaluate.
François G. Dorais (Sep 27 2024 at 23:17):
@Kim Morrison is working a lot with array code right now, so I'm pinging them to take a look.
François G. Dorais (Sep 27 2024 at 23:19):
It's actually the weekend for them so we shouldn't expect a response in the immediate future.
François G. Dorais (Sep 27 2024 at 23:20):
Anyway, a bug report in core is the right way to go.
Kyle Miller (Sep 27 2024 at 23:20):
Jannis Limperg said:
Perhaps 32 is the cutoff
Yes, you can see it with
#check #[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
and then adding an extra 1
.
Kyle Miller (Sep 27 2024 at 23:24):
Interestingly, by rfl
makes it elaborate instantly.
example : MyVector Nat 33 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1], by rfl⟩
I guess somehow rfl
is causing some bad defeq problems if it happens too soon?
Violeta Hernández (Sep 27 2024 at 23:24):
I've managed to reproduce the bug on bare lists! Give me a second
Violeta Hernández (Sep 27 2024 at 23:27):
import Init.Prelude
example : List Nat :=
let l := [1]
[l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0], l[0]]
Adding one more instance of l[0]
makes Lean eat up all your RAM :cold_sweat:
Violeta Hernández (Sep 27 2024 at 23:27):
yet again, 32 is the magic number
Kyle Miller (Sep 27 2024 at 23:28):
I suspect that's a different performance bug, with indexing notation.
Kyle Miller (Sep 27 2024 at 23:30):
Though on second thought, maybe it's related, and it has to do with a bad interaction with the let
s... (never mind, I think it's unrelated.)
Violeta Hernández (Sep 27 2024 at 23:30):
Oh nevermind, in this example it just takes a bit longer. In my actual code where the indexing list was more complicated it was in fact eating all my RAM
Violeta Hernández (Sep 27 2024 at 23:33):
I've opened #5502
Shreyas Srinivas (Sep 27 2024 at 23:34):
you might want to add the array example too
Shreyas Srinivas (Sep 27 2024 at 23:34):
It's clearly not isolated to vectors
Shreyas Srinivas (Sep 27 2024 at 23:35):
under steps to reproduce
Violeta Hernández (Sep 27 2024 at 23:35):
Which example? Declaring an array of length 33 worked just fine
Mario Carneiro (Sep 27 2024 at 23:38):
Yes, this has been known for a while (or at least I've complained about it on multiple occasions). There is a switchover to a different compilation strategy for large lists, but the strategy is actually much worse than the simple one
Mario Carneiro (Sep 27 2024 at 23:39):
it creates nested let bindings that break a bunch of fast paths in the compiler and elaborator
Violeta Hernández (Sep 27 2024 at 23:40):
Is there a way to disable it? Or at least increase the limit at which it takes hold?
Mario Carneiro (Sep 27 2024 at 23:40):
The reason it exists is because without it large list literals create very deep terms that lean doesn't handle well, but this could also be done by generating trees of appends instead
Mario Carneiro (Sep 27 2024 at 23:41):
also, the compiler unfolds all the let bindings and creates the deep terms anyway, so it seems kind of pointless
Mario Carneiro (Sep 27 2024 at 23:42):
You can disable it by locally overriding list notation to not use the bad macro
Shreyas Srinivas (Sep 27 2024 at 23:42):
So I tried all the failing examples with rfl
above and by rfl
works
Shreyas Srinivas (Sep 27 2024 at 23:42):
For example:
import Init.Prelude
structure MyVector (α : Type u) (n : Nat) where
toArray : Array α
size_eq : toArray.size = n
-- works
example : MyVector Nat 32 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
-- doesn't work
example : MyVector Nat 33 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], (by rfl)⟩
Shreyas Srinivas (Sep 27 2024 at 23:43):
I didn't remove your comments from the original example, but they both work
Violeta Hernández (Sep 27 2024 at 23:43):
Mario Carneiro said:
You can disable it by locally overriding list notation to not use the bad macro
how?
Kyle Miller (Sep 27 2024 at 23:44):
(@Shreyas Srinivas I mentioned that here)
Shreyas Srinivas (Sep 27 2024 at 23:45):
oh yeah. sorry
Shreyas Srinivas (Sep 27 2024 at 23:45):
I must have passed over it.
Mario Carneiro (Sep 27 2024 at 23:46):
universe u
structure MyVector (α : Type u) (n : Nat) where
toArray : Array α
size_eq : toArray.size = n
open Lean
macro_rules
| `([ $elems,* ]) => do
let rec expandListLit (i : Nat) (skip : Bool) (result : TSyntax `term) : MacroM Syntax := do
match i, skip with
| 0, _ => pure result
| i+1, true => expandListLit i false result
| i+1, false => expandListLit i true (← ``(List.cons $(⟨elems.elemsAndSeps.get! i⟩) $result))
let size := elems.elemsAndSeps.size
expandListLit size (size % 2 == 0) (← ``(List.nil))
-- works
example : MyVector Nat 32 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
-- works
example : MyVector Nat 33 :=
⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
Mario Carneiro (Sep 27 2024 at 23:47):
this is just a copy paste of the original macro with the slow path removed
Kyle Miller (Sep 27 2024 at 23:47):
It's possible that the difference between by rfl
and rfl
is that in this case the rfl
tactic can and does use kernel defeq
François G. Dorais (Sep 27 2024 at 23:49):
Since Vector
have known length, it is possible to figure out much better strategies before parsing the array. So maybe Vector
needs an independent implementation of the said macro for maximal efficiency.
Shreyas Srinivas (Sep 27 2024 at 23:50):
I started the PR with by_rfl
, but maybe I can add this macro Mario provided
Shreyas Srinivas (Sep 27 2024 at 23:51):
Violeta Hernández (Sep 27 2024 at 23:52):
Mario Carneiro said:
universe u structure MyVector (α : Type u) (n : Nat) where toArray : Array α size_eq : toArray.size = n open Lean macro_rules | `([ $elems,* ]) => do let rec expandListLit (i : Nat) (skip : Bool) (result : TSyntax `term) : MacroM Syntax := do match i, skip with | 0, _ => pure result | i+1, true => expandListLit i false result | i+1, false => expandListLit i true (← ``(List.cons $(⟨elems.elemsAndSeps.get! i⟩) $result)) let size := elems.elemsAndSeps.size expandListLit size (size % 2 == 0) (← ``(List.nil)) -- works example : MyVector Nat 32 := ⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩ -- works example : MyVector Nat 33 := ⟨#[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], rfl⟩
My code with nested indexing doesn't hang anymore with this workaround! Thanks!
Shreyas Srinivas (Sep 27 2024 at 23:53):
OTOH, this macro makes no sense for vector since the existing one calls the array notation
Mario Carneiro (Sep 27 2024 at 23:58):
Here's a version of the slow path that constructs append trees instead of nested lets:
macro_rules
| `([ $elems,* ]) => do
let elems := elems.getElems
let rec expand (lo i : Nat) (result : TSyntax `term) : MacroM (TSyntax `term) := do
match i with
| 0 => pure result
| i+1 => expand lo i (← ``(List.cons $(⟨elems[lo+i]!⟩) $result))
let nil ← ``(List.nil)
let rec bisect (lo hi : Nat) : MacroM (TSyntax `term) := do
if lo + 32 < hi then
let mid := (lo + hi) / 2
``(List.append $(← bisect lo mid) $(← bisect mid hi))
else
expand lo (hi - lo) nil
bisect 0 elems.size
Mario Carneiro (Sep 28 2024 at 00:02):
Also this macro is inefficient in that it constructs a big term to be processed by the elaborator, instead of just constructing the term directly, saving on a lot of metavariable construction and application elaboration
Mario Carneiro (Sep 28 2024 at 00:03):
e.g. lean has to work out that the type variables in all the list applications are the same, but this also interacts nontrivially with coercions inside the elements; this work can be shortcut by doing one big unification
Violeta Hernández (Sep 28 2024 at 00:52):
I'm running into yet another issue with vectors: indexing is really slow!
import Batteries.Data.Vector.Basic
-- Works, really slow.
example : #v[1,1,1,1,1,1,1,1,1,1,1,1,1,1][4] = 1 := rfl
-- Timeout
example : #v[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1][4] = 1 := rfl
Violeta Hernández (Sep 28 2024 at 00:53):
This is even with Mario's workaround
Violeta Hernández (Sep 28 2024 at 00:54):
And even with by rfl
Violeta Hernández (Sep 28 2024 at 00:55):
Actually, this times out even if the proof is sorried.
Violeta Hernández (Sep 28 2024 at 00:57):
This issue doesn't appear with lists or even with arrays
Mario Carneiro (Sep 28 2024 at 00:57):
this is in part caused by List.toArray
, for which a fix recently landed in lean core
Shreyas Srinivas (Sep 28 2024 at 00:57):
The by rfl
fix is ready and passed both batteries CI and mathlib build: https://github.com/leanprover-community/batteries/pull/968#issuecomment-2380322025
Mario Carneiro (Sep 28 2024 at 00:59):
here's a demonstration of the effect that the List.toArray
to Array.mk
change will have:
-- today: slow
example : (Vector.mk (n := 15) (List.toArray [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]) rfl)[4] = 1 := rfl
-- tomorrow: fast
example : (Vector.mk (n := 15) (Array.mk [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]) rfl)[4] = 1 := rfl
Mario Carneiro (Sep 28 2024 at 00:59):
that expression is the result of macro expansion on #v[...]
syntax
Mario Carneiro (Sep 28 2024 at 01:00):
cc @Kim Morrison in case you thought it didn't matter :)
Violeta Hernández (Sep 28 2024 at 01:02):
Is the relevant PR this one? https://github.com/leanprover/lean4/pull/5446
Mario Carneiro (Sep 28 2024 at 01:04):
lean4#5452 is attempt number 2 at that PR
Violeta Hernández (Sep 28 2024 at 01:07):
Overrode the #[]
notation in my code, and yeah, this is working like a charm
Kim Morrison (Sep 28 2024 at 03:54):
Violeta Hernández said:
Is the relevant PR this one? https://github.com/leanprover/lean4/pull/5446
This one is still potentially in flight. I think it should have no effect (hence not worrying about it too much!), but it fails in rebootstrapping. If anyone could explain to me what is going on there I'd be grateful.
Last updated: May 02 2025 at 03:31 UTC