Zulip Chat Archive

Stream: new members

Topic: Arrays vs. vectors vs. ...?

Timothy Chow (Nov 23 2025 at 22:54):

I have some combinatorial data that, if I were using a conventional programming language or CAS, I would simply "hardcode" as a one-dimensional array of unsigned integers. What is the recommended way to do this in Lean, if I have a few hundred or maybe a few thousand unsigned integers? Two options that I'm aware of are:

def v1 : Array Nat := #[1, 2, 3, 4, 5]
def v2 := ![1, 2, 3, 4, 5]

#eval v1[0]
#eval v2 0

However, I don't fully understand what I'm doing in either case. The Array option actually doesn't seem to work for me; more precisely, it works in the above example, but when I create a large Array then I get the dreaded "maximum recursion depth exceeded" error. I can circumvent that with v1[0]! instead of v1[0] but I don't understand all the implications of omitting the bounds checking.

The second option is what I'm currently using, but it also has some unexpected (to me) behavior. In the above example, v1[8] generates an error but v2 8 simply "wraps around." Why doesn't v2 8 generate an error?

Aaron Liu (Nov 23 2025 at 23:46):

because (8 : Fin 5) is valid so v2 8 also works

Aaron Liu (Nov 23 2025 at 23:46):

Timothy Chow said:

I have some combinatorial data that, if I were using a conventional programming language or CAS, I would simply "hardcode" as a one-dimensional array of unsigned integers.

I would probably read it in from another file

Aaron Liu (Nov 23 2025 at 23:47):

can you tell me what you are using all these numbers for?

Aaron Liu (Nov 23 2025 at 23:48):

for example, do you need to know at compile time the length of the array, or the contents of the array, or just that it's a big array of numbers?

Jakub Nowak (Nov 23 2025 at 23:50):

Timothy Chow said:

The second option is what I'm currently using, but it also has some unexpected (to me) behavior. In the above example, v1[8] generates an error but v2 8 simply "wraps around." Why doesn't v2 8 generate an error?

v2 indices are of type Fin n (where n is a length). The conversion of integer literals to Fin n is defined to wrap around. Also arithmetic operations on type Fin n do wrap around (in other words, it's arithmetic modulo n). If you want to avoid wrapping around, you could work with indices of type Nat. But then, to construct Fin n from idx : Nat, you have to provide the proof idx < n. Which is basically the same proof that array requires.

Timothy Chow said:

The Array option actually doesn't seem to work for me; more precisely, it works in the above example, but when I create a large Array then I get the dreaded "maximum recursion depth exceeded" error.

I think, you can report that as a bug. Here's #mwe:

def v1 : Array Nat := #[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]
#eval v1[0]

This works though:

def v1 : Array Nat := #[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]

def foo (v : Array Nat) (hv : v.size = 122 := by get_elem_tactic) : Nat :=
  v[0]

#eval foo v1

It seems like Lean has some issues with handling big array literals. You can try parsing data from a file, or using something like above as a workaround.

Aaron Liu (Nov 23 2025 at 23:50):

and do you want this to be optimized for kernel indexing into the array or runtime indexing into the array

Aaron Liu (Nov 23 2025 at 23:52):

Jakub Nowak said:

Timothy Chow said:

The Array option actually doesn't seem to work for me; more precisely, it works in the above example, but when I create a large Array then I get the dreaded "maximum recursion depth exceeded" error.

I think, you can report that as a bug. Here's #mwe:

def v1 : Array Nat := #[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]
#eval v1[0]

This works though:

def v1 : Array Nat := #[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]

def foo (v : Array Nat) (hv : v.size = 122 := by get_elem_tactic) : Nat :=
  v[0]

#eval foo v1

It seems like Lean has some issues with handling big array literals. You can try parsing data from a file, or using something like above as a workaround.

I think, it is unfolding the array and taking a while to compute its length one by one. But it works if you fix the length

def v1 : Vector Nat 122 := #v[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]

#eval v1[0]

Jakub Nowak (Nov 23 2025 at 23:53):

But #eval v1.size works, even with much longer array. So the issue is not in calculating the size.

Aaron Liu (Nov 23 2025 at 23:54):

but the elaborator doesn't use #eval

Jakub Nowak (Nov 23 2025 at 23:55):

Ah, you're right.

Jakub Nowak (Nov 23 2025 at 23:55):

Although get_elem_tactic was able to prove v.size = 122, no problem ~~(works with much longer too)~~.

Aaron Liu (Nov 23 2025 at 23:58):

but can it prove 0 < v1.size

Jakub Nowak (Nov 24 2025 at 00:00):

Ah, wait, no, sorry, it doesn't work with v.size = 200.

Jakub Nowak (Nov 24 2025 at 00:07):

I guess the tactic shouldn't try calculate the size. Instead, it should give the proof 0 < v.size just by checking it has at least one element. Not sure if that would be possible?

Timothy Chow (Nov 24 2025 at 02:07):

Aaron Liu said:

can you tell me what you are using all these numbers for?

There are a variety of situations I have in mind, but in general the idea is that there's some moderately large, fixed combinatorial object that I want to prove has some property, and use to prove other general facts. The object might be a finite graph, or a combinatorial design, or a Young tableau, or any of a long list of possibilities. I picked a one-dimensional array as the simplest case; an integer partition or a permutation could be represented as a one-dimensional array, for example.

In particular, the data isn't going to change. I'm not going to be modifying the numbers or increasing the size of the array.

Aaron Liu (Nov 24 2025 at 02:10):

then if it's not too too big I would probably store it in a docs#Vector (for easy writing down) or in a docs#Lean.RArray (for optimized access time)

Aaron Liu (Nov 24 2025 at 02:13):

but I feel like docs#Lean.RArray you should only use that one if the only thing you're going to be doing with it is looking up its entries over and over

Aaron Liu (Nov 24 2025 at 02:14):

if it's too big to reasonably fit in a Lean file without being an significant interruption then I would write some meta-code to read it in from another file

Timothy Chow (Nov 24 2025 at 02:19):

Thanks; I think probably Vector with an explicitly specified length is the way to go, at least for the initial examples I have in mind. What about a 2-dimensional array, though? Matrix, I guess?

Aaron Liu (Nov 24 2025 at 02:20):

oh actually I guess if it's for proving things maybe List.Vector would be better

Aaron Liu (Nov 24 2025 at 02:20):

for matrices I would just flatten it I feel like 2d stuff is too complicated

Aaron Liu (Nov 24 2025 at 02:21):

another option is a vector of vectors

Aaron Liu (Nov 24 2025 at 02:22):

Matrix is really not yeah actually no Matrix is probably fine

Eric Wieser (Nov 24 2025 at 02:43):

Maybe it's also worth noting that it's easy to convert between the representation, since v1 = .ofFn v2 (docs#Array.ofFn) and v2 = v1.get (docs#Array.get)

Timothy Chow (Nov 24 2025 at 02:43):

What's the difference between Vector and List.Vector?

Eric Wieser (Nov 24 2025 at 02:43):

The same as the difference between docs#Array and docs#List

Timothy Chow (Nov 24 2025 at 02:45):

Let me try asking the question another way. What are the pros and cons of using Vector versus List.Vector for the application I described above?

Matt Diamond (Nov 24 2025 at 05:00):

the docs for docs#List.Vector briefly describe the difference: List.Vector is better when you're reasoning about the list mathematically and Vector is better when it's just a data store for programming / verification

Timothy Chow (Nov 24 2025 at 12:00):

Yes, I saw that. But why is List.Vector better when you're reasoning about the list mathematically and why is Vector better when it's just a data store?

Ruben Van de Velde (Nov 24 2025 at 12:25):

I thought we were going to remove List.Vector

Timothy Chow (Nov 24 2025 at 14:36):

Jakub Nowak said:

I think, you can report that as a bug.

I forgot to mention that, while searching the Zulip archives for help, I noticed that something similar was reported a while ago. But I wasn't able to extract a clear answer to my question from that discussion.

Shreyas Srinivas (Nov 24 2025 at 14:55):

Matt Diamond said:

the docs for docs#List.Vector briefly describe the difference: List.Vector is better when you're reasoning about the list mathematically and Vector is better when it's just a data store for programming / verification

you can get the underlying list of a vector from docs#Vector . Technically Vector is a size limited array which in turn is a list (from the proof side). Given v : Vector a n, you can get the list with v.toArray.toList

Kevin Buzzard (Nov 24 2025 at 14:56):

Timothy Chow said:

Yes, I saw that. But why is List.Vector better when you're reasoning about the list mathematically and why is Vector better when it's just a data store?

I don't know the answer to this, so probably shouldn't be contributing to this thread, but for the analogous question with natural numbers the standard unary representation with zero and succ is better for proving (because the generated induction principle is the one you want) but lousy for calculating (because a number like 10^6 takes up 10^6 space as S(S(S(S(S(...S(zero))...), whereas the standard binary representation for nats (WithZero PosNum, where PosNum has the three constructors one, double and double_then_add_one) is going to be much fiddlier for proving (even defining multiplication is going to be fiddly, let alone proving that induction works) but much better for computing (becasue 10^6 now is stored in size log(10^6)).

Shreyas Srinivas (Nov 24 2025 at 14:57):

Fwiw, depending on what you want to prove, the Array API (and Vector API) might be adequate. So you probably don't need to access the underlying list.

Wrenna Robson (Nov 24 2025 at 17:47):

Ruben Van de Velde said:

I thought we were going to remove List.Vector

I wish!

Wrenna Robson (Nov 24 2025 at 17:48):

I do think that Vector is the best choice a lot of the time. In practice if one wants to think more mathematically, Fin n -> _ is better supported by the library anyway - List.Vector is just not very functional.

Wrenna Robson (Nov 24 2025 at 17:49):

(I think we should have a Tuple abbreviation for that but that is a different argument.)

Last updated: Dec 20 2025 at 21:32 UTC