Zulip Chat Archive

Stream: new members

Topic: Array indexing strangeness


Malcolm Langfield (May 28 2022 at 03:24):

Can anyone explain why this is expected behavior in Lean 4:

-- Array literal declaration and subscripts.
#eval #['a', 'b', 'c'][4]      'A'
#eval #['b', 'c', 'd'][4]      'A'
#eval #[1, 2, 3][4]      0
#eval #["a", "aa", "aaa"][4]      ""

I expect it has something to do with wanting the "subscript" function to be total. This seems like javascript-level silent failure inducing behavior to me. I thought one of the big selling points of dependent types was the whole Vector example thing (I'm thinking of one of Edwin Brady's talks), where we can't get index errors because the length of our containers is encoded in the type?

Is Array a type that most users aren't supposed to touch, or something that only exists because it's very fast in certain special cases or something?

Chris Bailey (May 28 2022 at 03:35):

Length isn't in the type of Array. The notation here uses Array.getOp, so it's returning the default element of the type. Collections generally have methods like Array.get that take a proof the access is in-bounds to ensure safe access.

I think the three options for default behavior in this case are 1. crash, 2. return an Option A, or 3. return a default element. I wouldn't call the third option failure, but I sort of agree it's the most likely to hide a bug. I'm not sure if there's a reason they went with that or if it's set in stone.

Chris Bailey (May 28 2022 at 03:39):

I'm not sure if there's a public indexing function that panics yet, so that might be part of it.

Malcolm Langfield (May 28 2022 at 03:43):

Thanks very much for the explanation! I understand length isn't in the type of Array. I guess my question is more why Array exists at all, and in particular why something that is unsafe like this is given "priority" notation, if that makes any sense. When would you need something that doesn't take a proof of in-bounds-ness?

Chris Bailey (May 28 2022 at 04:00):

You need it if you don't know that your index is in-bounds statically. get with some default to cover failure is the same thing as trying to generate the proof at runtime. Even if you made every sequence in your code Vec A n at the type level, it's unlikely that you'll always be able to tell that a given i is in-bounds without testing it, at which point you need some behavior if it's not actually in-bounds.

As for why the default is what it is, I think it's probably some combination of what I mentioned. There might not be a preferable alternative they can implement with what's public at this point in development, and I don't think they've spent a lot of time trying to put a razor fine edge on the public API. They might change it in the future.

Malcolm Langfield (May 28 2022 at 04:03):

I see, that makes a lot of sense, thanks!

Kyle Miller (May 28 2022 at 06:37):

Another thing is that sometimes it's easier to write an algorithm and prove that it's correct than to prove each step is safe inline. That's easier said than done at the moment with Array, though, since there aren't so many lemmas about it.

Kyle Miller (May 28 2022 at 06:40):

Array is an implementation of List that at runtime is represented by a contiguous block of memory. As far as I know, it's meant to be used. It has the special feature that the Lean compiler will turn functional modifications of an Array into actual mutations.


Last updated: Dec 20 2023 at 11:08 UTC