Zulip Chat Archive

Stream: new members

Topic: A name for bundling indices and a data structure together

Mark Fischer (Feb 08 2024 at 16:53):

I feel like this happens a lot, but I've never heard a name for the process. It comes down to passing around indices into a data structure, but stating properties about the parts of a data structure the indices point at.

I've been building out some helper functions with the idea to view a dependent pair like (l : List α) × Finset (Fin l.length) as a List α except that indexing is done though the set of indices in the pair.

def foo : (l : List Char) × Finset (Fin l.length) := ...
-- index the list at the 4'th smallest ℕ in the Finset,
#check foo[3] -- Char
-- the Finset contains an index that references 'a' in the list
#check 'a' ∈ foo -- Prop

This has been making it a bit easier notionally to talk about parts of a larger array. So I can write (l,s)[i] instead of l[s.sort[i]] especially in scenarios where it's nice to ignore that some term is actually a dependent pair.

This seems to be the sort of thing many algorithms use (esp when doing recursion with arrays). When I'm not trying to prove anything about what's happening, I just pass around a list of indices and a data structure and just hope to maintain the necessary invariants ambiently.

Is there a name for this?

Matt Diamond (Feb 09 2024 at 04:37):

What's the purpose of keeping the indices separate? Is the idea that the array is large and sparsely populated? Or are your index Finsets more like different "views" on shared data?

Matt Diamond (Feb 09 2024 at 04:39):

if it's a sparse array thing, that's kind of like docs#Finsupp

Matt Diamond (Feb 09 2024 at 04:40):

instead of a list it's a function from indices to values

Matt Diamond (Feb 09 2024 at 04:41):

with the potential downside that storing 0 is the same as storing nothing at all (and the type of values has to have a Zero element)

Mark Fischer (Feb 09 2024 at 14:48):

Yeah, hmmm, this might be a bit of an XY problem.

The problem I've been working through is proving properties about sudoku boards. Generally, the properties of a region of the board can be stated without carrying around reference to indices. So the first row doesn't need to the be the first 9 indices, I can just make a corresponding list of cells and prove stuff about the the first row that way.

But at least for Sudoku, it'll end up mattering which cell in the bigger board has a certain property as that will matter for 2 other regions (a column and a box) moving forward.

So without passing around indices, I'm writing strange looking theorems connecting the proofs back to the original board. So in a sense, it becomes an API issue. I want to be able to state what happening at roughly the level of abstraction that I see human Sudoku solving techniques described at.

Though more generally, I think this happens in algorithms where you keep pointers into an array. An undergrad programming problem might be to decide if any two numbers with a specific sum exists in a sorted array ([1, 1, 2, 3, 4, 6, 8, 9] with the sum being 11 is true for 2,9 or 3,8 for example ). A technique you'd expect here is to start with a pointer at either end of the array and move them inward until they you find the sum or they meet.

If you want to prove that this algorithm always terminates with the correct answer, there's a nice recursive proof (your array gets smaller at each step), but if you then also want to also know which indices the summed numbers where found at, it starts to get a bit messier. Your recursive call effectively passes those pointers around. It feels nice to bundle the array and pointers together as they logically only matter as a unit. Also, a longer proof may look cleaner in places that way? I'm not sure.

Mark Fischer (Feb 09 2024 at 17:23):

I think your description of different "views" on shared data is pretty much spot on.

Last updated: May 02 2025 at 03:31 UTC