Zulip Chat Archive
Stream: lean4
Topic: indexmap
Arthur Paulino (May 12 2025 at 15:42):
Hi all, I'm trying to implement an IndexMap with a guarantee of well-formedness. I'm needing some help with these sorries:
import Std.Data.HashMap
structure IndexMap (α : Type u) (β : Type v) [BEq α] [Hashable α] where
indices : Std.HashMap α Nat
pairs : Array (α × β)
wellFormed : ∀ a : α, indices[a]? = some i → i < pairs.size
namespace IndexMap
variable [BEq α] [Hashable α] (m : IndexMap α β) (a : α) (b : β)
instance : Inhabited (IndexMap α β) where
default :=
let indices := default
let pairs := #[]
have {i} : ∀ (a : α), indices[a]? = some i → i < pairs.size := by
intro a ha
have : indices[a]? = none := Std.HashMap.getElem?_emptyc
simp [this] at ha
⟨indices, pairs, this⟩
def insert : IndexMap α β :=
match h : m.indices[a]? with
| none =>
let newIndices := m.indices.insert a m.pairs.size
let newPairs := m.pairs.push (a, b)
have {i} : ∀ (a : α), newIndices[a]? = some i → i < newPairs.size := by
intro a' ha'
sorry
⟨newIndices, newPairs, this⟩
| some idx =>
let newIndices := m.indices.insert a idx
have := m.wellFormed a h
let newPairs := m.pairs.set idx (a, b)
have {i} : ∀ (a : α), newIndices[a]? = some i → i < newPairs.size := by
intro a' ha'
sorry
⟨newIndices, newPairs, this⟩
Does this look like the right direction? Thanks!
Mario Carneiro (May 12 2025 at 17:46):
your well formedness assertion is awfully weak? I would expect you to say that indices[a]? = some i implies fst <$> pairs[i]? = some a
Arthur Paulino (May 12 2025 at 17:51):
Not sure what you classify awfully weak. What I have is enough for me to say
have := m.wellFormed a h
let newPairs := m.pairs.set idx (a, b)
I'll try your suggestion and see where that will take me to
Mario Carneiro (May 12 2025 at 17:52):
I'm looking at the proof and it will need EquivBEq and LawfulHashable at least
Mario Carneiro (May 12 2025 at 18:00):
def insert : IndexMap α β :=
match h : m.indices[a]? with
| none =>
let newIndices := m.indices.insert a m.pairs.size
let newPairs := m.pairs.push (a, b)
have {i} : ∀ (a : α), newIndices[a]? = some i → i < newPairs.size := by
intro a' ha'
simp [newPairs, newIndices, Std.HashMap.getElem?_insert] at *
split at ha' <;> simp_all
have := m.wellFormed _ ha'; omega
⟨newIndices, newPairs, this⟩
| some idx =>
let newIndices := m.indices.insert a idx
have := m.wellFormed a h
let newPairs := m.pairs.set idx (a, b)
have {i} : ∀ (a : α), newIndices[a]? = some i → i < newPairs.size := by
intro a' ha'
simp [newPairs, newIndices, Std.HashMap.getElem?_insert] at *
split at ha' <;> simp_all
· exact ha' ▸ this
· exact m.wellFormed _ ha'
⟨newIndices, newPairs, this⟩
Mario Carneiro (May 12 2025 at 18:03):
I guess erasure will be hard in this data structure, but maybe you don't need that
Arthur Paulino (May 12 2025 at 18:04):
Yeah not even the Rust lib supports that well. That's fine
Arthur Paulino (May 12 2025 at 18:06):
Your implementation works beautifully, thanks a lot. I'll study it
Mario Carneiro (May 12 2025 at 18:08):
unnecessary golf:
def insert : IndexMap α β := by
refine
match h : m.indices[a]? with
| none => ⟨m.indices.insert a m.pairs.size, m.pairs.push (a, b), ?_⟩
| some idx =>
have := m.wellFormed a h
⟨m.indices.insert a idx, m.pairs.set idx (a, b), ?_⟩
all_goals
intro i a' ha'
simp [Std.HashMap.getElem?_insert] at *
split at ha' <;> simp_all
have := m.wellFormed _ ha'; omega
Arthur Paulino (May 12 2025 at 20:14):
I've renamed wellFormed to validIndices. Well-formedness requires stronger constraints, even stronger than the one you suggested. Maybe constraining sizes to be equal would be enough (as an addition to your suggestion).
Mario Carneiro (May 13 2025 at 02:31):
If the condition I mentioned is written as a biconditional, then you get a bijection and therefore the sizes are the same
Mario Carneiro (May 13 2025 at 02:33):
Oh, but actually there is an issue in the statement: unless you know that the type is LawfulBEq, it could be the case that indices[a]? = some i and indices[a']? = some i because they are hitting the same element of the map and a == a'. The assertion has to be restricted to just talk about elements that are actually in the map, i.e. (a, i) ∈ indices.toList
Arthur Paulino (May 13 2025 at 10:19):
I think an efficient IndexMap with formal & strong guarantees of well-formedness would be a great addition to Batteries. But I'll leave that to someone who's more skilled and/or has more time available than myself.
My only suggestion is not to adhere to some names the Rust API adopted because they're confusing. For example, in my code:
- Instead of
get, I definedgetByKey
@[inline] def getByKey : Option β :=
m.indices[a]? >>= (m.pairs[·]?.map Prod.snd)
- Instead of
get_index/getIndex, I definedgetByIdx
@[inline] def getByIdx (i : Nat) : Option (α × β) :=
m.pairs[i]?
Mario Carneiro (May 15 2025 at 01:19):
I think we would want to follow the naming convention from other lean data structures, which suggest get and getIdx for these operations
Mario Carneiro (May 15 2025 at 01:20):
If you have an API design, you could submit that as a PR to batteries using proof_wanted
Mario Carneiro (May 15 2025 at 01:20):
getting the API design is 80% of the work
Arthur Paulino (May 15 2025 at 01:44):
The reason why I don't like "get index" is because an index map also has a "get index of". And those have very similar meanings when you say them out loud
Arthur Paulino (May 15 2025 at 01:45):
"get"/"getByIdx"/"getIdxOf" would work
Mario Carneiro (May 15 2025 at 02:36):
another reason to make an API proposal PR: you get first dibs at naming everything :)
Robin Arnez (May 15 2025 at 08:53):
We have TreeMap.getEntryAtIdx? so that's probably a reasonable scheme to adopt
Arthur Paulino (May 19 2025 at 20:56):
I was able to optimize insert by avoiding inserting data to the HashMap when the index is already present in indices:
def insert : IndexMap α β := by
refine
match h : m.indices[a]? with
| none => ⟨m.pairs.push (a, b), m.indices.insert a m.pairs.size, ?_⟩
| some idx => ⟨m.pairs.set idx (a, b) (m.validIndices a h), m.indices, ?_⟩
all_goals
intro i a' ha'
simp [Std.HashMap.getElem?_insert] at *
· split at ha'
simp_all
exact Nat.lt_succ_of_lt (m.validIndices a' ha')
· exact m.validIndices a' ha'
Last updated: Dec 20 2025 at 21:32 UTC