Zulip Chat Archive

Stream: std4

Topic: HashMap insert or modify


Patrick Massot (Dec 16 2023 at 19:03):

I was surprised to not find

def Std.HashMap.insertOrModify {α : Type*} {_ : BEq α} {_ : Hashable α} {β : Type*} (self : Std.HashMap α β)
  (a : α) (f : α  β  β) (b : β): Std.HashMap α β :=
if self.contains a then
  self.modify a f
else
  self.insert a b

Is it somewhere? Is it a bad idea?

Kyle Miller (Dec 16 2023 at 19:16):

It would be nice if there were an interface where you could avoid computing b if it's not needed.

You could implement insertOrModify with this interface:

def Std.HashMap.modify' {α : Type*} {_ : BEq α} {_ : Hashable α} {β : Type*} (self : Std.HashMap α β)
  (a : α) (f : α  Option β  β) : Std.HashMap α β

The function f would be passed the original key and, if present, the existing value.

More generally, it could be f : α → Option β → Option β so that it can decide to remove the value (and I guess when inserting, to not insert).

Kyle Miller (Dec 16 2023 at 19:18):

Implementation-wise, it would be nice if it didn't find the element twice. The modify function finds the bucket where the value is stored and modifies it directly, rather than finding the value, updating it, then re-inserting.

Mario Carneiro (Dec 16 2023 at 19:21):

this is kind of similar to the alter function which exists on RBMap

Kyle Miller (Dec 16 2023 at 19:22):

A use case for having f return an Option is that you can use it to help implement the addition operation for data structures like sparse polynomials, where you don't want to store any 0's

Joachim Breitner (Dec 16 2023 at 20:31):

I added a slight variation of this to the Trie functions once.

Lean.Data.Trie.upsert Lean.Data.Trie : {α : Type}  Lean.Data.Trie α  String  (Option α  α)  Lean.Data.Trie α

It’s a bit like Kyle’s modify', but with f not taking a key (which is always a anyways, at least in this case).

Patrick Massot (Dec 16 2023 at 20:43):

Ok, I'll use my version until a nicer one gets implemented.

Scott Morrison (Dec 17 2023 at 00:23):

I like the f : α → Option β → Option β signature. Can we bikeshed a name for this? I will likely want to implement some of these soon.

Jannis Limperg (Dec 18 2023 at 09:23):

Haskell precedent is also alter. While you're at it, you could also have the function look up the element that was previously associated with the key, like Haskell's updateLookupWithKey. I think that would be the most general function operating on a single entry.

Kyle Miller (Dec 18 2023 at 15:36):

A downside though is that if you return the original value you can lose linearity. Imagine a HashMap of Arrays and you want to push an element to the end of one of the Arrays. If you return the original value, then the Array will necessarily be copied.

Jannis Limperg (Dec 18 2023 at 19:29):

Sure. So I guess the alterAndLookup function should exist (it's not a footgun since the loss of linearity is inevitable), but similar functions should not be implemented in terms of it.

Jireh Loreaux (Dec 18 2023 at 19:39):

Of course that's something that should be mentioned in the docstring. :wink:

Joe Hendrix (Dec 18 2023 at 19:48):

You don't necessarily lose linearity if you replace the value before returning it. It's worth taking at how Array.modifyM achieves this.

Mario Carneiro (Dec 19 2023 at 02:20):

I think in the version Kyle describes you necessarily have a copy because the extended array is in the hash map and the unextended array is in the return value and they are being returned as a pair

Mario Carneiro (Dec 19 2023 at 02:22):

If you returned the new value just added to the hashmap, then there would not necessarily need to be a copy although you would be passing back a value which is shared and should probably be discarded if you want to keep using the value linearly

Mario Carneiro (Dec 19 2023 at 02:23):

(FYI: that the haskell version is also called alter is not a coincidence, I borrowed the name and API from haskell maps)


Last updated: Dec 20 2023 at 11:08 UTC