Zulip Chat Archive

Stream: lean4

Topic: PersistentHashMap vs. HashMap?


Thomas Murrills (Apr 30 2024 at 16:44):

I’m wondering what the difference between PersistentHashMap and HashMap is.

  1. What sort of “persistence” is maintained by the former and not the latter?
  2. When might I want to use one over the other?

PersistentHashMap is used in some core data structures—e.g. the local context maintains one to associate fvarids with decls—so I imagine they’re the right choice for these applications, and I’d like to understand why. :)

Jannis Limperg (Apr 30 2024 at 17:01):

If you have hash maps a and b := a.insert x y and you have live pointers to a and b at the same time, then b is a full copy of a with one additional element. So modifications of hash maps are only efficient if they are made linearly, i.e. a is not used after b has been constructed (similar to Array). By contrast, persistent hash maps support reasonably efficient non-linear modifications, but they are less efficient when used linearly.

Core data structures often use persistent hash maps because these data structures end up in long-lived snapshots used by the server, among other things.

Kyle Miller (Apr 30 2024 at 17:03):

In particular, b tries to be efficient by sharing internal structure with a when they're persistent hash maps — they're like trees. If they're plain HashMaps, insertion copies the whole backing Array (unless there's just one reference and it can be updated in-place)

Thomas Murrills (Apr 30 2024 at 17:08):

Ah, okay, knowing it’s essentially similar to the List/Array difference re: linearity makes sense, thanks! :)

Thomas Murrills (Apr 30 2024 at 17:35):

But, interestingly, I notice we have PersistentArray too. I’m guessing this is like a shareable version of Array with more efficient access-by-index than List (to match the typical Array operations)?

Henrik Böving (Apr 30 2024 at 17:38):

Yes. Its basically a tree where the tips are individual arrays so you get something that is very close to constant access time and you only have to copy individual parts of the tree when using it in a non linear fashion

Shreyas Srinivas (Apr 30 2024 at 17:46):

Persistent lazy data structures have good amortized time complexity.

Shreyas Srinivas (Apr 30 2024 at 17:50):

I don't know if lean uses lazy data structures much though

Shreyas Srinivas (Apr 30 2024 at 17:50):

Is it useful? EDIT: More specifically, beyond the theoretical trade offs between amortized and worst case complexity, what are the practical considerations?

Henrik Böving (Apr 30 2024 at 17:54):

There is a bit of lazyness in Lean's implementation here and there but it is not as much as we use persistent or linear structures by far


Last updated: May 02 2025 at 03:31 UTC