Zulip Chat Archive

Stream: lean4

Topic: Lean.Hashmap vs Std.HashMap


Shreyas Srinivas (Apr 21 2023 at 11:52):

I was looking for Hashmaps and found a Lean.Hashmap and Std.Hashmap that seem similar on a cursory glance. How should I choose between them?

Ruben Van de Velde (Apr 21 2023 at 12:05):

My understanding is that the lean one exists primarily for the compiler itself to use, while the std one is more general purpose (also easier to update if needed)

Mario Carneiro (Apr 21 2023 at 12:54):

Shreyas Srinivas said:

How should I choose between them?

Use the Std one, if possible. Reasons you might not use it:

  • You don't depend on the std package, possibly because you are developing a very foundational package or are working inside lean core
  • You are working directly with a lean API which uses Lean.HashMap
  • You profiled both and Lean.HashMap performs better in your application. (This is very unlikely as the implementations are almost carbon copies, except for some elided bounds checks in the Std version, but if you run into issues please open an issue about it.)

The main advantages the Std.HashMap provides are:

  • Proved correct operations (you don't have to import the file containing the proofs if you don't need them)
  • A stable(r) interface (nothing in the lean ecosystem is really stable at this point, but Std will be thinking about stability more than most packages, including Lean)
  • Good performance is a target, although right now we're just copying the Lean.HashMap implementation so as to get the same compiler optimizations. But optimization PRs are welcome.

Shreyas Srinivas (Apr 21 2023 at 13:18):

Mario Carneiro said:

Shreyas Srinivas said:

How should I choose between them?

Use the Std one, if possible. Reasons you might not use it:

  • You don't depend on the std package, possibly because you are developing a very foundational package or are working inside lean core
  • You are working directly with a lean API which uses Lean.HashMap
  • You profiled both and Lean.HashMap performs better in your application. (This is very unlikely as the implementations are almost carbon copies, except for some elided bounds checks in the Std version, but if you run into issues please open an issue about it.)

The main advantages the Std.HashMap provides are:

  • Proved correct operations (you don't have to import the file containing the proofs if you don't need them)
  • A stable(r) interface (nothing in the lean ecosystem is really stable at this point, but Std will be thinking about stability more than most packages, including Lean)
  • Good performance is a target, although right now we're just copying the Lean.HashMap implementation so as to get the same compiler optimizations. But optimization PRs are welcome.

Thanks. Right now performance is not a priority for me. I am trying to use it as a small symbol table. Having theorems about correctness might be handy though.

Mario Carneiro (Apr 21 2023 at 13:29):

I doubt they are very distinguishable for most purposes, but if you want a quick decision procedure, it is "use Std"

Shreyas Srinivas (Apr 21 2023 at 13:35):

On a slight tangent, do we have nice notation for HashMaps, like Python dictionaries?

Mario Carneiro (Apr 21 2023 at 13:57):

no

Mario Carneiro (Apr 21 2023 at 13:57):

but it's fairly easy to make one if you so desire

Mario Carneiro (Apr 21 2023 at 15:20):

import Std

def Std.HashMap.ofList' [BEq α] [Hashable α] (l : List (α × β)) : HashMap α β := .ofList l

open Lean
syntax kvPair := term ": " term
syntax "!{" (term ": " term),* "}" : term
macro_rules | `(!{$[$k : $v],*}) => `(Std.HashMap.ofList' [$[($k, $v)],*])

#eval !{1: 2, 3: 4}.contains 1

(The first line is to work around a bug in the declaration of HashMap.ofList)


Last updated: Dec 20 2023 at 11:08 UTC