Documentation

Std.Data.DHashMap.Internal.Defs

This is an internal implementation file of the hash map. Users of the hash map should not rely on the contents of this file.

File contents: Definition of all operations on Raw₀, definition of WFImp.

Hash map implementation notes #

This is a simple separate-chaining hash table. The data of the hash map (DHashMap.Raw) consists of a cached size and an array of buckets, where each bucket is an AssocList α β (which is the same as a List ((a : α) × β a) but with one less level of indirection). The number of buckets is always a power of two. The hash map doubles its size upon inserting an element such that the number of elements is more than 75% of the number of buckets.

Because we need DHashMap.Raw to be nested-inductive-safe, we cannot bundle the fact that there is at least one bucket. We there for define a type Raw₀ which is just a Raw together with the fact that the size is positive. Almost all internal work on the hash map happens on Raw₀ so that we do not have to perform size check all of the time. The operations defined on Raw perform this size check once to transform the Raw into a Raw₀ and then operate on that (therefore, each operation on Raw will only do a single size check). The operations defined on DHashMap conclude that the size is positive from the well-formedness predicate, use that to build a Raw₀ and then operate on that. So the operations on DHashMap are exactly the same as the operations on Raw₀ and the operations on Raw are the same as the operations on Raw₀ as long as we can prove that the size check will succeed.

The operations on Raw₀ are defined in this file. They are built for performance. The IR is hand-checked to ensure that there are no unnecessary allocations, inlining problems or linearity bugs. Note that for many operations the IR only becomes meaningful once it has been specialized with concrete Hashable and BEq instances as is the case in actual applications.

Of the internal files, only Internal.Index, Internal.Defs and Internal.AssocList.Basic contain any executable code. The rest of the files set up the verification framework which is described in the remainder of this text.

The basic idea is to translate statements about hash maps into statements about lists using the function toListModel defined in this file. The function toListModel simply concatenates all buckets into a List ((a : α) × β a). The file Internal.List.Associative then contains a complete verification of associative lists. The theorems relating the operations on Raw₀ to the operations on List ((a : α) × β a) are located in Internal.WF and have such names as contains_eq_containsKey or toListModel_insert. In the file Internal.RawLemmas we then state all of the lemmas for Raw₀ and use a tactic to apply the results from Internal.WF to reduce to the results from Internal.List.Associative. From there we can state the actual lemmas for DHashMap.Raw, DHashMap, HashMap.Raw, HashMap, HashSet.Raw and HashSet in the non-internal *.Lemmas files and immediately reduce them to the results about Raw₀ in Internal.RawLemmas.

There are some additional indirections to this high-level strategy. First, we have an additional layer of so-called "model functions" on Raw₀, defined in the file Internal.Model. These have the same signature as their counterparts defined in this file, but may have a slighly simpler implementation. For example, Raw₀.erase has a linearity optimization which is not present in the model function Raw₀.eraseₘ. We prove that the functions are equal to their model implementations in Internal.Model, then verify the model implementation. This makes the verification more robust against implemenation details, future performance improvements, etc.

Second, reducing hash maps to lists only works if the hash map is well-formed. Our internal well-formedness predicate is called Raw.WFImp (defined in this file) and states that (a) each bucket only contains items with the correct hash, (b) the cached size is equal to the actual number of elements in the buckets, and (c) after concatenating the buckets the keys in the resulting list are pairwise not equal. The third condition is a priori stronger than the slightly more natural condition that the keys in each bucket are pairwise not equal, but they are equivalent in well-behaved cases and our condition works better. The user-visible well-formedness predicate Raw.WF is equivalent to WFImp, as is shown in Internal.WF. The user-visible version exists to postpone the proofs that all operations preserve well-formedness to a later file so that it is possible to import DHashMap.Basic without pulling in all of the proof machinery.

The framework works very hard to make adding and verifying new operations very easy and maintainable. To this end, we provide theorems apply_bucket, apply_bucket_with_proof, toListModel_updateBucket and toListModel_updateAllBuckets, which do all of the heavy lifting in a general way. The verification for each actual operation in Internal.WF is then extremely straightward, requiring only to plug in some results about lists. See for example the functions containsₘ_eq_containsKey and the section on eraseₘ for prototypical examples of this technique.

Here is a summary of the steps required to add and verify a new operation:

  1. Write the executable implementation
  1. Write the list implementation
  1. Write the model implementation
  1. Verify the model implementation
  1. Connect Raw and Raw₀
  1. Prove Raw₀ versions of user-facing lemmas
  1. State and prove the user-facing lemmas

This sounds like a lot of work (and it is if you have to add a lot of user-facing lemmas), but the framework is set up in such a way that each step is really easy and the proofs are all really short and clean.

def Std.DHashMap.Internal.toListModel {α : Type u} {β : αType v} (buckets : Array (Std.DHashMap.Internal.AssocList α β)) :
List ((a : α) × β a)

Internal implementation detail of the hash map

Equations
Instances For
    @[inline]
    def Std.DHashMap.Internal.computeSize {α : Type u} {β : αType v} (buckets : Array (Std.DHashMap.Internal.AssocList α β)) :

    Internal implementation detail of the hash map

    Equations
    Instances For
      @[reducible, inline]
      abbrev Std.DHashMap.Internal.Raw₀ (α : Type u) (β : αType v) :
      Type (max 0 u v)

      Internal implementation detail of the hash map

      Equations
      Instances For
        @[inline]
        def Std.DHashMap.Internal.Raw₀.empty {α : Type u} {β : αType v} (capacity : optParam Nat 8) :

        Internal implementation detail of the hash map

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[inline]
          def Std.DHashMap.Internal.Raw₀.reinsertAux {α : Type u} {β : αType v} (hash : αUInt64) (data : { d : Array (Std.DHashMap.Internal.AssocList α β) // 0 < d.size }) (a : α) (b : β a) :
          { d : Array (Std.DHashMap.Internal.AssocList α β) // 0 < d.size }

          Internal implementation detail of the hash map

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            def Std.DHashMap.Internal.Raw₀.expand {α : Type u} {β : αType v} [Hashable α] (data : { d : Array (Std.DHashMap.Internal.AssocList α β) // 0 < d.size }) :
            { d : Array (Std.DHashMap.Internal.AssocList α β) // 0 < d.size }

            Copies all the entries from buckets into a new hash map with a larger capacity.

            Equations
            Instances For
              @[irreducible]
              def Std.DHashMap.Internal.Raw₀.expand.go {α : Type u} {β : αType v} [Hashable α] (i : Nat) (source : Array (Std.DHashMap.Internal.AssocList α β)) (target : { d : Array (Std.DHashMap.Internal.AssocList α β) // 0 < d.size }) :
              { d : Array (Std.DHashMap.Internal.AssocList α β) // 0 < d.size }

              Inner loop of expand. Copies elements source[i:] into target, destroying source in the process.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                @[inline]

                Internal implementation detail of the hash map

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For
                  @[inline]
                  def Std.DHashMap.Internal.Raw₀.insert {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) (b : β a) :

                  Internal implementation detail of the hash map

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For
                    @[inline]
                    def Std.DHashMap.Internal.Raw₀.containsThenInsert {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) (b : β a) :

                    Internal implementation detail of the hash map

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For
                      @[inline]

                      Internal implementation detail of the hash map

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For
                        @[inline]
                        def Std.DHashMap.Internal.Raw₀.insertIfNew {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) (b : β a) :

                        Internal implementation detail of the hash map

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For
                          @[inline]
                          def Std.DHashMap.Internal.Raw₀.getThenInsertIfNew? {α : Type u} {β : αType v} [BEq α] [Hashable α] [LawfulBEq α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) (b : β a) :

                          Internal implementation detail of the hash map

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For
                            @[inline]
                            def Std.DHashMap.Internal.Raw₀.get? {α : Type u} {β : αType v} [BEq α] [LawfulBEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) :
                            Option (β a)

                            Internal implementation detail of the hash map

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              @[inline]
                              def Std.DHashMap.Internal.Raw₀.contains {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) :

                              Internal implementation detail of the hash map

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For
                                @[inline]
                                def Std.DHashMap.Internal.Raw₀.get {α : Type u} {β : αType v} [BEq α] [LawfulBEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) (hma : m.contains a = true) :
                                β a

                                Internal implementation detail of the hash map

                                Equations
                                Instances For
                                  @[inline]
                                  def Std.DHashMap.Internal.Raw₀.getD {α : Type u} {β : αType v} [BEq α] [LawfulBEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) (fallback : β a) :
                                  β a

                                  Internal implementation detail of the hash map

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For
                                    @[inline]
                                    def Std.DHashMap.Internal.Raw₀.get! {α : Type u} {β : αType v} [BEq α] [LawfulBEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) [Inhabited (β a)] :
                                    β a

                                    Internal implementation detail of the hash map

                                    Equations
                                    Instances For
                                      @[inline]
                                      def Std.DHashMap.Internal.Raw₀.erase {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (a : α) :

                                      Internal implementation detail of the hash map

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For
                                        @[inline]
                                        def Std.DHashMap.Internal.Raw₀.filterMap {α : Type u} {β : αType v} {γ : αType w} (f : (a : α) → β aOption (γ a)) (m : Std.DHashMap.Internal.Raw₀ α β) :

                                        Internal implementation detail of the hash map

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For
                                          @[inline]
                                          def Std.DHashMap.Internal.Raw₀.map {α : Type u} {β : αType v} {γ : αType w} (f : (a : α) → β aγ a) (m : Std.DHashMap.Internal.Raw₀ α β) :

                                          Internal implementation detail of the hash map

                                          Equations
                                          Instances For
                                            @[inline]
                                            def Std.DHashMap.Internal.Raw₀.filter {α : Type u} {β : αType v} (f : (a : α) → β aBool) (m : Std.DHashMap.Internal.Raw₀ α β) :

                                            Internal implementation detail of the hash map

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For
                                              @[inline]
                                              def Std.DHashMap.Internal.Raw₀.insertMany {α : Type u} {β : αType v} {ρ : Type w} [ForIn Id ρ ((a : α) × β a)] [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α β) (l : ρ) :
                                              { m' : Std.DHashMap.Internal.Raw₀ α β // ∀ (P : Std.DHashMap.Internal.Raw₀ α βProp), (∀ {m'' : Std.DHashMap.Internal.Raw₀ α β} {a : α} {b : β a}, P m''P (m''.insert a b))P mP m' }

                                              Internal implementation detail of the hash map

                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For
                                                @[inline]
                                                def Std.DHashMap.Internal.Raw₀.Const.get? {α : Type u} {β : Type v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β) (a : α) :

                                                Internal implementation detail of the hash map

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  @[inline]
                                                  def Std.DHashMap.Internal.Raw₀.Const.get {α : Type u} {β : Type v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β) (a : α) (hma : m.contains a = true) :
                                                  β

                                                  Internal implementation detail of the hash map

                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For
                                                    @[inline]
                                                    def Std.DHashMap.Internal.Raw₀.Const.getD {α : Type u} {β : Type v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β) (a : α) (fallback : β) :
                                                    β

                                                    Internal implementation detail of the hash map

                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For
                                                      @[inline]
                                                      def Std.DHashMap.Internal.Raw₀.Const.get! {α : Type u} {β : Type v} [BEq α] [Hashable α] [Inhabited β] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β) (a : α) :
                                                      β

                                                      Internal implementation detail of the hash map

                                                      Equations
                                                      Instances For
                                                        @[inline]
                                                        def Std.DHashMap.Internal.Raw₀.Const.getThenInsertIfNew? {α : Type u} {β : Type v} [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β) (a : α) (b : β) :
                                                        Option β × Std.DHashMap.Internal.Raw₀ α fun (x : α) => β

                                                        Internal implementation detail of the hash map

                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For
                                                          @[inline]
                                                          def Std.DHashMap.Internal.Raw₀.Const.insertMany {α : Type u} {β : Type v} {ρ : Type w} [ForIn Id ρ (α × β)] [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β) (l : ρ) :
                                                          { m' : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β // ∀ (P : (Std.DHashMap.Internal.Raw₀ α fun (x : α) => β)Prop), (∀ {m'' : Std.DHashMap.Internal.Raw₀ α fun (x : α) => β} {a : α} {b : β}, P m''P (m''.insert a b))P mP m' }

                                                          Internal implementation detail of the hash map

                                                          Equations
                                                          • One or more equations did not get rendered due to their size.
                                                          Instances For
                                                            @[inline]
                                                            def Std.DHashMap.Internal.Raw₀.Const.insertManyUnit {α : Type u} {ρ : Type w} [ForIn Id ρ α] [BEq α] [Hashable α] (m : Std.DHashMap.Internal.Raw₀ α fun (x : α) => Unit) (l : ρ) :
                                                            { m' : Std.DHashMap.Internal.Raw₀ α fun (x : α) => Unit // ∀ (P : (Std.DHashMap.Internal.Raw₀ α fun (x : α) => Unit)Prop), (∀ {m'' : Std.DHashMap.Internal.Raw₀ α fun (x : α) => Unit} {a : α} {b : Unit}, P m''P (m''.insert a b))P mP m' }

                                                            Internal implementation detail of the hash map

                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For
                                                              structure Std.DHashMap.Internal.List.HashesTo {α : Type u} {β : αType v} [BEq α] [Hashable α] (l : List ((a : α) × β a)) (i : Nat) (size : Nat) :

                                                              Internal implementation detail of the hash map

                                                              Instances For
                                                                theorem Std.DHashMap.Internal.List.HashesTo.hash_self {α : Type u} {β : αType v} [BEq α] [Hashable α] {l : List ((a : α) × β a)} {i : Nat} {size : Nat} (self : Std.DHashMap.Internal.List.HashesTo l i size) (h : 0 < size) (p : (a : α) × β a) :
                                                                p l(Std.DHashMap.Internal.mkIdx size h (hash p.fst)).val.toNat = i

                                                                Internal implementation detail of the hash map

                                                                structure Std.DHashMap.Internal.IsHashSelf {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Array (Std.DHashMap.Internal.AssocList α β)) :

                                                                Internal implementation detail of the hash map

                                                                Instances For
                                                                  theorem Std.DHashMap.Internal.IsHashSelf.hashes_to {α : Type u} {β : αType v} [BEq α] [Hashable α] {m : Array (Std.DHashMap.Internal.AssocList α β)} (self : Std.DHashMap.Internal.IsHashSelf m) (i : Nat) (h : i < m.size) :

                                                                  Internal implementation detail of the hash map

                                                                  structure Std.DHashMap.Internal.Raw.WFImp {α : Type u} {β : αType v} [BEq α] [Hashable α] (m : Std.DHashMap.Raw α β) :

                                                                  This is the actual well-formedness predicate for hash maps. Users should never need to interact with this and should use WF instead.

                                                                  Instances For

                                                                    Internal implementation detail of the hash map

                                                                    theorem Std.DHashMap.Internal.Raw.WFImp.size_eq {α : Type u} {β : αType v} [BEq α] [Hashable α] {m : Std.DHashMap.Raw α β} (self : Std.DHashMap.Internal.Raw.WFImp m) :
                                                                    m.size = (Std.DHashMap.Internal.toListModel m.buckets).length

                                                                    Internal implementation detail of the hash map

                                                                    Internal implementation detail of the hash map