Zulip Chat Archive

Stream: new members

Topic: Proof that HashSets don't have any overlap


aron (Apr 09 2025 at 18:11):

I'm making a data structure to store disjoint sets of keys that point to values:

import Std.Data.HashMap
import Std.Data.HashSet
import Mathlib
open Std

variable {k v : Type*} [instBeq : BEq k] [instHash : Hashable k]

instance : Hashable (HashSet k) where
  hash := hash  HashSet.toArray


abbrev DisjointSetMap' (k : Type u) (v : Type u') [instBeq : BEq k] [instHash : Hashable k] :=
  HashMap (HashSet k) v


structure DisjointSetMap (k : Type*) [BEq k] [Hashable k] (v : Type*) where
  /-- This maps from _sets_ of `k`s to `v`s -/
  innerMap : DisjointSetMap' k v
  /-- How to merge values together when key sets are merged -/
  mergeFn : v  v  (keysNew : HashSet k)  v

However for this data structure to be correct I want to include a proof in the DisjointSetMap structure that:

  • none of the hashsets are empty
  • no two hashsets have any overlap. In other words, that for every single key there is at most one key set that contains that key

Here's the full code, including my attempt at a proof so far

but I don't think I'm going down the right path here. Is there a better and ideally simpler way to prove this property?

Matt Diamond (Apr 10 2025 at 01:40):

why are your predicates defined as inductives? why not define them like this?

def NoEmptyKeys (map : DisjointSetMap' k v) : Prop :=
 keySet  map, ¬keySet.isEmpty

def NoKeyOverlap (map : DisjointSetMap' k v) : Prop :=
 (keySet1 keySet2 : HashSet k), keySet1  map  keySet2  map  (keySet1.intersection keySet2).isEmpty

Matt Diamond (Apr 10 2025 at 01:41):

also, notice that NoKeyOverlap doesn't need to check if the sets are nonempty because NoEmptyKeys already takes care of that (assuming you include NoEmptyKeys in the DisjointSetMap structure)

aron (Apr 10 2025 at 11:54):

Matt Diamond said:

why are your predicates defined as inductives?

Good point – because I needed them to be inductives for this other data structure I made, but you're right that since this property isn't recursive it can just be a regular function

aron (Apr 10 2025 at 12:00):

Your definition of NoKeyOverlap looks good, but I probably also need a proof that keySet1 and keySet2 are not the same key set, right?

So something like:

def NoKeyOverlap (map : DisjointSetMap' k v) : Prop :=
   (keySet1 keySet2 : HashSet k),
  keySet1  map 
  keySet2  map 
  keySet1  keySet2  -- <-- distinctness proof
  (keySet1.intersection keySet2).isEmpty

Johannes Tantow (Apr 10 2025 at 12:03):

Why would you need that? If they are equal their intersection won't be empty (as long as not both are empty)

aron (Apr 10 2025 at 16:30):

@Johannes Tantow right but if you don't require keySet1 and keySet2 to be different sets then you could never construct a valid proof because you could not prove that (keySet1.intersection keySet2).isEmpty for all keySets – namely when keySet1 = keySet2

Johannes Tantow (Apr 10 2025 at 16:37):

I have missed the forall true. I thought it is just a property for two general hash sets

Johannes Tantow (Apr 10 2025 at 16:41):

In that case, you are right and can ignore everything I said

Johannes Tantow (Apr 10 2025 at 16:55):

Though perhaps map.keys.Pairwise (fun x y => (x.intersection y).isEmpty) might be more succint, but that is probably not an easier proof.


Last updated: May 02 2025 at 03:31 UTC