Zulip Chat Archive

Stream: lean4

Topic: Slowness in converting an array of pairs to a HashMap


Anand Rao Tadipatri (Oct 12 2024 at 09:43):

I have some code for converting an array of pairs into a HashMap that seems unreasonably slow to me. Here is an #mwe:

import Lean

def generateCache {α β} [BEq α] [Hashable α] (results : Array (α × β)) (capacity := 8) : Std.HashMap α (Array β) :=
  results.foldl (init := Std.HashMap.empty (capacity := capacity)) fun map fst, snd 
    match map[fst]? with
    | some arr =>
      let map := map.erase fst
      map.insert fst (arr.push snd)
    | none => map.insert fst #[snd]

def dummyData (n : Nat) : Array (String × String) := Id.run do
  let mut arr := #[]
  for i in [0:n] do
    for j in [0:n] do
      arr := arr.push (s!"Key{i}", s!"Value{j}")
  return arr

-- slower than expected
/-
122511589|Starting ...
122511589|Generated data ...
122516170|Generated cache of size 750 ...
-/
#eval show IO _ from do
  IO.println s!"{← IO.monoMsNow}|Starting ..."
  let data := dummyData 750
  IO.println s!"{← IO.monoMsNow}|Generated data ..."
  let cache := generateCache (capacity := 2048) data
  IO.println s!"{← IO.monoMsNow}|Generated cache of size {cache.size} ..."

Is this kind of slowdown to be expected while converting between the datastructures? I'd appreciate any feedback in optimizing this code to reduce its runtime.

Henrik Böving (Oct 12 2024 at 10:01):

There's no need to do map.erase and then map.insert for the same value, only do map.insert

Henrik Böving (Oct 12 2024 at 10:04):

But the operation that you actually want to use here is modify which doesn't exist yet unfortunately

Henrik Böving (Oct 12 2024 at 10:04):

(which is my fault)

Anand Rao Tadipatri (Oct 12 2024 at 11:00):

Okay, thanks for the clarification.


Last updated: May 02 2025 at 03:31 UTC