Zulip Chat Archive
Stream: lean4
Topic: Slow sort then dedup
Sofia (Dec 30 2021 at 17:42):
@Sebastian Ullrich Now that I have the method to measure the slowness.... any ideas how to speed this up?
instance [Hashable a] : Hashable (Array a) := ⟨fun ar => ar.foldl (fun a b => hash (a, b)) 0⟩
-- quick and dirty deduplication from pre-sorted array. must not be empty.
def dedup [BEq a] [Inhabited a] (arr : Array a) : Array a × Array a := Id.run do
let mut res := Array.mkEmpty arr.size
let mut dups := Array.mkEmpty arr.size
let mut last := arr.get! 0
res := res.push last
for e in arr.toList.drop 1 do
if e != last then
res := res.push e
last := e
else
dups := dups.push e
(res, dups)
@[noinline] def dedup' (arr : Array (Array UInt64)) : IO (Array (Array UInt64)) := do
arr.qsort (hash . < hash .) |> dedup |>.1
Sofia (Dec 30 2021 at 17:44):
I suspect the toList and new array won't help, in the dedup part. Is hashing the array going to be slow under the usage of quicksort? I don't need any specific sort, I only need a consistent ordering for the dedup. Pointer-level sort would be interesting.
Sofia (Dec 30 2021 at 17:47):
For context. The outer array is on the order of thousands and the inner array is on the order of ~16-64.
Sofia (Dec 30 2021 at 17:50):
Eliminated the toList
, no improvement.
Sofia (Dec 30 2021 at 17:50):
def dedup [BEq a] [Inhabited a] (arr : Array a) : Array a × Array a := Id.run do
let mut res := Array.mkEmpty arr.size
let mut dups := Array.mkEmpty arr.size
let mut last := Inhabited.default
for e in arr do
if e != last then
res := res.push e
last := e
else
dups := dups.push e
(res, dups)
Sofia (Dec 30 2021 at 17:52):
Isolated out the dedup part, it isn't the problem. So the issue seems to be in quicksort and/or that hash-based sort.
Sofia (Dec 30 2021 at 17:56):
~1.2s to generate my graph. ~17.5 seconds to sort ~5000 elements !
Sebastian Ullrich (Dec 30 2021 at 17:56):
Is that compiled or interpreted?
Sofia (Dec 30 2021 at 17:57):
Compiled.
Sofia (Dec 30 2021 at 17:57):
Make that 8434, not 5000.
Sebastian Ullrich (Dec 30 2021 at 18:22):
That definitely sounds slower than expected. I don't have an immediate suggestion, but perf top
is great for getting a quick idea on where time is actually spent (on Linux).
Sofia (Dec 30 2021 at 18:28):
It is quite late.. I should call the night. I might try perf top
tomorrow. Not sure I've ever used perf
before. Known about it at least.
I have a few ideas for how to accelerate the pruning, so we'll see how that goes.
Thank you. TTYL. o/
Sebastian Ullrich (Dec 30 2021 at 18:34):
Depending on how often qsort
calls the comparison function, it might be better to compute the hashes beforehand. But it still doesn't sound like it should be that slow.
pcpthm (Dec 31 2021 at 01:57):
The current Array.qsort
has quadratic time complexity if all elements are equal because the partitioning is not doing a three-way comparison for the pivot <https://en.wikipedia.org/wiki/Quicksort#Repeated_elements>. I stumbled upon this some time ago but forgot to report it.
Sofia (Dec 31 2021 at 02:01):
@pcpthm Thanks for confirming the issue. I certainly hit this worst case with MANY duplicates. The solution (besides fixing the algorithm) is to just run it more often to keep it small.
Sofia (Dec 31 2021 at 02:03):
A better solution would be to collect sorted short-passes, then order-preserve merge.
Huỳnh Trần Khanh (Dec 31 2021 at 02:42):
ouch, pathological quicksort again :joy: why can't we use pattern defeating quicksort? it's fast and its worst case time complexity is O(nlogn) because the algorithm falls back to heapsort when the input is too pathological :joy:
Mario Carneiro (Dec 31 2021 at 02:46):
FYI there is a heapSort function in mathlib4
Last updated: Dec 20 2023 at 11:08 UTC