Zulip Chat Archive

Stream: lean4

Topic: Slow sort then dedup


Sofia (Dec 30 2021 at 17:42):

@Sebastian Ullrich Now that I have the method to measure the slowness.... any ideas how to speed this up?

instance [Hashable a] : Hashable (Array a) := fun ar => ar.foldl (fun a b => hash (a, b)) 0

-- quick and dirty deduplication from pre-sorted array. must not be empty.
def dedup [BEq a] [Inhabited a] (arr : Array a) : Array a × Array a := Id.run do
    let mut res := Array.mkEmpty arr.size
    let mut dups := Array.mkEmpty arr.size
    let mut last := arr.get! 0

    res := res.push last
    for e in arr.toList.drop 1 do
        if e != last then
            res := res.push e
            last := e
        else
            dups := dups.push e

    (res, dups)

@[noinline] def dedup' (arr : Array (Array UInt64)) : IO (Array (Array UInt64)) := do
    arr.qsort (hash . < hash .) |> dedup |>.1

Sofia (Dec 30 2021 at 17:44):

I suspect the toList and new array won't help, in the dedup part. Is hashing the array going to be slow under the usage of quicksort? I don't need any specific sort, I only need a consistent ordering for the dedup. Pointer-level sort would be interesting.

Sofia (Dec 30 2021 at 17:47):

For context. The outer array is on the order of thousands and the inner array is on the order of ~16-64.

Sofia (Dec 30 2021 at 17:50):

Eliminated the toList, no improvement.

Sofia (Dec 30 2021 at 17:50):

def dedup [BEq a] [Inhabited a] (arr : Array a) : Array a × Array a := Id.run do
    let mut res := Array.mkEmpty arr.size
    let mut dups := Array.mkEmpty arr.size
    let mut last := Inhabited.default

    for e in arr do
        if e != last then
            res := res.push e
            last := e
        else
            dups := dups.push e

    (res, dups)

Sofia (Dec 30 2021 at 17:52):

Isolated out the dedup part, it isn't the problem. So the issue seems to be in quicksort and/or that hash-based sort.

Sofia (Dec 30 2021 at 17:56):

~1.2s to generate my graph. ~17.5 seconds to sort ~5000 elements !

Sebastian Ullrich (Dec 30 2021 at 17:56):

Is that compiled or interpreted?

Sofia (Dec 30 2021 at 17:57):

Compiled.

Sofia (Dec 30 2021 at 17:57):

Make that 8434, not 5000.

Sebastian Ullrich (Dec 30 2021 at 18:22):

That definitely sounds slower than expected. I don't have an immediate suggestion, but perf top is great for getting a quick idea on where time is actually spent (on Linux).

Sofia (Dec 30 2021 at 18:28):

It is quite late.. I should call the night. I might try perf top tomorrow. Not sure I've ever used perf before. Known about it at least.

I have a few ideas for how to accelerate the pruning, so we'll see how that goes.

Thank you. TTYL. o/

Sebastian Ullrich (Dec 30 2021 at 18:34):

Depending on how often qsort calls the comparison function, it might be better to compute the hashes beforehand. But it still doesn't sound like it should be that slow.

pcpthm (Dec 31 2021 at 01:57):

The current Array.qsort has quadratic time complexity if all elements are equal because the partitioning is not doing a three-way comparison for the pivot <https://en.wikipedia.org/wiki/Quicksort#Repeated_elements>. I stumbled upon this some time ago but forgot to report it.

Sofia (Dec 31 2021 at 02:01):

@pcpthm Thanks for confirming the issue. I certainly hit this worst case with MANY duplicates. The solution (besides fixing the algorithm) is to just run it more often to keep it small.

Sofia (Dec 31 2021 at 02:03):

A better solution would be to collect sorted short-passes, then order-preserve merge.

Huỳnh Trần Khanh (Dec 31 2021 at 02:42):

ouch, pathological quicksort again :joy: why can't we use pattern defeating quicksort? it's fast and its worst case time complexity is O(nlogn) because the algorithm falls back to heapsort when the input is too pathological :joy:

Mario Carneiro (Dec 31 2021 at 02:46):

FYI there is a heapSort function in mathlib4


Last updated: Dec 20 2023 at 11:08 UTC