Zulip Chat Archive

Stream: general

Topic: Large array/bytearray


Carbon (Nov 06 2024 at 19:27):

I tried to replicate a python test case where one has to instantiate a large array:

# snip
h.update(b"\x00" * 1)
h.update(b"\x00" * 4294967295)
print(h.hexdigest())

The following works, but it takes a very long time to instantiate b using Lean 4.13.0. Is it normal and/or is there a better way to do it?

 snip
  let a := ByteArray.mk $ mkArray 1 0
  let b := ByteArray.mk $ mkArray 4294967295 0
  let state := SHA3_224.mk |> (SHA3_224.update ·  a) |> (SHA3_224.update · b)
  IO.println $ HexString.toHexString $ SHA3_224.final state

Niels Voss (Nov 06 2024 at 22:13):

I've not used ByteArray before, but is it possible python is compiling the bytes into a single 4GB object before timing it (possibly creating a massive .pyc file)? Also, maybe mkArray is taking up most of the time, and there might be a faster way to create it? This is just speculation though.

Carbon (Nov 07 2024 at 16:13):

Possible. I did not observe the creation of a .pyc file but python is incredibly faster for sure. Not a show-stopper for me for now, but I was wondering if there were better ways to do it.

Sebastian Ullrich (Nov 07 2024 at 16:31):

Using mkEmpty with the correct capacity + push is likely faster but really there should be a version of Array.mkArray. Could you open an issue?

Carbon (Nov 07 2024 at 17:21):

Something like the code below? Still looks very slow. Happy to open an issue! Is the request specifically to make version of Array.mkArray that is efficient for large arrays? As a side note, glancing at the process memory size, it gets much larger than the requested size in both implementations.

def fill  := Id.run do
  let mut x : Array UInt8 := Array.mkEmpty 4294967295
  for _ in [:4294967295] do
    x := x.push 0
  x

Sebastian Ullrich (Nov 08 2024 at 10:08):

I meant using ByteArray.mkEmpty. Which is still not very fast, but should be memory-efficient.


Last updated: May 02 2025 at 03:31 UTC