Zulip Chat Archive

Stream: std4

Topic: no kernel reduction of HashMap.insert


Scott Morrison (Nov 24 2023 at 03:08):

See:

import Std.Data.HashMap.Basic

open Std

#eval (HashMap.empty : HashMap Nat Nat).isEmpty  -- true
#eval (HashMap.empty.insert 1 1).isEmpty  -- false
example : (HashMap.empty : HashMap Nat Nat).isEmpty = true := rfl -- okay
example : (HashMap.empty.insert 1 1).isEmpty = false := rfl -- type mismatch

Two questions:

  1. How to I determine where this is failing, besides jumping to definition recursively on HashMap.insert?
  2. Is there any prospect of fixing this?

Scott Morrison (Nov 24 2023 at 03:34):

(The same behaviour happens with Lean.HashMap: change open Std to open Lean above.)

Scott Morrison (Nov 24 2023 at 03:47):

Oh:

example : (7 : UInt64) % 2 = 1 := rfl
example : (7 : USize) % 2 = 1 := rfl -- type mismatch

Scott Morrison (Nov 24 2023 at 04:00):

I see, so it all comes down to:

#eval System.Platform.numBits
example : System.Platform.numBits = 64 := rfl  -- type mismatch

James Gallicchio (Nov 24 2023 at 04:09):

is the dependency on numBits from hashable or from hashmap?

Scott Morrison (Nov 24 2023 at 04:13):

@James Gallicchio, see std4#390

Scott Morrison (Nov 24 2023 at 04:13):

I've no idea if it has significant performance problems, but it makes my problem go away. :-)

James Gallicchio (Nov 24 2023 at 04:15):

aha :-) 32bit platforms were out of date a decade ago, I feel okay with not caring heavily about performance on 32bit machines!

Henrik Böving (Nov 24 2023 at 08:35):

Couldn't we make System.Platform.numBits a thing that is specially handled by the kernel? That seems like a very conservative extension to the kernel.

Henrik Böving (Nov 24 2023 at 08:36):

Like in general the underlying issue here is not that you can't compute with HashMap.insert but instead with USize in the kernel which seems like something that we might want to do in general?

Scott Morrison (Nov 24 2023 at 08:43):

Be careful not to prove false by calculating numBits on one platform, then loading that .olean on a different platform and calculating again for a different answer...

Henrik Böving (Nov 24 2023 at 08:51):

But thats a technical issue right? one could put in how many bits you expect on this platform etc. and don't load if the configuration doesn't match.

Also Lean4checker should still notice one of the two configurations as false I think?

Scott Morrison (Nov 24 2023 at 08:53):

Yes, it's solvable. I'm just speculating as to why it's as it is now.

James Gallicchio (Nov 24 2023 at 16:43):

what is the motivation for handling it in the kernel? we can design the simp normal form to push as much computation as possible out of USize if the concern is proof-stuff

Mario Carneiro (Nov 24 2023 at 17:57):

James Gallicchio said:

aha :-) 32bit platforms were out of date a decade ago, I feel okay with not caring heavily about performance on 32bit machines!

I'm more concerned with making sure that Scott's change doesn't regress performance on 64-bit machines. I see some changes from uset to set in the PR

Mario Carneiro (Nov 24 2023 at 18:04):

Given the other thread about Hashable instances also blocking hash computation, I'm inclined to just leave things as is and use a simpler model for kernel computation like association lists

Mario Carneiro (Nov 24 2023 at 18:07):

Another reason not to make the hash algorithm visible to the kernel is that it will make details of it show up in proofs, which will make them more brittle to changes to the algorithm or the internals of HashMap. I don't think we want this to be public API

Scott Morrison (Nov 24 2023 at 23:55):

I really don't like the idea that we're going to say "no fast lookup tables in verified algorithms".

The Hashable problem is easy to work around: just write your own hash function.

Mario Carneiro (Nov 25 2023 at 04:57):

I think I've already said this in another thread, but to be clear, HashMap is not a "fast lookup table in verified algorithms". It has the same performance as AssocList when run in the kernel (probably much worse since it also needs to execute a hash function)

Scott Morrison (Nov 25 2023 at 04:58):

Got it! I will try to remember to add some documentation to that effect at some point. :-)


Last updated: Dec 20 2023 at 11:08 UTC