Zulip Chat Archive
Stream: batteries
Topic: Unifying `Batteries.HashMap` and `Std.HashMap`.
Kim Morrison (Oct 16 2024 at 02:10):
We're now in the situation of having both Batteries.HashMap
and Std.HashMap
(we also have Lean.HashMap
, but that is already being dealt with). I'd like to consolidate these, but I need some help.
There are no longer any uses of Batteries.HashMap
in repositories that I visit regularly (outside of deprecated files). (I'm happy to hear about other current uses.)
I think the steps going forward are:
- Identify currently existing
Batteries.HashMap
functionality/theorems that could not be implemented on top ofStd.HashMap
. - Put these on @Henrik Böving's TODO list. :-)
- Verify that any remaining functionality/theorems in
Batteries.HashMap
could be implemented on top ofStd.HashMap
. - Replace
Batteries.HashMap
with a synonym forStd.HashMap
, implementing any remaining functionality/theorems on top, still in Batteries. - (Optional/later): move that remaining functionality to
Std.HashMap
- (Optional/later): deprecate
Batteries.HashMap
.
Does this seem reasonable? Pinging in particular @Mario Carneiro to confirm this matches what you had in mind. If not, I'm happy to discuss alternative plans to reduce duplication.
If this is the right approach, then the current blocker is step 1. @Mario Carneiro, is this something you would be willing to take on? I can then drive 2. and 3. and perhaps 4. when appropriate.
Kim Morrison (Nov 24 2024 at 01:48):
Kim Morrison said:
We're now in the situation of having both
Batteries.HashMap
andStd.HashMap
(we also haveLean.HashMap
, but that is already being dealt with). I'd like to consolidate these, but I need some help.There are no longer any uses of
Batteries.HashMap
in repositories that I visit regularly (outside of deprecated files). (I'm happy to hear about other current uses.)I think the steps going forward are:
- Identify currently existing
Batteries.HashMap
functionality/theorems that could not be implemented on top ofStd.HashMap
.- Put these on Henrik Böving's TODO list. :-)
- Verify that any remaining functionality/theorems in
Batteries.HashMap
could be implemented on top ofStd.HashMap
.- Replace
Batteries.HashMap
with a synonym forStd.HashMap
, implementing any remaining functionality/theorems on top, still in Batteries.- (Optional/later): move that remaining functionality to
Std.HashMap
- (Optional/later): deprecate
Batteries.HashMap
.Does this seem reasonable? Pinging in particular Mario Carneiro to confirm this matches what you had in mind. If not, I'm happy to discuss alternative plans to reduce duplication.
If this is the right approach, then the current blocker is step 1. Mario Carneiro, is this something you would be willing to take on? I can then drive 2. and 3. and perhaps 4. when appropriate.
Could I ping on this? I'm happy to do 1. as well, but would like there to be some plan in this direction.
Mario Carneiro (Nov 24 2024 at 02:02):
With my batteries maintainer hat on I don't see any reason to push this process along, but I agree those are the steps that the FRO needs to do before we can deprecate Batteries.HashMap
.
Yury G. Kudryashov (Nov 24 2024 at 02:17):
About Lean.HashMap
, could you please add deprecation tags to functions, not only types? I'm trying to port code written by someone else, and I have to search for the corresponding function every time.
Kim Morrison (Nov 24 2024 at 02:57):
Mario Carneiro said:
With my batteries maintainer hat on I don't see any reason to push this process along, but I agree those are the steps that the FRO needs to do before we can deprecate
Batteries.HashMap
.
It turns out there is nothing missing except for findEntry?
, which has a slightly complicated spec: m.findEntry? a
returns some (a', b)
, for some a' == a
already in the HashMap, but this a'
may not actually be a
if BEq
ignores some data.
If it's okay to just replace this with (a, ·) <$> self[a]?
, then otherwise we're good to go, and steps 1-4 are all finished by batteries#1063.
Kim Morrison (Nov 24 2024 at 03:08):
Yury G. Kudryashov said:
About
Lean.HashMap
, could you please add deprecation tags to functions, not only types? I'm trying to port code written by someone else, and I have to search for the corresponding function every time.
Mario Carneiro (Nov 24 2024 at 15:21):
No, findEntry?
exists exactly because you might want the element in the map and not the one you have already
Mario Carneiro (Nov 24 2024 at 15:22):
in particular, it is used as a building block in other functions on sets IIRC (at least, in the RBMap version which has a more complete API)
Kim Morrison (Nov 24 2024 at 20:35):
Can we find any usages of Batteries.HashMap.findEntry?
or Lean.HashMap.findEntry?
? In what I've seen so far, I think it was just copied over from the Lean.HashMap
API when it was forked, but isn't actually used.
Kim Morrison (Nov 24 2024 at 20:36):
(I agree that Batteries.RBMap.findEntry?
is a different story.)
Mario Carneiro (Nov 24 2024 at 20:36):
No, it's copied from the RBMap API
Mario Carneiro (Nov 24 2024 at 20:36):
the other thing I want to know is how far along the lemmas are wrt the RBMap API
Kim Morrison (Nov 24 2024 at 20:36):
You mean for Std.HashMap
compared to Batteries.RBMap
? Or Markus's new work on ordered maps compared to Batteries.RBMap
?
Mario Carneiro (Nov 24 2024 at 20:37):
Is RBMap on the chopping block too? :face_with_diagonal_mouth:
Mario Carneiro (Nov 24 2024 at 20:38):
If these are supposed to be replacements, then they should have the lemmas too. That was the whole reason we were doing this, I thought
Mario Carneiro (Nov 24 2024 at 20:38):
How far along is that work?
Mario Carneiro (Nov 24 2024 at 20:39):
I got so much flak for not having all the theorems about hashmaps when this was under discussion
Kim Morrison (Nov 24 2024 at 20:39):
I think the new implementation Markus is working on it not actually an RBMap, it is something else, but I don't recall the details right now. Certainly the intention is that it will have at least as thorough verification API.
Kim Morrison (Nov 24 2024 at 20:40):
@Markus Himmel, just pinging you on this when you're available.
Mario Carneiro (Nov 24 2024 at 20:40):
But right now I'm talking about hashmaps
Mario Carneiro (Nov 24 2024 at 20:40):
we can deal with the other stuff later
Kim Morrison (Nov 24 2024 at 20:40):
Oh, then there is quite a bit more verification than there was previously.
Mario Carneiro (Nov 24 2024 at 20:41):
I'm not comparing to what currently exists in batteries master (which is almost nothing), I'm comparing to the RBMap API and also the work that wojciech did on a PR which was never merged
Kim Morrison (Nov 24 2024 at 20:41):
(Std.HashMap vs Batteries.HashMap)
Kim Morrison (Nov 24 2024 at 20:41):
Sorry, I thought you said we were talking about HashMaps.
Mario Carneiro (Nov 24 2024 at 20:41):
I am
Mario Carneiro (Nov 24 2024 at 20:42):
there is a bunch of proof work on batteries hashmaps which was deliberately blocked because of the work on Std.HashMap
Kim Morrison (Nov 24 2024 at 20:42):
I'm happy to go make a comparison, but I think the only code that I can actually compare against is Std.HashMap vs Batteries.RBMap?
Kim Morrison (Nov 24 2024 at 20:42):
Where is this?
Mario Carneiro (Nov 24 2024 at 20:42):
Which part? The batteries RBMap API is pretty close to what I would like to see from a complete map verification API
Mario Carneiro (Nov 24 2024 at 20:43):
Plus there's a copy and paste to hashsets
Mario Carneiro (Nov 24 2024 at 20:43):
That's all to scope out what lemmas we should have, not what lemmas we do have
Kim Morrison (Nov 24 2024 at 20:43):
Are you talking about batteries#279?
Mario Carneiro (Nov 24 2024 at 20:44):
Yes, that's what we do have
Kim Morrison (Nov 24 2024 at 20:44):
Okay, but there's no point doing a lemma-by-lemma comparison there, because that work never actually reached the point of the user-facing API, right?
Mario Carneiro (Nov 24 2024 at 20:45):
HashMap has never been completed, I want to know if it's complete now
Kim Morrison (Nov 24 2024 at 20:45):
The toList
function is has no verification.
Mario Carneiro (Nov 24 2024 at 20:45):
and if not I want to put pressure until it is
Kim Morrison (Nov 24 2024 at 20:45):
Otherwise there is a fair bit.
Kim Morrison (Nov 24 2024 at 20:45):
And someone (again, this will have to wait until Markus arrives) is working on the toList
part of the story.
Mario Carneiro (Nov 24 2024 at 20:46):
it is frustrating that proof_wanted
never really worked out here, and it seems likely that we're about to lose track entirely of what we even want from this API
Kim Morrison (Nov 24 2024 at 20:46):
https://github.com/leanprover/lean4/blob/master/src/Std/Data/HashMap/Lemmas.lean
Kim Morrison (Nov 24 2024 at 20:46):
is the current state of things
Kim Morrison (Nov 24 2024 at 20:47):
(although note this file comes in four versions, for the dependent and raw versions as well)
Kim Morrison (Nov 24 2024 at 20:48):
Sometime later today I'll try to go lemma-by-lemma through Batteries.RBMap and Std.HashMap and see how they compare, and create issues if I find something missing.
Mario Carneiro (Nov 24 2024 at 20:49):
is there anything about permutations in this file?
Mario Carneiro (Nov 24 2024 at 20:49):
maybe that's what you mean by toList
having no verification
Mario Carneiro (Nov 24 2024 at 20:50):
How do the benchmarks look between the three hashmap implementations?
Kim Morrison (Nov 24 2024 at 20:55):
Again, I'll wait for Markus, but he benchmarked rather carefully, and from what I remember he was happy with the outcomes.
Kim Morrison (Nov 24 2024 at 20:56):
Okay --- https://gist.github.com/kim-em/73bc3e4516271244e1a9702a5490189c is just the theorem statements from Batteries.Data.RBMap.Lemmas
. Am I missing anything else?
Kim Morrison (Nov 24 2024 at 20:56):
Later I'll go through these and make a comparison.
Mario Carneiro (Nov 24 2024 at 20:56):
IIRC there is a lot more in the previous section?
Mario Carneiro (Nov 24 2024 at 20:57):
most of the theorems are on RBNode
Mario Carneiro (Nov 24 2024 at 20:58):
(Note the use of findEntry?
in one of those lemmas. That would be incorrect with the new findEntry?
)
Kim Morrison (Nov 24 2024 at 21:00):
Sorry, theorems about RBNode are implementation details. I don't want to compare something other than the surface API.
Mario Carneiro (Nov 24 2024 at 21:00):
There are also two notions of membership, namely "find would succeed" and "the entry is in the map". I think these are called Mem
and EMem
respectively in RBMap
Mario Carneiro (Nov 24 2024 at 21:00):
No, RBNode
is the equivalent of HashSet
Mario Carneiro (Nov 24 2024 at 21:01):
but also, the low level API is also important, especially depending on how transparent you expect the internals to be
Kim Morrison (Nov 24 2024 at 21:01):
Can you point me to specific lemmas you're interested in, that are relevant for a comparison to a hashmap?
Kim Morrison (Nov 24 2024 at 21:02):
I'm not seeing anything in the RBNode section that seems relevant to the discussion here, sorry.
Mario Carneiro (Nov 24 2024 at 21:03):
I don't see an equivalent for
theorem mem_toList_insert [@TransCmp α cmp] {t : RBSet α cmp} :
v' ∈ toList (t.insert v) ↔ (v' ∈ toList t ∧ t.find? v ≠ some v') ∨ v' = v := by
Kim Morrison (Nov 24 2024 at 21:04):
Std.HashMap
is just so far ahead of Lean.HashMap
and Batteries.HashMap
(hand-checked IR, benchmarks, simp lemmas for pairwise combinations of lemmas, thorough framework for verifying further operations, including custom tactics to reduce proof obligations to list-based models, etc., etc.). I'm sort of at a loss as to what the obstacle to switching out Batteries.HashMap is.
Kim Morrison (Nov 24 2024 at 21:04):
Yes --- the HashMap.toList
function doesn't have any verification API yet.
Mario Carneiro (Nov 24 2024 at 21:05):
This lemma isn't really about toList, a \in toList map
is how this API says "exact membership"
Mario Carneiro (Nov 24 2024 at 21:06):
I guess you could do the same for hashmap but of course that means you will need a lot of lemmas about toList
Kim Morrison (Nov 24 2024 at 21:07):
"exact membership" is an implementation detail that is not meant to be exposed. My understanding is that the verification of toList
will be under appropriate Lawful
hypotheses. (Again, I'm not involved in verifying toList
, I may be wrong here.)
Mario Carneiro (Nov 24 2024 at 21:07):
?? It's externally visible
Mario Carneiro (Nov 24 2024 at 21:07):
It's not at all an implementation detail, it's what the user put into the map
Mario Carneiro (Nov 24 2024 at 21:08):
the precise ordering of toList
is an implementation detail, but up to permutation toList
on a hashmap is an external API
Mario Carneiro (Nov 24 2024 at 21:09):
hence a \in toList map
matters for hashmaps too
Kim Morrison (Nov 24 2024 at 21:09):
Yes, and a ∈ map.toList ↔ a ∈ map
with appropriate Lawful
hypotheses.
Mario Carneiro (Nov 24 2024 at 21:09):
no, that is false
Mario Carneiro (Nov 24 2024 at 21:10):
unless by appropriate Lawful hypotheses you mean that beq is equivalent to eq
Mario Carneiro (Nov 24 2024 at 21:10):
which is too strong of an assumption
Kim Morrison (Nov 24 2024 at 21:11):
In principle there's no reason why this can't be done. My understanding was that it wasn't useful to do it, but I'm not involved in verifying toList
. :woman_shrugging:
I think we're getting a bit off track --- this is not something that Batteries.HashMap allows doing in any case.
Mario Carneiro (Nov 24 2024 at 21:11):
I'm not sure what you mean, of course you can do it
Mario Carneiro (Nov 24 2024 at 21:12):
and the lemmas Wojciech wrote let you verify that the right things have been replaced or not
Kim Morrison (Nov 24 2024 at 21:12):
But this is non-existent work, Mario.
Kim Morrison (Nov 24 2024 at 21:12):
The lemma just isn't there.
Kim Morrison (Nov 24 2024 at 21:13):
I agreee that one might one it
Kim Morrison (Nov 24 2024 at 21:13):
so if there's a use case for it, lets make it happen as part of Std.HashMap.toList verification.
Mario Carneiro (Nov 24 2024 at 21:14):
Is it possible to prove such a lemma as a user without breaking the API boundary?
Mario Carneiro (Nov 24 2024 at 21:14):
it's not great to provide a partial API and also get annoyed when people work around it
Mario Carneiro (Nov 24 2024 at 21:15):
if you say the internals are open, then it's possible for people to just do this stuff on their own dime
Mario Carneiro (Nov 24 2024 at 21:16):
but if you seal the internals then you need enough lemmas at the low level so that people won't want to break the boundary
Kim Morrison (Nov 24 2024 at 21:16):
Isn't the relevant lemma map.toList.contains a ↔ a ∈ map
anyway? I'm a bit confused why you want to mix up BEq and Eq here.
Mario Carneiro (Nov 24 2024 at 21:17):
The question is essentially "what happens to the keys (up to Eq
) when you insert/find/... in a map, where the BEq
relation is a quotient of the type?"
Mario Carneiro (Nov 24 2024 at 21:17):
Mario Carneiro said:
I don't see an equivalent for
theorem mem_toList_insert [@TransCmp α cmp] {t : RBSet α cmp} : v' ∈ toList (t.insert v) ↔ (v' ∈ toList t ∧ t.find? v ≠ some v') ∨ v' = v := by
Lemmas like this are how you answer that question
Kim Morrison (Nov 24 2024 at 21:19):
Mario Carneiro said:
Is it possible to prove such a lemma as a user without breaking the API boundary?
I'm pretty sure yes: there's nothing forbidding anyone from following the usual mechanism to provide more lemmas.
Mario Carneiro (Nov 24 2024 at 21:19):
the usual mechanism meaning what exactly?
Kim Morrison (Nov 24 2024 at 21:19):
There's a long discussion of this at https://github.com/leanprover/lean4/blob/master/src/Std/Data/DHashMap/Internal/Defs.lean
Kim Morrison (Nov 24 2024 at 21:20):
Sorry, I'd assumed you'd seen that.
Mario Carneiro (Nov 24 2024 at 21:21):
to be clear, you mean PRing to lean
Kim Morrison (Nov 24 2024 at 21:21):
I don't think so?
Kim Morrison (Nov 24 2024 at 21:21):
I mean, of course welcome to.
Mario Carneiro (Nov 24 2024 at 21:21):
I think the summary of steps to add a new operation is being written from that perspective?
Kim Morrison (Nov 24 2024 at 21:22):
Oh, in the sense that it tells you where to put content? Yes. But I don't know of anything reason it can't be done in separate files.
Kim Morrison (Nov 24 2024 at 21:22):
Lean doesn't complain. :-)
Mario Carneiro (Nov 24 2024 at 21:22):
but FRO members might
Mario Carneiro (Nov 24 2024 at 21:23):
I feel like I've been blamed for using the array API in formalizations of HashMap and UnionFind in the past
Kim Morrison (Nov 24 2024 at 21:23):
I mean, if it's good work we'd love to have it PR'd to the repository? It would be sad if good work on operations or lemmas for HashMap had to be implemented multiple times.
Mario Carneiro (Nov 24 2024 at 21:24):
I think the next stage for batteries.hashmap should be making it an overlay API for std.hashmap and encouraging lemma contributions there
Kim Morrison (Nov 24 2024 at 21:25):
Concretely what does that look like relative to batteries#1063?
Mario Carneiro (Nov 24 2024 at 21:26):
Nothing, that's the first part, and the second part is adding a bunch of proof_wanted
for all the things that have come out in this discussion
Kim Morrison (Nov 24 2024 at 21:26):
I agree that the tutorial on extending that I linked to above would be even better with a "worked example" outside of the Lean repo.
Kim Morrison (Nov 24 2024 at 21:26):
So one can see by example what exactly is needed at each layer.
Henrik Böving (Nov 24 2024 at 21:28):
Kim Morrison said:
I agree that the tutorial on extending that I linked to above would be even better with a "worked example" outside of the Lean repo.
I think linking to a PR that just implements the full procedure for one function might be fine? (Shameless plug: https://github.com/leanprover/lean4/pull/5244/ though that does it for 4 operations, a more minimal example would truly be just one)
Kim Morrison (Nov 24 2024 at 21:28):
How about we deprecate Batteries.HashMap.findEntry?
, to make sure that if there is actually a user of it, they see the warning about the changed behaviour, and then add some proof_wanteds or module docs about the desire to replace it?
Mario Carneiro (Nov 24 2024 at 21:29):
Why not just test the process you are advocating by implementing it instead?
Mario Carneiro (Nov 24 2024 at 21:29):
As I mentioned earlier, it plays a role in some theorem statements too
Mario Carneiro (Nov 24 2024 at 21:29):
the proposed implementation is just wrong
Kim Morrison (Nov 24 2024 at 21:30):
But it plays a role in theorem statements that haven't to date been written. :-)
Mario Carneiro (Nov 24 2024 at 21:31):
I have wanted from the beginning to have a robust HashMap library in lean. We're closer now, but having one person do all the work is not a scalable solution
Kim Morrison (Nov 24 2024 at 21:31):
Well, it's not one person: note that PR Henrik linked above is person 2. :-) And there's WIP about toList
, but I forget who is working on it.
Henrik Böving (Nov 24 2024 at 21:32):
People from TU Dresden.
Mario Carneiro (Nov 24 2024 at 21:32):
As soon as this stops being a FRO priority I expect the lemmas to stop there
Mario Carneiro (Nov 24 2024 at 21:33):
and I don't know that there is a good pathway for external contributors to influence that
Mario Carneiro (Nov 24 2024 at 21:33):
hence why I want Batteries.HashMap to be the receiver for that energy instead
Mario Carneiro (Nov 24 2024 at 21:34):
we actively discouraged people from getting involved, I would like that to end once we do this
Kim Morrison (Nov 24 2024 at 21:35):
Mario Carneiro said:
we actively discouraged people from getting involved, I would like that to end once we do this
Sorry, wasn't that just pointing out that the lemma existed already? It's not discouraging people to tell them that the work has already been done.
Mario Carneiro (Nov 24 2024 at 21:36):
that and the complete lack of other proof_wanted
s in this file I'm quite sure has been effective in keeping people away from working on hashmaps
Mario Carneiro (Nov 24 2024 at 21:36):
I know it has been effective on me and Wojciech at least
Kim Morrison (Nov 24 2024 at 21:37):
but, but... "the complete lack of other proof_wanted
s in this file" is on you, isn't it? Who else would have written them? I'm confused.
Mario Carneiro (Nov 24 2024 at 21:38):
The messaging at me to stop messing with hashmaps has been coming loud and clear for the past few months
Kim Morrison (Nov 24 2024 at 21:38):
re: Wojciech, I started reviewing their PR, and it then immediately went silent?
Kim Morrison (Nov 24 2024 at 21:38):
Mario Carneiro said:
The messaging at me to stop messing with hashmaps has been coming loud and clear for the past few months
Sure, of course, as soon as we planned to write a replacement we said so.
Mario Carneiro (Nov 24 2024 at 21:39):
right, so now that this work is settled down I want hashmap to be back in business for me and others
Kim Morrison (Nov 24 2024 at 21:39):
Great! Yes.
Kim Morrison (Nov 24 2024 at 21:40):
I think it will be great if there are people (including you) who would like to contribute more operations and verification lemmas for Std.HashMap
.
Kim Morrison (Nov 24 2024 at 21:40):
and speaking for myself at least, I don't see any problem with that incubating at least in Batteries.
Mario Carneiro (Nov 24 2024 at 21:40):
I'm not going to write a PR to lean though, I'm going to write PRs for batteries and you can take them to lean
Mario Carneiro (Nov 24 2024 at 21:40):
this is a much less stressful approach for everyone
Kim Morrison (Nov 24 2024 at 21:41):
and hopefully if there is good work on Std.HashMap in Batteries, then authors will be pleased when/if we ask if we can incorporate it into the Lean repo.
Kim Morrison (Nov 24 2024 at 21:41):
Great! It sounds like furious agreement all round, then. :-)
Mario Carneiro (Nov 24 2024 at 21:42):
but this does mean that batteries.hashmap isn't going to be deprecated any time soon
Mario Carneiro (Nov 24 2024 at 21:42):
and in fact it might become the default for anyone downstream of batteries
Kim Morrison (Nov 24 2024 at 21:42):
Why not change the namespace over to Std, though? Batteries (the library) can add code to Std (the namespace).
Mario Carneiro (Nov 24 2024 at 21:43):
That is a possibility as well
Mario Carneiro (Nov 24 2024 at 21:44):
am I getting your permission to not make a stink about batteries having theorems about std internals?
Kim Morrison (Nov 24 2024 at 21:56):
Let's pause until Markus is available, but in principle yes, I think the sounds great:
- We merge batteries#1063 (either as is, or probably better with
findEntry?
deprecated for now) - Preferably we subsequently switch over the namespace too.
- You or others are welcome to add operations and lemmas about Std.HashMap in Batteries
- preferably following the standard approach from https://github.com/leanprover/lean4/blob/master/src/Std/Data/DHashMap/Internal/Defs.lean (ideally even we organize files in Batteries to match that file structure)
- preferably being sure to maintain parity across the four dependent/raw variants
- Said authors understand that the optimal outcome for good/stable code added in this way to Batteries is that it is upstreamed subsequently to the Lean repo, but there's no expectation or obligation that they do it themselves.
- Given that there will also be people contributing HashMap code directly to the Lean repo, I attempt to provide communication in both directions about things I know people are working on.
Eric Wieser (Nov 24 2024 at 22:11):
I think if no implementation of findEntry?
is available in the new shim, it would be better to replace it with a panic! "not implemented"
than replace it with an implementation that does something different to what it did before
Kim Morrison (Nov 24 2024 at 22:45):
I've replaced findEntry?
with a panic in batteries#1063.
Jannis Limperg (Nov 24 2024 at 23:46):
I feel like panic!
is strictly worse than removing findEntry?
(temporarily). In both cases I need to fix my code, but if it's removed I at least get a compiler error, rather than just a warning. Having a function in the library, even a deprecated one, that just straight up panics would be extremely surprising.
Mario Carneiro (Nov 24 2024 at 23:46):
I also don't think there is any need to remove the function at all, it's not much more than a copy paste job on find?
Eric Wieser (Nov 24 2024 at 23:49):
Having a copy-paste job on one function still seems like an improvement over the status quo where the
Jannis Limperg said:
I feel like
panic!
is strictly worse than removingfindEntry?
(temporarily). In both cases I need to fix my code, but if it's removed I at least get a compiler error, rather than just a warning. Having a function in the library, even a deprecated one, that just straight up panics would be extremely surprising.
I guess ideally you want want a compiler error saying "this has been removed, see discussion" rather than "this name is missing, good luck working out whether you're missing an import or something else is going on"
Mario Carneiro (Nov 24 2024 at 23:51):
it would be nice to have another deprecation-like attribute for post-removal assistance
Kim Morrison (Nov 25 2024 at 00:07):
My preference was just to deprecate it and change the behaviour.
Mario Carneiro (Nov 25 2024 at 00:07):
but why though? The new behavior is clearly worse
Kim Morrison (Nov 25 2024 at 00:08):
If someone would like to include an new implementation of findEntry?
in batteries#1063, that would be great: please let me know! If someone would like to write a new implementation after batteries#1063, that would also be great.
Mario Carneiro (Nov 25 2024 at 00:08):
I don't see any justification for it other than it's less work given that the correct implementation doesn't exist yet
Kim Morrison (Nov 25 2024 at 00:08):
Particularly if it's just a copy paste job?
Kim Morrison (Nov 25 2024 at 00:11):
I don't want to drag this out longer for a function that is currently unused. We've got a path, as discussed above, for anyone who needs it to add it, either PRing to the Lean repo or the Batteries repo.
Kim Morrison (Nov 25 2024 at 00:27):
Jannis Limperg said:
In both cases I need to fix my code,
@Jannis Limperg, are you using findEntry?
? I thought Aesop wasn't using Batteries.HashMap at all anymore.
Jannis Limperg (Nov 25 2024 at 00:28):
No, I meant "I" as in "a hypothetical downstream user". Maybe not my clearest formulation. :sweat_smile:
Johannes Tantow (Nov 25 2024 at 06:53):
Kim Morrison schrieb:
- :-) And there's WIP about
toList
, but I forget who is working on it.
Henrik Böving schrieb:
People from TU Dresden.
Not sure if other people are working on it, but @Lukas Gerlach and I are working on ofList
not toList
and with that on insertMany
for Lists.
Markus Himmel (Nov 25 2024 at 07:20):
Regarding findEntry?
: a possible compromise might be to use
def findEntry? [BEq α] [Hashable α] (m : Std.HashMap α β) (k : α) : Option (α × β) :=
if h : k ∈ m then some (m.getKey k h, m.get k h) else none
in the shim, which isn't particularly efficient but should have the same behavior as the current implementation in batteries.
Kim Morrison (Nov 25 2024 at 10:51):
Thanks Markus, great to know that is possible. I will put that in, with a note.
Kim Morrison (Nov 25 2024 at 10:54):
Okay, I've now restored findEntry?
with the original behaviour.
Kim Morrison (Nov 25 2024 at 10:54):
I think batteries#1063 is ready for review?
Kim Morrison (Nov 25 2024 at 11:13):
(Thanks for the review comments, @Markus Himmel, I've dealt with these now.)
𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Nov 25 2024 at 16:44):
Kim Morrison said:
re: Wojciech, I started reviewing their PR, and it then immediately went silent?
Thank you for looking at that (batteries#279), and sorry for not getting back to you. The reason I abandoned it is that at that point I was feeling burnt out on specifically the "building a hashmap API for other people to use" front as a previous PR to that end existed for about a year, code-rotted, and never got merged, and I had also heard some rumours about the FRO starting new work on hashmaps. I should have communicated this better. I am personally happy with any solution that provides the necessary API (see below).
Kim Morrison said:
Okay, but there's no point doing a lemma-by-lemma comparison there, because that work never actually reached the point of the user-facing API, right?
It has reached the user-facing API, I just never made an honest attempt to upstream it for reasons outlined above. I am using the user-facing lemmas in this verification project. So, for me the ideal new Std.HashMap
would provide replacements for everything I use there, at least in the sense that if some fact I use is not directly available in Std.HashMap
, it is still trivial to prove given what is there in Std.HashMap
. Note that that project assumes LawfulBEq
, so subtleties of Eq
vs BEq
mostly don't apply.
Johannes Tantow (Nov 25 2024 at 20:28):
I have only looked briefly in this and didn't find everything yet in Std, but this feels wrong to me: theorem ext_findEntry? [LawfulBEq α] (m₁ m₂ : HashMap α β) : (∀ a, m₁.findEntry? a = m₂.findEntry? a) → m₁ = m₂
. The order of elements in the bucket shouldn't matter for findEntry?
, but in order for both maps to be equal the order in every buckets must be the same.
Mario Carneiro (Nov 25 2024 at 20:29):
yeah I think it should say m₁ ~ m₂
Mario Carneiro (Nov 25 2024 at 20:30):
this being an equivalence relation exposed by the API with lemmas about how all the operations respect it
Kim Morrison (Nov 25 2024 at 22:18):
(Just to be clear for context, this problem with ext_findEntry?
is about the project Wojciech linked just above, not about the PR discussed here batteries#1063.)
𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Nov 25 2024 at 22:50):
Oops, that's what I get for leaving a sorry
. Thanks for spotting that! It is used to prove insert_comm
which is wrong for the same reason.
𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Nov 26 2024 at 02:59):
Okay fixed it here. As far as I can see, a big thing that I need that is not there for Std.HashMap
atm are lemmas about fold
.
Kim Morrison (Nov 26 2024 at 03:41):
@Wojciech Nawrocki, just double checking --- lemmas about fold, which are indeed not written yet for Std.HashMap
(they will be presumably just say that fold
agrees with toList.fold
; i.e. they are part of the the verification lemmas for toList
), also don't exist anywhere for Batteries.HashMap
, right?
Mario Carneiro (Nov 26 2024 at 10:47):
they exist in Wojciech's project
Kim Morrison (Nov 29 2024 at 00:29):
Are we ready to merge batteries#1063?
Kim Morrison (Mar 20 2025 at 04:39):
Pinging on this again, @𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒. Std.HashMap
now has fairly complete lemmas about the toList
function and fold
.
Kim Morrison (Mar 27 2025 at 01:30):
Okay, I'm going to interpret that emoji response as "yes, seen, and acknowledged", and I'll merge tomorrow. Just let me know if there's further reason to wait!
𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 27 2025 at 02:38):
Sorry, I just haven't had the chance to see if the lemmas that are now in Std suffice to reproduce my proofs. At any rate I am not opposed to the unification.
Last updated: May 02 2025 at 03:31 UTC