Zulip Chat Archive
Stream: lean4
Topic: HashMap moved to std
Jannis Limperg (Oct 05 2022 at 17:11):
I see std
now has its own implementation of HashMap
, Std.HashMap
, distinct from Lean.HashMap
. What's the advantage of this, compared to changing the implementation in core (and adding functions/lemmas in std
as necessary)? I feel like having two types which fulfil the same purposes and have almost the same implementation is not ideal.
James Gallicchio (Oct 05 2022 at 18:10):
I think the reasoning is that Std.HashMap is intended to be more stable than Lean.HashMap. Having people rely on the Lean one makes it more difficult if the compiler devs need to make changes. Also there's some issues in the bootstrap that affect its API, which Std.HashMap can avoid since it's not in the bootstrap
Jannis Limperg (Oct 06 2022 at 09:45):
I'm sure you all have given this issue more consideration than I have, but I already see myself wanting to use some library (which uses Std.HashMap
) for metaprogramming (which uses Lean.HashMap
). At that point, my options are
- conversions back and forth, probably linear-time;
- really unsafe casts if the representations are sufficiently similar;
- duplicating the relevant part of the library's API for
Lean.HashMap
.
I would take occasional breakage over this.
Out of curiosity, what are the bootstrapping concerns?
James Gallicchio (Oct 06 2022 at 15:19):
I'm not super familiar with the metaprogramming API, does it explicitly expose hashmaps somewhere?
Jannis Limperg (Oct 06 2022 at 16:11):
Yes. For example, simp
now collects used simp lemmas and these are stored in a HashMap
. More importantly, PersistentHashMap
s, which I guess should also move to std
by the same reasoning, are used to implement all the core data structures.
In general, the metaprogramming API is now the entire non-private
compiler API. So whatever appears in a type signature there is exposed. (Technically, defeq also makes all implementations part of the API, but let's not go there.)
Mario Carneiro (Oct 06 2022 at 17:19):
I was planning to add a conversion between the two types to Std. I think it has to be the linear time conversion because the invariants aren't the same. I think it is an acceptable cost for conversion (the zero-cost conversion is a little sketchy and will again tie the representations together, so I don't want to promise its continued availability), but in applications I would generally want to #xy why you actually need the conversion: it's rare to be literally giving a whole hashmap from one module to another. In most cases you should just use the data structure required by the API you are interacting with, or default to Std.HashMap if you have a free choice.
Mario Carneiro (Oct 06 2022 at 17:23):
Out of curiosity, what are the bootstrapping concerns?
Std wants to put invariants on all the data structures such that they can be verified to have the correct behavior under reasonable assumptions, and this sometimes front-loads the proving effort. Lean core is not currently in a position to be able to do those proofs however unless we move a decent chunk of Std to core.
Jannis Limperg (Oct 07 2022 at 09:52):
The potential issue I'm describing hasn't occurred yet, so I can't xy. I agree that the issue won't be terribly common in practice, but at the same time, passing around a map doesn't sound like an outlandish thing either.
One concrete issue I already have is that I have some utility methods for Lean.HashMap
, Lean.PersistentHashMap
, etc. lying around. I was planning to move them to std, but will now have to duplicate them for the Std
versions as well. These are things like insertWith
, update
, merge
, ForIn
, BEq
. In the long run, I think we will want much larger APIs for these data structures along the lines of Haskell's Data.Map
. Duplicating these APIs will be annoying.
Could the bootstrapping issue be addressed by having the unverified data structure in core and verifying it post-hoc (and perhaps packaging up the verified version as a subtype, as usual)?
Mario Carneiro (Oct 07 2022 at 09:57):
I would eagerly add those APIs to std, and backport them to lean core as they become necessary
Mario Carneiro (Oct 07 2022 at 09:58):
I think it is feasible to have the unverified version in core, but leo doesn't want to deal with the maintenance overhead of that at the moment. I don't think this is a permanent state of affairs (but it could be a while before this is changed)
Jannis Limperg (Oct 07 2022 at 10:15):
All right, I'll deal with it for now and we'll see whether the duplication becomes an issue in practice.
Siddhartha Gadgil (Oct 07 2022 at 10:57):
Just in case it was not noticed, in the master of mathlib3port there is exactly this collision in Mathlib.Tactic.Sat.FromLRAT
:
error: stdout:
./lean_packages/mathlib/././Mathlib/Tactic/Sat/FromLRAT.lean:251:30: error: ambiguous, possible interpretations
Std.HashMap ℕ Clause : Type
Mario Carneiro (Oct 07 2022 at 11:14):
that's only a name collision, it can be fixed by using open Lean hiding HashMap
or the other way around
Mario Carneiro (Oct 07 2022 at 11:17):
you shouldn't be getting that error if the version of std4 and mathlib4 match. Note that this was fixed in the std4 bump for 10-03
Siddhartha Gadgil (Oct 07 2022 at 11:52):
Mario Carneiro said:
you shouldn't be getting that error if the version of std4 and mathlib4 match. Note that this was fixed in the std4 bump for 10-03
That commit is for mathlib4
. I am getting the error for mathlib3port
.
Mario Carneiro (Oct 07 2022 at 11:58):
you probably have the wrong lean-toolchain
Mario Carneiro (Oct 07 2022 at 11:58):
it needs to match whatever version of std4 / mathlib4 is being used
Siddhartha Gadgil (Oct 07 2022 at 14:05):
Mario Carneiro said:
you probably have the wrong
lean-toolchain
Thanks. I will try to get things in sync.
Last updated: Dec 20 2023 at 11:08 UTC