Zulip Chat Archive
Stream: general
Topic: association lists, finmap
Sean Leather (Sep 20 2018 at 06:57):
As some people know, I've been working on association lists and finite maps for mathlib (https://github.com/spl/lean-finmap). Mario has decided he wants to get it into mathlib soon, so he has started rewriting it in his own fashion (https://github.com/leanprover-community/mathlib/tree/finmap). I wanted to open up the discussion to get feedback from anyone else who might be interested.
Sean Leather (Sep 20 2018 at 07:05):
One question is whether there is the need for a structure
to wrap a list without duplicate keys. Mario thinks it might be useful:
structure alist (α : Type u) (β : α → Type v) : Type (max u v) := (val : list (sigma β)) (nd : val.knodup)
I went down this path and found that, given that (a) there are so many useful definitions and theorems regarding lists and (b) the structure
is such a thin wrapper, every definition or proof for the structure
is simple, it created a large number of really simple statements. Considering that alist
is really a list with a property and I believe it is useful to use other list definitions that don't involve this property, I think the alist
structure either (a) makes it more difficult to work with the wrapped list or (b) creates the problem of scaling the number of statements about alist
to match those already about list
.
Sean Leather (Sep 20 2018 at 07:09):
In another way of putting it, you could look at an alist
as something unique on its own or you could look at a list with a property that it has no duplicate keys. In the process of developing my library, I found the latter view much more useful.
Simon Hudon (Sep 20 2018 at 07:10):
Why is latter more useful?
Sean Leather (Sep 20 2018 at 07:12):
On a related note, that's also how I ended up with the finmap
structure
:
structure finmap (α : Type u) (β : α → Type v) : Type (max u v) := (val : multiset (sigma β)) (nodupkeys : val.nodupkeys)
You could say that this is a similar situation: a finmap
is just a multiset
with no duplicate keys. However, I found that it's more useful to look at it as a finmap
because I didn't find a lot of the other multiset
definitions to be very useful for a finmap
.
Sean Leather (Sep 20 2018 at 07:13):
Why is latter more useful?
The most important reason for me is that I care about the structure of the internal list: nil
, cons
, append
, etc.
Sean Leather (Sep 20 2018 at 07:13):
Also, regarding finmap
, I don't care about the structure of the internal multiset
.
Sean Leather (Sep 20 2018 at 07:15):
... which makes sense, since a multiset
is a quotient
.
Sean Leather (Sep 20 2018 at 07:17):
@Simon Hudon Thanks for that question, BTW. It gets to a key reason I don't think alist
is useful, a reason that hadn't yet come to mind.
Sean Leather (Sep 20 2018 at 07:20):
That is, I found that it was useful to wrap a quotient
like multiset
in a structure
but not that useful to wrap a list
in a structure
because the structure of the list is important when working with the structure
. I ended up creating definitions to wrap nil
, cons
, append
, etc., and all of that already existed for the list
, so I came to ask why it was necessary. I couldn't come up with a good answer.
Simon Hudon (Sep 20 2018 at 07:25):
You could say that this is a similar situation: a finmap is just a multiset with no duplicate keys. However, I found that it's more useful to look at it as a finmap because I didn't find a lot of the other multiset definitions to be very useful for a finmap.
I don't get this nuance
Sean Leather (Sep 20 2018 at 07:30):
I don't get this nuance
I think it's similar to how you don't see a lot of the multiset
definitions used for finset
. There's a particular subset (mostly starting with nd
) that are used in finset
, but many of them are not.
Sean Leather (Sep 20 2018 at 07:31):
multiset
seems to be useful as a substrate for finset
and finmap
, but it also has a larger API that is oriented towards counting duplicates.
Sean Leather (Sep 20 2018 at 07:32):
Also, the subset of the multiset
API used for finset
is not useful for finmap
.
Mario Carneiro (Sep 20 2018 at 13:16):
I think the comparison is apt, and applicable to alist
. A finset is just a multiset with a property, but the API is different (some basic operations are not applicable, like cons
, and others are slightly weird and become more useful on the subtype, like ndinsert
).
Mario Carneiro (Sep 20 2018 at 13:18):
In the case of alist
, there are some operations that don't work anymore, like cons
, and others are slightly weird and have been prefixed with k
for lists. cons
is just not a valid alist
operation, insert
is.
Sean Leather (Sep 20 2018 at 13:21):
I think the comparison is apt, and applicable to
alist
.
So does this mean you agree with me, since my comparison is apt, which is to say that alist
is not a useful structure?
Sean Leather (Sep 20 2018 at 13:23):
In the case of
alist
, there are some operations that don't work anymore, likecons
, and others are slightly weird and have been prefixed withk
for lists.cons
is just not a validalist
operation,insert
is.
This hints of disagreement. My point is that lists are constructed with nil
and cons
, and it is useful to have that construction, and therefore more useful to have a bare list with no duplicate keys than to have a structure hide that construction.
Mario Carneiro (Sep 20 2018 at 13:25):
In short: I think finset
is useful, and I think alist
is useful
Mario Carneiro (Sep 20 2018 at 13:26):
I don't see why alist
should provide a nil/cons construction. It's not natural. If you want to do such a construction then you should unpack the alist
first
Sean Leather (Sep 20 2018 at 13:26):
That's kind of my point. :wink:
Mario Carneiro (Sep 20 2018 at 13:26):
Instead it has empty
/insert
Mario Carneiro (Sep 20 2018 at 13:27):
My point is, why are you doing list things on alist
s?
Sean Leather (Sep 20 2018 at 13:28):
My point is that I want to construct lists with no duplicate keys as well as use any other existing definitions and theorems on lists.
Mario Carneiro (Sep 20 2018 at 13:30):
then do so, on list
. You don't need alist
for that
Sean Leather (Sep 20 2018 at 13:30):
Quite right, which is why I don't think alist
is necessary.
Mario Carneiro (Sep 20 2018 at 13:31):
alist
provides a compositional structure for safety-preserving functions on lists
Mario Carneiro (Sep 20 2018 at 13:32):
as well as use any other existing definitions and theorems on lists.
Most list operations don't preserve the invariant. What do you do with these?
Sean Leather (Sep 20 2018 at 13:33):
Well, if you want it, add it. I'm only trying to save you work.
Mario Carneiro (Sep 20 2018 at 13:33):
This is an honest question. What is missing from the API of alist
that you would want, that requires resorting to list
?
Mario Carneiro (Sep 20 2018 at 13:35):
Except for union
, there's nothing I've missed that comes to mind
Sean Leather (Sep 20 2018 at 13:36):
I'm not sure about alist
, but I defined the minimum necessary to use list
s with no duplicate keys. For example, has_mem
and append
don't need to be redefined. You only need some extra properties to use append
, for example.
Mario Carneiro (Sep 20 2018 at 13:36):
To answer your question from earlier, alist
should also be available as a tool for programmers. "I need a map, I only have equality" -> use alist
or finmap
Mario Carneiro (Sep 20 2018 at 13:37):
Note that the has_mem
on alist
is different from the one on lists
Mario Carneiro (Sep 20 2018 at 13:37):
and append
doesn't apply
Sean Leather (Sep 20 2018 at 13:37):
Which I don't think it should be.
Sean Leather (Sep 20 2018 at 13:37):
As you'll see in my library.
Mario Carneiro (Sep 20 2018 at 13:39):
I think in common programming parlance, x \in m
where m is a dictionary / map / associative array, means x
is a key contained in the map m
, not x
is a key-value pair in the map
Mario Carneiro (Sep 20 2018 at 13:40):
The idiom for the other interpretation is <x, y> \in m.val
, where val
is the operation sometimes called entries
in other libraries
Mario Carneiro (Sep 20 2018 at 13:40):
(indeed, maybe it should be renamed to that)
Mario Carneiro (Sep 20 2018 at 13:45):
Note also that if it meant membership in the underlying list, then it wouldn't be decidable unless you assume decidability of the values, which is not needed for anything else
Sean Leather (Sep 20 2018 at 13:46):
Fair enough. With a : alist
, you can use a.val
/a.entries
.
Mario Carneiro (Sep 20 2018 at 13:47):
right, I think this helps to view them as different things (a finite map vs a list of pairs)
Mario Carneiro (Sep 20 2018 at 13:48):
You pointed out that it is possible to define operations like append
if you assume side conditions (i.e. the key sets are disjoint). Does this come up often?
Sean Leather (Sep 20 2018 at 13:49):
I don't know about “often,” but it has come up.
Sean Leather (Sep 20 2018 at 13:51):
I'm now using finmap
instead of association lists directly, and you have the same thing with union there.
Sean Leather (Sep 20 2018 at 13:51):
I'm assuming you'll use has_union
for alist
instead of has_append
?
Mario Carneiro (Sep 20 2018 at 13:51):
yes
Mario Carneiro (Sep 20 2018 at 13:52):
since it's union, not append :P
Mario Carneiro (Sep 20 2018 at 13:52):
it's not even possible to give an append
instance that uses list.append
Sean Leather (Sep 20 2018 at 13:52):
Well, it's debatable that it's not a form of append.
Mario Carneiro (Sep 20 2018 at 13:52):
I prefer set terminology when duplicates are being dropped
Sean Leather (Sep 20 2018 at 13:53):
It's still appending. :slight_smile:
Sean Leather (Sep 20 2018 at 13:53):
When it comes to the finmap
, you can forget that there is any appending. But with alist
, you can't.
Mario Carneiro (Sep 20 2018 at 13:54):
Not sure what you mean. The underlying operation is not an append
Mario Carneiro (Sep 20 2018 at 13:55):
It is true that union
would be order-respecting on alist
Mario Carneiro (Sep 20 2018 at 13:55):
but then again, so is union
on list
Sean Leather (Sep 20 2018 at 13:56):
def append : list α → list α → list α | [] l := l | (h :: s) t := h :: (append s t) def kunion : list (sigma β) → list (sigma β) → list (sigma β) | [] l := l | (hd :: tl) l := hd :: kunion tl (kerase hd.1 l)
Sean Leather (Sep 20 2018 at 13:56):
Are you doing it differently?
Mario Carneiro (Sep 20 2018 at 13:57):
I think the erase
makes that a rather different kind of operation. But also I think we need two functions here - that is unionl
Mario Carneiro (Sep 20 2018 at 13:58):
unionr
would be
def kunionr : list (sigma β) → list (sigma β) → list (sigma β) | [] l := l | (hd :: tl) l := kinsert hd.1 hd.2 (kunionr tl l)
Mario Carneiro (Sep 20 2018 at 13:58):
I'm not even sure if it should get a has_union
notation
Sean Leather (Sep 20 2018 at 13:59):
I'm not sure if you need it. I don't foresee people asking for it.
Mario Carneiro (Sep 20 2018 at 14:00):
You need left and right preference. For finmap
you can ignore it since the order of parameters gives the preference, but on alist
you have order and preference separately
Sean Leather (Sep 20 2018 at 14:02):
Why do you need it? Just because it's possible doesn't make it a need.
Mario Carneiro (Sep 20 2018 at 14:03):
I'm okay with just assuming right preference (which is what list.union
does), but you want left preference?
Sean Leather (Sep 20 2018 at 14:04):
I stuck with the Haskell Data.Map
approach because it seemed to the most intuitive to me.
Mario Carneiro (Sep 20 2018 at 14:05):
Keep in mind that haskell has laziness to contend with, making your definition more natural for them
Sean Leather (Sep 20 2018 at 14:05):
Yep.
Sean Leather (Sep 20 2018 at 14:06):
∪
is infixl
, so could that also indicate left-biased is preferred?
Mario Carneiro (Sep 20 2018 at 14:06):
It seems like the difference between foldl
and foldr
Mario Carneiro (Sep 20 2018 at 14:06):
and we have both of those
Mario Carneiro (Sep 20 2018 at 14:07):
I'd rather not commit to one camp here
Mario Carneiro (Sep 20 2018 at 14:08):
particularly since there are also performance characteristics to worry about, and the lean 4 compiler may change this as well
Sean Leather (Sep 20 2018 at 14:08):
You have to commit to one for ∪
at the very least.
Sean Leather (Sep 20 2018 at 14:09):
I guess if you're going with the list.union
precedent, you would go with the right-biased option.
Mario Carneiro (Sep 20 2018 at 14:10):
I think alist
shouldn't have a notation, finmap
can use left preference by default since the API properties take precedence there
Mario Carneiro (Sep 20 2018 at 14:11):
I should look at other languages like java or python and see what preference is preferred
Sean Leather (Sep 20 2018 at 14:12):
I would consider Haskell, ML, OCaml, Coq, Agda, Idris, etc. before looking at Java or Python.
Mario Carneiro (Sep 20 2018 at 14:12):
hm, python has a mutation operation which makes the precedence obvious
Mario Carneiro (Sep 20 2018 at 14:14):
Same with java. OCaml takes a merge function that resolves differences
Mario Carneiro (Sep 20 2018 at 14:18):
Haskell has both union (left bias) and unionWith that takes a merge function
Mario Carneiro (Sep 20 2018 at 14:22):
I couldn't find anything for Coq. Idris uses merge left bias and mergeWith
Mario Carneiro (Sep 20 2018 at 14:22):
ML is roll your own dict
Sean Leather (Sep 21 2018 at 09:35):
I'm coming to realize that what would be better for me than a finmap
union that updates by insert
or erase
is something like an internal disjoint union. At least that's what I've gathered it would be called. [1, 2] For finmap
, perhaps the simplest definition would be:
def djunion (f g : finmap α β) : finmap α β := if disjoint f.keys g.keys then f ∪ g else ∅
But a more efficient version would probably use list.append
instead of list.kunion
as ∪
does.
I'm only dealing with unions of disjoint finmap
s, and it would simplify some things if I didn't have to track all of the disjoint f.keys g.keys
.
Sean Leather (Sep 21 2018 at 09:57):
Likewise, a disjoint insert would be good. Conceptually:
def djinsert (a : α) (b : β a) (f : finmap α β) : finmap α β := if a ∉ f.keys then insert a b f else ∅
But cons
would be better than kinsert
.
Sean Leather (Oct 08 2018 at 14:16):
@Mario Carneiro What is your plan with https://github.com/leanprover-community/mathlib/tree/finmap ? I'd like to see something get into mathlib. However, given the somewhat large difference between your rewrite and my library, I'm a bit reluctant at this point to invest the time to port more definitions and theorems over to that branch without knowing what's going to happen to it. Do you plan to continue working on that branch?
Mario Carneiro (Oct 08 2018 at 14:25):
would you like me to merge it?
Sean Leather (Oct 08 2018 at 14:27):
would you like me to merge it?
“It” being the leanprover-community/mathlib
branch? Sure, if you're happy with it in its current state. I would feel more comfortable extending it at that point.
Mario Carneiro (Oct 08 2018 at 14:28):
I noticed you renamed some things. Not a fan of knodup
?
Sean Leather (Oct 08 2018 at 14:29):
I noticed you renamed some things. Not a fan of
knodup
?
:slight_smile: No. I think nodupkeys
is better.
Mario Carneiro (Oct 08 2018 at 14:31):
when I look at it I think "nine letters is long for a name segment"
Mario Carneiro (Oct 08 2018 at 14:31):
You could have nodup_keys
but then it confuses with nodup
of keys
Mario Carneiro (Oct 08 2018 at 14:31):
what about nodup_fst
?
Sean Leather (Oct 08 2018 at 14:31):
You could have
nodup_keys
but then it confuses withnodup
ofkeys
Exactly.
Sean Leather (Oct 08 2018 at 14:32):
what about
nodup_fst
?
How is that better than nodupkeys
since nodup
and keys
already exist?
Sean Leather (Oct 08 2018 at 14:32):
... exist in the association list/finmap API that is.
Mario Carneiro (Oct 08 2018 at 14:33):
I mean for the list version, before the "map" interpretation
Sean Leather (Oct 08 2018 at 14:37):
I'd prefer nodupkeys
for both for consistency's sake. I think the name is clear in that it links nodup
and keys
and as short as it's needed to be.
Sean Leather (Oct 08 2018 at 14:38):
Anyway, I have to go now. If you want to discuss any other issues, I'll pick it up tomorrow.
Last updated: Dec 20 2023 at 11:08 UTC