Zulip Chat Archive

Stream: general

Topic: association lists, finmap


view this post on Zulip Sean Leather (Sep 20 2018 at 06:57):

As some people know, I've been working on association lists and finite maps for mathlib (https://github.com/spl/lean-finmap). Mario has decided he wants to get it into mathlib soon, so he has started rewriting it in his own fashion (https://github.com/leanprover-community/mathlib/tree/finmap). I wanted to open up the discussion to get feedback from anyone else who might be interested.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:05):

One question is whether there is the need for a structure to wrap a list without duplicate keys. Mario thinks it might be useful:

structure alist (α : Type u) (β : α  Type v) : Type (max u v) :=
(val : list (sigma β))
(nd : val.knodup)

https://github.com/leanprover-community/mathlib/blob/e5ddd1d4dca984f6d7a77a87a1608b414296208f/data/list/alist.lean

I went down this path and found that, given that (a) there are so many useful definitions and theorems regarding lists and (b) the structure is such a thin wrapper, every definition or proof for the structure is simple, it created a large number of really simple statements. Considering that alist is really a list with a property and I believe it is useful to use other list definitions that don't involve this property, I think the alist structure either (a) makes it more difficult to work with the wrapped list or (b) creates the problem of scaling the number of statements about alist to match those already about list.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:09):

In another way of putting it, you could look at an alist as something unique on its own or you could look at a list with a property that it has no duplicate keys. In the process of developing my library, I found the latter view much more useful.

view this post on Zulip Simon Hudon (Sep 20 2018 at 07:10):

Why is latter more useful?

view this post on Zulip Sean Leather (Sep 20 2018 at 07:12):

On a related note, that's also how I ended up with the finmap structure:

structure finmap (α : Type u) (β : α  Type v) : Type (max u v) :=
(val : multiset (sigma β))
(nodupkeys : val.nodupkeys)

https://github.com/spl/lean-finmap/blob/035f1faa218e47b9f411c4a243900955f4714a56/src/data/finmap.lean#L8-L10

You could say that this is a similar situation: a finmap is just a multiset with no duplicate keys. However, I found that it's more useful to look at it as a finmap because I didn't find a lot of the other multiset definitions to be very useful for a finmap.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:13):

Why is latter more useful?

The most important reason for me is that I care about the structure of the internal list: nil, cons, append, etc.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:13):

Also, regarding finmap, I don't care about the structure of the internal multiset.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:15):

... which makes sense, since a multiset is a quotient.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:17):

@Simon Hudon Thanks for that question, BTW. It gets to a key reason I don't think alist is useful, a reason that hadn't yet come to mind.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:20):

That is, I found that it was useful to wrap a quotient like multiset in a structure but not that useful to wrap a list in a structure because the structure of the list is important when working with the structure. I ended up creating definitions to wrap nil, cons, append, etc., and all of that already existed for the list, so I came to ask why it was necessary. I couldn't come up with a good answer.

view this post on Zulip Simon Hudon (Sep 20 2018 at 07:25):

You could say that this is a similar situation: a finmap is just a multiset with no duplicate keys. However, I found that it's more useful to look at it as a finmap because I didn't find a lot of the other multiset definitions to be very useful for a finmap.

I don't get this nuance

view this post on Zulip Sean Leather (Sep 20 2018 at 07:30):

I don't get this nuance

I think it's similar to how you don't see a lot of the multiset definitions used for finset. There's a particular subset (mostly starting with nd) that are used in finset, but many of them are not.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:31):

multiset seems to be useful as a substrate for finset and finmap, but it also has a larger API that is oriented towards counting duplicates.

view this post on Zulip Sean Leather (Sep 20 2018 at 07:32):

Also, the subset of the multiset API used for finset is not useful for finmap.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:16):

I think the comparison is apt, and applicable to alist. A finset is just a multiset with a property, but the API is different (some basic operations are not applicable, like cons, and others are slightly weird and become more useful on the subtype, like ndinsert).

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:18):

In the case of alist, there are some operations that don't work anymore, like cons, and others are slightly weird and have been prefixed with k for lists. cons is just not a valid alist operation, insert is.

view this post on Zulip Sean Leather (Sep 20 2018 at 13:21):

I think the comparison is apt, and applicable to alist.

So does this mean you agree with me, since my comparison is apt, which is to say that alist is not a useful structure?

view this post on Zulip Sean Leather (Sep 20 2018 at 13:23):

In the case of alist, there are some operations that don't work anymore, like cons, and others are slightly weird and have been prefixed with k for lists. cons is just not a valid alist operation, insert is.

This hints of disagreement. My point is that lists are constructed with nil and cons, and it is useful to have that construction, and therefore more useful to have a bare list with no duplicate keys than to have a structure hide that construction.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:25):

In short: I think finset is useful, and I think alist is useful

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:26):

I don't see why alist should provide a nil/cons construction. It's not natural. If you want to do such a construction then you should unpack the alist first

view this post on Zulip Sean Leather (Sep 20 2018 at 13:26):

That's kind of my point. :wink:

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:26):

Instead it has empty/insert

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:27):

My point is, why are you doing list things on alists?

view this post on Zulip Sean Leather (Sep 20 2018 at 13:28):

My point is that I want to construct lists with no duplicate keys as well as use any other existing definitions and theorems on lists.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:30):

then do so, on list. You don't need alist for that

view this post on Zulip Sean Leather (Sep 20 2018 at 13:30):

Quite right, which is why I don't think alist is necessary.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:31):

alist provides a compositional structure for safety-preserving functions on lists

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:32):

as well as use any other existing definitions and theorems on lists.

Most list operations don't preserve the invariant. What do you do with these?

view this post on Zulip Sean Leather (Sep 20 2018 at 13:33):

Well, if you want it, add it. I'm only trying to save you work.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:33):

This is an honest question. What is missing from the API of alist that you would want, that requires resorting to list?

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:35):

Except for union, there's nothing I've missed that comes to mind

view this post on Zulip Sean Leather (Sep 20 2018 at 13:36):

I'm not sure about alist, but I defined the minimum necessary to use lists with no duplicate keys. For example, has_mem and append don't need to be redefined. You only need some extra properties to use append, for example.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:36):

To answer your question from earlier, alist should also be available as a tool for programmers. "I need a map, I only have equality" -> use alist or finmap

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:37):

Note that the has_mem on alist is different from the one on lists

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:37):

and append doesn't apply

view this post on Zulip Sean Leather (Sep 20 2018 at 13:37):

Which I don't think it should be.

view this post on Zulip Sean Leather (Sep 20 2018 at 13:37):

As you'll see in my library.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:39):

I think in common programming parlance, x \in m where m is a dictionary / map / associative array, means x is a key contained in the map m, not x is a key-value pair in the map

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:40):

The idiom for the other interpretation is <x, y> \in m.val, where val is the operation sometimes called entries in other libraries

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:40):

(indeed, maybe it should be renamed to that)

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:45):

Note also that if it meant membership in the underlying list, then it wouldn't be decidable unless you assume decidability of the values, which is not needed for anything else

view this post on Zulip Sean Leather (Sep 20 2018 at 13:46):

Fair enough. With a : alist, you can use a.val/a.entries.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:47):

right, I think this helps to view them as different things (a finite map vs a list of pairs)

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:48):

You pointed out that it is possible to define operations like append if you assume side conditions (i.e. the key sets are disjoint). Does this come up often?

view this post on Zulip Sean Leather (Sep 20 2018 at 13:49):

I don't know about “often,” but it has come up.

view this post on Zulip Sean Leather (Sep 20 2018 at 13:51):

I'm now using finmap instead of association lists directly, and you have the same thing with union there.

view this post on Zulip Sean Leather (Sep 20 2018 at 13:51):

I'm assuming you'll use has_union for alist instead of has_append?

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:51):

yes

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:52):

since it's union, not append :P

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:52):

it's not even possible to give an append instance that uses list.append

view this post on Zulip Sean Leather (Sep 20 2018 at 13:52):

Well, it's debatable that it's not a form of append.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:52):

I prefer set terminology when duplicates are being dropped

view this post on Zulip Sean Leather (Sep 20 2018 at 13:53):

It's still appending. :slight_smile:

view this post on Zulip Sean Leather (Sep 20 2018 at 13:53):

When it comes to the finmap, you can forget that there is any appending. But with alist, you can't.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:54):

Not sure what you mean. The underlying operation is not an append

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:55):

It is true that union would be order-respecting on alist

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:55):

but then again, so is union on list

view this post on Zulip Sean Leather (Sep 20 2018 at 13:56):

def append : list α  list α  list α
| []       l := l
| (h :: s) t := h :: (append s t)

def kunion : list (sigma β)  list (sigma β)  list (sigma β)
| []         l := l
| (hd :: tl) l := hd :: kunion tl (kerase hd.1 l)

view this post on Zulip Sean Leather (Sep 20 2018 at 13:56):

Are you doing it differently?

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:57):

I think the erase makes that a rather different kind of operation. But also I think we need two functions here - that is unionl

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:58):

unionr would be

def kunionr : list (sigma β) → list (sigma β) → list (sigma β)
| []         l := l
| (hd :: tl) l := kinsert hd.1 hd.2 (kunionr tl l)

view this post on Zulip Mario Carneiro (Sep 20 2018 at 13:58):

I'm not even sure if it should get a has_union notation

view this post on Zulip Sean Leather (Sep 20 2018 at 13:59):

I'm not sure if you need it. I don't foresee people asking for it.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:00):

You need left and right preference. For finmap you can ignore it since the order of parameters gives the preference, but on alist you have order and preference separately

view this post on Zulip Sean Leather (Sep 20 2018 at 14:02):

Why do you need it? Just because it's possible doesn't make it a need.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:03):

I'm okay with just assuming right preference (which is what list.union does), but you want left preference?

view this post on Zulip Sean Leather (Sep 20 2018 at 14:04):

I stuck with the Haskell Data.Map approach because it seemed to the most intuitive to me.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:05):

Keep in mind that haskell has laziness to contend with, making your definition more natural for them

view this post on Zulip Sean Leather (Sep 20 2018 at 14:05):

Yep.

view this post on Zulip Sean Leather (Sep 20 2018 at 14:06):

is infixl, so could that also indicate left-biased is preferred?

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:06):

It seems like the difference between foldl and foldr

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:06):

and we have both of those

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:07):

I'd rather not commit to one camp here

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:08):

particularly since there are also performance characteristics to worry about, and the lean 4 compiler may change this as well

view this post on Zulip Sean Leather (Sep 20 2018 at 14:08):

You have to commit to one for at the very least.

view this post on Zulip Sean Leather (Sep 20 2018 at 14:09):

I guess if you're going with the list.union precedent, you would go with the right-biased option.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:10):

I think alist shouldn't have a notation, finmap can use left preference by default since the API properties take precedence there

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:11):

I should look at other languages like java or python and see what preference is preferred

view this post on Zulip Sean Leather (Sep 20 2018 at 14:12):

I would consider Haskell, ML, OCaml, Coq, Agda, Idris, etc. before looking at Java or Python.

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:12):

hm, python has a mutation operation which makes the precedence obvious

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:14):

Same with java. OCaml takes a merge function that resolves differences

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:18):

Haskell has both union (left bias) and unionWith that takes a merge function

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:22):

I couldn't find anything for Coq. Idris uses merge left bias and mergeWith

view this post on Zulip Mario Carneiro (Sep 20 2018 at 14:22):

ML is roll your own dict

view this post on Zulip Sean Leather (Sep 21 2018 at 09:35):

I'm coming to realize that what would be better for me than a finmap union that updates by insert or erase is something like an internal disjoint union. At least that's what I've gathered it would be called. [1, 2] For finmap, perhaps the simplest definition would be:

def djunion (f g : finmap α β) : finmap α β := if disjoint f.keys g.keys then f  g else 

But a more efficient version would probably use list.append instead of list.kunion as does.

I'm only dealing with unions of disjoint finmaps, and it would simplify some things if I didn't have to track all of the disjoint f.keys g.keys.

view this post on Zulip Sean Leather (Sep 21 2018 at 09:57):

Likewise, a disjoint insert would be good. Conceptually:

def djinsert (a : α) (b : β a) (f : finmap α β) : finmap α β := if a  f.keys then insert a b f else 

But cons would be better than kinsert.

view this post on Zulip Sean Leather (Oct 08 2018 at 14:16):

@Mario Carneiro What is your plan with https://github.com/leanprover-community/mathlib/tree/finmap ? I'd like to see something get into mathlib. However, given the somewhat large difference between your rewrite and my library, I'm a bit reluctant at this point to invest the time to port more definitions and theorems over to that branch without knowing what's going to happen to it. Do you plan to continue working on that branch?

view this post on Zulip Mario Carneiro (Oct 08 2018 at 14:25):

would you like me to merge it?

view this post on Zulip Sean Leather (Oct 08 2018 at 14:27):

would you like me to merge it?

“It” being the leanprover-community/mathlib branch? Sure, if you're happy with it in its current state. I would feel more comfortable extending it at that point.

view this post on Zulip Mario Carneiro (Oct 08 2018 at 14:28):

I noticed you renamed some things. Not a fan of knodup?

view this post on Zulip Sean Leather (Oct 08 2018 at 14:29):

I noticed you renamed some things. Not a fan of knodup?

:slight_smile: No. I think nodupkeys is better.

view this post on Zulip Mario Carneiro (Oct 08 2018 at 14:31):

when I look at it I think "nine letters is long for a name segment"

view this post on Zulip Mario Carneiro (Oct 08 2018 at 14:31):

You could have nodup_keys but then it confuses with nodup of keys

view this post on Zulip Mario Carneiro (Oct 08 2018 at 14:31):

what about nodup_fst?

view this post on Zulip Sean Leather (Oct 08 2018 at 14:31):

You could have nodup_keys but then it confuses with nodup of keys

Exactly.

view this post on Zulip Sean Leather (Oct 08 2018 at 14:32):

what about nodup_fst?

How is that better than nodupkeys since nodup and keys already exist?

view this post on Zulip Sean Leather (Oct 08 2018 at 14:32):

... exist in the association list/finmap API that is.

view this post on Zulip Mario Carneiro (Oct 08 2018 at 14:33):

I mean for the list version, before the "map" interpretation

view this post on Zulip Sean Leather (Oct 08 2018 at 14:37):

I'd prefer nodupkeys for both for consistency's sake. I think the name is clear in that it links nodup and keys and as short as it's needed to be.

view this post on Zulip Sean Leather (Oct 08 2018 at 14:38):

Anyway, I have to go now. If you want to discuss any other issues, I'll pick it up tomorrow.


Last updated: May 08 2021 at 19:11 UTC