Zulip Chat Archive

Stream: general

Topic: typeclasses for data structures


Scott Morrison (Sep 14 2019 at 10:24):

Do we have examples in mathlib of typeclasses specifying the behaviour of a data structure?

While working on a complicated tactic, I found myself wanting a priority queue, and ended up just using a list with ordered_insert (from data/list/sort.lean), because we don't have a good priority queue available.

It occurred to me that I really should be writing against an API for a priority queue, so that if one was ever implemented I could drop that in easily, and I suppose that the right way to do this is with a typeclass describing how to think of some type Q as a priority queue. However I'm a bit disconcerted by not knowing examples of us doing this in mathlib at present. (Obviously the question of what to do here generalises to any other interesting data structure.)

Does anyone have advice on best practice here?

Scott Morrison (Sep 14 2019 at 10:25):

My API might look something like:

class priority_queue_ops (Q : Π (α : Type u) [decidable_linear_order α], Type u) :=
(size   : Π {α : Type u} [decidable_linear_order α], Q α → ℕ)
(insert : Π {α : Type u} [decidable_linear_order α], α → Q α → Q α)
(peek   : Π {α : Type u} [decidable_linear_order α], Π (q : Q α), 0 < size q → α)
(pull   : Π {α : Type u} [decidable_linear_order α], Π (q : Q α), 0 < size q → (α × Q α))

class priority_queue (Q : Π (α : Type u) [decidable_linear_order α], Type u) extends priority_queue_ops Q :=
(peek_eq_pull : ∀ {α : Type u} [decidable_linear_order α] (q : Q α) (h : 0 < size q), peek q h = (pull q h).1)
(insert_size  : ∀ {α : Type u} [decidable_linear_order α] (q : Q α) (a : α), size (insert a q) = size q + 1)
(pull_size    : ∀ {α : Type u} [decidable_linear_order α] (q : Q α) (h : 0 < size q), size (pull q h).2 = size q - 1)
(count        : Π {α : Type u} [decidable_linear_order α], α → Q α → ℕ)
(pull_min     : ∀ {α : Type u} [decidable_linear_order α] (q : Q α) (h : 0 < size q) (a : α) (w : 0 < count a (pull q h).2), (pull q h).1 ≤ a)
(count_insert : ∀ {α : Type u} [decidable_linear_order α] (a b : α) (q : Q α), count a (insert b q) = count a q + (if a = b then 1 else 0))
(count_pull   : ∀ {α : Type u} [decidable_linear_order α] (a : α) (q : Q α) (h : 0 < size q), count a (pull q h).2 = count a q - (if a = (pull q h).1 then 1 else 0))

instance sorted_list_priority_queue : priority_queue (λ α _, by exactI (Σ' L : list α, L.sorted (≤))) := ...

Scott Morrison (Sep 14 2019 at 10:26):

(A separate more specific question is whether this is a reasonable API for a priority queue, but I really want to hear about the general idea of using typeclasses for this purpose in mathlib.)

Scott Morrison (Sep 14 2019 at 10:26):

(If people are happy with it, I have an implementation of that final instance.)

Jason Rute (Sep 14 2019 at 12:06):

One idea is to look how type classes for priority queues (and other data structures) are is implemented in other languages like Haskell.

Jason Rute (Sep 14 2019 at 12:06):

While in Lean, you can say 0 < size q explicitly, it might still be good to also have an option version of some of these functions. In other programming languages at least, it allows for easy composibility. For example it makes pull-or-else easy to implement and possible to chain a bunch of pulls together. (And since there is only one failure case with a pull or peek, there is no loss of information when it returns none.)

Jason Rute (Sep 14 2019 at 12:06):

Also for implementing this and other such data structures in an efficient way, you might look at this thesis (which I guess is also a book, neither of which I’ve read).

Jason Rute (Sep 14 2019 at 12:07):

Finally since it is Lean, you could have a separate lawful priority queue class which includes proofs that it really is a priority queue.

Jason Rute (Sep 14 2019 at 12:08):

(I don’t really program in Lean, so take my advice with a grain of salt.)

Jason Rute (Sep 14 2019 at 12:16):

Oh, sorry. You already showed it is lawful.

Jason Rute (Sep 14 2019 at 12:39):

Should you have an empty function and proofs that size and count of empty are 0?

Jason Rute (Sep 14 2019 at 13:04):

empty isn’t a good name, but I think you get the idea. I can’t see how to use the API in an implementation free way without being able to have a starting point.

Jason Rute (Sep 14 2019 at 13:10):

Also more advance ideas: Priority queues are monads (similar to the set monad). I am not sure this fact is useful in practice and it is probably not worth implementing your priority queue as a monad.

Scott Morrison (Sep 14 2019 at 13:19):

Thanks, Jason, that's all helpful.

Scott Morrison (Sep 14 2019 at 13:19):

I'd first like to get some opinions on the higher level question --- should we even have APIs for data structures in Lean, or just build particular implementations of them? It's not totally obvious to me that we need the APIs yet, given how little programming infrastructure there is generally.

Scott Morrison (Sep 14 2019 at 13:21):

On some of the other points --- I actually added empty locally in the meantime. :-) I agree that the option-valued pull and peek are essential in use, but I struggled to write the laws in terms of those, so thought I'd tack them on afterwards.

Jason Rute (Sep 14 2019 at 13:31):

As to that question, I know monad and such are implemented without the laws. Should the first class be the standard priority queue class? For tactics, verified implementations of algorithms and data structures are good, but not essential. And I could imagine it would be more inviting to first implement a fast priority queue and unit test it rather than do that and formally verify its correctness at the same time.

Jason Rute (Sep 14 2019 at 13:32):

(I know this is now beside the point, but) I think it is possible to satisfy your contract in a few paradoxical ways. First an ordered list where size is one minus the length of the list should satisfy it I think.


Last updated: Dec 20 2023 at 11:08 UTC