Zulip Chat Archive

Stream: new members

Topic: Programming with sets


Mike Hamburg (Jun 24 2025 at 16:32):

Hi all, I'm new-ish to Lean, and I'm looking to use it as a general-purpose programming language where I can write some code and later prove correctness properties about it.

Mike Hamburg (Jun 24 2025 at 16:35):

One thing I've been struggling with is how to program with sets. (Edit: by this of course I mean finite sets, such as Python's frozenset or Haskell's Data.Set.) I have some inductive type T (e.g. a basic algebraic datatype), and I can derive BEq and Ord on T, and then I can make TreeSet T. But TreeSet doesn't expose some pretty basic operations such as intersection, and also doesn't have particularly many lemmas about it, especially compared to Finset T. But Finset T depends on having a LinearOrder. The automatically derived Ord instance ought to be a LinearOrder, but as far as I can tell the compiler won't derive LinearOrder for that type and it's a pain to prove for an inductive type with a few constructors.

Mike Hamburg (Jun 24 2025 at 16:36):

Also it seems to me that Finset, since it's based on unordered lists, will be quite slow. Eg it ought to take quadratic time for union and intersection, unless I'm missing something

Mike Hamburg (Jun 24 2025 at 16:38):

Is there a set type that makes this easy, where I get the usual set operations like \cap, \cup, \in etc, and can prove theorems about them, without having to write a lot of boilerplate?

Matthew Ballard (Jun 24 2025 at 16:45):

The only other "set" I can remember off the top of my head is docs#Std.HashSet

Mario Carneiro (Jun 24 2025 at 17:00):

Finset doesn't depend on linear orders, it only needs equality on the type

Mario Carneiro (Jun 24 2025 at 17:01):

If you want better performance you need either a linear order (give or take) or a hash function on the key type

Mario Carneiro (Jun 24 2025 at 17:02):

You can use RBSet as an alternative to TreeSet, it has more lemmas

Mike Hamburg (Jun 24 2025 at 17:08):

Yeah, RBTree could be another option, seems pretty similar to TreeSet. Finset doesn't need linear orders by itself, but to get the answer out as a list, without running into noncomputable, you have to sort it which requires linear orders unless I'm missing something.

Mike Hamburg (Jun 24 2025 at 17:12):

Out of curiosity, none of these "practical" sets implement Inter or have named intersection or disjointness functions, do they? Only Finset does?

Mike Hamburg (Jun 24 2025 at 17:14):

Also is there a way to derive a LinearOrder for inductive types? The derived Ord ought to induce a LinearOrder (if this is true for all of the fields, but it's true for common ones like Nat and Bool and String) and proving this should be more or less mechanical but also kind of a pain to do by hand

Kenny Lau (Jun 24 2025 at 17:16):

@Mike Hamburg I mean, you can't have everything, lol

Robin Arnez (Jun 24 2025 at 17:17):

Mike Hamburg schrieb:

derive a LinearOrder for inductive types

That would need deriving Std.TransOrd and deriving Std.LawfulEqOrd which doesn't exist yet and also it's very weird to induce a linear order on a type that is not like, linear

Robin Arnez (Jun 24 2025 at 17:18):

Intersection would be s1.filter s2.contains btw and I would use Std.ExtHashSet instead Std.TreeSet for having filter lemmas and extensionality

Robin Arnez (Jun 24 2025 at 17:21):

e.g.

import Std.Data.ExtHashSet

theorem inter_comm [BEq α] [Hashable α] [LawfulBEq α] (s₁ s₂ : Std.ExtHashSet α) :
    s₁.filter s₂.contains = s₂.filter s₁.contains := by
  ext k
  simp [and_comm]

although we don't specifically have intersection on these as a function or operator

Mike Hamburg (Jun 24 2025 at 17:30):

ah ok, that's not too bad for the inter_comm lemma.

I'm not sure what you mean about inducing a linear order on a type that's not linear?

Kenny Lau (Jun 24 2025 at 17:59):

@Mike Hamburg it might be worth asking this, what are you planning to do with sets?

Robin Arnez (Jun 24 2025 at 18:23):

I mean you usually have a linear order when "makes sense", e.g. for natural numbers, rationals, reals. Notably α × β has an which doesn't form a linear order (x.1 ≤ y.1 ∧ x.2 ≤ y.2). For the purposes described you would certainly use Ord instead which has a lot of lemmas under e.g. Std.TransCmp but no deriving handler currently (but I mean LinearOrder is not different in that regard).

Mike Hamburg (Jun 24 2025 at 20:37):

Oh hey Kenny! Long time no see, unless you're a different Kenny Lau. Basically code generation, which uses sets for various analysis purposes. The code generation doesn't have to be particularly fast, but being gratuitously quadratic would be a bummer

Mike Hamburg (Jun 24 2025 at 20:39):

@Robin Arnez I see. I was thinking of using a lexicographic order for the types in question, as Ord does. I guess that's why Lean doesn't auto-derive Ord for product types unless you use Lex... I was wondering why that would be the case

Robin Arnez (Jun 25 2025 at 10:31):

You can find it under the name lexOrd. Not sure why it's not an instance but if you need it:
attribute [instance] lexOrd

Kevin Buzzard (Jun 25 2025 at 11:48):

My guess is that it's not an instance because making it an instance would be tantamount to saying "this is the only order you are allowed on a product" which is certainly not true in general.

Aaron Liu (Jun 25 2025 at 12:02):

Products already have an order instance (the product order), it's just not a linear order for most of the time

Robin Arnez (Jun 25 2025 at 12:04):

Yeah right but there is no reason to make Ord compatible with LE. I'd say Ord is the programming order and LE is the mathematical order.

Aaron Liu (Jun 25 2025 at 12:04):

Robin Arnez said:

You can find it under the name lexOrd. Not sure why it's not an instance but if you need it:
attribute [instance] lexOrd

Since it would conflict with the usual order instance. If you want it as an instance there is docs#Prod.Lex.instOrdLexProd

Aaron Liu (Jun 25 2025 at 12:05):

Robin Arnez said:

Yeah right but there is no reason to make Ord compatible with LE. I'd say Ord is the programming order and LE is the mathematical order.

docs#LinearOrder bundles a compatible Ord instance

Robin Arnez (Jun 25 2025 at 12:05):

Yeah but the usual product order is not a linear order so there are no instance diamonds to worry about

Robin Arnez (Jun 25 2025 at 12:11):

I understand why you'd want LE and LT to be lexicograhical only on Lex but Ord?

Aaron Liu (Jun 25 2025 at 12:12):

Robin Arnez said:

Yeah but the usual product order is not a linear order so there are no instance diamonds to worry about

If either of the components is a subsingleton you get a linear order

Robin Arnez (Jun 25 2025 at 12:14):

Then it's equivalent to the lexicograhical order though..?

Robin Arnez (Jun 25 2025 at 12:15):

Also, we don't have that as an instance, do we?


Last updated: Dec 20 2025 at 21:32 UTC