Zulip Chat Archive

Stream: lean4

Topic: Subtype vs Set

Henrik Böving (Mar 12 2022 at 11:14):

While writing docs4#BoundedRandom (which was originally implemented using sets that only have elements within a certain range: docs#bounded_random) I started wondering when to use a subtype vs a set. I can see that you want to use a set when e.g. defining a subgroup because you'd have to be polymorphic over whatever is the property of the subtype as well which is just unnecessary when you can use a set. But when the property is known like in this case, when and why should I use a subtype over a set or a set over a subtype?

Kevin Buzzard (Mar 12 2022 at 11:25):

One is a type, one is a term. This helps sometimes.

Kyle Miller (Mar 12 2022 at 19:35):

Ignoring the dependent types, this reduces to the design question of whether you want to pass around multiple arguments and return values or bundle them up into a structure. There's more freedom when you keep things separate, but it can be more convenient conglomerating them, if in practice the things really do tend to be passed around together. One example: do you have a function take a pair? or two values? Relatedly: do you have the theorem take a proof of an And or two proofs? (In mathlib, almost no theorems take an And, but many do take a custom And via a Prop-valued structure.)

For sets vs subtypes, it's a question of whether you pass around two things or one thing. One way to think about it is that there are a bunch of functions that only take one value, and by using Subtype.mk you can effectively make them take two (the term and a proof that a predicate holds for it).

Maybe an example is the theorems about nonzero natural numbers. The theorems are generally stated using n : Nat and h : n ≠ 0, but they could also be stated using n' : {n : Nat // n ≠ 0}. However, these would be less applicable because you can't rw with them if the natural numbers aren't literally bundled with their ≠ 0 proofs like this.

Kyle Miller (Mar 12 2022 at 19:35):

Lean has a handful of built-in (non)-dependent pairs:

And for Prop/Prop (non-dependent, Prop-valued)
Prod for Type/Type (non-dependent)
Exists for Sort/Prop (dependent, Prop-valued)
Sigma for Type/Type (dependent)
PSigma for Sort/Sort (dependent)
I tend to think of them as being for when you sort of would want a custom two-field structure, but for whatever reason it's not worth all the effort of doing so.

Last updated: May 02 2025 at 03:31 UTC