Zulip Chat Archive

Stream: Is there code for X?

Topic: More API for pair sets?


Terence Tao (Jan 28 2024 at 17:49):

I'm finding that there is very little explicit API for pair sets {a,b}, but a lot of things can be done by simp plus maybe one more lemma. For instance:

import Mathlib

example {α:Type*} (x a b: α) : x  ({a,b} : Set α)  x = a  x = b := by
  simp

example {α:Type*} (a b: α) (E: Set α): {a,b}  E  a  E  b  E := by
  simp [Set.insert_subset_iff]

example {α:Type*} (a b: α): ({a,b} : Set α) = {a}  {b} := by
  simp [Set.pair_comm a b]

Would it be worth making these basic lemmas more explicit in Mathlib (similar to how there is plenty of API for singletons {a}? (I feel like the second example could also be a simp lemma.)

Yaël Dillies (Jan 28 2024 at 17:59):

The reason for this situation is that all finite explicit sets are built out of three things:

along with a class stating that insert a ∅ = {a} (docs#IsLawfulSingleton)

Yaël Dillies (Jan 28 2024 at 18:00):

Hence {a} is a primitive while {a, b} is derived as insert a {b}

Kevin Buzzard (Jan 28 2024 at 18:00):

Maybe the first two could be simp lemmas and the third one should be stated the other way around and then made a simp lemma if you think that {a,b} is a good "canonical form" for {a} union {b}.

Kevin Buzzard (Jan 28 2024 at 18:01):

At what level of reducibility is {a,b} equal to insert a {b}?

Ruben Van de Velde (Jan 28 2024 at 18:02):

Isn't it insert a insert b empty?

Yaël Dillies (Jan 28 2024 at 18:02):

Kevin Buzzard said:

At what level of reducibility is {a,b} equal to insert a {b}?

Syntactic

Yaël Dillies (Jan 28 2024 at 18:02):

Ruben Van de Velde said:

Isn't it insert a insert b empty?

Nope, that's the point of this setup.

Yaël Dillies (Jan 28 2024 at 18:03):

You want x ∈ {a, b} to be defeq to x = a ∨ x = b. If {a, b} were notation for insert a (insert b ∅) then x ∈ {a, b} would instead be defeq to x = a ∨ x = b ∨ False

Kevin Buzzard (Jan 28 2024 at 18:03):

We avoid the junk "or False" which was being appended when the naive implication of set notation was used

Kevin Buzzard (Jan 28 2024 at 18:05):

Similarly x \in {a} is presumably syntactic for x = a. For ages in lean 3 it was definitonally something else and it was really annoying :-)

Terence Tao (Jan 28 2024 at 18:05):

Kevin Buzzard said:

Maybe the first two could be simp lemmas and the third one should be stated the other way around and then made a simp lemma if you think that {a,b} is a good "canonical form" for {a} union {b}.

The first one is already a simp lemma (or more precisely the composition of two existing simp lemmas, Set.mem_insert_iff and Set.mem_singleton_iff). As for the third lemma, given the lack of API for pair sets currently, it doesn't seem like a great idea to simp into a pair set.

The main issue from my viewpoint as a user is that these obvious lemmas were not picked up by the usual searches (library search, exact?, apply?, etc.). If one isn't aware that {a,b} is syntactic sugar for insert a {b} (which, by the way, is not made very explicit in the current documentation), it actually becomes quite hard to manipulate pair sets with the current API, other than by hoping to get lucky with simp.

Mario Carneiro (Jan 28 2024 at 18:07):

Kevin Buzzard said:

Similarly x \in {a} is presumably syntactic for x = a. For ages in lean 3 it was definitonally something else and it was really annoying :-)

No, x \in {a} is syntactic for Mem x (singleton a), it is (regular) defeq to x = a though

Yaël Dillies (Jan 28 2024 at 18:07):

Then we should make the documentation clearer. This setup has been around for ages

Kevin Buzzard (Jan 28 2024 at 18:08):

In my experience what happens when you start formalising interesting stuff is that you discover missing pieces of the API for the objects you're using, and then the best favour you can do to the community is to make a short PR adding them. PRs containing no new definitions and which fill in holes in the library are very easy to review and merge, and once they're in, you don't have to maintain them yourself.

Yaël Dillies (Jan 28 2024 at 18:09):

Yes but here the situation is different, because the correct answer to "We are missing API for pairs" is "Make more API for insert then"

Kevin Buzzard (Jan 28 2024 at 18:09):

Right but this can all be discussed on GitHub

Kevin Buzzard (Jan 28 2024 at 18:10):

Yael if you take one look at it and see a better abstraction then you just make a suggestion of eg a complete rewrite

Kevin Buzzard (Jan 28 2024 at 18:10):

It's ten lines of code

Mario Carneiro (Jan 28 2024 at 18:10):

Yaël Dillies said:

Yes but here the situation is different, because the correct answer to "We are missing API for pairs" is "Make more API for insert then"

and no matter how many times we do that the problem of "we appear to be missing API for pairs" doesn't go away

Terence Tao (Jan 28 2024 at 18:11):

Yaël Dillies said:

Then we should make the documentation clearer. This setup has been around for ages

As far as I can tell the only documentation currently that states that pair sets {a,b} are defined as insert a {b} is a cryptic sentence in https://leanprover-community.github.io/mathlib4_docs/Std/Classes/SetNotation.html#%C2%ABterm{_}%C2%BB

Mario Carneiro (Jan 28 2024 at 18:11):

the docs on that should definitely be improved to state the desugaring

Mario Carneiro (Jan 28 2024 at 18:12):

does that docstring show up when you hover {1, 2}?

Kevin Buzzard (Jan 28 2024 at 18:12):

This subtlety is all due to the fact that for counting "things" (like elements of a set) it's sometimes better to start at 1

Yaël Dillies (Jan 28 2024 at 18:12):

Yeah well first example is by simp (or even rfl), second example would be by simp if Set.insert_subset_iff were a simp lemma (which I personally think is reasonable, but possibly unforeseen drawbacks), and the third one is rfl again

Kevin Buzzard (Jan 28 2024 at 18:13):

Thanks for the review!

Yaël Dillies (Jan 28 2024 at 18:14):

Mario Carneiro said:

does that docstring show up when you hover {1, 2}?

Just to expand on that, many notations currently don't display any documentation on hovers, which is a very poor state of things. #9942 is a first attempt at improving this.

Yaël Dillies (Jan 28 2024 at 18:15):

This all comes from our Lean 3 habit of trusting notation to know what docstring it should display (which is usually obvious, but not always)

Kevin Buzzard (Jan 28 2024 at 18:17):

Someone could now open a mathlib PR, tag it WIP, add simp to the lemma Yael suggests, and then use CI to see if mathlib compiles, and then run the benchmarker on the results to see if mathlib is now easier or harder to compile.

Mario Carneiro (Jan 28 2024 at 18:17):

I think the inherit_doc algorithm is a bit more conservative than lean 3, but the fact that the default is no docs is very ungreat

Terence Tao (Jan 28 2024 at 18:17):

The immediate fix then is to expand on the docstring for «term{_}» to explicitly state that {a,b} is syntactically equal to insert a {b}, and maybe also that {a,b,c} is syntactically equal to insert a {b,c}. Given that this docstring shows up on hover, that should be enough of a guide for a typical user to know where to look for API.

Mario Carneiro (Jan 28 2024 at 18:18):

Re: #9942, when adding explicit docs to a notation one thing I like to do is to make sure to actually mention the declaration name in the docstring, because it's not obvious or easy to get to this information otherwise

Yaël Dillies (Jan 28 2024 at 18:18):

Mario Carneiro said:

I think the inherit_doc algorithm is a bit more conservative than lean 3, but the fact that the default is no docs is very ungreat

Strong agreed. It has the advantage that we now have more opportunities to write better docstrings, though.

Yaël Dillies (Jan 28 2024 at 18:19):

Isn't the declaration name something that's accessible on hover?

Mario Carneiro (Jan 28 2024 at 18:19):

you see the docs on hover

Yaël Dillies (Jan 28 2024 at 18:19):

Don't you see the expected type too?

Mario Carneiro (Jan 28 2024 at 18:19):

yes

Terence Tao (Jan 28 2024 at 18:19):

image.png
This is what one sees for instance in the playground. No declaration name, just the docstring and type.

Mario Carneiro (Jan 28 2024 at 18:20):

but that doesn't tell you what function it is

Yaël Dillies (Jan 28 2024 at 18:21):

Expected type got a lot more confusing in Lean 4, though, because Lean 4 displays

very_long_expected_term
_across_several_lines : ex
pected_type

where Lean 3 would just give

expected_type

which I find more useful

Mario Carneiro (Jan 28 2024 at 18:21):

I don't think that's true?

Mario Carneiro (Jan 28 2024 at 18:21):

hover on terms just shows the expected type

Mario Carneiro (Jan 28 2024 at 18:21):

hover on constants shows the constant signature

Yaël Dillies (Jan 28 2024 at 18:22):

...unless I'm blind
image.png

Mario Carneiro (Jan 28 2024 at 18:22):

and I was just about to suggest that @[inherit_doc] and @[inherit_doc foo] should set things up so that you also see the constant hover for foo, instead of just the docstring for it

Yaël Dillies (Jan 28 2024 at 18:23):

Ìs the thing in my image what you mean by "constant signature"?

Mario Carneiro (Jan 28 2024 at 18:23):

no

Mario Carneiro (Jan 28 2024 at 18:24):

image.png

Yaël Dillies (Jan 28 2024 at 18:25):

This sounds useful, but should be displayed after the docstring

Mario Carneiro (Jan 28 2024 at 18:25):

isn't doc-gen the same way?

Terence Tao (Jan 28 2024 at 18:26):

I also recommend adding text to the (currently quite pitiful) section on API on pair sets near docs#Set.pair_eq_singleton to the effect that one can prove many results about {a,b} by using that it is syntactically equal to insert a {b}.

Yaël Dillies (Jan 28 2024 at 18:26):

Yeah, which is very confusing now that we have docstrings on structure fields.

Mario Carneiro (Jan 28 2024 at 18:27):

there was some hubbub about this because the docs were going before the declaration ("to match the docstring syntax in the lean file"), but it has since been changed to have the docs after the definition

Mario Carneiro (Jan 28 2024 at 18:27):

which IMO makes more sense

Mario Carneiro (Jan 28 2024 at 18:28):

docs#Dvd doesn't look that unreasonable to me

Terence Tao (Jan 28 2024 at 18:29):

I still think it's worth explicitly putting the second example into Mathlib though. Right now it is not picked up by exact?, hint, apply?, or simp.

Yaël Dillies (Jan 28 2024 at 18:35):

Yeah this should be fixed by making Set.insert_subset_iff a simp lemma.

Yaël Dillies (Jan 28 2024 at 18:36):

Mario Carneiro said:

docs#Dvd doesn't look that unreasonable to me

Actually, this looks better than it used to. I was remembering the time where the structure docstring would come after the fields (and their docstrings).

Yaël Dillies (Jan 28 2024 at 18:47):

Mario Carneiro said:

Re: #9942, when adding explicit docs to a notation one thing I like to do is to make sure to actually mention the declaration name in the docstring, because it's not obvious or easy to get to this information otherwise

Should I actually do this or should I assume that hovers will soon work differently and include the declaration name automatically?

Ruben Van de Velde (Jan 28 2024 at 19:04):

Yaël Dillies said:

Yeah this should be fixed by making Set.insert_subset_iff a simp lemma.

Isn't docs#Set.insert_subset_iff a bad simp lemma because it duplicates t?

Mario Carneiro (Jan 28 2024 at 19:05):

so does mem_insert though

Mario Carneiro (Jan 28 2024 at 19:05):

I think it's a reasonable simp lemma exactly because it works nicely on insert chains

Mario Carneiro (Jan 28 2024 at 19:10):

then again, unlike mem_insert, with insert_subset_iff both sides of the subset can be pairs; we probably don't want {a, b, c, d, e} ⊆ {x, y, z, u, v} to explode into 25 cases

Yaël Dillies (Jan 28 2024 at 19:14):

Sometimes we might? So maybe that indicates it should be in a non-default simp set

Terence Tao (Jan 28 2024 at 19:15):

I think that strengthens the case of making {a,b} ⊆ E ↔ a ∈ E ∧ b ∈ E a standalone lemma (and possibly also a simp lemma), for instance to be added to the small collection of "Lemmas about pairs" in https://leanprover-community.github.io/mathlib4_docs/Mathlib/Data/Set/Basic.html

Yaël Dillies (Jan 28 2024 at 19:17):

Actually, Ruben is right that even that would be a bad simp lemma since it duplicates E. If E is a huge expression, this might give exponential slowdown.

Mario Carneiro (Jan 28 2024 at 19:17):

quadratic slowdown is more likely

Yaël Dillies (Jan 28 2024 at 19:18):

Probably something that could be fixed with pre vs post simp lemmas, even

Mario Carneiro (Jan 28 2024 at 19:19):

it should be in a non-default simp set

aka it should not be simp and people should just pass it to simp when they want it

Yaël Dillies (Jan 28 2024 at 19:22):

That's one solution (status quo), but my hope is that we could make it more discoverable by having a simp set gathering a bunch of lemmas like this one which are not so easy to find

Terence Tao (Jan 28 2024 at 19:25):

actually, just having a bunch more lemmas with the word pair in their name (Set.mem_pair, Set.pair_subset_iff, etc.), would make searching for lemmas involving pairs quite easy.

Terence Tao (Jan 28 2024 at 19:27):

Do you envision some larger set of simp lemmas that a more desperate version simp! of simp would try using if simp itself was unsuccessful? ["I really, really need this simplified, regardless of computational cost."]


Last updated: May 02 2025 at 03:31 UTC