Zulip Chat Archive

Stream: mathlib4

Topic: Should Subtype.val '' be related to Lean's internal cast?


Terence Tao (Nov 14 2023 at 18:59):

If E is a set in X and F is a set in E (viewed as a subtype of X), I was wondering if Subtype.val '' F is intended to match with (F : Set X), i.e.,

import Mathlib

example {X : Type*} {E : Set X} {F : Set E} : (Subtype.val '' F) = (F : Set X) := by
  sorry

It seems that (F: Set X) uses a method Lean.Internal.coeM which the documentation warns not to actually use in practice. So does this mean that the casting operator ( : Set X) is deprecated, and I should be using Subtype.val '' throughout instead?

Yaël Dillies (Nov 14 2023 at 19:00):

Yes exactly. Lean.Internal.coeM is not something you should ever see. But we haven't managed to enforce that invisibility yet.

Terence Tao (Nov 14 2023 at 19:01):

A pity, because the casting (F : Set X) is intuitive to me and sounds like something I would want to use.

Yaël Dillies (Nov 14 2023 at 19:02):

We're not claiming we shouldn't have such casting, simply that Lean.internal.coeM really is not what it's meant to be using

Patrick Massot (Nov 14 2023 at 19:03):

If you don't want to see it then you can hide it using attribute [coe] Lean.Internal.coeM somewhere near the top of your file.

Patrick Massot (Nov 14 2023 at 19:04):

But of course this is a very superficial thing to do.

Yaël Dillies (Nov 14 2023 at 19:05):

Are those definitionally equal, at least?

Patrick Massot (Nov 14 2023 at 19:09):

What do you mean by "those"?

Patrick Massot (Nov 14 2023 at 19:09):

Are you asking whether the example above is rfl? I don't think, but it is definitely true.

Terence Tao (Nov 14 2023 at 19:10):

I can't see how to prove it though given that there are no methods relating to Lean.internal.coeM.

Patrick Massot (Nov 14 2023 at 19:10):

You can unfold the definition.

Eric Wieser (Nov 14 2023 at 19:11):

import Mathlib

example {X : Type*} {E : Set X} {F : Set E} : (Subtype.val '' F) = (F : Set X) := by
  dsimp [Lean.Internal.coeM, CoeT.coe, CoeHTCT.coe, CoeHTC.coe, CoeOTC.coe, CoeTC.coe]
  show Subtype.val '' F =  i  F, {i}
  sorry

Eric Wieser (Nov 14 2023 at 19:11):

This definitely shouldn't happen, none of those coes are supposed to ever be visible

Patrick Massot (Nov 14 2023 at 19:12):

A dirty proof is

import Mathlib

attribute [coe] Lean.Internal.coeM

example {X : Type*} {E : Set X} {F : Set E} : (Subtype.val '' F) = (F : Set X) := by
  dsimp [Lean.Internal.coeM]
  ext x
  simp only [Set.iUnion_coe_set, Set.mem_iUnion, Set.mem_singleton_iff, exists_prop]
  constructor
  · rintro a, ha, rfl
    use a, a.property, ha
    rfl
  · rintro _, xE, xF, rfl
    use x, xE

Patrick Massot (Nov 14 2023 at 19:13):

But really I'm worried that needing this lemma reveals a deeper issue with your setup. Type theory doesn't play nicely with sets in sets. Usually there is a better way of saying the same thing.

Kyle Miller (Nov 14 2023 at 19:17):

Patrick makes a good point, but just to see how we can get this coercion to work, I stumbled upon this:

import Mathlib

attribute [-instance] Set.monad
instance {X : Type*} (s : Set X) (t : Set s) : CoeT (Set s) t (Set X) := ⟨(Subtype.val '' t)⟩

example {X : Type*} {E : Set X} {F : Set E} : (Subtype.val '' F) = (F : Set X) := by
  -- ⊢ Subtype.val '' F = Subtype.val '' F
  sorry

(Do we want Set to actually be a Monad? Sets don't have computational content in Lean.)

Sebastien Gouezel (Nov 14 2023 at 19:18):

This is an instance of dependent type hell. Another one you might meet is the following: if you have two conditions p and q on a type X, and build the subtype made of the points satisfying p, and then in this one build the subsubtype made of the points also satisfying q, then the thing you get is not the same as if you do things in the other order, using q first and then p. Of course, there is a canonical bijection between them, but it's not even trivial to write down. On the other hand, if you use subsets of X, then the two things are equal (although not definitionally) and a lot of the pain goes away.

Sebastien Gouezel (Nov 14 2023 at 19:19):

If this becomes too painful, you may consider working only with subsets and never with subtypes.

Patrick Massot (Nov 14 2023 at 19:20):

Let me be more explicit than Kyle: Kyle says that we could simply change the definition of the coercion from Set E to Set X to make the equality true by definition. This would require to give up the monad instance on Set whose utility isn't clear anyway.

Kyle Miller (Nov 14 2023 at 19:24):

And we don't have to give up the monad entirely. There's this trick for enabling do notation and all that on a case-by-case basis:

def SetM (α : Type*) := Set α

instance : Monad SetM := Set.monad

def Set.run {α : Type*} (s : SetM α) : Set α := s

def pairs {α : Type*} (s : Set α) : Set (α × α) := Set.run do
  let x  s
  let y  s
  return (x, y)

Terence Tao (Nov 14 2023 at 19:30):

Patrick Massot said:

But really I'm worried that needing this lemma reveals a deeper issue with your setup. Type theory doesn't play nicely with sets in sets. Usually there is a better way of saying the same thing.

Here's what I was actually trying to prove:

import Mathlib

attribute [instance] MeasureTheory.Measure.Subtype.measureSpace

example {X : Type*} [hX: MeasureTheory.MeasureSpace X] {E : Set X} (hE: MeasurableSet E) {F : Set E} (hF : MeasurableSet
F) : MeasureTheory.MeasureSpace.volume F = MeasureTheory.MeasureSpace.volume (F : Set X) := by
  sorry

I can prove this with (F: Set X) replaced by Subtype.val '' F, hence my above question. Now I guess I can just do the rest of my project without ever touching the casting operator to Set X and using Subtype.val '' throughout, but I think it's a pity because to me the above statement is really naturally stated using the casting operator.

Eric Wieser (Nov 14 2023 at 19:30):

Independently of whether Set should be a monad, I think the design of coeM is wrong and it should be inlined like any other coercion

Eric Wieser (Nov 14 2023 at 19:31):

Especially since it doesn't currently inline the coercion inside it

Patrick Massot (Nov 14 2023 at 19:33):

Terry, this statement already has a Set E with E : Set X, so you need to tell use more in order to see whether you really want this statement.

Terence Tao (Nov 14 2023 at 19:35):

Ultimately I want to prove the formula P[ F | E ] = P [ F ] / P [ E ] where E is an event in a probability space X (let's take it to be positive probability to avoid technicalities), and F is an event in the conditioned probability space E. I'm modeling E as a subset of X and F as a subset of E.

Terence Tao (Nov 14 2023 at 19:37):

One could I guess view the conditioned probability space as still a space on X, and view F as a subset of X directly that happens to lie in E, but I find it more conceptual to think of E as the universal set of the conditioned space, rather than X. I was hoping that Lean could tolerate this small amount of dependency in the types.

Patrick Massot (Nov 14 2023 at 19:39):

Lean can definitely tolerate this, but it is not necessarily the easiest path.

Patrick Massot (Nov 14 2023 at 19:39):

If you don't need anything more nested than that then it's probably ok.

Terence Tao (Nov 14 2023 at 19:42):

The reason why I'd like to narrow Set.univ down from X to E is that I will need some sort of congr type statement to the effect that if two events or random variables agree on E, but not necessarily on all of X, then their probabilities (or expectations, entropies or whatever) will also agree when conditioned to E. If I were to keep the universal set fixed at X then I would need some auxiliary lemmas that track the essential support of various probability measures in order to do congr properly, and it seems simpler to me to allow for a dependent universal set.

Terence Tao (Nov 14 2023 at 19:44):

I guess for now I'll just take one of your proofs of the equivalence of Subtype.val '' F and (F : Set X) so that I can use (F : Set X) in the statements of my lemmas even if I don't plan to do anything with this cast other than turn it right back into Subtype.val '' F, just because I think it is aesthetically more natural.

Yaël Dillies (Nov 14 2023 at 19:44):

Getting congr to play ball is a more minor issue than fighting dependent hell in my experience.

Terence Tao (Nov 14 2023 at 20:32):

Yaël Dillies said:

Getting congr to play ball is a more minor issue than fighting dependent hell in my experience.

Hmm, after thinking about it a bit more I am inclined to agree. I can condition to E while retaining X as the ambient MeasurableSpace and add a lemma noting that E has full probability in this conditioned space, and write the congr lemmas in terms of almost sure equality which is what is natural in probability theory anyway. It does mean that there are now multiple MeasureSpace structures on X besides the canonical one, so notation such as P[ ] that was intended to hide the MeasureSpace structure will sometimes have to be modified (I have already made a notation P[ E ; ‹ MeasureSpace X › ] in anticipation of having to explicitly specify the measure space structure on occasion), but I think I can do this using the @ operator to override default instances of MeasureSpace as necessary. Now there are no more dependent subtypes or Subtype.val to work with.

Terence Tao (Nov 14 2023 at 20:36):

Actually this is better in a number of ways. I now get the full conditional probability formula P[ F | E ] = P[ F ∩ E ] / P[ E ] for all F : Set X (assuming measurability etc. of course) rather than just for F : Set E like I had previously.

Kyle Miller (Nov 14 2023 at 22:00):

Terence Tao said:

I can prove this with (F: Set X) replaced by Subtype.val '' F, hence my above question. Now I guess I can just do the rest of my project without ever touching the casting operator to Set X and using Subtype.val '' throughout, but I think it's a pity because to me the above statement is really naturally stated using the casting operator.

I suppose it's irrelevant for your applications now, but if there's a mathlib maintainer that thinks it's a good idea, I experimented with making casting for Set behave nicer for this in #8413

Peter Nelson (Nov 15 2023 at 14:02):

This seems analogous to some of the decisions I made for matroids in mathlib (see the discussion of 'ground sets' in the docs of docs#Matroid).
Coercions/DTT stuff gets pretty bad when sets in types get involved, so I decided to have all my predicates defined on an ambient type, even when everything is happening in some particular subset.

Terence Tao (Nov 15 2023 at 17:20):

Peter Nelson said:

This seems analogous to some of the decisions I made for matroids in mathlib (see the discussion of 'ground sets' in the docs of docs#Matroid).
Coercions/DTT stuff gets pretty bad when sets in types get involved, so I decided to have all my predicates defined on an ambient type, even when everything is happening in some particular subset.

I also noticed this when working with symmetric polynomials in n variables. Initially I modeled these variables by a function on Fin n, but I quickly abandoned this due to not understanding how to use coercions at the time and instead they are defined as functions on even though I don't use the values of this function on all but finitely many of the inputs.

Ideally at some point these sorts of recommendations should be added to a "helpful tips" section of some introductory text for how to use Lean for the working mathematician.

Kyle Miller (Nov 15 2023 at 17:31):

Another trick there is to model the variables by a function on some type X of cardinality n. Abstract indexing like this tends to work nicely -- and help you avoid unnecessary arithmetic.


Last updated: Dec 20 2023 at 11:08 UTC