Zulip Chat Archive

Stream: Polynomial Freiman-Ruzsa conjecture

Topic: Is this the right way to introduce independent copies?

Terence Tao (Nov 18 2023 at 03:24):

In https://teorth.github.io/pfr/blueprint/sect0006.html#a0000000014 we will need the following construction: starting with two random variables $X_1,X_2$ (on different probability spaces), create four independent variables $X'_1, X'_2, X''_1, X''_2$ of $X_1,X_2,X_1,X_2$ respectively on the same space. There are also a number of places elsewhere where we will generate two independent copies of two variables, or three independent copies of three variables.

I was trying to think about the general procedure for creating independent copies of some finite number of variables and came up with this beast of a statement:

import Mathlib

open MeasureTheory ProbabilityTheory

lemma independent_copies' {I: Type*} [Fintype I] (l : List I) (S : I → Type*) (mS : ∀ i : I, MeasurableSpace (S i))
(Ω : I → Type*) (mΩ : ∀ i : I, MeasurableSpace (Ω i)) (X : ∀ i : I, Ω i → S i) (hX : ∀ i : I, Measurable (X i))
(μ: ∀ i : I, Measure (Ω i)) :
  ∃ ν : Measure (List.TProd S l), IsProbabilityMeasure ν ∧
  ∃ X' : (∀ i : I, (List.TProd S l) → S i),
  (iIndepFun mS X' ν) ∧
  ∀ i : I, Measurable (X' i) ∧ IdentDistrib (X' i) (X i) ν (μ i) := by sorry

(The Measurable conclusions are perhaps redundant, being almost covered by the IdentDistrib hypothesis, except that that hypothesis only gives almost everywhere measurability.)

II am wondering if I am overthinking this and there is some simpler way to create four independent copies of some random variables. For creating copies of just two random variables, the situation is a little bit better but still rather cumbersome:

import Mathlib

open MeasureTheory ProbabilityTheory

lemma independent_copies (X : Ω → S) (Y : Ω' → T) (hX : Measurable X) (hY: Measurable Y)
(μ: Measure Ω) (μ': Measure Ω') :
  ∃ ν : Measure (S × T), ∃ X' : S × T → S, ∃ Y' : S × T → T,
  IsProbabilityMeasure ν ∧ Measurable X' ∧ Measurable Y'
  ∧ (IndepFun X' Y' ν) ∧ IdentDistrib X' X ν μ ∧ IdentDistrib Y' Y ν μ' := by sorry

In this latter case one can also explicitly describe \nu as (μ.map X).prod (μ'.map Y) and X', Y' as fun x ↦ x.1 and fun x ↦ x.2 respectively. If there was a similarly compact way to describe four independent copies then perhaps the first big lemma could be broken up into more manageable looking pieces.

Sebastien Gouezel (Nov 18 2023 at 07:59):

I think your approach is good. I started doing the 4 variables case by hand, and it was really cumbersome. A more general statement like the one you give will probably be nicer to prove. I would probably hide the implementation detail that the space we get is List.TProd, though, with something like that:

import Mathlib

open MeasureTheory ProbabilityTheory
universe u v

lemma independent_copies' {I: Type*} [Fintype I] {S : I → Type u}
    [mS : ∀ i : I, MeasurableSpace (S i)] {Ω : I → Type v}
    [mΩ : ∀ i : I, MeasurableSpace (Ω i)] (X : ∀ i : I, Ω i → S i) (hX : ∀ i : I, Measurable (X i))
    (μ : ∀ i : I, Measure (Ω i)) :
    ∃ (A : Type (max u v)) (mA : MeasurableSpace A) (μA : Measure A) (X' : ∀ i, A → S i),
    IsProbabilityMeasure μA ∧
    (iIndepFun mS X' μA) ∧
    ∀ i : I, Measurable (X' i) ∧ IdentDistrib (X' i) (X i) μA (μ i) := by sorry

When trying to use this in the concrete situation of four random variables, I stumbled on a notational issue, though. When A, B, C, D have the same type, we have the notation ![A, B, C, D] to form the function from Fin 4 = {0, 1, 2, 3} mapping 0 to A and 1 to B and so on. So, when applying the lemma, we can use ![Ω, Ω, Ω', Ω'] to say which space we want to use on each coordinate. However, we can not use ![X, X, X', X'] to write which variables we want to copy, because these don't have the same type (the first two have type Ω → G, the other ones Ω' → G). It's certainly just a matter of setting up the right notation.

Vincent Beffara (Nov 18 2023 at 10:00):

Why not first state this:

lemma pre_exindep {I : Type*} [Fintype I] (Ω : I → Type*) [mΩ : ∀ i, MeasurableSpace (Ω i)]
  (μ : ∀ i, Measure (Ω i)) :
  ∃ Ω' : Type*, ∃ mΩ' : MeasurableSpace Ω', ∃ ν : Measure Ω', ∃ X : ∀ i, Ω' → Ω i,
  IsProbabilityMeasure ν ∧ iIndepFun mΩ X ν ∧ (∀ i : I, Measurable (X i) ∧ ν.map (X i) = μ i) := by sorry

and then instantiate it where the μ i are the laws of the X i?

Sebastien Gouezel (Nov 18 2023 at 10:35):

Yes, that's one of the workarounds I had in mind. There is a tension that things look easier to formalize by using measures (i.e., the distribution of the random variables instead of the random variables themselves), but the intuition is much better in terms of random variables, and the conditioning is also much more natural to phrase with random variables.

For another example, I'm formalizing the existence of a minimizer for the tau functional (whatever that is), and in the end you want to get a random variable with a minimizing distribution, but the proof is naturally done in the space of probability measures on G.

Adam Topaz (Nov 18 2023 at 15:19):

This is very far from my area, but I do wonder if we can make Lean keep track of taking products of the sample spaces? I know that the infrastructure for this project has already been set up, so this isn't a suggestion, but more a curiosity -- what do people think about the following approach? It lets you write x + y for random variables x and y, and automatically takes the product of the sample spaces (modulo some sorries):

import Mathlib

noncomputable section

universe u

open MeasureTheory CategoryTheory

structure ProbCat : Type (u+1) where
  carrier : MeasCat.{u}
  vol : ProbabilityMeasure carrier

structure RV (α : Type u) [MeasurableSpace α] where
  Ω : ProbCat.{u}
  f : Ω.carrier → α
  hf : Measurable f

def MeasCat.prod (X Y : MeasCat.{u}) : MeasCat.{u} where
  α := X × Y

def ProbCat.prod (X Y : ProbCat.{u}) : ProbCat.{u} where
  carrier := X.carrier.prod Y.carrier
  vol := ⟨Measure.prod X.vol.val Y.vol.val, sorry⟩

instance (α : Type u) [MeasurableSpace α] [Add α] [MeasurableAdd₂ α] : Add (RV α) where
  add := fun ⟨Ω,f,hf⟩ ⟨Ω', f',hf'⟩ => ⟨Ω.prod Ω', fun (x,y) => f x + f' y, sorry⟩

example {α : Type u} [MeasurableSpace α] [Add α] [MeasurableAdd₂ α]
    (x y : RV α) : (x + y).Ω = x.Ω.prod y.Ω :=
  rfl

example {α : Type u} [MeasurableSpace α] [Add α] [MeasurableAdd₂ α]
    (x y : RV α) (u : x.Ω.carrier) (v : y.Ω.carrier) : (x + y).f (u,v) = x.f u + y.f v :=
  rfl

Vincent Beffara (Nov 18 2023 at 15:40):

I would guess that the most common situation would be to have RVs defined on the same probability space and we would like to add them the usual way as functions (so x + y would also be defined on the same space), which I'm not sure is compatible with your suggestion. But perhaps a related option would be to define the "independent pair" of RVs with bundled probability space (and then compose with addition or multiplication or whatever if needed).

In general I do not believe that bundling the probability space would fit the usual mathematical way that probability is done. It's true that we usually don't make it explicit, but it is still there :-)

Adam Topaz (Nov 18 2023 at 15:49):

Again, I’m not at all a probability theorist :) but I was under the impression that talking explicitly about a sample space of a random variable is as “evil” as talking about equality of objects in a category. Is that not the case?

Adam Topaz (Nov 18 2023 at 15:54):

In any case, bundling would at lease solve the issue raised by Sebastien about matrix notation.

Terence Tao (Nov 18 2023 at 17:23):

Sebastien Gouezel said:

I think your approach is good. I started doing the 4 variables case by hand, and it was really cumbersome. A more general statement like the one you give will probably be nicer to prove. I would probably hide the implementation detail that the space we get is List.TProd, though, with something like that:

OK, thanks. I had initially tried this but couldn't figure out what type to assign to A without explicitly constructing it, but with the universes made explicit it works. (Amusing to me, though, that such a concrete piece of mathematics requires just a little bit of awareness about universes.)

When trying to use this in the concrete situation of four random variables, I stumbled on a notational issue, though. When A, B, C, D have the same type, we have the notation ![A, B, C, D] to form the function from Fin 4 = {0, 1, 2, 3} mapping 0 to A and 1 to B and so on. So, when applying the lemma, we can use ![Ω, Ω, Ω', Ω'] to say which space we want to use on each coordinate. However, we can not use ![X, X, X', X'] to write which variables we want to copy, because these don't have the same type (the first two have type Ω → G, the other ones Ω' → G). It's certainly just a matter of setting up the right notation.

Well, in this specific project we only ever take copies of at most four variables, so we could just set up a few helper lemmas to use independent_copies' in a way that hides all the notational issues, hopefully.

Terence Tao (Nov 18 2023 at 17:34):

Adam Topaz said:

Again, I’m not at all a probability theorist :) but I was under the impression that talking explicitly about a sample space of a random variable is as “evil” as talking about equality of objects in a category. Is that not the case?

Yeah, I think so. The way I like to think of it (outside of Lean) is that in probability theory that there are one or more ambient probability spaces in play at a given time, with each random variable being studied attached to one of these spaces. (Some notions between random variables, such as independence, only make sense if they are attached to the same space; other notions, such as identical distribution or convergence in distribution, make sense even if they are attached to different spaces.) However, these spaces should be viewed as "dynamic" or "fluid"; at any given time, one has the right to "flip some more coins" and replace any one of the probability spaces with an extension of that space, with all random variables that were attached to the previous space, now attached to the extension. All probability theory concepts are designed to be invariant under this "base change" operation: for instance, if the original random variable X had an expectation E[X], this expectation would remain unchanged under such an extension operation. The main reason one would perform such an extension is to introduce new random variables that were independent of the preceding ones. With this setup it becomes possible to not refer to the underlying probability spaces at all, with the understanding that they are automatically being managed by the ambient context.

(I had some thoughts about actually setting up a filter on the category of probability spaces (where the morphisms are extension maps) and defining a random variable to be something like a "pro-functor" on this category (defined on certain probability spaces and their extensions, but not on all probability spaces), to try to capture this picture formally, but I've never quite figured out exactly how to do it properly.)

In this project we aren't taking this approach (I imagine it would generate way too much "dependent type hell" in Lean) but an intermediate approach in which every time one needs to introduce some variables that are independent of each other and have the distribution of some existing random variables, we use a product space construction to create independent copies of those random variables, and then work with that space going forward. I was persuaded to not try to introduce custom structures to explicitly attach the probability spaces to the random variables, but keep everything deconstructed into existing Mathlib classes and just record the probability spaces and measures in the statements of the lemmas. This makes the statements a little longer than they would in Mathematical English probability, where one would indeed hide the probability space as much as possible, but I think it is a tolerable compromise.

Adam Topaz (Nov 18 2023 at 18:19):

Thanks! In general I think there are a number of people here (myself included) who are very interested in figuring out how to formalize things in a way that’s as close as possible to the way humans usually do mathematics, not just in probability theory but also in sheaf/topos theory, condensed mathematics, etc. It would be nice to keep this discussion going but I don’t want to distract too much from this project so I’ll post something in another stream.

Sebastien Gouezel (Nov 18 2023 at 19:27):

I think we should be able to switch from [MeasurableSpace] to [MeasureSpace] in most statements, thereby hiding the measure from the notation and making them much closer to the paper notation. I'll experiment on this later.

Terence Tao (Nov 18 2023 at 19:35):

Sebastien Gouezel said:

I think we should be able to switch from [MeasurableSpace] to [MeasureSpace] in most statements, thereby hiding the measure from the notation and making them much closer to the paper notation. I'll experiment on this later.

Yes, Remy did some proof of concept demonstrations of this in entropy_basic.lean using := by volume_tac to hide the measure in case one had a MeasureSpace instance. It should be doable, and perhaps worth switching to if the notation becomes too cluttered.

Last updated: May 02 2025 at 03:31 UTC