Zulip Chat Archive

Stream: Is there code for X?

Topic: the law of a random variable is a probability measure

Kalle Kytölä (Aug 02 2021 at 23:48):

If Pr is a probability measure on a sample space α, and X is a β-valued random variable, then I suppose the right way in Lean to talk about the distribution of X (i.e. the law of X) is as measure.map X Pr, right?

Do we have an instance that the law of a random variable is a probability measure? Is instance even the right way to implement this, to free the user from ever having to explicitly explain that the distribution of a random variable, which is a (measure.map X Pr : measure β) by construction, is moreover a probability_measure?

In other words, is there something like the following? Or is there a better way?

import measure_theory.integration

open measure_theory

instance probability_measure.map_is_probability_measure
  {α β : Type} [measurable_space α] [measurable_space β]
  (Pr : measure α) [probability_measure Pr] {X : α → β} (X_mble : measurable X) :
  probability_measure (measure.map X Pr) :=
{ measure_univ := begin
    rw [measure.map_apply X_mble measurable_set.univ, set.preimage_univ],
    exact measure_univ,
  end, }

Mario Carneiro (Aug 02 2021 at 23:52):

is there a bundled morphism type for measurable functions? Because you could have that be an instance of probability_measure (measure.map (\u X) Pr) where X is a bundled morphism

Mario Carneiro (Aug 02 2021 at 23:53):

alternatively, you could add a wrapper around measure.map that takes an extra argument for X_mble and give that the probability_measure instance

Floris van Doorn (Aug 03 2021 at 01:09):

I believe the closest thing we have to bundled measurable maps is docs#measure_theory.ae_eq_fun, which is about a.e.-measurable maps, and also takes a quotient...

Hunter Monroe (Aug 03 2021 at 05:07):

Note that probability_theory/integration.lean has lemmas regarding real-valued random variables, without specifying that the measure is a probability measure. @Eric Wieser pointed out a problem here with too many measurable_space instances. My tentative definition of expected value for real-valued random variables (which I have not committed) is:

def expected_value [M : measure_space α] {μ : measure α} (X : α → ℝ≥0∞) : ennreal = ∫⁻ a, X a ∂μ

I am also working on a proof of the Borel-Cantelli partial converse where the constraint that the measure be a probability measure is an added assumption:

lemma measure_limsup_eq_one {s : ℕ → set α} (hs : ∀ i, measurable_set (s i)) (hs' : ∑' i, μ (s i) = ∞) (hi: Indep_set s μ) [probability_measure μ ]:  μ (limsup at_top s) = 1 := sorry

Kalle Kytölä (Aug 03 2021 at 07:42):

Hunter Monroe said:

Note that probability_theory/integration.lean has lemmas regarding real-valued random variables, without specifying that the measure is a probability measure.

I agree one doesn't always need to have the assumption that a measure is a probability measure, even for "probability" results. In particular I fully agree the expected value should be written as just an integral or a lintegral, depending on whether one is doing expected values of nonnegative or expected values of integrable random variables.

But there are cases when you do need to know that the law of a random variable is a probability measure. My suggestion was that if there is already a probability_measure instance on the (probability) measure Pr on the sample space, then it should be up to the typeclass inference to get a probabability measure instance on the law measure.map X Pr of a random variable X. The alternatives are to leave it to the user (which I would not like), or some more clever implementation.

Kalle Kytölä (Aug 03 2021 at 07:43):

Eric Wieser pointed out a problem here with too many measurable_space instances.

I read that somewhat differently: my takeaway was that the typeclass inference is designed to have only one instance of a class, whereas in probability theory one often needs several sigma-algebras (measurable_space "instances") on the same type. The appropriate implementation for conveniently working with several sigma-algebras is a good question. But I was anyway not suggesting adding any new measurable_space instances above, only a probability_measure instance on an existing measure (namely measure.map X Pr). I am pretty sure one doesn't ever want many probability_measure instances on the same measure :wink:, so here the typeclass seems a priori quite appropriate. (Although I realize that if there is no consensus on the best implementation, then playing with any alternatives a bit might be good before deciding about the mathlib way.)

Kalle Kytölä (Aug 03 2021 at 07:53):

Floris van Doorn said:

I believe the closest thing we have to bundled measurable maps is docs#measure_theory.ae_eq_fun, which is about a.e.-measurable maps, and also takes a quotient...

Thanks Floris! I also thought there is no bundled measurable maps as such. (Although I don't know if there should be; for example bounded_continuous_functions seem to be working nicely.) And I would have guessed indeed the issue is that occasionally one would prefer ae_measurable and occasionally measurable (in particular one doesn't always have a measure to determine the meaning of a.e., or one wants measurability irrespective of the measure).

I don't know if creating bundled versions of both "measurable functions" and "a.e. measurable functions" separately would be a reasonable option... (I have nothing in principle against it, but looks like a nonnegligible refactor). I was anyway hoping that the probability_measure instance on the law of a random variable, when one already has a probability_measure on the probability measure on the sample space, would be a minor thing.

Kalle Kytölä (Aug 03 2021 at 08:01):

Maybe to return more to the law of a random variable, a related question:

What is the canonical mathlib way to express that "these random variables are identically distributed"?

In terms of measure.map, one can write:

import measure_theory.integration

open measure_theory

def identically_distributed {α β ι : Type} [measurable_space α] [measurable_space β]
  (Pr : measure α) [probability_measure Pr]
  {Xs : ι → α → β} (Xs_mbles : ∀ i, measurable (Xs i)) : Prop :=
∀ (i j : ι), measure.map (Xs i) Pr = measure.map (Xs i) Pr

def identically_distributed_with_law {α β ι : Type} [measurable_space α] [measurable_space β]
  (Pr : measure α) [probability_measure Pr]
  {Xs : ι → α → β} (Xs_mbles : ∀ i, measurable (Xs i)) (law : measure β) : Prop :=
∀ i, measure.map (Xs i) Pr = law

Is this the intended way?

(If it is, then I'm back to wanting a probability_measure (measure.map (Xs i) Pr) instance. :wink:)

Eric Wieser (Aug 03 2021 at 08:34):

~~You can probably spell measure.map (Xs i) Pr as Pr.map (Xs i), not that that answers any of your questions~~

Kalle Kytölä (Aug 03 2021 at 08:41):

I actually tried Pr.map X etc., since I recently started to like the dot-notation a lot (that would be a nice notation here, in particular).

But it gives the error:
invalid field notation, function 'measure_theory.measure.map' does not have explicit argument with type (measure_theory.measure ...).

Kalle Kytölä (Aug 03 2021 at 08:43):

This is in fact quite puzzling to me, since I think there is an explicit argument of type measure α --- just right of colon.

Kalle Kytölä (Aug 03 2021 at 08:45):

Oh, is it because the right-of-colon argument in docs#measure_theory.measure.map is not mapped simply by → but by the bundled →ₗ[ℝ≥0∞]?

Eric Wieser (Aug 03 2021 at 08:57):

Yes, that's exactly what the problem is!

Hunter Monroe (Aug 05 2021 at 01:31):

Kalle Kytölä said:

What is the canonical mathlib way to express that "these random variables are identically distributed"?

A canonical approach to "identical and independently distributed random variables" should build on Indep_fun. The formal-ml repository, has random_variables_IID--this repository is not canonical and hard-to-make canonical.

Rish Vaishnav (Feb 27 2022 at 20:58):

Kalle Kytölä said:

Oh, is it because the right-of-colon argument in docs#measure_theory.measure.map is not mapped simply by → but by the bundled →ₗ[ℝ≥0∞]?

I've been thinking about this recently, see #12345. Could we not just make a wrapper like docs#measure_theory.measure.restrict does over the linear map? I would be happy to do any needed refactoring, just not sure if we had a particular reason against it.

Rish Vaishnav (Feb 28 2022 at 02:24):

#12350

Last updated: May 02 2025 at 03:31 UTC