Zulip Chat Archive

Stream: new members

Topic: Handling Probability with Uniform variables on I


Mathias Sven (Dec 15 2025 at 20:11):

I have been struggling with how to prove or perhaps even state my problem in Lean. Essentially, I just want to show that the distribution of X\sqrt X and max(X,Y)\max(X, Y) are equivalent, where XX and YY are independent variables uniform on [0,1][0, 1].

I have not been able to find many resources as to how to state this sort of problem correctly, but from what I have found, it seems it would be something like this?

lemma sqrt_uniform_identDistrib_max_uniform_uniform
  [IsProbabilityMeasure ( : Measure I)]
  (U₁ U₂ : I  )
  (hU₁ : pdf.IsUniform U₁ unitInterval )
  (hU₂ : pdf.IsUniform U₂ unitInterval )
  (hindep : IndepFun U₁ U₂) :
  IdentDistrib (λω  (U₁ ω)) (λω  max (U₁ ω) (U₂ ω)) := by

If this is correct, I have tried to show this in quite a few ways, all starting with the same boilerplate:

  have aemU₁ : AEMeasurable U₁  := by simp [hU₁.aemeasurable]
  have aemU₂ : AEMeasurable U₂  := by simp [hU₂.aemeasurable]
  have aemeasurable_fst := aemU₁.sqrt
  have aemeasurable_snd := aemU₁.max aemU₂
  have := Measure.isProbabilityMeasure_map aemeasurable_fst
  have := Measure.isProbabilityMeasure_map aemeasurable_snd
  refine { aemeasurable_fst, aemeasurable_snd, map_eq := ?_ }
  apply Measure.eq_of_cdf

But regardless of what I try, I seem to always get stuck on a by_cases where I can't show the negative.

In my last attempt, instead of using apply Measure.eq_of_cdf, I found Measure.ext_of_Iic which seems like it would allow me to follow a similar logic to how I wanted to show it, but in the end I still get the same issue:

  apply Measure.ext_of_Iic
  intro t
  repeat rw [Measure.map_apply_of_aemeasurable ‹_› measurableSet_Iic]
  by_cases ht : t  I
  calc
     ((fun ω => (U₁ ω)) ⁻¹' Set.Iic t)
      =  (U₁ ⁻¹' Set.Iic (t ^ 2)) := by
      congr 1; ext ω
      simp only [Set.mem_preimage, Set.mem_Iic]
      exact Real.sqrt_le_left ht.out.left
    _ =  (I  Set.Iic (t ^ 2)) := by
      simp [hU₁.measure_preimage (by simp) (by simp) measurableSet_Iic]
    _ =  (Set.Icc 0 (t ^ 2)) := by
      congr 1; ext x; simp
      intro hxt _
      exact hxt.trans ((sq_le_one_iff₀ ht.left).mpr ht.right)
    _ = ENNReal.ofReal (t ^ 2) := by simp [Real.volume_Icc]

  symm

  calc
     ((fun ω => max (U₁ ω) (U₂ ω)) ⁻¹' (Set.Iic t))
      =  ((U₁ ⁻¹' Set.Iic t)  (U₂ ⁻¹' Set.Iic t)) := by
      congr 1; ext; simp
    _ =  (U₁ ⁻¹' Set.Iic t) *  (U₂ ⁻¹' Set.Iic t) := by
      simp [hindep.measure_inter_preimage_eq_mul]
    _ =  (I  Set.Iic t) *  (I  Set.Iic t) := by
      simp [hU₂.measure_preimage _ _, hU₁.measure_preimage _ _]

  have : I  Set.Iic t = Set.Icc 0 t := by
    ext x : 1
    simp [Set.mem_inter_iff, Set.mem_Icc, Set.mem_Iic]
    intro hxt _
    exact hxt.trans ht.right
  simp [this, sq, ENNReal.ofReal_mul, ht.out]

Which leaves me with the goal:

ht : t  I
  ((fun ω => (U₁ ω)) ⁻¹' Set.Iic t) =  ((fun ω => max (U₁ ω) (U₂ ω)) ⁻¹' Set.Iic t)

Etienne Marion (Dec 15 2025 at 20:39):

This would be a more idiomatic way to state it:

import Mathlib

open MeasureTheory ProbabilityTheory
open scoped unitInterval

variable {Ω : Type*} { : MeasurableSpace Ω} {P : Measure Ω} [IsProbabilityMeasure P]

lemma sqrt_uniform_identDistrib_max_uniform_uniform
    {U₁ U₂ : Ω  } (hU₁ : pdf.IsUniform U₁ I P) (hU₂ : pdf.IsUniform U₂ I P)
    (hindep : U₁ ⟂ᵢ[P] U₂) :
    IdentDistrib (fun ω  (U₁ ω)) (fun ω  max (U₁ ω) (U₂ ω)) P P := by
  sorry

Etienne Marion (Dec 15 2025 at 20:50):

And to finish your proof it looks like you just need to do a split on whether t < 0 or 1 < t.

Mathias Sven (Dec 15 2025 at 21:47):

That last part is what I am struggling with. Given my assumptions, it seems like there should be an easy way to show that it is 0 on both sides in one case, and 1 on the other, but I can't seem to figure it out

Mathias Sven (Dec 15 2025 at 21:49):

Perhaps my approach is also needlessly convoluted

Etienne Marion (Dec 15 2025 at 22:15):

The missing lemma saying that a random variable uniform over a set is almost surely in this set:

  have h1 : ∀ᵐ ω P, U₁ ω  I := by
    change P (U₁ ⁻¹' I) = 0
    rw [hU₁.measure_preimage (by simp) (by simp)]
    simp
    measurability
  have h2 : ∀ᵐ ω P, U₂ ω  I := by
    change P (U₂ ⁻¹' I) = 0
    rw [hU₂.measure_preimage (by simp) (by simp)]
    simp
    measurability

Etienne Marion (Dec 15 2025 at 22:37):

Ok here you go:

import Mathlib

open MeasureTheory ProbabilityTheory
open scoped unitInterval

variable {Ω : Type*} { : MeasurableSpace Ω} {P : Measure Ω} [IsProbabilityMeasure P]

lemma sqrt_uniform_identDistrib_max_uniform_uniform
    {U₁ U₂ : Ω  } (hU₁ : pdf.IsUniform U₁ I P) (hU₂ : pdf.IsUniform U₂ I P)
    (hindep : U₁ ⟂ᵢ[P] U₂) :
    IdentDistrib (fun ω  (U₁ ω)) (fun ω  max (U₁ ω) (U₂ ω)) P P := by
  have m1 := (hU₁.aemeasurable (by simp) (by simp)).sqrt
  have m2 := (hU₁.aemeasurable (by simp) (by simp)).max (hU₂.aemeasurable (by simp) (by simp))
  have := P.isProbabilityMeasure_map m1
  have := P.isProbabilityMeasure_map m2
  refine m1, m2, Measure.eq_of_cdf _ _ ?_⟩
  ext t
  have h1 : ∀ᵐ ω P, U₁ ω  I := by
    change P (U₁ ⁻¹' I) = 0
    rw [hU₁.measure_preimage (by simp) (by simp)]
    simp
    measurability
  have h2 : ∀ᵐ ω P, U₂ ω  I := by
    change P (U₂ ⁻¹' I) = 0
    rw [hU₂.measure_preimage (by simp) (by simp)]
    simp
    measurability
  simp only [unitInterval, Set.mem_Icc] at h1 h2
  rw [cdf_eq_real, cdf_eq_real]
  simp_rw [measureReal_def]
  congr 1
  rw [Measure.map_apply_of_aemeasurable, Measure.map_apply_of_aemeasurable]
  · obtain ht | ht := lt_or_ge t 0
    · have h : (fun ω  (U₁ ω)) ⁻¹' Set.Iic t = ( : Set Ω) := by
        ext ω
        simp
        linarith [Real.sqrt_nonneg (U₁ ω)]
      have h' : (fun ω  max (U₁ ω) (U₂ ω)) ⁻¹' Set.Iic t =ᵐ[P] ( : Set Ω) := by
        filter_upwards [h2] with ω 
        rw [eq_iff_iff]
        change ω  (fun ω  max (U₁ ω) (U₂ ω)) ⁻¹' Set.Iic t  ω  ( : Set Ω)
        simp only [Set.mem_preimage, Set.mem_Iic, sup_le_iff, Set.mem_empty_iff_false, iff_false,
          not_and, not_le]
        intro
        linarith
      rw [h, measure_congr h']
    · have h : (fun ω  (U₁ ω)) ⁻¹' Set.Iic t = U₁ ⁻¹' Set.Iic (t ^ 2) := by
        ext
        simp [Real.sqrt_le_iff, ht]
      have h' : (fun ω  max (U₁ ω) (U₂ ω)) ⁻¹' Set.Iic t =ᵐ[P] ((U₁ ⁻¹' Set.Iic t)  (U₂ ⁻¹' Set.Iic t) : Set Ω) := by
        filter_upwards [h1, h2] with ω hω1 hω2
        rw [eq_iff_iff]
        change ω  (fun ω  max (U₁ ω) (U₂ ω)) ⁻¹' Set.Iic t  ω  (U₁ ⁻¹' Set.Iic t)  (U₂ ⁻¹' Set.Iic t)
        simp
      rw [h, measure_congr h', hindep.measure_inter_preimage_eq_mul, hU₁.measure_preimage (by simp) (by simp),
        hU₁.measure_preimage (by simp) (by simp), hU₂.measure_preimage (by simp) (by simp),
        unitInterval]
      obtain ht' | ht' := le_total t 1
      · have : t ^ 2  1 := by exact (sq_le_one_iff₀ ht).mpr ht'
        have : Set.Icc 0 1  Set.Iic (t ^ 2) = Set.Icc 0 (t ^ 2) := by ext; grind
        rw [this]
        have : Set.Icc 0 1  Set.Iic t = Set.Icc 0 t := by ext; grind
        rw [this]
        simp [ht, pow_two]
      · have : 1  t ^ 2 := by exact (one_le_sq_iff₀ ht).mpr ht'
        have : Set.Icc 0 1  Set.Iic (t ^ 2) = Set.Icc 0 1 := by ext; grind
        rw [this]
        have : Set.Icc 0 1  Set.Iic t = Set.Icc 0 1 := by ext; grind
        rw [this]
        simp
      any_goals measurability
  any_goals measurability

It is pretty ugly, lots of things should be added to the library as separate lemmas to make this proof much shorter.

Mathias Sven (Dec 16 2025 at 21:52):

I have now had the time to understand the proof you provided better, and I see. It is indeed a little larger than I was expecting. Thank you for providing this, it should serve as a good blueprint for proving similar problems.

Philippe Duchon (Dec 17 2025 at 10:53):

I think probability arguments are likely to always require a lot of boilerplate, because there are a lot of elementary facts that one doesn't explicitly write in an informal proof. Each of the individual steps in Etienne's proof is simple enough to follow, but it still makes up for a longer than expected proof.

In an informal setting, you'd go with "Let us compute the probability distribution function of both random variables and check that they are the same", and proceed by picking a value between 0 and 1 since it is obvious that the pdfs agree outside of the unit interval. This is really what Etienne's proof does, but it has to make explicit all the "obvious" parts.


Last updated: Dec 20 2025 at 21:32 UTC