Zulip Chat Archive

Stream: new members

Topic: Lebesgue Measure on Unit Interval

Dominic Steinitz (Oct 16 2024 at 10:42):

How would I define the probability triple: $([0, 1], {\mathcal{B}}[0, 1], {\mathcal{L}})$ . I've grepped the Mathlib repo, googled and asked ChatGPT but am still none the wiser :-(

Yaël Dillies (Oct 16 2024 at 10:43):

Do you really want to define it, or rather to use it?

Dominic Steinitz (Oct 16 2024 at 10:48):

I'd like to have it as I want to use it but it would be interesting to see e.g. how the Borel $\sigma$ -algebra and Lebesgue measure are constructed.

Rémy Degenne (Oct 16 2024 at 10:48):

There has been several similar discussions in the past weeks on the subject of "how do I write the probability related thing X in Mathlib?". I started to write a post explaining the most common ones: https://github.com/RemyDegenne/lean-community-blog/blob/master/posts/basic-probabliity-in-lean.md
This is still very much a work in progress, but I hope to finish it soon.

Rémy Degenne (Oct 16 2024 at 10:49):

I hope it helps. It does not cover you exact example though.

Rémy Degenne (Oct 16 2024 at 10:51):

For your case: the interval [0,1] should have a MeasureSpace instance as a subset of the reals, which is exactly the sigma-algebra and the measure you want.

Eric Wieser (Oct 16 2024 at 10:52):

The discussion here might be relevant

Rémy Degenne (Oct 16 2024 at 10:52):

import Mathlib
open MeasureTheory

noncomputable
example : MeasureSpace (Set.Icc (0 : ℝ) 1) := inferInstance

Vincent Beffara (Oct 16 2024 at 10:59):

Also, docs#unitInterval.instMeasureSpaceElemReal

Vincent Beffara (Oct 16 2024 at 11:00):

In fact, Mathlib.MeasureTheory.Constructions.UnitInterval is exactly what you are after.

Rémy Degenne (Oct 16 2024 at 11:01):

I never encountered unitInterval! Thanks.

Dominic Steinitz (Oct 17 2024 at 07:16):

I was expecting to get an error

import Mathlib.MeasureTheory.Constructions.UnitInterval
import Mathlib.Probability.Independence.Basic
import Mathlib.Probability.Distributions.Gaussian

open scoped unitInterval
open MeasureTheory ProbabilityTheory

#check volume I
#check volume ∅
#check volume (Set.Icc (1/2 : ℝ) (3/4 : ℝ))

noncomputable
def ν : (Measure I) := volume

def Y : I → ℝ := λ ⟨x, _⟩ => x^2

variable (hYY : Measure.map Y ν = gaussianReal 0 1)

but I guess hYY is a hypothesis which I know to be false but Lean does not yet know.

Be that as it may, what I really want is to be able to sample from a normally distributed random variable (perhaps I should ask this on a new thread) and my thinking had been to get such a random variable and provide a supply of random numbers on the unit interval and apply the random variable to this supply.

Do I really have to provide the CDF and an inverse to it (as a provable definition of Y)? In a way the work has already been done by (in ProbabilityTheory somewhere):

noncomputable
def gaussianPDFReal (μ : ℝ) (v : ℝ≥0) (x : ℝ) : ℝ :=
  (√(2 * π * v))⁻¹ * rexp (- (x - μ)^2 / (2 * v))

Dominic Steinitz (Oct 17 2024 at 08:19):

@Olga Ponomarenko ^

Olga Ponomarenko (Oct 17 2024 at 08:41):

@Rémy Degenne thank you for doing the write-up! It definitely is helpful to frame one's thinking.

I also want to mention that there are several translated (natural language to Lean 4) problem sets, and I found this Hugging Face repo to be very useful:

import pandas as pd

df = pd.read_parquet("hf://datasets/internlm/Lean-Workbook/wkbk_1009.parquet")
df.loc[df["natural_language_statement"].apply(lambda x: "random var" in x.lower())].to_dict(orient="records")

e.g.
id :
lean_workbook_plus_52330

theorem lean_workbook_plus_52330 (X Y Z : ℝ) (hx : X ∈ Set.Icc 0 1) (hy : Y ∈ Set.Icc 0 1) (hz : Z ∈ Set.Icc 0 1) : (X * Y) ^ Z ∈ Set.Icc 0 1   :=  by sorry

state_before :

X Y Z : ℝ
hx : X ∈ Set.Icc 0 1
hy : Y ∈ Set.Icc 0 1
hz : Z ∈ Set.Icc 0 1
⊢ (X * Y) ^ Z ∈ Set.Icc 0 1

state_after :

case left
X Y Z : ℝ
left : 0 ≤ X
right : X ≤ 1
left_1 : 0 ≤ Y
right_1 : Y ≤ 1
left_2 : 0 ≤ Z
right_2 : Z ≤ 1
⊢ 0 ≤ (X * Y) ^ Z

case right

X Y Z : ℝ
left : 0 ≤ X
right : X ≤ 1
left_1 : 0 ≤ Y
right_1 : Y ≤ 1
left_2 : 0 ≤ Z
right_2 : Z ≤ 1
⊢ (X * Y) ^ Z ≤ 1

natural_language_statement :
Let $ X, Y, Z$ be independent random variables uniformly distributed on the unit interval $ [0,1]$ . Prove that $ (XY)^Z$ is also uniformly distributed (yes, I didn't believe it either).

Josha Dekker (Oct 17 2024 at 08:58):

The lean statement

theorem lean_workbook_plus_52330 (X Y Z : ℝ) (hx : X ∈ Set.Icc 0 1) (hy : Y ∈ Set.Icc 0 1) (hz : Z ∈ Set.Icc 0 1) : (X * Y) ^ Z ∈ Set.Icc 0 1   :=  by sorry

translates to: if X, Y and Z lie in [0,1], then (XY)^Z lies in [0,1]. I don't think the statement makes any claims about uniform distributions.

Josha Dekker (Oct 17 2024 at 09:00):

but apparently it is true? https://math.stackexchange.com/questions/261783/distribution-of-xyz-if-x-y-z-is-i-i-d-uniform-on-0-1

Vincent Beffara (Oct 17 2024 at 09:41):

If you want to "explicitly" sample Gaussians, probably the easiest way is to implement something like Box-Muller.

Dominic Steinitz (Oct 17 2024 at 10:15):

Vincent Beffara said:

If you want to "explicitly" sample Gaussians, probably the easiest way is to implement something like Box-Muller.

And I've even used that method in the past but forgotten that I had (until just now).

Last updated: May 02 2025 at 03:31 UTC