Zulip Chat Archive
Stream: mathlib4
Topic: characterization of geometrical distribution, memorylessness
Matthias Liesenfeld (Aug 11 2025 at 16:18):
Hello,
I'm new to lean/mathlib 4 and wanted to make my first contribution, but I'm stuck. Either way, I'm interested in what you think about it.
In Mathlib\Probability\Distributions\Geometric.lean I wanted to add a section about memorylessness. First, define memorylessness for discrete random variables. (Like https://en.wikipedia.org/wiki/Memorylessness) Then, characterize the geometrical distribution as memoryless. I started off as follows:
section Memorylessness
def memorylessness_discrete {Ω : Type*} [MeasurableSpace Ω] (X : Ω → ℕ) (P : Measure Ω)
[IsProbabilityMeasure P] : Prop :=
∀ (n m : ℕ), P[{ω | X ω > n + m} | {ω | X ω ≥ n}] = P {ω | X ω > m}
lemma aux2 (x y z : ℝ≥0∞) : x = y → z * x = z * y := by
exact fun a ↦ congrArg (HMul.hMul z) a
lemma aux1 (x y z : ℝ≥0∞) : x ≠ 0 → x ≠ ⊤ → x⁻¹ * y = z → y = x * z := by
intro x_ne_zero x_ne_top h
have h' : x * (x⁻¹ * y) = x * z := aux2 _ _ _ h
rwa [ENNReal.mul_inv_cancel_left x_ne_zero x_ne_top] at h'
theorem memorylessness_discrete_iff_geometric {Ω : Type*} [MeasurableSpace Ω] (X : Ω → ℕ)
(hX : Measurable X) (P : Measure Ω) [IsProbabilityMeasure P]
(hXnn : ∀ (n : ℕ), P {ω | X ω ≥ n} ≠ 0) (hp_pos : 0 < ENNReal.toNNReal (P {ω | X ω = 0}))
(hp_le_one : ENNReal.toNNReal (P {ω | X ω = 0}) ≤ 1) :
memorylessness_discrete X P ↔ Measure.map X P = geometricMeasure hp_pos hp_le_one := by
constructor
· intro hm
have h : ∀ (n m : ℕ), P {ω | X ω > n + m} = P {ω | X ω > m} * P {ω | X ω ≥ n} := by
intro n m
have h' : P[{ω | X ω > n + m} | {ω | X ω ≥ n}] = P {ω | X ω > m} := hm n m
have h'' : P[{ω | X ω > n + m} | {ω | X ω ≥ n}]
= (P {ω | X ω ≥ n})⁻¹ • P ({ω | X ω > n + m} ∩ {ω | X ω ≥ n}) := by
simp [cond]
rw [Measure.restrict_apply']
refine measurableSet_le ?_ ?_
exact measurable_const
exact hX
have h''' : P ({ω | X ω > n + m} ∩ {ω | X ω ≥ n}) = P {ω | X ω > n + m} := by
apply DFunLike.congr_arg
apply Set.inter_eq_self_of_subset_left
intro ω hω
simp at hω
simp
trans n+m
exact Nat.le_add_right n m
exact Nat.le_of_succ_le hω
rw [h'', h'''] at h'
have hmul : (P {ω | X ω ≥ n})⁻¹ * P {ω | X ω > n + m} = P {ω | X ω > m} := by
exact h'
have hc : P {ω | X ω ≥ n} ≠ 0 := hXnn n
have hfin : P {ω | X ω ≥ n} ≠ ⊤ := by simp
have hgoal : P {ω | X ω > n + m} = P {ω | X ω ≥ n} * P {ω | X ω > m} := aux1 _ _ _ hc hfin hmul
rwa [mul_comm]
have h' : ∀ (n : ℕ), P {ω | X ω > n} = (P {ω | X ω > 0})^(n+1) := by
intro n
induction' n with n ih
· simp
· rw [h (n+1) 0]
have ih' : P {ω | X ω ≥ n + 1} = P {ω | X ω > 0} ^ (n + 1) := ih
rw [ih']
ring
have h'' : ∀ (n : ℕ), P {ω | X ω = n} = (P {ω | X ω > 0})^n * (1 - P {ω | X ω > 0}) := by
intro n
by_cases h : n = 0
· rw [h]
simp
refine ENNReal.eq_sub_of_add_eq' ?_ ?_
simp
apply?
· have h' : (n - 1 ∈ ℕ) := by sorry
calc
P {ω | X ω = n} = P {ω | X ω > n - 1} - P {ω | X ω > n} := by sorry
_ = (P {ω | X ω > 0})^n - (P {ω | X ω > 0})^(n+1) := by sorry
_ = (P {ω | X ω > 0})^n * (1 - P {ω | X ω > 0}) := by sorry
have h'4 : IsProbabilityMeasure (Measure.map X P) := by sorry
/-
calc
(Measure.map X P) univ = (Measure.map X P) {_n | True}
--(Measure.map X P) {_n | True} = (Measure.map X P) (Set ℕ) := by sorry
:= sorry
-/
have h'5 : IsProbabilityMeasure (geometricMeasure hp_pos hp_le_one) := by sorry
have h''' h'4 h'5 : (Measure.map X P).toPMF = (geometricMeasure hp_pos hp_le_one).toPMF := by sorry
have : Measure.map X P = geometricMeasure hp_pos hp_le_one := by sorry
· intro X_geo
let p := P {ω | X ω = 0}
have h' : ∀ (n : ℕ), P {ω | X ω < n} = 1 - (1-p)^n := by
intro n
calc
P {ω | X ω < n} = P {ω | X ω ∈ Finset.range n} := by simp
_ = (Measure.map X P) (Finset.range n) := by sorry
_ = (geometricMeasure hp_pos hp_le_one) (Finset.range n) := by rw [X_geo]
_ = ∑ m ∈ Finset.range n, (geometricPMF hp_pos hp_le_one) (m) := by sorry
_ = ENNReal.ofReal (∑ m ∈ Finset.range n, geometricPMFReal (p.toReal) (m)) := by sorry
_ = ∑ m ∈ Finset.range n, (1-p) ^ m * p := by sorry
_ = (1 - (1-p)^n) * p / (1- (1-p)) := by sorry
_ = 1 - (1-p)^n := by sorry
have h'' : ∀ (n : ℕ), P {ω | X ω ≥ n} = (1-p)^n := by
intro n
sorry --apply prob_compl_eq_one_sub _
have h'3 : ∀ (n m : ℕ), P[{ω | X ω > n + m} | {ω | X ω ≥ n}] = (1-p)^(m+1) by sorry
end Memorylessness
end ProbabilityTheory
There are a lot of open things to do but I already want to share to see what you think, like if i'm even on the right track and get some feedback, maybe even help or any advice. There is so much I dont know I cant list it all. But some questions which seem fundamental to me:
- Do you think the notation of memoryless discrete random variables is something which makes sense in mathlib? Would Geometric.lean be a good spot for that?
- In the definition: Should the definition be formulated for random variables? (I feel in probability you rather think of random variables than of measures)
- A special case is
p=1where the geometrical measure becomes the Dirac measure at 0. How should one treat that case?
Kenny Lau (Aug 11 2025 at 16:23):
if P is memoryless iff P is geometric then why do we need a separate def IsMemoryless : Prop?
Wuyang (Aug 11 2025 at 19:03):
Framing memorylessness for discrete random variables and tying it to the geometric distribution seems quite natural here. I’d keep it in Geometric.lean. Also, LeanFinder (www.leanfinder.org) might help you find related lemmas and definitions quickly.
Wuyang (Aug 11 2025 at 19:03):
@leanfinder memorylessness property discrete random variable geometric distribution characterization in mathlib
leanfinder (Aug 11 2025 at 19:03):
Here’s what I found:
-
definition ProbabilityTheory.geometricPMF (hp_pos : 0 < p) (hp_le_one : p ≤ 1) : PMF ℕ := ⟨fun n ↦ ENNReal.ofReal (geometricPMFReal p n), by apply ENNReal.hasSum_coe.mpr rw [← toNNReal_one] exact (geometricPMFRealSum hp_pos hp_le_one).toNNReal (fun n ↦ geometricPMFReal_nonneg hp_pos hp_le_one)⟩ "For a real number \( p \in (0,1] \), the geometric probability mass function (PMF) on the natural numbers is defined by \( \text{geometricPMF}(p)(n) = (1-p)^n \cdot p \) for each \( n \in \mathbb{N} \), where the values are represented as non-negative extended real numbers. This defines a valid probability distribution since the sum of all probabilities equals 1." (score: 0.716)
-
theorem ProbabilityTheory.geometricPMFReal_pos {n : ℕ} (hp_pos : 0 < p) (hp_lt_one : p < 1) : 0 < geometricPMFReal p n := by rw [geometricPMFReal] have : 0 < 1 - p := sub_pos.mpr hp_lt_one positivity "For any natural number and real number , the geometric probability mass function is strictly positive." (score: 0.693)
-
definition ProbabilityTheory.geometricPMFReal (p : ℝ) (n : ℕ) : ℝ := (1-p) ^ n * p "For a real number \( p \in (0,1] \) and a natural number \( n \), the function \( \text{geometricPMFReal}(p, n) \) is defined as \( (1-p)^n \cdot p \), which represents the probability mass function of a geometric distribution with success probability \( p \)." (score: 0.683)
-
theorem ProbabilityTheory.geometricPMFReal_nonneg {n : ℕ} (hp_pos : 0 < p) (hp_le_one : p ≤ 1) : 0 ≤ geometricPMFReal p n := by rw [geometricPMFReal] have : 0 ≤ 1 - p := sub_nonneg.mpr hp_le_one positivity "For any natural number and real number satisfying , the geometric probability mass function evaluated at is nonnegative, i.e., ." (score: 0.683)
-
definition ProbabilityTheory.geometricMeasure (hp_pos : 0 < p) (hp_le_one : p ≤ 1) : Measure ℕ := (geometricPMF hp_pos hp_le_one).toMeasure "For a success probability \( p \in (0,1] \), the geometric measure on the natural numbers \( \mathbb{N} \) is the probability measure induced by the geometric probability mass function \( \text{geometricPMF}(p)(n) = (1-p)^n \cdot p \) for each \( n \in \mathbb{N} \). This measure assigns to any subset \( A \subseteq \mathbb{N} \) the sum of the probabilities \( (1-p)^n \cdot p \) for all \( n \in A \)." (score: 0.677)
-
definition ProbabilityTheory.poissonPMF (r : ℝ≥0) : PMF ℕ := by refine ⟨fun n ↦ ENNReal.ofReal (poissonPMFReal r n), ?_⟩ apply ENNReal.hasSum_coe.mpr rw [← toNNReal_one] exact (poissonPMFRealSum r).toNNReal (fun n ↦ poissonPMFReal_nonneg) "The probability mass function of the Poisson distribution with rate parameter is an extended non-negative real-valued function on , defined by:
$$
\text{poissonPMF}(\lambda, n) = e^{-\lambda} \frac{\lambda^n}{n!}
$$
for each , where the expression is interpreted in ." (score: 0.672) -
theorem ProbabilityTheory.stronglyMeasurable_geometricPMFReal : StronglyMeasurable (geometricPMFReal p) := stronglyMeasurable_iff_measurable.mpr measurable_geometricPMFReal "For any real number , the function defined by is strongly measurable with respect to the discrete -algebra on and the Borel -algebra on ." (score: 0.671)
-
theorem ProbabilityTheory.geometricPMFRealSum (hp_pos : 0 < p) (hp_le_one : p ≤ 1) : HasSum (fun n ↦ geometricPMFReal p n) 1 := by unfold geometricPMFReal have := hasSum_geometric_of_lt_one (sub_nonneg.mpr hp_le_one) (sub_lt_self 1 hp_pos) apply (hasSum_mul_right_iff (hp_pos.ne')).mpr at this simp only [sub_sub_cancel] at this rw [inv_mul_eq_div, div_self hp_pos.ne'] at this exact this "For a real number \( p \in (0,1] \), the sum over all natural numbers \( n \) of the geometric probability mass function \( (1-p)^n p \) equals 1, i.e.,
$$
\sum_{n=0}^\infty (1-p)^n p = 1.
$$" (score: 0.669) -
theorem ProbabilityTheory.isProbabilityMeasureGeometric (hp_pos : 0 < p) (hp_le_one : p ≤ 1) : IsProbabilityMeasure (geometricMeasure hp_pos hp_le_one) := PMF.toMeasure.isProbabilityMeasure (geometricPMF hp_pos hp_le_one) "For any success probability , the geometric measure on defined by for each is a probability measure, i.e., ." (score: 0.668)
-
theorem PMF.bernoulli_apply : bernoulli p h b = cond b p (1 - p) := rfl "For a Bernoulli distribution with parameter where , the probability mass function evaluated at a boolean value is given by:
$$
\text{bernoulli}(p, h)(b) = \begin{cases}
p & \text{if } b = \text{true} \\
1 - p & \text{if } b = \text{false}
\end{cases}
$$" (score: 0.665)
Matthias Liesenfeld (Aug 12 2025 at 08:13):
Kenny Lau schrieb:
if P is memoryless iff P is geometric then why do we need a separate
def IsMemoryless : Prop?
Perhaps using def is not the intended way of doing so, but from a mathematical point of view there is the notion of memorylessness of discrete distributions of random variables so in a mathematical text one would define it by stating the formal definition. So is there a more appropriate way of modelling that, i.e. making a definition in the mathematical sense?
Kenny Lau (Aug 12 2025 at 08:57):
i'm not sure
Last updated: Dec 20 2025 at 21:32 UTC