Documentation

Mathlib.Probability.StrongLaw

The strong law of large numbers #

We prove the strong law of large numbers, in ProbabilityTheory.strong_law_ae: If X n is a sequence of independent identically distributed integrable real-valued random variables, then āˆ‘ i in range n, X i / n converges almost surely to š”¼[X 0]. We give here the strong version, due to Etemadi, that only requires pairwise independence.

This file also contains the Lįµ– version of the strong law of large numbers provided by ProbabilityTheory.strong_law_Lp which shows āˆ‘ i in range n, X i / n converges in Lįµ– to š”¼[X 0] provided X n is independent identically distributed and is Lįµ–.

Implementation #

We follow the proof by Etemadi [Etemadi, An elementary proof of the strong law of large numbers][etemadi_strong_law], which goes as follows.

It suffices to prove the result for nonnegative X, as one can prove the general result by splitting a general X into its positive part and negative part. Consider Xā‚™ a sequence of nonnegative integrable identically distributed pairwise independent random variables. Let Yā‚™ be the truncation of Xā‚™ up to n. We claim that

  āˆ‘_k ā„™ (|āˆ‘_{i=0}^{c^k - 1} Yįµ¢ - š”¼[Yįµ¢]| > c^k Īµ)
    ā‰¤ āˆ‘_k (c^k Īµ)^{-2} āˆ‘_{i=0}^{c^k - 1} Var[Yįµ¢]    (by Markov inequality)
    ā‰¤ āˆ‘_i (C/i^2) Var[Yįµ¢]                           (as āˆ‘_{c^k > i} 1/(c^k)^2 ā‰¤ C/i^2)
    ā‰¤ āˆ‘_i (C/i^2) š”¼[Yįµ¢^2]
    ā‰¤ 2C š”¼[X^2]                                     (see `sum_variance_truncation_le`)

Prerequisites on truncations #

def ProbabilityTheory.truncation {Ī± : Type u_1} (f : Ī± ā†’ ā„) (A : ā„) :
Ī± ā†’ ā„

Truncating a real-valued function to the interval (-A, A].

Instances For
    theorem ProbabilityTheory.abs_truncation_le_bound {Ī± : Type u_1} (f : Ī± ā†’ ā„) (A : ā„) (x : Ī±) :
    @[simp]
    theorem ProbabilityTheory.truncation_zero {Ī± : Type u_1} (f : Ī± ā†’ ā„) :
    theorem ProbabilityTheory.abs_truncation_le_abs_self {Ī± : Type u_1} (f : Ī± ā†’ ā„) (A : ā„) (x : Ī±) :
    theorem ProbabilityTheory.truncation_eq_self {Ī± : Type u_1} {f : Ī± ā†’ ā„} {A : ā„} {x : Ī±} (h : |f x| < A) :
    theorem ProbabilityTheory.truncation_eq_of_nonneg {Ī± : Type u_1} {f : Ī± ā†’ ā„} {A : ā„} (h : āˆ€ (x : Ī±), 0 ā‰¤ f x) :
    theorem ProbabilityTheory.truncation_nonneg {Ī± : Type u_1} {f : Ī± ā†’ ā„} (A : ā„) {x : Ī±} (h : 0 ā‰¤ f x) :
    theorem ProbabilityTheory.moment_truncation_eq_intervalIntegral {Ī± : Type u_1} {m : MeasurableSpace Ī±} {Ī¼ : MeasureTheory.Measure Ī±} {f : Ī± ā†’ ā„} (hf : MeasureTheory.AEStronglyMeasurable f Ī¼) {A : ā„} (hA : 0 ā‰¤ A) {n : ā„•} (hn : n ā‰  0) :
    āˆ« (x : Ī±), ProbabilityTheory.truncation f A x ^ n āˆ‚Ī¼ = āˆ« (y : ā„) in -A..A, y ^ n āˆ‚MeasureTheory.Measure.map f Ī¼
    theorem ProbabilityTheory.moment_truncation_eq_intervalIntegral_of_nonneg {Ī± : Type u_1} {m : MeasurableSpace Ī±} {Ī¼ : MeasureTheory.Measure Ī±} {f : Ī± ā†’ ā„} (hf : MeasureTheory.AEStronglyMeasurable f Ī¼) {A : ā„} {n : ā„•} (hn : n ā‰  0) (h'f : 0 ā‰¤ f) :
    āˆ« (x : Ī±), ProbabilityTheory.truncation f A x ^ n āˆ‚Ī¼ = āˆ« (y : ā„) in 0 ..A, y ^ n āˆ‚MeasureTheory.Measure.map f Ī¼
    theorem ProbabilityTheory.integral_truncation_eq_intervalIntegral {Ī± : Type u_1} {m : MeasurableSpace Ī±} {Ī¼ : MeasureTheory.Measure Ī±} {f : Ī± ā†’ ā„} (hf : MeasureTheory.AEStronglyMeasurable f Ī¼) {A : ā„} (hA : 0 ā‰¤ A) :
    āˆ« (x : Ī±), ProbabilityTheory.truncation f A x āˆ‚Ī¼ = āˆ« (y : ā„) in -A..A, y āˆ‚MeasureTheory.Measure.map f Ī¼
    theorem ProbabilityTheory.integral_truncation_eq_intervalIntegral_of_nonneg {Ī± : Type u_1} {m : MeasurableSpace Ī±} {Ī¼ : MeasureTheory.Measure Ī±} {f : Ī± ā†’ ā„} (hf : MeasureTheory.AEStronglyMeasurable f Ī¼) {A : ā„} (h'f : 0 ā‰¤ f) :
    āˆ« (x : Ī±), ProbabilityTheory.truncation f A x āˆ‚Ī¼ = āˆ« (y : ā„) in 0 ..A, y āˆ‚MeasureTheory.Measure.map f Ī¼
    theorem ProbabilityTheory.integral_truncation_le_integral_of_nonneg {Ī± : Type u_1} {m : MeasurableSpace Ī±} {Ī¼ : MeasureTheory.Measure Ī±} {f : Ī± ā†’ ā„} (hf : MeasureTheory.Integrable f) (h'f : 0 ā‰¤ f) {A : ā„} :
    āˆ« (x : Ī±), ProbabilityTheory.truncation f A x āˆ‚Ī¼ ā‰¤ āˆ« (x : Ī±), f x āˆ‚Ī¼
    theorem ProbabilityTheory.tendsto_integral_truncation {Ī± : Type u_1} {m : MeasurableSpace Ī±} {Ī¼ : MeasureTheory.Measure Ī±} {f : Ī± ā†’ ā„} (hf : MeasureTheory.Integrable f) :
    Filter.Tendsto (fun A => āˆ« (x : Ī±), ProbabilityTheory.truncation f A x āˆ‚Ī¼) Filter.atTop (nhds (āˆ« (x : Ī±), f x āˆ‚Ī¼))

    If a function is integrable, then the integral of its truncated versions converges to the integral of the whole function.

    theorem ProbabilityTheory.sum_prob_mem_Ioc_le {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] {X : Ī© ā†’ ā„} (hint : MeasureTheory.Integrable X) (hnonneg : 0 ā‰¤ X) {K : ā„•} {N : ā„•} (hKN : K ā‰¤ N) :
    (Finset.sum (Finset.range K) fun j => ā†‘ā†‘MeasureTheory.volume {Ļ‰ | X Ļ‰ āˆˆ Set.Ioc ā†‘j ā†‘N}) ā‰¤ ENNReal.ofReal ((āˆ« (a : Ī©), X a) + 1)
    theorem ProbabilityTheory.tsum_prob_mem_Ioi_lt_top {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] {X : Ī© ā†’ ā„} (hint : MeasureTheory.Integrable X) (hnonneg : 0 ā‰¤ X) :
    āˆ‘' (j : ā„•), ā†‘ā†‘MeasureTheory.volume {Ļ‰ | X Ļ‰ āˆˆ Set.Ioi ā†‘j} < āŠ¤
    theorem ProbabilityTheory.sum_variance_truncation_le {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] {X : Ī© ā†’ ā„} (hint : MeasureTheory.Integrable X) (hnonneg : 0 ā‰¤ X) (K : ā„•) :
    (Finset.sum (Finset.range K) fun j => (ā†‘j ^ 2)ā»Ā¹ * āˆ« (a : Ī©), (ProbabilityTheory.truncation X ā†‘j ^ 2) a) ā‰¤ 2 * āˆ« (a : Ī©), X a
    theorem ProbabilityTheory.strong_law_aux1 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) (hnonneg : āˆ€ (i : ā„•) (Ļ‰ : Ī©), 0 ā‰¤ X i Ļ‰) {c : ā„} (c_one : 1 < c) {Īµ : ā„} (Īµpos : 0 < Īµ) :
    āˆ€įµ (Ļ‰ : Ī©), āˆ€į¶  (n : ā„•) in Filter.atTop, |(Finset.sum (Finset.range āŒŠc ^ nāŒ‹ā‚Š) fun i => ProbabilityTheory.truncation (X i) (ā†‘i) Ļ‰) - āˆ« (a : Ī©), Finset.sum (Finset.range āŒŠc ^ nāŒ‹ā‚Š) (fun i => ProbabilityTheory.truncation (X i) ā†‘i) a| < Īµ * ā†‘āŒŠc ^ nāŒ‹ā‚Š
    theorem ProbabilityTheory.strong_law_aux2 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) (hnonneg : āˆ€ (i : ā„•) (Ļ‰ : Ī©), 0 ā‰¤ X i Ļ‰) {c : ā„} (c_one : 1 < c) :
    āˆ€įµ (Ļ‰ : Ī©), (fun n => (Finset.sum (Finset.range āŒŠc ^ nāŒ‹ā‚Š) fun i => ProbabilityTheory.truncation (X i) (ā†‘i) Ļ‰) - āˆ« (a : Ī©), Finset.sum (Finset.range āŒŠc ^ nāŒ‹ā‚Š) (fun i => ProbabilityTheory.truncation (X i) ā†‘i) a) =o[Filter.atTop] fun n => ā†‘āŒŠc ^ nāŒ‹ā‚Š
    theorem ProbabilityTheory.strong_law_aux3 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) :
    (fun n => (āˆ« (a : Ī©), Finset.sum (Finset.range n) (fun i => ProbabilityTheory.truncation (X i) ā†‘i) a) - ā†‘n * āˆ« (a : Ī©), X 0 a) =o[Filter.atTop] Nat.cast

    The expectation of the truncated version of Xįµ¢ behaves asymptotically like the whole expectation. This follows from convergence and CesĆ ro averaging.

    theorem ProbabilityTheory.strong_law_aux4 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) (hnonneg : āˆ€ (i : ā„•) (Ļ‰ : Ī©), 0 ā‰¤ X i Ļ‰) {c : ā„} (c_one : 1 < c) :
    āˆ€įµ (Ļ‰ : Ī©), (fun n => (Finset.sum (Finset.range āŒŠc ^ nāŒ‹ā‚Š) fun i => ProbabilityTheory.truncation (X i) (ā†‘i) Ļ‰) - ā†‘āŒŠc ^ nāŒ‹ā‚Š * āˆ« (a : Ī©), X 0 a) =o[Filter.atTop] fun n => ā†‘āŒŠc ^ nāŒ‹ā‚Š
    theorem ProbabilityTheory.strong_law_aux5 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) (hnonneg : āˆ€ (i : ā„•) (Ļ‰ : Ī©), 0 ā‰¤ X i Ļ‰) :
    āˆ€įµ (Ļ‰ : Ī©), (fun n => (Finset.sum (Finset.range n) fun i => ProbabilityTheory.truncation (X i) (ā†‘i) Ļ‰) - Finset.sum (Finset.range n) fun i => X i Ļ‰) =o[Filter.atTop] fun n => ā†‘n

    The truncated and non-truncated versions of Xįµ¢ have the same asymptotic behavior, as they almost surely coincide at all but finitely many steps. This follows from a probability computation and Borel-Cantelli.

    theorem ProbabilityTheory.strong_law_aux6 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) (hnonneg : āˆ€ (i : ā„•) (Ļ‰ : Ī©), 0 ā‰¤ X i Ļ‰) {c : ā„} (c_one : 1 < c) :
    āˆ€įµ (Ļ‰ : Ī©), Filter.Tendsto (fun n => (Finset.sum (Finset.range āŒŠc ^ nāŒ‹ā‚Š) fun i => X i Ļ‰) / ā†‘āŒŠc ^ nāŒ‹ā‚Š) Filter.atTop (nhds (āˆ« (a : Ī©), X 0 a))
    theorem ProbabilityTheory.strong_law_aux7 {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) (hnonneg : āˆ€ (i : ā„•) (Ļ‰ : Ī©), 0 ā‰¤ X i Ļ‰) :
    āˆ€įµ (Ļ‰ : Ī©), Filter.Tendsto (fun n => (Finset.sum (Finset.range n) fun i => X i Ļ‰) / ā†‘n) Filter.atTop (nhds (āˆ« (a : Ī©), X 0 a))

    Xįµ¢ satisfies the strong law of large numbers along all integers. This follows from the corresponding fact along the sequences c^n, and the fact that any integer can be sandwiched between c^n and c^(n+1) with comparably small error if c is close enough to 1 (which is formalized in tendsto_div_of_monotone_of_tendsto_div_floor_pow).

    theorem ProbabilityTheory.strong_law_ae {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] (X : ā„• ā†’ Ī© ā†’ ā„) (hint : MeasureTheory.Integrable (X 0)) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) :
    āˆ€įµ (Ļ‰ : Ī©), Filter.Tendsto (fun n => (Finset.sum (Finset.range n) fun i => X i Ļ‰) / ā†‘n) Filter.atTop (nhds (āˆ« (a : Ī©), X 0 a))

    Strong law of large numbers, almost sure version: if X n is a sequence of independent identically distributed integrable real-valued random variables, then āˆ‘ i in range n, X i / n converges almost surely to š”¼[X 0]. We give here the strong version, due to Etemadi, that only requires pairwise independence.

    theorem ProbabilityTheory.strong_law_Lp {Ī© : Type u_1} [MeasureTheory.MeasureSpace Ī©] [MeasureTheory.IsProbabilityMeasure MeasureTheory.volume] {p : ENNReal} (hp : 1 ā‰¤ p) (hp' : p ā‰  āŠ¤) (X : ā„• ā†’ Ī© ā†’ ā„) (hā„’p : MeasureTheory.Memā„’p (X 0) p) (hindep : Pairwise fun i j => ProbabilityTheory.IndepFun (X i) (X j)) (hident : āˆ€ (i : ā„•), ProbabilityTheory.IdentDistrib (X i) (X 0)) :
    Filter.Tendsto (fun n => MeasureTheory.snorm (fun Ļ‰ => (Finset.sum (Finset.range n) fun i => X i Ļ‰) / ā†‘n - āˆ« (a : Ī©), X 0 a) p MeasureTheory.volume) Filter.atTop (nhds 0)

    Strong law of large numbers, Lįµ– version: if X n is a sequence of independent identically distributed real-valued random variables in Lįµ–, then āˆ‘ i in range n, X i / n converges in Lįµ– to š”¼[X 0].