Sub-Gaussian random variables #
This presentation of sub-Gaussian random variables is inspired by section 2.5 of
[Ver18]. Let X
be a random variable. Consider the following five properties, in which
Kᵢ
are positive reals,
- (i) for all
t ≥ 0
,ℙ(|X| ≥ t) ≤ 2 * exp(-t^2 / K₁^2)
, - (ii) for all
p : ℕ
with1 ≤ p
,𝔼[|X|^p]^(1/p) ≤ K₂ sqrt(p)
, - (iii) for all
|t| ≤ 1/K₃
,𝔼[exp (t^2 * X^2)] ≤ exp (K₃^2 * t^2)
, - (iv)
𝔼[exp(X^2 / K₄)] ≤ 2
, - (v) for all
t : ℝ
,𝔼[exp (t * X)] ≤ exp (K₅ * t^2 / 2)
.
Properties (i) to (iv) are equivalent, in the sense that there exists a constant C
such that
if X
satisfies one of those properties with constant K
, then it satisfies any other one with
constant at most CK
.
If 𝔼[X] = 0
then properties (i)-(iv) are equivalent to (v) in that same sense.
Property (v) implies that X
has expectation zero.
The name sub-Gaussian is used by various authors to refer to any one of (i)-(v). We will say that a
random variable has sub-Gaussian moment generating function (mgf) with constant K₅
to mean that
property (v) holds with that constant. The function exp (K₅ * t^2 / 2)
which appears in
property (v) is the mgf of a Gaussian with variance K₅
.
That property (v) is the most convenient one to work with if one wants to prove concentration
inequalities using Chernoff's method.
TODO: implement definitions for (i)-(iv) when it makes sense. For example the maximal constant K₄
such that (iv) is true is an Orlicz norm. Prove relations between those properties.
Conditionally sub-Gaussian random variables and kernels #
A related notion to sub-Gaussian random variables is that of conditionally sub-Gaussian random
variables. A random variable X
is conditionally sub-Gaussian in the sense of (v) with respect to
a sigma-algebra m
and a measure μ
if for all t : ℝ
, exp (t * X)
is μ
-integrable and
the conditional mgf of X
conditioned on m
is almost surely bounded by exp (c * t^2 / 2)
for some constant c
.
As in other parts of Mathlib's probability library (notably the independence and conditional independence definitions), we express both sub-Gaussian and conditionally sub-Gaussian properties as special cases of a notion of sub-Gaussianity with respect to a kernel and a measure.
Main definitions #
Kernel.HasSubgaussianMGF
: a random variableX
has a sub-Gaussian moment generating function with parameterc
with respect to a kernelκ
and a measureν
if forν
-almost allω'
, for allt : ℝ
, the moment generating function ofX
with respect toκ ω'
is bounded byexp (c * t ^ 2 / 2)
.HasCondSubgaussianMGF
: a random variableX
has a conditionally sub-Gaussian moment generating function with parameterc
with respect to a sigma-algebram
and a measureμ
if for allt : ℝ
,exp (t * X)
isμ
-integrable and the moment generating function ofX
conditioned onm
is almost surely bounded byexp (c * t ^ 2 / 2)
for allt : ℝ
. The actual definition usesKernel.HasSubgaussianMGF
:HasCondSubgaussianMGF
is defined as sub-Gaussian with respect to the conditional expectation kernel form
and the restriction ofμ
to the sigma-algebram
.HasSubgaussianMGF
: a random variableX
has a sub-Gaussian moment generating function with parameterc
with respect to a measureμ
if for allt : ℝ
,exp (t * X)
isμ
-integrable and the moment generating function ofX
is bounded byexp (c * t ^ 2 / 2)
for allt : ℝ
. This is equivalent toKernel.HasSubgaussianMGF
with a constant kernel. SeeHasSubgaussianMGF_iff_kernel
.
Main statements #
measure_sum_ge_le_of_iIndepFun
: Hoeffding's inequality for sums of independent sub-Gaussian random variables.measure_sum_ge_le_of_HasCondSubgaussianMGF
: the Azuma-Hoeffding inequality for sub-Gaussian random variables.
Implementation notes #
Definition of Kernel.HasSubgaussianMGF
#
The definition of sub-Gaussian with respect to a kernel and a measure is the following:
structure Kernel.HasSubgaussianMGF (X : Ω → ℝ) (c : ℝ≥0)
(κ : Kernel Ω' Ω) (ν : Measure Ω' := by volume_tac) : Prop where
integrable_exp_mul : ∀ t, Integrable (fun ω ↦ exp (t * X ω)) (κ ∘ₘ ν)
mgf_le : ∀ᵐ ω' ∂ν, ∀ t, mgf X (κ ω') t ≤ exp (c * t ^ 2 / 2)
An interesting point is that the integrability condition is not integrability of exp (t * X)
with respect to κ ω'
for ν
-almost all ω'
, but integrability with respect to κ ∘ₘ ν
.
This is a stronger condition, as the weaker one did not allow to prove interesting results about
the sum of two sub-Gaussian random variables.
For the conditional case, that integrability condition reduces to integrability of exp (t * X)
with respect to μ
.
Definition of HasCondSubgaussianMGF
#
We define HasCondSubgaussianMGF
as a special case of Kernel.HasSubgaussianMGF
with the
conditional expectation kernel for m
, condExpKernel μ m
, and the restriction of μ
to m
,
μ.trim hm
(where hm
states that m
is a sub-sigma-algebra).
Note that condExpKernel μ m ∘ₘ μ.trim hm = μ
. The definition is equivalent to the two
conditions
- for all
t
,exp (t * X)
isμ
-integrable, - for
μ.trim hm
-almost allω
, for allt
, the mgf with respect to the the conditional distributioncondExpKernel μ m ω
is bounded byexp (c * t ^ 2 / 2)
.
For any t
, we can write the mgf of X
with respect to the conditional expectation kernel as
a conditional expectation, (μ.trim hm)
-almost surely:
mgf X (condExpKernel μ m ·) t =ᵐ[μ.trim hm] μ[fun ω' ↦ exp (t * X ω') | m]
.
References #
Sub-Gaussian with respect to a kernel and a measure #
A random variable X
has a sub-Gaussian moment generating function with parameter c
with respect to a kernel κ
and a measure ν
if for ν
-almost all ω'
, for all t : ℝ
,
the moment generating function of X
with respect to κ ω'
is bounded by exp (c * t ^ 2 / 2)
.
This implies in particular that X
has expectation 0.
- integrable_exp_mul (t : ℝ) : MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (t * X ω)) (MeasureTheory.Measure.bind ν ⇑κ)
Instances For
Chernoff bound on the right tail of a sub-Gaussian random variable.
For ν : Measure Ω'
, κ : Kernel Ω' Ω
and η : (Ω' × Ω) Ω''
, if a random variable X : Ω → ℝ
has a sub-Gaussian mgf with respect to κ
and ν
and another random variable Y : Ω'' → ℝ
has
a sub-Gaussian mgf with respect to η
and ν ⊗ₘ κ : Measure (Ω' × Ω)
, then X + Y
(random
variable on the measurable space Ω × Ω''
) has a sub-Gaussian mgf with respect to
κ ⊗ₖ η : Kernel Ω' (Ω × Ω'')
and ν
.
For ν : Measure Ω'
, κ : Kernel Ω' Ω
and η : Ω Ω''
, if a random variable X : Ω → ℝ
has a sub-Gaussian mgf with respect to κ
and ν
and another random variable Y : Ω'' → ℝ
has
a sub-Gaussian mgf with respect to η
and κ ∘ₘ ν : Measure Ω
, then X + Y
(random
variable on the measurable space Ω × Ω''
) has a sub-Gaussian mgf with respect to
κ ⊗ₖ prodMkLeft Ω' η : Kernel Ω' (Ω × Ω'')
and ν
.
Conditionally sub-Gaussian moment generating function #
A random variable X
has a conditionally sub-Gaussian moment generating function
with parameter c
with respect to a sigma-algebra m
and a measure μ
if for all t : ℝ
,
exp (t * X)
is μ
-integrable and the moment generating function of X
conditioned on m
is
almost surely bounded by exp (c * t ^ 2 / 2)
for all t : ℝ
.
This implies in particular that X
has expectation 0.
The actual definition uses Kernel.HasSubgaussianMGF
: HasCondSubgaussianMGF
is defined as
sub-Gaussian with respect to the conditional expectation kernel for m
and the restriction of μ
to the sigma-algebra m
.
Equations
- ProbabilityTheory.HasCondSubgaussianMGF m hm X c μ = ProbabilityTheory.Kernel.HasSubgaussianMGF X c (ProbabilityTheory.condExpKernel μ m) (μ.trim hm)
Instances For
Sub-Gaussian moment generating function #
A random variable X
has a sub-Gaussian moment generating function with parameter c
with respect to a measure μ
if for all t : ℝ
, exp (t * X)
is μ
-integrable and
the moment generating function of X
is bounded by exp (c * t ^ 2 / 2)
for all t : ℝ
.
This implies in particular that X
has expectation 0.
This is equivalent to Kernel.HasSubgaussianMGF X c (Kernel.const Unit μ) (Measure.dirac ())
,
as proved in HasSubgaussianMGF_iff_kernel
.
Properties about sub-Gaussian moment generating functions should be proved first for
Kernel.HasSubgaussianMGF
when possible.
- integrable_exp_mul (t : ℝ) : MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (t * X ω)) μ
Instances For
Chernoff bound on the right tail of a sub-Gaussian random variable.
Hoeffding inequality for sub-Gaussian random variables.
Hoeffding inequality for sub-Gaussian random variables.
If X
is sub-Gaussian with parameter cX
with respect to the restriction of μ
to
a sub-sigma-algebra m
and Y
is conditionally sub-Gaussian with parameter cY
with respect to
m
and μ
then X + Y
is sub-Gaussian with parameter cX + cY
with respect to μ
.
HasSubgaussianMGF X cX (μ.trim hm)
can be obtained from HasSubgaussianMGF X cX μ
if X
is
m
-measurable. See HasSubgaussianMGF.trim
.
Let Y
be a random process adapted to a filtration ℱ
, such that for all i : ℕ
, Y i
is
conditionally sub-Gaussian with parameter cY i
with respect to ℱ (i - 1)
.
In particular, n ↦ ∑ i ∈ range n, Y i
is a martingale.
Then the sum ∑ i ∈ range n, Y i
is sub-Gaussian with parameter ∑ i ∈ range n, cY i
.
Azuma-Hoeffding inequality for sub-Gaussian random variables.