Basic properties of the risk of an estimator #

Main statements #

iSup_bayesRisk_le_minimaxRisk: the maximal Bayes risk is less than or equal to the minimax risk.
bayesRisk_le_bayesRisk_comp: data-processing inequality for the Bayes risk with respect to a prior: if we compose the data generating kernel P with a Markov kernel, then the Bayes risk increases.
bayesRisk_le_iInf: for P a Markov kernel, the Bayes risk is less than ⨅ y, ∫⁻ θ, ℓ θ y ∂π.

In several cases, there is no information in the data about the parameter and the Bayes risk takes its maximal value.

bayesRisk_const: if the data generating kernel is constant, then the Bayes risk is equal to ⨅ y, ∫⁻ θ, ℓ θ y ∂π.
bayesRisk_of_subsingleton: if the observation space is a subsingleton, then the Bayes risk is equal to ⨅ y, ∫⁻ θ, ℓ θ y ∂π.

TODO #

In many cases, the maximal Bayes risk and the minimax risk are equal (by a so-called minimax theorem).

theorem ProbabilityTheory.avgRisk_le_iSup_risk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] :

avgRisk ℓ P κ π ≤ ⨆ (θ : Θ), ∫⁻ (y : 𝓨), ℓ θ y ∂(κ.comp P) θ

source

theorem ProbabilityTheory.bayesRisk_le_avgRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) [hκ : IsMarkovKernel κ] :

bayesRisk ℓ P π ≤ avgRisk ℓ P κ π

source

theorem ProbabilityTheory.bayesRisk_le_minimaxRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] :

bayesRisk ℓ P π ≤ minimaxRisk ℓ P

source

theorem ProbabilityTheory.iSup_bayesRisk_le_minimaxRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) :

⨆ (π : MeasureTheory.Measure Θ), ⨆ (_ : MeasureTheory.IsProbabilityMeasure π), bayesRisk ℓ P π ≤ minimaxRisk ℓ P

The maximal Bayes risk is less than or equal to the minimax risk.

source

theorem ProbabilityTheory.avgRisk_const_left {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (μ : MeasureTheory.Measure 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) :

avgRisk ℓ (Kernel.const Θ μ) κ π = ∫⁻ (θ : Θ), ∫⁻ (y : 𝓨), ℓ θ y ∂μ.bind ⇑κ ∂π

See avgRisk_const_left' for a similar result with integrals swapped.

source

theorem ProbabilityTheory.avgRisk_const_left' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.SFinite μ] (κ : Kernel 𝓧 𝓨) [IsSFiniteKernel κ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :

avgRisk ℓ (Kernel.const Θ μ) κ π = ∫⁻ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π ∂μ.bind ⇑κ

See avgRisk_const_left for a similar result with integrals swapped.

source

theorem ProbabilityTheory.avgRisk_const_right' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) (ν : MeasureTheory.Measure 𝓨) (π : MeasureTheory.Measure Θ) :

avgRisk ℓ P (Kernel.const 𝓧 ν) π = ∫⁻ (θ : Θ), (P θ) Set.univ * ∫⁻ (y : 𝓨), ℓ θ y ∂ν ∂π

See avgRisk_const_right for a simpler result when P is a Markov kernel.

source

theorem ProbabilityTheory.avgRisk_const_right {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) [IsMarkovKernel P] (ν : MeasureTheory.Measure 𝓨) (π : MeasureTheory.Measure Θ) :

avgRisk ℓ P (Kernel.const 𝓧 ν) π = ∫⁻ (θ : Θ), ∫⁻ (y : 𝓨), ℓ θ y ∂ν ∂π

See avgRisk_const_right' for a similar result when P is not a Markov kernel.

source

theorem ProbabilityTheory.bayesRisk_le_iInf' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) :

bayesRisk ℓ P π ≤ ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y * (P θ) Set.univ ∂π

See bayesRisk_le_iInf for a simpler result when P is a Markov kernel.

source

theorem ProbabilityTheory.bayesRisk_le_iInf {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (P : Kernel Θ 𝓧) [IsMarkovKernel P] (π : MeasureTheory.Measure Θ) :

bayesRisk ℓ P π ≤ ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π

See bayesRisk_le_iInf' for a similar result when P is not a Markov kernel.

source

theorem ProbabilityTheory.bayesRisk_const' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.SFinite μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] (hl_pos : μ Set.univ = ⊤ → ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π = 0 → ∃ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π = 0) (h_zero : μ = 0 → Nonempty 𝓨) :

bayesRisk ℓ (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y * μ Set.univ ∂π

source

theorem ProbabilityTheory.bayesRisk_const_of_neZero {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (μ : MeasureTheory.Measure 𝓧) [NeZero μ] [MeasureTheory.IsFiniteMeasure μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :

bayesRisk ℓ (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y * μ Set.univ ∂π

source

theorem ProbabilityTheory.bayesRisk_const_of_nonempty {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} [Nonempty 𝓨] (hl : Measurable (Function.uncurry ℓ)) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.IsFiniteMeasure μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :

bayesRisk ℓ (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y * μ Set.univ ∂π

source

theorem ProbabilityTheory.bayesRisk_const {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.IsProbabilityMeasure μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :

bayesRisk ℓ (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π

source

theorem ProbabilityTheory.avgRisk_le_mul' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (P : Kernel Θ 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), ℓ θ y ≤ ↑C) :

avgRisk ℓ P κ π ≤ ↑C * κ.bound * P.bound * π Set.univ

See avgRisk_le_mul for the usual case in which π is a probability measure and the kernels are Markov.

source

theorem ProbabilityTheory.avgRisk_le_mul {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (P : Kernel Θ 𝓧) [IsMarkovKernel P] (κ : Kernel 𝓧 𝓨) [IsMarkovKernel κ] (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), ℓ θ y ≤ ↑C) :

avgRisk ℓ P κ π ≤ ↑C

source

theorem ProbabilityTheory.bayesRisk_le_mul' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} [h𝓨 : Nonempty 𝓨] (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), ℓ θ y ≤ ↑C) :

bayesRisk ℓ P π ≤ ↑C * P.bound * π Set.univ

For a bounded loss, the Bayes risk with respect to a prior is bounded by a constant. See bayesRisk_le_mul for the usual cases where all measures are probability measures.

source

theorem ProbabilityTheory.bayesRisk_le_mul {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} [Nonempty 𝓨] (P : Kernel Θ 𝓧) [IsMarkovKernel P] (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), ℓ θ y ≤ ↑C) :

bayesRisk ℓ P π ≤ ↑C

For a bounded loss, the Bayes risk with respect to a prior is bounded by a constant.

source

theorem ProbabilityTheory.bayesRisk_lt_top {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} [Nonempty 𝓨] (P : Kernel Θ 𝓧) [IsFiniteKernel P] (π : MeasureTheory.Measure Θ) [MeasureTheory.IsFiniteMeasure π] {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), ℓ θ y ≤ ↑C) :

bayesRisk ℓ P π < ⊤

For a bounded loss, the Bayes risk with respect to a prior is finite.

source

theorem ProbabilityTheory.bayesRisk_discard {Θ : Type u_1} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} (hl : Measurable (Function.uncurry ℓ)) (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :

bayesRisk ℓ (Kernel.discard Θ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π

source

theorem ProbabilityTheory.bayesRisk_eq_iInf_measure_of_subsingleton {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} {P : Kernel Θ 𝓧} {π : MeasureTheory.Measure Θ} [Subsingleton 𝓧] [Nonempty 𝓨] :

bayesRisk ℓ P π = ⨅ (μ : MeasureTheory.Measure 𝓨), ⨅ (_ : MeasureTheory.IsProbabilityMeasure μ), avgRisk ℓ P (Kernel.const 𝓧 μ) π

source

theorem ProbabilityTheory.bayesRisk_of_subsingleton' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} {P : Kernel Θ 𝓧} {π : MeasureTheory.Measure Θ} [Subsingleton 𝓧] [Nonempty 𝓨] [MeasureTheory.SFinite π] (hl : Measurable (Function.uncurry ℓ)) :

bayesRisk ℓ P π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y * (P θ) Set.univ ∂π

source

theorem ProbabilityTheory.bayesRisk_of_subsingleton {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} {ℓ : Θ → 𝓨 → ENNReal} {P : Kernel Θ 𝓧} {π : MeasureTheory.Measure Θ} [Subsingleton 𝓧] [Nonempty 𝓨] [IsMarkovKernel P] [MeasureTheory.SFinite π] (hl : Measurable (Function.uncurry ℓ)) :

bayesRisk ℓ P π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), ℓ θ y ∂π

source

theorem ProbabilityTheory.bayesRisk_le_bayesRisk_comp {Θ : Type u_1} {𝓧 : Type u_2} {𝓧' : Type u_3} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓧' : MeasurableSpace 𝓧'} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) (η : Kernel 𝓧 𝓧') [IsMarkovKernel η] :

bayesRisk ℓ P π ≤ bayesRisk ℓ (η.comp P) π

Data processing inequality for the Bayes risk with respect to a prior: composition of the data generating kernel by a Markov kernel increases the risk.

source

theorem ProbabilityTheory.bayesRisk_le_bayesRisk_map {Θ : Type u_1} {𝓧 : Type u_2} {𝓧' : Type u_3} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓧' : MeasurableSpace 𝓧'} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) {f : 𝓧 → 𝓧'} (hf : Measurable f) :

bayesRisk ℓ P π ≤ bayesRisk ℓ (P.map f) π

Data processing inequality for the Bayes risk with respect to a prior: taking the map of the data generating kernel by a function increases the risk.

source

theorem ProbabilityTheory.bayesRisk_compProd_le_bayesRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓧' : Type u_3} {𝓨 : Type u_4} {mΘ : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓧' : MeasurableSpace 𝓧'} {m𝓨 : MeasurableSpace 𝓨} (ℓ : Θ → 𝓨 → ENNReal) (P : Kernel Θ 𝓧) [IsSFiniteKernel P] (π : MeasureTheory.Measure Θ) (η : Kernel (Θ × 𝓧) 𝓧') [IsMarkovKernel η] :

bayesRisk ℓ (P.compProd η) π ≤ bayesRisk ℓ P π

Documentation

Mathlib.Probability.Decision.Risk.Basic

Basic properties of the risk of an estimator #

Main statements #

TODO #