Documentation

Mathlib.Probability.Decision.Risk.Basic

Basic properties of the risk of an estimator #

Main statements #

In several cases, there is no information in the data about the parameter and the Bayes risk takes its maximal value.

TODO #

In many cases, the maximal Bayes risk and the minimax risk are equal (by a so-called minimax theorem).

theorem ProbabilityTheory.avgRisk_le_iSup_risk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] :
avgRisk P κ π ⨆ (θ : Θ), ∫⁻ (y : 𝓨), θ y (κ.comp P) θ
theorem ProbabilityTheory.bayesRisk_le_avgRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) [ : IsMarkovKernel κ] :
bayesRisk P π avgRisk P κ π
theorem ProbabilityTheory.bayesRisk_le_minimaxRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] :
bayesRisk P π minimaxRisk P
theorem ProbabilityTheory.iSup_bayesRisk_le_minimaxRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) :

The maximal Bayes risk is less than or equal to the minimax risk.

theorem ProbabilityTheory.avgRisk_const_left {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (μ : MeasureTheory.Measure 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) :
avgRisk (Kernel.const Θ μ) κ π = ∫⁻ (θ : Θ), ∫⁻ (y : 𝓨), θ y μ.bind κ π

See avgRisk_const_left' for a similar result with integrals swapped.

theorem ProbabilityTheory.avgRisk_const_left' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.SFinite μ] (κ : Kernel 𝓧 𝓨) [IsSFiniteKernel κ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :
avgRisk (Kernel.const Θ μ) κ π = ∫⁻ (y : 𝓨), ∫⁻ (θ : Θ), θ y π μ.bind κ

See avgRisk_const_left for a similar result with integrals swapped.

theorem ProbabilityTheory.avgRisk_const_right' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) (ν : MeasureTheory.Measure 𝓨) (π : MeasureTheory.Measure Θ) :
avgRisk P (Kernel.const 𝓧 ν) π = ∫⁻ (θ : Θ), (P θ) Set.univ * ∫⁻ (y : 𝓨), θ y ν π

See avgRisk_const_right for a simpler result when P is a Markov kernel.

theorem ProbabilityTheory.avgRisk_const_right {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) [IsMarkovKernel P] (ν : MeasureTheory.Measure 𝓨) (π : MeasureTheory.Measure Θ) :
avgRisk P (Kernel.const 𝓧 ν) π = ∫⁻ (θ : Θ), ∫⁻ (y : 𝓨), θ y ν π

See avgRisk_const_right' for a similar result when P is not a Markov kernel.

theorem ProbabilityTheory.bayesRisk_le_iInf' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) :
bayesRisk P π ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y * (P θ) Set.univ π

See bayesRisk_le_iInf for a simpler result when P is a Markov kernel.

theorem ProbabilityTheory.bayesRisk_le_iInf {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (P : Kernel Θ 𝓧) [IsMarkovKernel P] (π : MeasureTheory.Measure Θ) :
bayesRisk P π ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y π

See bayesRisk_le_iInf' for a similar result when P is not a Markov kernel.

theorem ProbabilityTheory.bayesRisk_const' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.SFinite μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] (hl_pos : μ Set.univ = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y π = 0∃ (y : 𝓨), ∫⁻ (θ : Θ), θ y π = 0) (h_zero : μ = 0Nonempty 𝓨) :
bayesRisk (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y * μ Set.univ π
theorem ProbabilityTheory.bayesRisk_const_of_neZero {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (μ : MeasureTheory.Measure 𝓧) [NeZero μ] [MeasureTheory.IsFiniteMeasure μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :
bayesRisk (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y * μ Set.univ π
theorem ProbabilityTheory.bayesRisk_const_of_nonempty {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} [Nonempty 𝓨] (hl : Measurable (Function.uncurry )) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.IsFiniteMeasure μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :
bayesRisk (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y * μ Set.univ π
theorem ProbabilityTheory.bayesRisk_const {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (μ : MeasureTheory.Measure 𝓧) [MeasureTheory.IsProbabilityMeasure μ] (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :
bayesRisk (Kernel.const Θ μ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y π
theorem ProbabilityTheory.avgRisk_le_mul' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (P : Kernel Θ 𝓧) (κ : Kernel 𝓧 𝓨) (π : MeasureTheory.Measure Θ) {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), θ y C) :
avgRisk P κ π C * κ.bound * P.bound * π Set.univ

See avgRisk_le_mul for the usual case in which π is a probability measure and the kernels are Markov.

theorem ProbabilityTheory.avgRisk_le_mul {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (P : Kernel Θ 𝓧) [IsMarkovKernel P] (κ : Kernel 𝓧 𝓨) [IsMarkovKernel κ] (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), θ y C) :
avgRisk P κ π C
theorem ProbabilityTheory.bayesRisk_le_mul' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} [h𝓨 : Nonempty 𝓨] (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), θ y C) :
bayesRisk P π C * P.bound * π Set.univ

For a bounded loss, the Bayes risk with respect to a prior is bounded by a constant. See bayesRisk_le_mul for the usual cases where all measures are probability measures.

theorem ProbabilityTheory.bayesRisk_le_mul {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} [Nonempty 𝓨] (P : Kernel Θ 𝓧) [IsMarkovKernel P] (π : MeasureTheory.Measure Θ) [MeasureTheory.IsProbabilityMeasure π] {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), θ y C) :
bayesRisk P π C

For a bounded loss, the Bayes risk with respect to a prior is bounded by a constant.

theorem ProbabilityTheory.bayesRisk_lt_top {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} [Nonempty 𝓨] (P : Kernel Θ 𝓧) [IsFiniteKernel P] (π : MeasureTheory.Measure Θ) [MeasureTheory.IsFiniteMeasure π] {C : NNReal} (hℓC : ∀ (θ : Θ) (y : 𝓨), θ y C) :
bayesRisk P π <

For a bounded loss, the Bayes risk with respect to a prior is finite.

theorem ProbabilityTheory.bayesRisk_discard {Θ : Type u_1} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} (hl : Measurable (Function.uncurry )) (π : MeasureTheory.Measure Θ) [MeasureTheory.SFinite π] :
bayesRisk (Kernel.discard Θ) π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y π
theorem ProbabilityTheory.bayesRisk_eq_iInf_measure_of_subsingleton {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} {P : Kernel Θ 𝓧} {π : MeasureTheory.Measure Θ} [Subsingleton 𝓧] [Nonempty 𝓨] :
bayesRisk P π = ⨅ (μ : MeasureTheory.Measure 𝓨), ⨅ (_ : MeasureTheory.IsProbabilityMeasure μ), avgRisk P (Kernel.const 𝓧 μ) π
theorem ProbabilityTheory.bayesRisk_of_subsingleton' {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} {P : Kernel Θ 𝓧} {π : MeasureTheory.Measure Θ} [Subsingleton 𝓧] [Nonempty 𝓨] [MeasureTheory.SFinite π] (hl : Measurable (Function.uncurry )) :
bayesRisk P π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y * (P θ) Set.univ π
theorem ProbabilityTheory.bayesRisk_of_subsingleton {Θ : Type u_1} {𝓧 : Type u_2} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓨 : MeasurableSpace 𝓨} { : Θ𝓨ENNReal} {P : Kernel Θ 𝓧} {π : MeasureTheory.Measure Θ} [Subsingleton 𝓧] [Nonempty 𝓨] [IsMarkovKernel P] [MeasureTheory.SFinite π] (hl : Measurable (Function.uncurry )) :
bayesRisk P π = ⨅ (y : 𝓨), ∫⁻ (θ : Θ), θ y π
theorem ProbabilityTheory.bayesRisk_le_bayesRisk_comp {Θ : Type u_1} {𝓧 : Type u_2} {𝓧' : Type u_3} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓧' : MeasurableSpace 𝓧'} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) (η : Kernel 𝓧 𝓧') [IsMarkovKernel η] :
bayesRisk P π bayesRisk (η.comp P) π

Data processing inequality for the Bayes risk with respect to a prior: composition of the data generating kernel by a Markov kernel increases the risk.

theorem ProbabilityTheory.bayesRisk_le_bayesRisk_map {Θ : Type u_1} {𝓧 : Type u_2} {𝓧' : Type u_3} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓧' : MeasurableSpace 𝓧'} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) (π : MeasureTheory.Measure Θ) {f : 𝓧𝓧'} (hf : Measurable f) :
bayesRisk P π bayesRisk (P.map f) π

Data processing inequality for the Bayes risk with respect to a prior: taking the map of the data generating kernel by a function increases the risk.

theorem ProbabilityTheory.bayesRisk_compProd_le_bayesRisk {Θ : Type u_1} {𝓧 : Type u_2} {𝓧' : Type u_3} {𝓨 : Type u_4} { : MeasurableSpace Θ} {m𝓧 : MeasurableSpace 𝓧} {m𝓧' : MeasurableSpace 𝓧'} {m𝓨 : MeasurableSpace 𝓨} ( : Θ𝓨ENNReal) (P : Kernel Θ 𝓧) [IsSFiniteKernel P] (π : MeasureTheory.Measure Θ) (η : Kernel (Θ × 𝓧) 𝓧') [IsMarkovKernel η] :
bayesRisk (P.compProd η) π bayesRisk P π