Zulip Chat Archive

Stream: Is there code for X?

Topic: total variation distance between two PMFs

Yannick Seurin (Jan 05 2026 at 18:06):

What would be the standard way to define the total variation distance between two PMFs p : PMF α and q : PMF α?
According to this, it can be defined using the total variation of signed measures, but I'm not familiar enough with measure theory to figure it out how to turn p.toMeasure - q.toMeasure into a signed measure...

Etienne Marion (Jan 05 2026 at 18:32):

You can use MeasureTheory.Measure.toSignedMeasure

Yannick Seurin (Jan 05 2026 at 21:05):

Thanks! I searched for toSignedMeasure but it actually appears pretty low in the search results...
So totalVariation (p.toMeasure.toSignedMeasure - q.toMeasure.toSignedMeasure) gives us a measure on α, but how do we translate it to the TV distance?

I also found this blog post which suggests another definition based on the L1 distance between the Radon-Nikodym derivatives of p.toMeasure and q.toMeasure. But again, I'm not qualified enough to express it in Lean...

Yongxi Lin (Aaron) (Jan 05 2026 at 21:56):

You get the the total variation distance by evaluating totalVariation (p.toMeasure.toSignedMeasure - q.toMeasure.toSignedMeasure) on the univ.

Yannick Seurin (Jan 06 2026 at 12:11):

Thanks! Would it make sense to have it (and other distances) defined in Mathlib? For someone like me who's not versed in measure theory but wants to work on "basic" probability results with PMFs, it would be quite handy :)

Etienne Marion (Jan 06 2026 at 12:24):

The first step would be to add the total variation distance of two arbitrary measures.

Kexing Ying (Jan 06 2026 at 12:41):

Note that there's #26156 which defines total variation for any vector measures

Yannick Seurin (Jan 06 2026 at 12:46):

Like this?

import Mathlib

open MeasureTheory.Measure Set

noncomputable def MeasureTheory.Measure.totalVariationDistance
    {α : Type u} [MeasurableSpace α]
    (μ ν : Measure α) [hμ : IsFiniteMeasure μ] [hν : IsFiniteMeasure ν ] :=
  (μ.toSignedMeasure - ν.toSignedMeasure) univ

noncomputable def PMF.totalVariationDistance
    {α : Type u} [MeasurableSpace α]
    (p q : PMF α) :=
  MeasureTheory.Measure.totalVariationDistance p.toMeasure q.toMeasure

Would that make a good start to then prove equality with other definitions, such as the supremum one, the L1 distance between the Radon-Nikodym derivatives of μ and ν, or the PMF tsum form for Fintype α?

Rémy Degenne (Jan 06 2026 at 12:52):

In #27579 I wrote a definition of a TV distance between two Measure. The (somewhat obscure) definition is line 117 of this file: https://github.com/leanprover-community/mathlib4/pull/27579/changes#diff-73a759938aafd3b48c0176b0baaeec1ef494f0feeb6b08bdcc3d6bd81d16ed84R117 . The lemmas below show equality to several more familiar expressions.

Rémy Degenne (Jan 06 2026 at 12:57):

This is a big PR that I did not work on much recently, so unless I start being more active with the PR that definition is sadly not going to end up in Mathlib soon.

Yannick Seurin (Jan 06 2026 at 13:00):

Oh, I see, thanks. Out of curiosity, why choose this definition rather than the one based on signed measures, which seems standard as far as I can tell?

Rémy Degenne (Jan 06 2026 at 13:03):

I was doing work that used the DeGroot statistical information, so I coded that. And that's a family of divergences that includes TV, so I thought I might as well introduce that important special case.

Rémy Degenne (Jan 06 2026 at 13:16):

And to be clear: I'm not saying anybody should wait for my PR to land (which might be long). If you want to PR another, more direct definition, do it.

Yannick Seurin (Jan 06 2026 at 13:52):

I'll see what I can come with, might be a good way to get more familiar with measure theory...

Bolton Bailey (Jan 06 2026 at 17:59):

If you aren't inclined to learn measure theory for the sake of this, then I don't see anything wrong with providing the definition in terms of discrete math. For example, Arklib has this definition.

Bolton Bailey (Jan 06 2026 at 18:14):

In fact, I don't think we need either the Fintype or MeasurableSpace typeclass arguments to provide this definition

See this prior discussion here: #Is there code for X? > Statistical distance for `PMF`s? @ 💬

Bolton Bailey (Jan 06 2026 at 18:31):

(Unfortunately, I don't think I ever tried to clean up or PR the code in that thread, probably because I got sidetracked by the return type issue.)

Bolton Bailey (Jan 06 2026 at 19:18):

Courtesy of claude, here is a cleaned up and sorry free version of the definition from that thread

import Mathlib.Probability.ProbabilityMassFunction.Basic

theorem PMF.summable_toReal_abs {α : Type*} (p : PMF α) :
    Summable fun i => |(p i).toReal| := by
  simp only [abs_of_nonneg ENNReal.toReal_nonneg]
  exact ENNReal.summable_toReal p.tsum_coe_ne_top

theorem PMF.dist_summable {α : Type*} (X Y : PMF α) :
    Summable fun i => dist (X i).toReal (Y i).toReal := by
  apply Summable.of_nonneg_of_le (f := fun i ↦ |(X i).toReal| + |(Y i).toReal|)
    (fun i ↦ abs_nonneg _) _
  · apply Summable.add
    · apply PMF.summable_toReal_abs
    · apply PMF.summable_toReal_abs
  · exact fun b ↦ abs_sub (X b).toReal (Y b).toReal

noncomputable instance StatisticalDistance {α} : PseudoMetricSpace (PMF α) where
  dist X Y := (1 / 2) * ∑' i, dist ((X i).toReal) ((Y i).toReal)
  dist_self X := by
    simp
  dist_comm X Y := by
    congr
    ext i
    exact PseudoMetricSpace.dist_comm (X i).toReal (Y i).toReal
  dist_triangle X Y Z := by
    rw [<-mul_add]
    rw [<-Summable.tsum_add (PMF.dist_summable X Y) (PMF.dist_summable Y Z)]
    simp only [one_div, inv_pos, Nat.ofNat_pos, mul_le_mul_iff_right₀]
    exact
      Summable.tsum_mono (PMF.dist_summable X Z)
        (Summable.add (PMF.dist_summable X Y) (PMF.dist_summable Y Z))
        fun i ↦ PseudoMetricSpace.dist_triangle (X i).toReal (Y i).toReal (Z i).toReal

Yannick Seurin (Jan 06 2026 at 23:12):

I don't see anything wrong with providing the definition in terms of discrete math

Indeed, and for Fintype α it's pretty straightforward as done in Arklib. For arbitrary α, I was tempted to use the supremum definition and let

import Mathlib

def TVD (p q : PMF α) : ℝ :=
  ⨆ (s : Set α), |(∑' (a : s), p a).toReal - (∑' (a : s), q a).toReal|

but I wasn't sure this was correct, hence my question about a measure-based definition that follows the textbooks.

See this prior discussion here: #Is there code for X? > Statistical distance for PMFs? @ 💬

Nice! It would be interesting to check whether these definitions coincide...

Bolton Bailey (Jan 07 2026 at 19:43):

Yannick Seurin said:

Nice! It would be interesting to check whether these definitions coincide...

I think so! You could also make a few other variants (the absolute value sign shouldn't be necessary in what you've written, or you could just choose s to be the set for which p > q). Whatever we do, we should try to produce proofs of the equivalences to other forms of the definition, etc and encapsulate the definition itself.

I have been working on a PR here #33680, you can contribute to that or make a PR of your own, I would be happy to review it if so.

Last updated: Feb 28 2026 at 14:05 UTC