Zulip Chat Archive

Stream: general

Topic: Definition of variance


Jason KY. (Sep 28 2022 at 10:12):

Currently, the variance is defined as the integral š”¼[(X - š”¼[X])^2] so in particular the variance of X is defined to be 0 if X is not LĀ². I think it might be better to define variance as a lintegral instead since then the variance of X would be infinity if X is not LĀ². If this were the case, Chebyshev's inequality would remain to hold for LĀ¹ functions .

I came across this while looking at the proof of the WLLN which should hold for LĀ¹ functions but as a result of the above, I can't use the current Chebyshev's inequality directly for it.

Floris van Doorn (Sep 28 2022 at 10:20):

It might be annoying to have variance take values in ennreal...
My thought would be: try to change it to be ennreal valued, and if it turns out to make certain arguments harder, then we should consider having both definitions side-by-side.
But that said, I've hardly used the probability theory part of mathlib, so let's also wait for others to chime in.

Jason KY. (Sep 28 2022 at 10:38):

For Chebyshev's inequality, the RHS has aennreal.of_realso at least for Chebyshev's inequality it seems natural to define variance to take value in ennreal. Though I can certainly imagine cases where ennreal is annoying...

Sebastien Gouezel (Sep 28 2022 at 11:47):

In probability theory, one uses the variance all the time as a real number (and one only uses it for functions in L^2), so I think it is important that the main definition is a real number. If you find uses for the ennreal-valued version, you can introduce it as evariance or something, of course!

Jason KY. (Sep 28 2022 at 12:42):

Would it be reasonable to redefine variance as the ennreal.to_real of someevariance defined using a lintegral?

Sebastien Gouezel (Sep 28 2022 at 13:09):

If most useful lemmas remain true, it shouldn't do any harm!


Last updated: Dec 20 2023 at 11:08 UTC