Zulip Chat Archive

Stream: Is there code for X?

Topic: Probability Distributions


Aaron Bies (May 10 2023 at 13:57):

I had a look at the probability module in mathlib and it seems there are lots of fairly advanced statements about probability, but I can't find any concrete probability distributions (like uniform, normal, exponential, etc).

Are those distributions well hidden or not implemented? If it's the latter, I would like to work on that.

Johan Commelin (May 10 2023 at 13:59):

(cc @Peter Pfaffelhuber who is also interested in that area, and I think has distributions high on his wish list)

Yaël Dillies (May 10 2023 at 14:11):

I personally have the Erdös-Réniy distribution in http://github.com/YaelDillies/LeanCamCombi

Rémy Degenne (May 10 2023 at 14:14):

The answer is that they are mostly no implemented, although we have some results like the value of the Gaussian integral (but no Gaussian probability distribution). Any work towards getting those distributions is welcome!

I am currently PRing work that will give us the definition of the cdf of a real probability distribution as a by-product (#18834, needs to be split into smaller pieces).

Jason KY. (May 10 2023 at 14:14):

One way of expressing distribution is by using measure_theory.has_pdf, e.g. the uniform distribution can be written using measure_theory.pdf.is_uniform

Peter Pfaffelhuber (May 10 2023 at 15:22):

As a newbie to Lean, I started to define the binomial distribution (as an outer measure on \R) using pmf(=probability mass functions) from the measure-theory section. (I can share the code, if you are interested.) Hypergeometric distributions should work very similarly, geometric and Poisson are different, since they are not defined on a finset.
However, I am not sure if it is best to use pmf's. There is also measure.has_density -- or .has_pdf, as Rèmy said -- which is more general in my opinion. So, you could define a binomial distribution as a measure with density with respect to the counting measure on range (n-1), and a Poisson distribution with respect to the counting measure on \N.

Jason KY. (May 10 2023 at 15:38):

I think in general it is best to use pmf for discrete distributions (note that you can use pmf for countably infinite support since we are using tsumto define it) since you can get pdfs from pmfs for free after we've developed more API. Moreover, pmfs are easier to use since most measurability/integrablity problems won't apply.

Shreyas Srinivas (May 10 2023 at 16:15):

Further on this subject, do we have basic bounds like the union bound or concentration inequalities? For CS work, these inequalities are (part of) our bread and butter.

Rémy Degenne (May 10 2023 at 16:36):

Union bound : docs#measure_theory.measure_Union_le
For concentration inequalities, it depends on which inequality you want. Markov inequality: docs#measure_theory.mul_meas_ge_le_lintegral
I have an open PR with Hoeffding's inequality somewhere, but it is on pause because I want to have the tools to define conditionally sub-Gaussian random variables first

Yaël Dillies (May 10 2023 at 16:59):

Jason KY. said:

I think in general it is best to use pmf for discrete distributions (note that you can use pmf for countably infinite support since we are using tsumto define it) since you can get pdfs from pmfs for free after we've developed more API. Moreover, pmfs are easier to use since most measurability/integrablity problems won't apply.

Yes and no. The thing is that you can either define the distribution concretely and prove things about it, or you can define the predicate of a random variable being distributed according to the distribution and prove things about those. Your message applies to the first case, but Bhavik, you and I agreed before that the second approach was better because more flexible.

Yaël Dillies (May 10 2023 at 17:00):

I too need a lot of concentration inequalities for probabilistic combinatorics. Not sure how much overlap with CS there is.

Jason KY. (May 10 2023 at 17:05):

What Bhavik, you and I talked about before was about whats is the best way to say X is distributed according to a specific distribution. Here I am talking about whats the best way of setting up an abstract distribution.

Peter Pfaffelhuber (May 10 2023 at 18:52):

Quick question: measure_theory.has_pdfneeds three tings, ℙ, X, and μ as an input, but in the end, only μ and the push-forward of ℙ along X (i.e. a probability distribution on E) is used. Why doesn't the structure need only two things, μ and a probability distribution on E, as input? With such a definition, some X could be has_pdfif its push-forward has this property. Wouldn't this be even more flexible?

Shreyas Srinivas (May 10 2023 at 18:54):

Yaël Dillies said:

I too need a lot of concentration inequalities for probabilistic combinatorics. Not sure how much overlap with CS there is.

Chernoff, multiplicative chernoff, hoeffding, azuma, mcdiarmird (and if the most general versions of these use measure theoretic stuff, then APIs for discrete versions)

Shreyas Srinivas (May 10 2023 at 18:56):

Additionally, some distributions : binomial, geometric, bernoulli, poisson.

Shreyas Srinivas (May 10 2023 at 18:58):

If the theorem statements get tedious then it might be nice to have suitable domain specific tactics to work with them.

Shreyas Srinivas (May 10 2023 at 19:00):

The lovasz local lemma would also be very useful (including constructive versions. see : https://dl.acm.org/doi/pdf/10.1145/2049697.2049702?casa_token=qAxZX-_SYNcAAAAA:efAO3p4clkT5J7xqMVbEwj2Vohxpuz28V_Yw9sm7_oc1T-GPRi6ZaDg1sdC4SJgy-n4eqmDGakqO5Q)

Shreyas Srinivas (May 10 2023 at 19:03):

Yaël Dillies said:

I too need a lot of concentration inequalities for probabilistic combinatorics. Not sure how much overlap with CS there is.

I suspect a non-trivial amount of overlap would exist. You might also want the Chebyshev inequality because of the second moment method.

Yaël Dillies (May 10 2023 at 19:07):

Yes, I definitely need Chebyshev. Chernoff is one of those standard tools that is interchangeable with a bunch of other inequalities, so it would be useful to me but not crucial. I don't know about the other ones.

Rémy Degenne (May 10 2023 at 19:07):

Shreyas Srinivas said:

Chernoff, multiplicative chernoff, hoeffding, azuma, mcdiarmird (and if the most general versions of these use measure theoretic stuff, then APIs for discrete versions)

We are really not far from having these. In fact we have Chernoff bounds: docs#probability_theory.measure_ge_le_exp_cgf . I have code for Hoeffding (#15141). Azuma requires conditionally sub-Gaussian to have it in the generality I want, but I will add that as soon as I finish PRing stuff related to the existence of conditional distributions.

Yaël Dillies (May 10 2023 at 19:08):

Look up Lovazs Local Lemma on Zulip. It's already formalised.

Shreyas Srinivas (May 10 2023 at 19:09):

Basically everything in Appendix C and D here : https://cs.nyu.edu/~mohri/mlbook/

Jason KY. (May 10 2023 at 19:11):

Peter Pfaffelhuber said:

Quick question: measure_theory.has_pdfneeds three tings, ℙ, X, and μ as an input, but in the end, only μ and the push-forward of ℙ along X (i.e. a probability distribution on E) is used. Why doesn't the structure need only two things, μ and a probability distribution on E, as input? With such a definition, some X could be has_pdfif its push-forward has this property. Wouldn't this be even more flexible?

I defined has_pdffor the sole purpose of saying that a random variable has a density. What you suggest can already be expressed using measure.have_lebesgue_decomposition (+ absolute continuity) and would not work well if I want it to be a type class on a random variable.

Peter Pfaffelhuber (May 10 2023 at 19:17):

Thanks, this makes sense.

Peter Pfaffelhuber (May 10 2023 at 19:20):

Rémy Degenne schrieb:

Shreyas Srinivas said:

Chernoff, multiplicative chernoff, hoeffding, azuma, mcdiarmird (and if the most general versions of these use measure theoretic stuff, then APIs for discrete versions)

We are really not far from having these. In fact we have Chernoff bounds: docs#probability_theory.measure_ge_le_exp_cgf . I have code for Hoeffding (#15141). Azuma requires conditionally sub-Gaussian to have it in the generality I want, but I will add that as soon as I finish PRing stuff related to the existence of conditional distributions.

It appears to me that you are suggesting to formalize parts of large deviation theory...

Rémy Degenne (May 10 2023 at 19:28):

Well a small part, yes. Large deviations and concentration inequalities have a lot in common.

Peter Pfaffelhuber (May 11 2023 at 22:04):

@Rémy Degenne Thanks for helping me with the ennreal -> real -business today.
@Aaron Bies
Here
https://github.com/pfaffelh/some_probability/blob/master/src/binomial.lean
is an implementation of the binomial distribution using pmf's. Other distributions with finite support should work similarly, once you show that the pmf is non-negative and sums to one.

Peter Pfaffelhuber (May 12 2023 at 20:41):

Actually, for characterizing probability distributions, is there something like the following implemented?
(In words: Two random variables X and Y have the same distribution iff E[f(X)]=E[f(Y)] for all bounded and continuous f.
(The → should follow from some lemma in probability_theory.ident_distrib, but the ← requires results from measure theory. I am not sure if E must be metric.)

variables {Ω Ω' E : Type*} [measurable_space Ω] [measurable_space Ω'] [topological_space E] [measurable_space E]
[borel_space E] [metric_space E]  {P : measure Ω} [is_probability_measure P] {P' : measure Ω'} [is_probability_measure P']
{X : Ω  E} {Y : Ω'  E} [ae_measurable X P] [ae_measurable Y P']

theorem ident_distrib_iff  : (probability_theory.ident_distrib X Y P P') 
 (f : bounded_continuous_function E ),  ω, f (X ω) P =  ω', f (Y ω') P'

This would be a first step in proving that characteristic functions characterize the distribution of a random variable.

Kalle Kytölä (May 13 2023 at 14:48):

The API on these is not in a great shape yet... I intend to work on improving the API, but during the semesters I have had no time for Lean/mathlib. I was also thinking of doing the improvements directly in mathlib4, so as not to add material to be ported.

But essentially these types of results are in mathlib in a rudimentary form. For example docs#measure_theory.tendsto_lintegral_thickened_indicator_of_is_closed is close to what you are asking for (it assumes pseudo_emetric_space so covers the metric space case), although it needs to be combined with the facts that closed sets form a pi-system that generate the Borel sigma-algebra, and that such pi-systems determine finite measures.

We also have somewhat cleaner looking related things like docs#measure_theory.finite_measure.tendsto_of_forall_integral_tendsto, but this should be combined with uniqueness of limits in the topology of weak convergence of measures, which is essentially equivalent to your question.

So currently you can certainly get this with finite amount of work from existing results, but in the summer I hope to clean up the API about these and once that is done, a better direct statement should be available (maybe in mathlib4?).

Of course if you want to add a good version of the statement, it is certainly something we'd like to have!

Peter Pfaffelhuber (May 14 2023 at 22:29):

@Kalle Kytölä Many thanks! I was not aware that weak convergence is essentially implemented. And yes, my question is equivalent to uniqueness of the weak limit (which is surprisingly missing, although Portmanteau Theorem etc are implemented). Since my far-in-the-future-goal are weak convergence results (on continuous-time stochastic processes), I will invest some time in the next few weeks.

Kalle Kytölä (May 14 2023 at 22:44):

Even portmanteau in mathlib is still missing a few implications plus lots of API (like the uniqueness of the weak limits and other standard things). But I do intend to PR the missing parts, too.

Nevertheless, for example the uniqueness of the limits is very natural and independent of the rest (and amounts to combining a few already existing results), so feel free to add it if you are quicker (I still have a few weeks of teaching and grading and admin for the semester left before I have hope to resume Leaning).

One thing I don't know is what is the policy on mathlib3 PRs at the moment --- I had basically mentally prepared to continue weak convergence in Lean4 once I have Lean time again. Maybe someone closer to the porting effort can comment on mathlib3 vs mathlib4 PRs. (I guess probability theory and much of measure theory is not yet ported, so at the moment, one can't directly start in mathlib4, but the porting tide also looks not too far away.)

Peter Pfaffelhuber (May 14 2023 at 23:03):

I am still new to Lean, and never wrote a single line of Lean4, but I can maybe find out about the policy. For me, showing uniqueness of the weak limit is certainly a good exercise, even when using mathlib3.
@Kalle Kytölä
By the way, you have nice lecture notes on "Metric spaces" and "Large random systems".

Kalle Kytölä (May 14 2023 at 23:11):

The Lean3 result on uniqueness of weak limits should indeed be a rather nice thing to do. But it might involve passing between some lintegrals and integrals, some bounded_continuous_functions with codomain nnreal and some with codomain real, coercions to ennreals, and other usual formalization things. A key is of course also to locate the lemmas saying that pi-systems determine finite measures (this exists for sure, but I forgot the name) etc. I hope you'll have fun with it (but the current API holes may be a source of some hiccups).

Rémy Degenne (May 15 2023 at 07:18):

The lemma stating that pi-systems determine finite measures is docs#measure_theory.ext_of_generate_finite . The more general lemma about induction using pi-systems is docs#measurable_space.induction_on_inter . I always have trouble remembering that last name, because I don't really understand the inter in it.

Eric Wieser (May 16 2023 at 20:54):

It sounds like that lemma deserves a docstring!

Peter Pfaffelhuber (May 19 2023 at 20:49):

Just working on the uniqueness of the weak limit... One question: In textbooks, I never saw the ae_measurable property of a random variable, only measurable. (The ae_measurable is e.g. used in probability_theory.ident_distrib.) Does this distinction have any real use? So, is there an example, where you really want the image measure of a random variable which is only ae_measurable rather than measurable?

Anatole Dedecker (May 19 2023 at 21:06):

This is useful because an a.e limit of measurable functions is not always measurable since we don’t require the sigma algebra to be complete with respect to a measure. I’m not sure I can give a more specific example in the context you’re interested in, but the high level answer is that we try to assume ae_measurable as much as possible so that we don’t get stuck when working in non-complete measure spaces.

Peter Pfaffelhuber (May 19 2023 at 22:35):

Ok, I see. You avoid the usual completeness of the sigma-algebra. I think measure_theory.integral_map_of_strongly_measurable could use a version with only assuming hφ : ae_measurable φ then. (Will also try this...)

Anatole Dedecker (May 19 2023 at 22:37):

Isn't that docs#measure_theory.integral_map ? Of course you need to be careful about the measures then...

Peter Pfaffelhuber (May 19 2023 at 22:42):

@Anatole Dedecker
Yes, you are right!

Peter Pfaffelhuber (Jun 07 2023 at 19:25):

Finally, here is a version of uniqueness of the weak limit.

Rémy Degenne (Jun 07 2023 at 19:50):

This is great!
It would be easier to read if you cut the lines at 100 characters though (and in general see https://leanprover-community.github.io/contribute/style.html).

Rémy Degenne (Jun 07 2023 at 19:51):

The first lemma is docs#ennreal.coe_to_nnreal

Rémy Degenne (Jun 07 2023 at 20:03):

Also I see that you declare all the types each time you write a lemma: you could write variables {E : Type*} [measurable_space E] at the top of the file to define it once and have it available everywhere.

Peter Pfaffelhuber (Jun 07 2023 at 20:26):

Thanks for the feedback! I got rid of the first lemma now.
I tried to use variables ..., but it gave me error messages I could not get rid of. Maybe you can show me next week...

Rémy Degenne (Jun 07 2023 at 20:35):

A potential error you could get with variables is that if you name an instance it will not be included automatically everywhere. To ensure that your measurable_space hypothesis is found you should not name the instance (you should write [measurable_space E] and not [m : measurable_space E]).

Rémy Degenne (Jun 07 2023 at 20:39):

I don't know why it behaves that way. If something is an instance there should be only one so there is arguably no need to give it a name anyway, but that's not an explanation of the change of behaviour.


Last updated: Dec 20 2023 at 11:08 UTC