Zulip Chat Archive

Stream: Is there code for X?

Topic: Is this a good foundational definition for random variable?


Lars Ericson (Dec 05 2020 at 01:23):

I don't see a definition of random_variable in Lean. I would like to build Definition 5, page 7 and Remark 1, page 8 in Lean. I would appreciate your critical eye on whether this is a good foundation:

"Let (๐‘‹,ฮฃ) be a measure space.

Let (๐‘Œ,๐‘‘) be a metric space.

Let ๐‘ฅ be a map from ๐‘‹ to ๐‘Œ such that ๐‘ฅโˆ’1(๐ต)โˆˆฮฃ for any Borel set ๐ต in ๐‘Œ .

We say that ๐‘ฅ:(๐‘‹,ฮฃ)โ†’(๐‘Œ,๐‘‘) is a ๐‘Œ -valued random function and ๐‘ฅ is a ฮฃ -measurable function.

We say that ๐‘ฅ is a ๐‘Œ -valued random variable iff

๐‘ฅโˆ’1(๐‘‚)โˆˆฮฃ for any open set ๐‘‚
๐‘ฅโˆ’1(๐‘†)โˆˆฮฃ for any closet set ๐‘†
If ๐‘Œ=โ„œ , then {๐‘คโˆˆ๐‘‹:๐‘ฅ(๐‘ค)โ‰ค๐›ผ}โˆˆฮฃ for any ๐›ผโˆˆโ„œ ."

Mario Carneiro (Dec 05 2020 at 07:32):

Definition 5 is just another way to say x : X -> Y is a measurable function

Mario Carneiro (Dec 05 2020 at 07:32):

where the measurable space on Y is the borel sigma algebra (which it is by default for metric spaces)

Mario Carneiro (Dec 05 2020 at 07:49):

import measure_theory.borel_space

variables (X Y : Type) [measurable_space X]
  [measurable_space Y] [metric_space Y] [borel_space Y] (x : X โ†’ Y)
#check measurable x -- x is a Y-valued RV on X

Mario Carneiro (Dec 05 2020 at 07:54):

Remark 1 isn't a definition, it's a theorem about this definition of measurable function. (1) is docs#measurable_of_is_open, (2) is docs#measurable_of_is_closed, (3) is docs#measurable_of_Iic

Mario Carneiro (Dec 05 2020 at 07:57):

My suggestion, if you want to pursue this topic, is to read the borel_space theorems to get a sense for what's available, and try to match things up with the textbook. Most likely we won't hit everything, and then you can try to prove what remains. This will help to familiarize yourself with the library

Lars Ericson (Dec 05 2020 at 16:47):

Thanks @Mario Carneiro . So a random variable V is a measurable function. I quoted from Mark Dean at Brown's course notes for a chapter on measure theory in a course on decision theory. It's a small point but I'm making it because I found a wide variety of definitions of random variable in other people's course notes. Also Wikipedia restricts the range of V to โ„: "a random variable is understood as a measurable function defined on a probability space that maps from the sample space to the real numbers". Clearly this is too restrictive, it doesn't have to be โ„, any metric space will do.

I will study the borel_space in mathlib. In this context I am trying to fill out the following picture with mathlib definitions and examples:
Screenshot-from-2020-12-05-11-43-17.png

where so far from Zulip we have the following examples:

  • Set: {0,1,2}, โ„, [0,1], [0,oo)
  • Metric Space: taxicab, euclidean
  • Measurable Space: {0,T}, P(T), B((0,1)), B([0,1]), B([0,oo))
    There is a lot of probability theory mechanics in the measure_theory part of mathlib. It seems I need to add some definitions to get probability_space and random_variable.

Mario Carneiro (Dec 05 2020 at 16:48):

The mathlib way is to generalize everything to the broadest context where it continues to make sense and have the desired properties

Lars Ericson (Dec 05 2020 at 16:49):

The broadest context for random_variable is to have a range of Type but that is too much for my purposes.

Mario Carneiro (Dec 05 2020 at 16:49):

In this case, you can probably do most of the stuff about random variables on arbitrary measurable functions; if you start talking about adding things then you have to add an additive group structure and so on

Mario Carneiro (Dec 05 2020 at 16:50):

Type is too much, you need a measurable space in the codomain

Mario Carneiro (Dec 05 2020 at 16:50):

but you can just lazily add assumptions as they become necessary

Lars Ericson (Dec 05 2020 at 16:50):

What Mark Dean is saying is metric_space in the codomain, hence my question to begin with: Is Mark Dean wrong?

Lars Ericson (Dec 05 2020 at 16:51):

That's the point of mathematical taste that I'm asking about.

Mario Carneiro (Dec 05 2020 at 16:51):

mathematicians like to set up some fairly constraining assumptions at the start just to clarify what they are talking about and not have to worry about edge cases, but this gets annoying when you are simultaneously serving many distinct theories

Mario Carneiro (Dec 05 2020 at 16:52):

maybe @Kevin Buzzard has something to say about "all rings are commutative" :)

Lars Ericson (Dec 05 2020 at 16:52):

Mark Dean's notes are building a theory with metric_space as the codomain of random variable rather than (another, not necessarily the same as the domain) measurable_space. Is that inherently wrong, from your experience as a professional mathematician, or just a different choice?

Mario Carneiro (Dec 05 2020 at 16:52):

in this case, one implies the other

Mario Carneiro (Dec 05 2020 at 16:53):

so just assume the weaker thing until it makes a theorem false, then add the assumption then

Lars Ericson (Dec 05 2020 at 16:53):

So all metric_space's are measurable_space's, or vice versa?

Lars Ericson (Dec 05 2020 at 16:53):

I.e. I can't construct a metric_space which is not a measurable_space or vice versa.

Mario Carneiro (Dec 05 2020 at 16:54):

Why don't you answer that question?

Lars Ericson (Dec 05 2020 at 16:55):

Prior to Lean I would just go to Stack Exchange for this, but challenge accepted!

Lars Ericson (Dec 05 2020 at 16:56):

I fear dragons though: "There is no relationship between these two notions. โ€“ Crostul Aug 19 '15 at 15:46"

Lars Ericson (Dec 05 2020 at 16:56):

Stack Exchange is telling me they are unrelated, so I still have to make a choice. I'm not sure how to go about proving "Unrelated".

Lars Ericson (Dec 05 2020 at 16:57):

I guess Exists M: metric_space M, not (measurable_space M) and Exists M: measurable_space(M), not (metric_space M)

Lars Ericson (Dec 05 2020 at 16:58):

I think I can do that. Which, when I'm done, leaves us back at my original question on this thread: Is metric_space, Mark Dean's choice, the nicest choice to develop a theory of random variable?

Mario Carneiro (Dec 05 2020 at 16:58):

Hint: It was brought up earlier in this discussion

Lars Ericson (Dec 05 2020 at 16:58):

Thanks! I will study the thread.

Mario Carneiro (Dec 05 2020 at 16:59):

I'm saying that this is the wrong question to ask

Mario Carneiro (Dec 05 2020 at 16:59):

You don't need to determine in advance the global assumptions for every theorem you will prove

Mario Carneiro (Dec 05 2020 at 16:59):

Use metric_space if you need it, don't if you don't

Lars Ericson (Dec 05 2020 at 17:00):

Wise advice. I will start with Type for the codomain and see where I get stuck.

Mario Carneiro (Dec 05 2020 at 17:00):

and at least with Mark Dean's definition, there is no need for a definition of random_variable at all because that's just measurable f

Mario Carneiro (Dec 05 2020 at 17:02):

it is better not to redefine a notion if you don't have to because every new definition comes with an obligation to provide an API for it and if you can leverage an existing library of theorems that's time saved

Kevin Buzzard (Dec 05 2020 at 17:03):

Mario Carneiro said:

maybe Kevin Buzzard has something to say about "all rings are commutative" :)

https://stacks.math.columbia.edu/tag/0006 a.k.a. "if you print out these notes they will be over 7000 pdf pages; the word "ring" is mentioned over 5000 times and it means "commutative ring" each time"

Mario Carneiro (Dec 05 2020 at 17:04):

yeah but how many of those uses actually require the commutativity assumption? I would wager not all of them

Lars Ericson (Dec 05 2020 at 17:05):

So when proving theorems I can prove them in the original domain, like measurable. However when considering an application domain, I think it would be helpful to introduce some type aliases that let people know say def random_variable := measurable, even if it's just an equality. I can locate those renamings as syntactic sugar at the end of development.

Mario Carneiro (Dec 05 2020 at 17:06):

nope, use a comment

Mario Carneiro (Dec 05 2020 at 17:06):

that alias will get in your way more than you would think

Lars Ericson (Dec 05 2020 at 17:11):

What about context? I assume the domain of my measurable has gone into/come out of a definition of a probability_space. I like to keep the idea of a probability_space because it is a thing that people talk about, even though, in mathlib terms, it might disappear in the same way as ๐œŽ_algebra is not immediately visible as such. So in that sense my notion of random_variable depends on a prior construction of probability_space to make contextual sense. I would like to formalize these contextual relationships. The mathlib style seems to be to leave a lot of context in the comments and to avoid formalizing it if is not strictly necessary to make the steps of a proof on a core concept go through. Which is "lean", but maybe a little dry. Not that there's anything wrong with that, please don't take offense, I'm just saying.

Mario Carneiro (Dec 05 2020 at 17:22):

that all sounds about right. Formalize the proof

Mario Carneiro (Dec 05 2020 at 17:22):

everything else is comments

Mario Carneiro (Dec 05 2020 at 17:23):

probability_space does have some additional content over measure_space since the measure of the whole space is 1 and a forteriori the measure is always real valued and at most 1

Mario Carneiro (Dec 05 2020 at 17:24):

but I haven't yet seen any evidence that the term "random variable" contributes much over "measurable function" besides licensing the author to use notation like X+YX+Y without worrying about the fact that it's a function and not a number

Lars Ericson (Dec 05 2020 at 18:26):

If I wanted to create aliases in spite of good advice, I don't know how to do it in Lean. For example the last line here:

import measure_theory.borel_space

#check measurable_space (set.Ioo (0:โ„) 1)
#check (by apply_instance : measurable_space (set.Ioo (0:โ„) 1))

def Bo0o1 := measurable_space (set.Ioo (0:โ„) 1)
#check Bo0o1
#check (by apply_instance : Bo0o1)

results in the error

tactic.mk_instance failed to generate instance for
  Bo0o1
state:
โŠข Bo0o1

when it should do the same thing as #check (by apply_instance : measurable_space (set.Ioo (0:โ„) 1)). I'm guessing that def a := b doesn't make a a pointer to the b but acts instead as a renaming of b which will block unification for theorems on the original expression.

Alternatively, I was hoping to turn comments into theorems. For example, rather than define a type random_variable, make a theorem random_variable out of

import measure_theory.borel_space

variables (X Y : Type) [measurable_space X]
  [measurable_space Y] [metric_space Y] [borel_space Y] (x : X โ†’ Y)
#check measurable x -- x is a Y-valued RV on X

i.e. something like

theorem random_variable  (X Y : Type) [measurable_space X]  [measurable_space Y] [metric_space Y] [borel_space Y] (x : X โ†’ Y) :=
    measurable x

For that, Lean gives the error

don't know how to synthesize placeholder
context:
X Y : Type,
_inst_5 : measurable_space X,
_inst_6 : measurable_space Y,
_inst_7 : metric_space Y,
_inst_8 : borel_space Y,
x : X โ†’ Y
โŠข Sort ?

Eric Wieser (Dec 05 2020 at 22:45):

I think the abbreviation keyword would be appropriate for the type of alias being suggested here, right?

Eric Wieser (Dec 05 2020 at 22:45):

Like how we alias vector_space to semimodule

Rรฉmy Degenne (Dec 06 2020 at 10:22):

@Mario Carneiro , on the topic of random variable vs measurable function: I started working on formalizing some probability theory (following Williams' book "Probability with martingales") and after a lot of messing with definitions I would argue that defining random variables as the measurable property is not convenient for what people want to do in probability theory and that you really want to put the probability space in the random variable definition.
Example of statement you want to be able to make easily: "let X and Y be two iid Normal random variables with law N(0,1), then X+Y is Normal with law N(0,2)."
In general an important property of a random variable is its law and on the contrary you don't care about the probability space it is defined on and you would like to avoid speaking about it.

Now consider two ways of defining random variables : 1) measurable functions from measurable space A to measurable space B, and have a probability measure on A on the side (or a hypothesis probability_space A), 2) a random variable definition contains the measurability statement and the probability measure.

With definition 2, the statement "let X be a random variable with law N(0,1)" is implemented as follows: the user inputs the target space B = R and the law, and mathlib chooses any suitable Ax, probability Px and function to represent X (ex: Ax = B, Px is the desired law, the function is the identity). In probability theory, the probability space (Ax,Px) does not matter anyway. Idem for the definition of Y, with measurable spaces Ay and B and probability Py. Then when we write X+Y, we are defining a random variable on an extended space Axy with probability measure Pxy. All those probability space definitions and extensions are things that we do not care about in probability theory, and they are conveniently done behind the scene. (X+Y and Y+X may not lead to the same Axy but again, laws are the same so you don't care)

With definition 1, we have to specify Ax, Px and the function for X manually. Then when we want to talk about X+Y we have to explicitly call a function to build the extended probability space. My point is that it is both less convenient and not in line with what probability theory is about, namely the study of properties that are unchanged when you extend the probability space. In my mind, the implementation of probability theory is a success if the user never has to write what the probability space is.

Well I guess I just wrote a wall of text to say that I think that the correct and most useful definition of random variable is the usual one: measurable function from a probability space to a measurable space. But my point is that there is indeed a need for a random_variable definition that is not simply the measurable property but should include the probability measure.

Mario Carneiro (Dec 06 2020 at 10:25):

I'm not convinced that the event space can really be hidden all that well, because non-independent RVs are a thing

Mario Carneiro (Dec 06 2020 at 10:26):

sometimes you want two RVs to have the same event space, and sometimes you want them to be "freely adjoined" using a product to get independence, and I don't think either one dominates the usage

Kevin Buzzard (Dec 06 2020 at 10:27):

People have struggled to do this sort of thing in Lean, for reasons I've never completely internalised, and my instinct would just be to have a go at trying to do something and to see what happens. Any code which manages to get something working would be a step forward

Mario Carneiro (Dec 06 2020 at 10:28):

@Reid Barton has talked about some ideas for how to formalize proofs where the event space acts more like a context (in the type theory sense) where you can add new variables and extend your space in the middle of the argument, but it seems like the sort of thing that needs a Framework to work

Mario Carneiro (Dec 06 2020 at 10:29):

I agree with Kevin. Currently I'm at the stage where (due to ignorance of complications, perhaps) everything looks like it can just be formalized straightforwardly, so I would be inclined to just push ahead until the going starts to get rough and we have an idea where the naive approach falls over

Rรฉmy Degenne (Dec 06 2020 at 10:36):

Well I have some things implemented for the random_variable A P B approach where I specify the probability measure in the definition (up to statements about independence of random variables and Kolmogorov 0-1 law). I'll clean it somewhat and make a branch with it in order to facilitate discussion in the coming week(s).

Rรฉmy Degenne (Dec 06 2020 at 10:39):

And I will try writing the code for the same things with random variables as measurable functions. I think it will lead to complications, but I am not 100% sure and it will be interesting.

Mario Carneiro (Dec 06 2020 at 10:40):

yep, what's important is to see both options and compare

Sebastien Gouezel (Dec 06 2020 at 12:50):

If I understand correctly, the Giry monad is exactly designed for this: a setup where you can add new variables that are independent (or conditionally independent) of previous ones, and it will extend the probability space for you.

In probability theory, though, the choice of representation is sometimes important, as it can contain more information than the mere distribution. Here is a (very important) example. When you define real-valued continuous time processes, in a general framework the process is a random function from a probability space to the space C of cadlag real functions (continuous on the right with a limit on the left functions) with a topology called the Skorokhod topology. In this context, the brownian motion can be prescribed by giving the joint distribution of the values of the process at each finite set (a Gaussian vector with suitable variances), and in this sense the law of the brownian motion is then just a probabily measure mu on C. A representation of the brownian motion is thus a measurable map X from a probability space Omega to E, mapping the probability measure on Omega to mu. An important property of the brownian motion is that one can choose such an X such that X (omega) is continuous for almost every omega, but this does not just follow from the distribution: there are also examples of X representing the brownian motion for which X (omega) is continuous for no omega!

TLDR: trying to hide the event space and the measurable map may seem appealing, but in practice it is not so clear.

Lars Ericson (Dec 06 2020 at 14:03):

Re @Rรฉmy Degenne : "...on the contrary you don't care about the probability space it is defined on and you would like to avoid speaking about it....All those probability space definitions and extensions are things that we do not care about in probability theory, and they are conveniently done behind the scene....In my mind, the implementation of probability theory is a success if the user never has to write what the probability space is.": The definition of the "usual" probability space of real continuous random variables is why I am here on Lean. I don't think probability space is a throwaway concept to be shoved to the back of the closet. Real continuous RVs can be supported by B(R), B([0,1]), B((0,1)) and other ways. It's interesting to know that there is not a single canonical support and to know how to construct these different supports and to know why they are different.

Also "the correct and most useful definition of random variable is the usual one: measurable function from a probability space to a measurable space." Earlier I gave an example where a competent practitioner defined it as a function from a measure space to a metric space. It's interesting to know that there is not a canonical definition of random variable, at least in the realm of application.

Sebastien Gouezel (Dec 06 2020 at 14:34):

Lars Ericson said:

Also "the correct and most useful definition of random variable is the usual one: measurable function from a probability space to a measurable space." Earlier I gave an example where a competent practitioner defined it as a function from a measure space to a metric space. It's interesting to know that there is not a canonical definition of random variable, at least in the realm of application.

The definition you quoted is a particular case of the general definition (as metric spaces are a particular case of measurable space, with the Borel sigma-algebra), there is no conflict here. It's just like in early curriculum textbooks, you define continuous functions from reals to reals. Then you define continuous functions from metric spaces to metric spaces. You can not say that the first definition is wrong, because it is just a special instance of the second definition. And in more advanced books you will have continuous functions between topological spaces, which is again a generalization (and the final one). In mathlib, we go directly for the more general definition, of continuous functions between topological spaces.

Lars Ericson (Dec 06 2020 at 14:38):

@Sebastien Gouezel it's not that clear cut because I may be able to construct instances of metric_space which do not satisfy measurable_space and vice versa.

Rรฉmy Degenne (Dec 06 2020 at 14:59):

@Sebastien Gouezel : I totally agree on the fact that the measurable map can matter, but indeed I did not think about those continuous processes when writing (my usual work is entirely in discrete time). My point is that we are considering two possibilities for the information that we put inside a "random_variable" object: either a random variable is a measurable function (and in that case we don't define a new structure, we use measurable) with the probability measure separate from it, or it is a structure with a measurable function and a probability measure. The second one still contains all the information about the map.
That second formulation may not be advisable because it adds notation for something similar to what we already have, but I just think that adding the measure information allows doing some things easier when the distribution is the main information we need (and that case should cover the whole of probability with discrete time processes?).

But that is all idle speculation. I'll take the advice from Mario and Kevin and compare the two implementations in actual code... and whatever we end up choosing, I'll have some code written :)

Sebastien Gouezel (Dec 06 2020 at 17:19):

Lars Ericson said:

@Sebastien Gouezel it's not that clear cut because I may be able to construct instances of metric_space which do not satisfy measurable_space and vice versa.

No, any metric_space is canonically a measurable space, for the Borel sigma-algebra. It's not registered in mathlib as an instance by default, as one might want to do some more exotic things (completing sigma-algebras for a measure), but it's definitely what the textbook you mentioned means when it talks about measurable functions from a probability space to a metric space. You can register it as an instance with local attribute [instance] borel in mathlib.

Sebastien Gouezel (Dec 06 2020 at 17:19):

Rรฉmy Degenne said:

But that is all idle speculation. I'll take the advice from Mario and Kevin and compare the two implementations in actual code... and whatever we end up choosing, I'll have some code written :)

That's absolutely the wisest path!

Lars Ericson (Dec 06 2020 at 19:15):

In the Wikipedia sense, (I'm not sure if that coincides with the Lean sense), the carrier set of a metric space can be any set. For example, โ„, so (โ„,d) is a metric space with say def d (x: โ„) (y: โ„) := (x*x-y*y)*(x*x-y*y). However, โ„ contains subsets which are not measurable. So (โ„,d) is a metric space which is not a measurable space.

At least, that's what I get from StackExchange and also this comment in StackExchange: "There is no relationship between these two notions. โ€“ Crostul Aug 19 '15 at 15:46", from a guy with 32,375 reputation (I don't know what that is in Zulip reputation points, but for Mathematics Stack Exchange, that's really a lot of points).

Que t'en pense d'y a ca? What are the Stack Exchange folks missing about this?

Alex J. Best (Dec 06 2020 at 19:20):

If you look at the answer on that stackechange post instead of that comment the same Borel measure Sebastien mentioned is described.
Being a measure space doesn't mean that all subsets are measurable, but that there is a set of subsets which form a sigma algebra that are, the Borel sigma algebra is such a choice that can be used for any metric space.

Reid Barton (Dec 06 2020 at 19:27):

Even saying that the reals "contains subsets which are not measurable" implies that you already know how to make it a measurable space!

Damiano Testa (Dec 06 2020 at 19:52):

Not to mention have some form of weak axiom of choice...

Lars Ericson (Dec 06 2020 at 21:25):

You can make a sigma algebra B(R) from R, so (R, B(R)) is a measurable space.
You can define a distance metric d between any two points in R, so (R, d) is a metric space.
So we can say that R is a carrier set for a measurable space and a metric space.

So R is not a good example.

The challenge is to find a set X which has no sigma algebra. How about

C := {A โŠ‚ โ„ค : is_finite A โˆจ is_finite (โ„ค \ A)}

Reid Barton (Dec 06 2020 at 21:55):

there aren't any such sets X

Lars Ericson (Dec 06 2020 at 23:27):

Jacky Chong gives this example:

Let ๐‘‹=[0,โˆž) and consider A={finite union of [๐‘Ž,๐‘)}.
It's clear that [๐‘Ž,๐‘)โˆฉ[๐‘,๐‘‘)=[max{๐‘Ž,๐‘},min{๐‘,๐‘‘}) and [๐‘Ž,๐‘)๐ถ=[0,๐‘Ž)โˆช[๐‘,โˆž).
So A is an algebra but clearly not a ๐œŽ-algebra since it doesn't contain any open interval.
Take the obvious measure ๐‘š([๐‘Ž,๐‘))=๐‘โˆ’๐‘Ž.

Here is another presentation in another Stack Exchange article on the same topic of the example above. Apparently it is in somebody's textbook. Here is the recap:

The class ๐’œ={๐นโŠ‚โ„•:๐น is finite or ๐น๐‘ is finite}is an algebra of subsets of โ„• but not a ๐œŽ-algebra. Then let ๐ด1,โ€ฆ,๐ด๐‘› sets in ๐’œ, then or every ๐ด is finite or at last one ๐ด๐‘– is infinite (and therefore A^i^c is finite) so or the union is finite or the complement of the union has finite complement.

Clearly ๐’œ is closed to complement.

The union of even numbers do not lies in ๐’œ, so ๐’œ is not a ๐œŽ-รกlgebra.

This would be a pretty nice example to use Lean to prove that say ๐’œ, as constructed, is empty, or ๐’œ, as constructed, is not empty and is in fact a sigma algebra.

Eric Wieser (Dec 07 2020 at 08:51):

Are you confusing "a type X for which no sigma algebra can be constructed" with "a set of sets A over X which is not a sigma algebra for X"?

Lars Ericson (Dec 07 2020 at 13:54):

OK, you can generate a sigma algebra for any set X. So if (X,d) was a metric space then you can make (X,sigma(X)). So there is no reason not to make the codomain of a random variable be a measurable space.

For Mark Dean's purposes in writing his course notes, he sees utility in specializing his RVs to codomain metric space. That is a specialized choice.

RV is any observation of an experiment which is of interest to the experimenter. So the codomain of RV is completely in the eye of the beholder, there is no constraint. That supports making the codomain beType in mathlib, if you were to formalize random_variable.


Last updated: Dec 20 2023 at 11:08 UTC