Zulip Chat Archive

Stream: new members

Topic: conditional expected values as an exercise

Kalle Kytölä (Apr 02 2021 at 00:43):

As a way to try to familiarize myself with writing math in Lean, I tried conditional expected values (they were the easiest actual math definition I could think of). I did not immediately find them implemented in mathlib yet, but even if they exist or if someone serious is already working on them, it doesn't matter at all --- my only goal here is to see if I can get to the point of doing anything in Lean. (I also only have time for trying this very irregularly.)

I imagined a certain sequence of mathematically trivial steps that might constitute a good exercise for myself, starting from the definition(s) and sanity-check lemmas, including a.e.-uniqueness. Next would be some standard properties, of which I would probably pick and choose for the purposes of my exercise. Many of the early steps would actually not yet involve the existence of conditional expected values. Existence is slightly nontrivial --- it commonly done with the Radon-Nikodym theorem (which I believe is not yet in mathlib), although in my exercise I would like to actually approach it via first doing it only in $L^2$ by orthogonal projections to closed subspaces (completeness of $L^2(\mu)$ appeared to be there, other stuff I didn't check). Anyways, the point is I had a (rough) sequence of mathematically very easy definitions and lemmas in mind. And the other important thing: the purpose was an exercise, not a contribution to mathlib (I'm nowhere close to the level yet).

Predictably, I not only failed at the sanity-check stage, but I even largely failed to reach that stage...

I will have some questions below about complete trivialities. If someone is patient and willing to help despite the fact I'm stuck at the syntax and trivialities at step 0, and despite the fact that this is not an attempt at a mathlib contribution, I will be very grateful!

Kalle Kytölä (Apr 02 2021 at 00:44):

I don't know what is the best way to pose the questions. For now, I created a single file <https://github.com/kkytola/lean-questions/blob/main/cond_exp-question-20210401.lean>, which contains some of the first steps I already stumbled with. I will try to describe some actual questions below, in particular because the file is not a mwe.

Kalle Kytölä (Apr 02 2021 at 00:49):

It seemed like the first thing should be to define sub-sigma-algebras $\mathscr{G} \subset \mathscr{F}$ of a given sigma-algebra $\mathscr{F}$ on $S$ .

def is_subsigmaalg {S : Type*} (Γ₁ Γ₂ : measurable_space S ) : Prop :=
  ∀ (E : set S ) , (Γ₁.is_measurable' E) → (Γ₂.is_measurable' E)

Of course, after actually doing this, I realized that the set of measurable_spaces on a given set is an instance of a complete_lattice, so the appropriate le is unsurprisingly already there in mathlib :oh_no:. Well, at least my definition coincided with the actual one :joy: --- I could prove:

lemma is_subsigmaalg_iff_le {S : Type*} {Γ₁ Γ₂ : measurable_space S } :
  is_subsigmaalg Γ₁ Γ₂ ↔ Γ₁ ≤ Γ₂ := by sorry

Nevertheless, I still had trouble using some of the syntax for the existing complete_lattice. For example, I don't know how to get the trivial sigma-algebra $\mathscr{G} = \{ \emptyset , S \}$ as the minimal element of the complete lattice. I ended up using the following definition, which certainly feels redundant given what already exists...

def is_trivial_sigmaalg {S : Type*} (Γ : measurable_space S ) : Prop :=
  ∀ (E : set S ) , (Γ.is_measurable' E) → (E = ∅ ∨ E = univ)

Any hints on using complete_lattice for this directly?

Yakov Pechersky (Apr 02 2021 at 00:52):

Have you looked at how bundled subobjects are done in mathlib?

Kalle Kytölä (Apr 02 2021 at 00:52):

My current (almost surely not satisfactory) definition of a conditional expected value is:

structure cond_exp_enn {S : Type*} (f : S → ennreal)
  (Γsub Γfull : measurable_space S ) [hsub : Γsub ≤ Γfull]
  (μ : @measure_theory.measure S Γfull) :=
    ( to_fun : S → ennreal )
    ( is_submeasurable : @measurable S ennreal Γsub _ to_fun )
    ( equal_subintegrals : ∀ (E : set S) , Γsub.is_measurable' E →
        ∫⁻ x in E , to_fun(x) ∂ μ = ∫⁻ x in E , f(x) ∂ μ )

I don't know if this is any good... I would appreciate comments on:

Is structure at all reasonable for such definitions?
In the current version I am not requiring measurability and integrability of $f$ $f$ , or that the measure $\mu$ $μ$ is a probability measure. All of these would almost always be needed, so should they be a part of the definition?
- (I suppose constructing instances would become harder, but lemmas could then avoid repeatedly the same assumptions.)
Here I'm already using some @s to specify implicit arguments. Later I ended up doing much much more of that, to the extent that I think I'm doing something wrong. Does this sound like a common beginners' tendency that I could learn to avoid?
The above works with functions and is going to be only a.e. unique, but I suppose one can have a parallel definition with equivalence classes like in the $L^p$ -spaces (?). Then one will have actual uniqueness etc.
Anything else that an experienced person can tell me :big_smile:

Quite embarrassingly, I failed to do a coercion to a function f : S → ennreal. I have seen some analogous coercions in the tutorials I did, but none of my imitations of the syntax seemed to work. Besides "Go to definition", I only found this documentation <https://leanprover-community.github.io/mathlib_docs/init/coe.html#has_coe_to_fun>. After seeing both I am at loss... even about whether is_coe_to_fun is a structure or a class (and I frankly forgot what little I learned about their differences anyway).

Here's my attempt at a minimal example in which I can't figure out the syntax...

structure foo {A B : Type*} :=
  ( to_fun : A → B )

@[instance]
def foo.has_coe_to_fun {A B : Type*} :
  has_coe_to_fun (@foo A B) :=
{ coe := sorry ,
  F := sorry ,
}

Kalle Kytölä (Apr 02 2021 at 00:54):

Yakov Pechersky said:

Have you looked at how bundled subobjects are done in mathlib?

No, I'm afraid not... I did some tutorials, but I don't remember encountering those (or possibly I wasn't paying attention or understanding). Where would I find information about them?

Bryan Gin-ge Chen (Apr 02 2021 at 00:55):

Quite embarrassingly, I failed to do a coercion to a function f : S → ennreal. I have seen some analogous coercions in the tutorials I did, but none of my imitations of the syntax seemed to work. Besides "Go to definition", I only found this documentation <https://leanprover-community.github.io/mathlib_docs/init/coe.html#has_coe_to_fun>.

Coercions are explained at the very end of TPiL.

Yakov Pechersky (Apr 02 2021 at 00:57):

You could read the subgroup code -- you bundle the carrier set and the properties retained. I think data/set_like was merged in, which gives a nice general framework for such bundles.

Bryan Gin-ge Chen (Apr 02 2021 at 00:58):

I should look at the set_like code too, since I've always been a bit uncomfortable with subobjects...

Kalle Kytölä (Apr 02 2021 at 00:58):

Oops. I did subgroups as a part of the first half of Kevin Buzzard's course, so I evidently didn't learn something I should have... I will take another look.

Just a clarification: is the bundled thing supposed to be for the definition of the conditional expected value itself or something else I'm getting wrong? (I suppose there's quite a bit...)

Yakov Pechersky (Apr 02 2021 at 01:03):

Just for the subsigmaalgebra bit

Yakov Pechersky (Apr 02 2021 at 01:05):

For the has_coe_to_fun, in most cases, you won't need to fill out the coe field if you fill in the F field. In your case, F := foo.to_fun

Heather Macbeth (Apr 02 2021 at 01:34):

Kalle Kytölä said:

Existence is slightly nontrivial --- it commonly done with the Radon-Nikodym theorem (which I believe is not yet in mathlib), although in my exercise I would like to actually approach it via first doing it only in $L^2$ by orthogonal projections to closed subspaces (completeness of $L^2(\mu)$ appeared to be there, other stuff I didn't check).

@Kalle Kytölä Some pointers:

yes, we have the inner product structure on $L^2(\mu)$ and its completeness as of last month! #6596, #6563
orthogonal projection: docs#orthogonal_projection
@Martin Zinkevich has a proof of Radon-Nikodym at
https://github.com/google/formal-ml/blob/master/src/formal_ml/radon_nikodym.lean
which he has gradually been PR'ing.

For other things, feel free to ask!

Sebastien Gouezel (Apr 02 2021 at 05:45):

Kalle Kytölä said:

Nevertheless, I still had trouble using some of the syntax for the existing complete_lattice. For example, I don't know how to get the trivial sigma-algebra $\mathscr{G} = \{ \emptyset , S \}$ as the minimal element of the complete lattice. I ended up using the following definition, which certainly feels redundant given what already exists...
def is_trivial_sigmaalg {S : Type*} (Γ : measurable_space S ) : Prop :=
  ∀ (E : set S ) , (Γ.is_measurable' E) → (E = ∅ ∨ E = univ)
Any hints on using complete_lattice for this directly?

The trivial subalgebra is the smallest element in the lattice, so it is called bot (for bottom), and typed as ⊥ (\bot in vscode).

Sebastien Gouezel (Apr 02 2021 at 06:32):

I think the conditional expectation of a function should just be a function, together with statements that assert it behaves like you want. Something along the lines of

import measure_theory.bochner_integration

open_locale ennreal
open measure_theory

variables {α : Type*} (F : measurable_space α) [measurable_space α] (μ : measure α)

def cond_exp_enn {α : Type*} (F : measurable_space α) [measurable_space α] (μ : measure α)
  (f : α → ℝ≥0∞) (x : α) : ℝ≥0∞ :=
sorry

lemma measurable_cond_exp_enn {f : α → ℝ≥0∞} (hf : measurable f) :
  @measurable _ _ F _ (cond_exp_enn F μ f) := sorry

lemma int_cond_exp_enn {E : set α} (hE : measurable_set E) {f : α → ℝ≥0∞} (hf : measurable f) :
  ∫⁻ x in E, cond_exp_enn F μ f x ∂μ = ∫⁻ x in E, f x ∂μ :=
sorry

The real work is in the definition of cond_exp_enn to make sure that the other lemmas work, of course.

Kalle Kytölä (Apr 03 2021 at 14:22):

Thank you Sebastien!

To be honest, it took me a while to get what you mean at all (but I knew that you are an expert at this, so tried to unravel).

Why I was very puzzled at first is the following. In my opinion, in math we give the defining property of conditional expected value and then we prove (1) that the property uniquely defines the conditional expected value (modulo modification on a zero measure set) and (2) that the conditional expected value of an integrable random variable exists. My exercise was structured like this, too. At first I could not see at all what you meant by defining conditional expected values and then proving that (what I think of as) the defining property is satisfied by the given definition...

But I think I ultimately got what you mean --- please correct me if I'm wrong. The analogy with real numbers might be that we want the Lean definition to be something, say equivalence classes of Cauchy sequences of rationals, and then the axiomatic ("defining") properties are proven from that definition (as opposed to giving axioms and proving existence and uniqueness). So do I get it right that with conditional expected values you suggest to take the Lean definition to be an actual construction? Such a construction might be e.g. (1) a suitable Radon-Nikodym derivative or (2) an $L^1(\mathsf{P})$ -limit of orthogonal projections to $L^2(\Omega,\mathscr{G},\mathsf{P}) \subset L^2(\Omega,\mathscr{F},\mathsf{P})$ of truncations which are in $L^2(\mathsf{P}) \subset L^1(\mathsf{P})$ (no point taking an equivalence class of such sequences, I guess, since a good sequence is quite constructive?).

I have no doubt your suggestion is the better formalization strategy. For the purposes of my exercise, however, I will still consider whether it is your suggestion or simply a "defining property" structure that is easier for incremental playing around. In principle, if the second approach would ever reach a sufficiently complete stage, I suppose refactoring towards your suggested approach (or just writing a second version from scratch) might be a way to get to mathlib quality. (Just don't count on me ever getting that far!)

I really appreciate your insight, thank you! Your advice definitely helps me understand good formalization. But given that I'm still very much stuck at the syntax it will take me very very long before I can put that into actual practice.

Kalle Kytölä (Apr 04 2021 at 02:34):

Sebastien Gouezel said:

Kalle Kytölä said:

Nevertheless, I still had trouble using some of the syntax for the existing complete_lattice. For example, I don't know how to get the trivial sigma-algebra $\mathscr{G} = \{ \emptyset , S \}$ as the minimal element of the complete lattice.

The trivial subalgebra is the smallest element in the lattice, so it is called bot (for bottom), and typed as ⊥ (\bot in vscode).

Thank you!

I had actually guessed that bot or ⊥ would be the right thing, but I had not figured out the syntax. With variable (α : Type*) I had tried things like (measurable_space α).⊥ and (measurable_space α).complete_lattice.⊥, which to me seemed like the most likely things that instance : complete_lattice (measurable_space α) might do (well, it doesn't).

The one thing I had not tried was (⊥ : measurable_space α), because that would have been too simple. And of course it not only works :sweat_smile:, but it is probably the most natural syntax a mathematician could hope for...

What I guess I learned (at least until the next time) in this exercise is the usefulness of #check. It is way quicker (although I prefer not to elaborate on how long I spent on, e.g., ⊥) to test with #check than to try to get the syntax right where it is used. So finally, for any other new members, the following is the trivial sigma algebra, and it actually works:

import measure_theory.measurable_space
open measurable_space
variable (α : Type*)
#check (⊥ : measurable_space α)

Kalle Kytölä (Apr 04 2021 at 02:37):

The following is not important at all, but it is somewhat related to the above, and I feel like these sort of things should not be as complicated as I make them look like.

I checked that my one line condition for triviality coincides with that in ⊥ : measurable_space S. What I came up with was not short. Among other things, it takes me three lines to prove that univ : set S is measurable :grimacing: (I have used this a number of times, and it always takes me three lines). There are surely other terrible choices as well. In case someone has hints on more efficient writing, I'll appreciate. However, this is obviously not of any importance (besides my own very very slow learning).

import measure_theory.measurable_space

open set
open measurable_space

def is_trivial_sigmaalg {S : Type*} (Γ : measurable_space S ) : Prop :=
  ∀ (E : set S ) , (Γ.measurable_set' E) → (E = ∅ ∨ E = univ)

lemma is_trivial_sigmaalg_iff_bot {S : Type*} (Γ : measurable_space S ):
  (is_trivial_sigmaalg Γ) ↔ (Γ = ⊥) :=
begin
  split ,
  { intro htriv ,
    suffices : ( ∀ (E : set S) ,
        Γ.measurable_set' E ↔ (⊥ : measurable_space S).measurable_set' E) ,
    { exact ext this , } ,
    intro E ,
    split ,
    { intro hE ,
      apply measurable_set_bot_iff.mpr ,
      exact htriv E hE , } ,
    { intro hE ,
      have key := measurable_set_bot_iff.mp hE ,
      cases key with hEemp hEuniv,
      { rw hEemp ,
        exact measurable_set_empty Γ , } ,
      { rw hEuniv ,
        have mbleempc := Γ.measurable_set_compl ∅ Γ.measurable_set_empty ,
        simp at mbleempc ,
        assumption , } ,
      } ,
    } ,
  { intros hbot E hE ,
    rw hbot at hE ,
    exact measurable_set_bot_iff.mp hE , } ,
end

Kalle Kytölä (Apr 04 2021 at 02:41):

I still have a lot of trivial questions about the syntax, but I will first try to internalize the many very helpful answers to the questions I already posed before posing new ones...

Thank you to all who have helped me so far!

Rémy Degenne (Apr 05 2021 at 15:47):

I started implementing a conditional expectation during the last weeks, not for ennreal functions as in the code suggested by Sebastien, but for functions with values in a Hilbert space. Most of the work is on branch#conditional_expectation (mostly in the measure_theory/conditional_expectation file). It is full of sorry for mostly simple but tedious lemmas and globally a big mess, but a lot of the work is done. (it is also somewhat outdated and I should merge master to use some recent results about bounded functions and Lp)

My strategy is to define the conditional expectation first for L2 as an orthogonal projection to a subspace of L2, then express it as a continuous linear map from the simple functions of L1 to L1 (that it is linear is easy, since I build it by composing three linear maps L1simple -> L2 -> L2 subspace -> L1, but proving the continuity needs work), then extend to a continuous linear map from L1 to L1.
It got more difficult than expected from the very first step: I wanted to write the conditional expectation with respect to sigma-algebra m as the orthogonal projection of functions of L^2 (defined with sigma-algebra m0) to the subspace of m-measurable elements of L^2. But with the objects that we have the 0 element of L^2 may not even be m-measurable, so no subspace there. That 0 is however µ-almost-everywhere equal to an m-measurable function for a measure µ defined on m0, and that is the property that should define the subspace. Proving the completeness of that subspace was something that did not follow that easily from results that we already had, which surprised me.

I will not detail all steps of the construction but here are a few remarks about how the experience went:

our library is written with only one measurable_space structure in mind, which is most often an implicit argument in lemmas. As soon as there are two different ones on the same space the code gets more tedious (and you need to write a lot of @).
there are many different structures involved, which could all represent the same function: simple functions, simple functions seen as elements of L1, L1 functions, L2 functions... the API for converting between those needs more lemmas.
and an important point, not lean-related, which I was not aware of: many things are not as quick to prove in a generic Hilbert space as they are for R. For some results I used the following approach: first prove it for R, then prove it for indicator functions by writing an indicator function as the scalar product of a real function and an element of the space, then go to simple functions and so on. The book "analysis in Banach spaces" (volume 1) was useful.

Mario Carneiro (Apr 05 2021 at 17:05):

Regarding the first point, if you want to work with multiple measures you can, the type measure X is there for this. If you want multiple measurable spaces this is less common but should still be possible. Do you have any particular examples in mind?

Rémy Degenne (Apr 05 2021 at 17:32):

It is indeed possible to have multiple measurable_space and that is what I did. I don't really have a problem there, and the main point of my message is just to let @Kalle Kytölä know that I am also working on this.

The library is simply less convenient if there are several measurable_space on a same space.
The issue is that since every lemma in the library uses [measurable_space α] and not an explicit argument you will need to use @ if you have two and you want to use the one that is not guessed. In the particular example of the conditional expectation for functions in L1: those (equivalence classes of) functions are defined with respect to a measure µ on a m0 : measurable_space α, and the conditional expectation with respect to m <= m0 is an m-measurable function. There are naturally two measurable_space around, m0 and m. Then if I want to say that a function f is m0-measurable I may write measurable fbut to write the same thing with m I have to use @measurable _ _ m _ f. And for some more involved statements the number of _ may be large. One of my proofs is full of _ _ _ _ _.

So working with several measurable_space is totally feasible, it is just not very convenient. As a last remark: until I started working on the conditional expectation, not having to type the name of the measurable_space everywhere was great.

Mario Carneiro (Apr 05 2021 at 18:15):

One trick that is used in the topology library when dealing with e.g. continuous functions with multiple topologies on the same space is to define a local notation like notation `measurable'` m f := @measurable _ _ m _ f so that you can supply it when you need to, and use measurable when the default is fine

Mario Carneiro (Apr 05 2021 at 18:16):

for example src#continuous_iff_coinduced_le

Rémy Degenne (Apr 05 2021 at 18:24):

Thanks, I did not know that trick. That can indeed be useful. Although looking at my code, the lemmas that produce the _ _ _ _ are quite diverse. At least, @measurable appears several times and will most probably get a notation. Thanks!

Kalle Kytölä (Apr 05 2021 at 18:57):

Hello @Rémy Degenne!

Thank you for letting me know! I am excited to hear that someone serious is working on this!

I found the branch and file you mentioned, thanks! In principle I would like to be able to offer help, but unfortunately I'm not yet at the level to be able to do so, and I also only have time very irregularly. I will instead try to follow what you do; I will almost surely learn various things from that. Meanwhile I'll probably play a bit more with my own exercise also just to learn a bit more.

Nice to hear that you also chose the $L^2$ -approach to conditional expected values. I hardly even got started so far, but a number of the things you mention corresponded to my observations as well. Multiple measurable spaces simultaneously on the same type were ultimately not that bad (except for lots of @s and learning syntax). But exactly as you say, the inclusions among the spaces of suitable functions were not easy. In particular, while it is quite straightforward (with lots of @s) to define $L^2(\Omega,\mathscr{G},\mathsf{P}|_{\mathscr{G}})$ , I did not yet manage to realize this as a closed subspace of $L^2(\Omega,\mathscr{F},\mathsf{P})$ , so I did not get to use any orthogonal projections yet. As you say, more measure theory lemmas for embedding functions from various spaces to each other would seem useful.

Thanks again for letting me know, and very glad to hear you're working on this topic!

Last updated: May 02 2025 at 03:31 UTC