Zulip Chat Archive

Stream: Is there code for X?

Topic: Brownian Motion

Olga Ponomarenko (Sep 17 2024 at 21:34):

I'm arriving a bit late here, but have there been further updates to the potential ProbabilitySpace class? I just want to define Brownian motion :crying_cat:

Notification Bot (Sep 17 2024 at 23:01):

A message was moved here from #Is there code for X? > Should there be a ProbabilitySpace class? by Jason Rute.

Jason Rute (Sep 17 2024 at 23:05):

Hi Olga, I moved your message to a new topic. (I'm not sure if the ProbabilitySpace class is the barrier to defining Brownian Motion.) I also want to link your two PA.SX questions:

Maybe someone more knowledgable like @Jason KY. or @Rémy Degenne can answer.

Jason Rute (Sep 17 2024 at 23:11):

But I think it would also be good to explain why you are looking for Brownian motion in Lean (and that particular definition of it that you gave on PA.SX). If you are new to Lean, I think probability theory might be a tough place to jump in as it gets very subtle. Moreover, Brownian motion depends on a lot of things so it is more likely basic facts about say the normal distribution or about continuous processes have not yet been formalized. (I haven't formalized any probability myself, but I've been around others at IBM working on formalizing it in Coq, and my PhD work was very probability adjacent.) Nonetheless, if you give us some more details about your motivations one of us might be able to help direct you.

Rémy Degenne (Sep 18 2024 at 05:59):

@Peter Pfaffelhuber is working on defining the Brownian motion. Jason is right that there are missing prerequisites (about characteristic functions and Gaussian measures notably, as well as the Kolmogorov-Chentsov theorem), and Peter is working on adding those. I agree with Jason that the existence or not of a ProbabilitySpace class is not very relevant.

Rémy Degenne (Sep 18 2024 at 06:25):

Also if you want to work on things for which you need a Brownian motion, you can just assume it exists. Create a def for it with sorry, write lemmas with proofs sorry for its main properties, and go on from there.

Peter Pfaffelhuber (Sep 18 2024 at 11:51):

Hi Olga! Yes, we (i.e. @Jakob Stiefel and myself) are working in this. We are following
this route. Jakob has finished Proposition 1.3 (no PR yet), but Kolmogorov-Chentsov is completely open. Could you write a bit about your plans?

Jason Rute (Sep 18 2024 at 14:13):

@Olga Ponomarenko I realize we haven’t been tagging you in this thread. Please see the above conversation.

Jason Rute (Sep 18 2024 at 14:15):

Also, for other people here, is it even possible to define the rest of @Olga Ponomarenko’s theorem (in her PA.SX post mentioned above) besides the Brownian motion part? Or are there definitions missing there such as “normally distributed”.

Rémy Degenne (Sep 18 2024 at 14:25):

docs#ProbabilityTheory.gaussianReal is the normal distribution on R. "Normally distributed with mean m and variance v" would be expressed by saying that the law is equal to gaussianReal x v. Independence of a family of random variables is docs#ProbabilityTheory.iIndepFun

Peter Pfaffelhuber (Sep 18 2024 at 14:28):

However, I think there is no multi-dimensional part as far as I can see. In a definition of Brownian Motion, I would state that $X$ is a Brownian Motion iff $X_0=0$ , it has continuous paths ae and its distribution is such that $(X_{t_1},...X_{t_n})$ (for $0 \leq t_1 <... < t_n$ ) is Gaussian with mean $0$ and covariance matrix $C$ with $C_{ij} = t_i \wedge t_j$ . The independence of increments then follows. In this way it is actually easy to show that the family of finite-dimensional distributions is projective.

Rémy Degenne (Sep 18 2024 at 14:29):

That's right. The only Gaussian we have for now is on the real line. Nothing multi-dimensional.

Peter Pfaffelhuber (Sep 18 2024 at 14:30):

As for stochastic calculus, I think you can easily start to define the discrete stochastic integral and make use of the martingale results, which could be nice. For continuous stochastic calculus, I think too much is missing at the moment (apart from the less interesting cases of Stieltjes integrals if the integrator has bounded variation).

Kevin Buzzard (Sep 18 2024 at 14:40):

Rémy Degenne said:

That's right. The only Gaussian we have for now is on the real line. Nothing multi-dimensional.

Martin Hairer has stressed the importance of formalising the Gaussian in the sense that basically I can't get him to engage with mathlib until we've done it :-) @Oliver Nash I think has a good understanding of the generality in which Hairer thinks we should be working (in the sense that I once watched them talking about it but I was completely lost).

Peter Pfaffelhuber (Sep 18 2024 at 14:46):

@Kevin Buzzard Thanks for the motivation! My guess that it boils down to the generality by which we define Gaussian distributions. Right now we only have them as distributions on $\R^1$ , for regular Brownian Motion we would need them on $\R^d$ , but it would of course be more general to have them on some Banach space (needed e.g. to have SPDEs later on.)

Vincent Beffara (Sep 18 2024 at 14:49):

Something like this https://en.wikipedia.org/wiki/Abstract_Wiener_space

Oliver Nash (Sep 18 2024 at 21:18):

Despite Kevin's generous remarks above, I know hardly anything about this beautiful area.

I think the main thing to stress is what Kevin has already said: a Fields medalist openly declared that their interest in Mathlib is contingent upon it having a definition of Gaussian measures.

I remember looking at Mathlib shortly after chatting with Hairer to see if we could define these and we couldn't quite but looking now, it seems like we can:

import Mathlib

open MeasureTheory ProbabilityTheory

variable {E : Type*} [AddCommGroup E] [Module ℝ E]
  [TopologicalSpace E] [TopologicalAddGroup E] [ContinuousSMul ℝ E]
  [MeasurableSpace E] [BorelSpace E]

/-- Definition of Gaussian measure.

Copied from: https://mathoverflow.net/questions/407465/ -/
def Measure.IsGaussian (μ : Measure E) [IsProbabilityMeasure μ] : Prop :=
  ∀ f : E →L[ℝ] ℝ, ∃ m v, μ.map f = gaussianReal m v

/- TODO
1. Show Dirac measures are Gaussian
2. Show finite-dimensional measures defined via usual exponential formula are Gaussians
3. Construct the Wiener measure on `C(Icc a b, ℝ)` and prove it is a Gaussian
...
-/

Olga Ponomarenko (Sep 18 2024 at 21:52):

Following the Lean tutorial, I'm definitely the "try to launch the dishwasher by randomly pressing some buttons". I was hopeful that it would be "straightforward" to write out all five proofs of the Black-Scholes formula (and some other more interesting models like SABR), but it looks like I need to dig deeper. Which is also great, especially given all this other interest in the topic! (Also, I'm just a practitioner, but happy to be helpful, I'll reach out to @Peter Pfaffelhuber )

Dominic Steinitz (Sep 19 2024 at 08:33):

Practitioner (former) here also :smile: This is an area that could really stand some formalisation. I just picked up my old copy of Protter (Stochastic Integration and Differential Equations) and was amused at how many notes I had written in the margin.

My practitioner experience was writing a (first-order) functional language in which one could express exotic financial derivatives and generate e.g. prices via martingales and Monte Carlo / PDEs, schedules of payments etc. We could even generate latex (which the risk folks seemed to prefer to a formal language description).

Kexing Ying (Sep 19 2024 at 09:20):

Martin has some really nice notes (Chap. 4 here) on how to define general Gaussians. What's quite nice is that the existence of Brownian motion can obtained as a corollary of the existence of Gaussian measures. Although one will need Kolmogorov extension either way.

Violeta Hernández (Sep 21 2024 at 09:15):

We don't have Kolgomorov yet? That's a project I could get behind!

Etienne Marion (Sep 21 2024 at 09:23):

Kolmogorov is at https://github.com/RemyDegenne/kolmogorov_extension4, and only needs to make its way to Mathlib.

Rémy Degenne (Sep 21 2024 at 09:39):

For the Brownian motion definition with the proof Peter mentioned you need two theorems named after Kolmogorov: the extension theorem which is in that repository, and the continuity theorem (also named Kolmogorov-Chentsov) which has not been formalized yet.

Dominic Steinitz (Sep 22 2024 at 07:55):

There is a variety of constructions of Brownian motion in the literature. The approach we have
followed goes back to one of the great pioneers of Brownian motion, the French mathematician
Paul Lévy. Lévy’s construction has the advantage that continuity properties of
Brownian motion can be obtained from the construction. An alternative is to first show that a
Markov process with the correct transition probabilities can be constructed, and then to use an
abstract criterion, like Kolmogorov’s criterion for the existence of a continuous version of the
process.

Why wouldn't one use Lévy rather than Kolmogorov (or indeed one of the other methods)?

Peter Pfaffelhuber (Sep 22 2024 at 19:15):

Dominic schrieb:

Why wouldn't one use Lévy rather than Kolmogorov (or indeed one of the other methods)?

For me, it seems that the Kolmogorov-Chentsov criterion will be applicable to other scenarios as well. In addition, defining the approximating linearly interpolating random walks in Lévy's construction isn't even easy on paper, if done properly. I see no reasons why this could not be done in Lean, so feel free to start a project.

Vincent Beffara (Sep 22 2024 at 20:19):

It's true that the dyadic piecewise linear interpolation of Brownian motion is a bit too specific, and that showing that the covariance is the right one is a little tricky if you want to do it right. But OTOH the definition itself uses only the Borel-Cantelli lemma and the completeness of C(I,R) which are both in Mathlib, so it would likely be the quickest approach.

Etienne Marion (Sep 22 2024 at 21:07):

Doesn’t it also require an infinite family of independent gaussian variables?

Thomas Zhu (Sep 25 2024 at 22:30):

I had briefly thought about how to prove Kolmogorov–Chentsov in Lean many months ago, and had a rough plan but I didn't continue working on it. There's basically only one proof strategy, namely reasoning about integer multiples of 2^n and using a lot of floor functions, which is kind of an engineering work in Lean. We may also want to aim for the more general Kolmogorov–Chentsov on R^d instead of R, if there is interest (so we need to reason about dyadic points in R^d, perhaps ZSpan)

Edit: I'm so sorry, I didn't see the full message above. It seems this is already worked on now.

Thomas Zhu (Sep 26 2024 at 03:28):

Peter Pfaffelhuber said:

Hi Olga! Yes, we (i.e. Jakob Stiefel and myself) are working in this. We are following
this route. Jakob has finished Proposition 1.3 (no PR yet), but Kolmogorov-Chentsov is completely open. Could you write a bit about your plans?

This is very great news. I fully agree with the more general epsilon-cover route. @Peter Pfaffelhuber What is the status of this project? Is it actively being worked on now?

Dominic Steinitz (Oct 08 2024 at 07:48):

Etienne Marion said:

Doesn’t it also require an infinite family of independent gaussian variables?

Can it be done like this?

theorem BrowianExistence
    {m : ∀ x, MeasurableSpace ℝ}
    {f : ∀ i, Ω → ℝ} (hf_Indep : iIndepFun m f μ) (h_Normal : Measure.map (f i) μ = gaussianReal 0 1)
    {i j : Ω -> ℝ} (hij : i ≠ j) :
    IndepFun (f i) (f j) μ := sorry

Ignore the conclusion (IndepFun (f i) (f j) μ) - I just put that there to make lean happy (in Haskell I would write undefined or even _ which would then tell me which type was required). My next task is to find out how to assert the existence of a continuous function in lean :smile:

Vincent Beffara (Oct 08 2024 at 08:02):

Taking a family of iid Gaussians as an input makes sense, you can always plug them in later to prove existence. But then you are not proving the existence of Brownian motion, rather you are constructing Brownian motion from this input, so you are in fact writing a def.

Vincent Beffara (Oct 08 2024 at 08:04):

(Also, you can write theorem bla (whatever) (whatever else) : True := sorry if you want to make it clear that the conclusion is just a placeholder.)

Dominic Steinitz (Oct 08 2024 at 08:13):

I am happy to be roughly on the right lines.

I thought math lib had a construction for independent variables (at least I saw Chapter 4 of Williams' excellent book referenced)? Once I have my proof that Brownian motion exists given that a (countably) infinite supply of normally distributed random variables exist then I can just plug in this fact? So maybe what I have written is a lemma :wink:.

Etienne Marion (Oct 08 2024 at 08:19):

Infinite independent families of random variables are not in Mathlib yet, unless I am much mistaken. I formalized infinite products of probability measures at https://github.com/sgouezel/kolmogorov_extension4 depending on https://github.com/RemyDegenne/kolmogorov_extension4, but these have to make their way to Mathlib.

Vincent Beffara (Oct 08 2024 at 08:20):

What I'm saying is that your are proving more than existence, you are providing a construction i.e. "data" which might turn out to be more useful than just existence. And then you can if you want write a separate existence lemma with a proof like obtain ... := existence_of_iid followed by applying your def to get a witness.

Dominic Steinitz (Oct 08 2024 at 08:27):

Vincent Beffara said:

What I'm saying is that your are proving more than existence, you are providing a construction i.e. "data" which might turn out to be more useful than just existence. And then you can if you want write a separate existence lemma with a proof like obtain ... := existence_of_iid followed by applying your def to get a witness.

Oh ok - so a constructive proof should be stated as a definition? Thank you.

Vincent Beffara (Oct 08 2024 at 08:34):

I mean, you can always hide the construction in the depth of the proof of a theorem that says "there exists a measure such that blah blah happens" if you want. And then it's not a def it's a theorem even though the proof is constructive. But because of proof irrelevance there will be no way to get access to the construction, and if you want to sample a Brownian motion later you will need to use choose. (Which may very well be fine.)

I would write the definition separately, if only to be able to show properties as separate lemmas for better code readability.

Bjørn Kjos-Hanssen (Oct 08 2024 at 22:09):

Etienne Marion said:

Infinite independent families of random variables are not in Mathlib yet, unless I am much mistaken. I formalized infinite products of probability measures at https://github.com/sgouezel/kolmogorov_extension4 depending on https://github.com/RemyDegenne/kolmogorov_extension4, but these have to make their way to Mathlib.

Awesome, I've been looking for a workable definition of the fair-coin measure on ℕ → Bool.
I was able to prove that the probability μ {A | ∀ k : Fin s, A k.1 = b k} = (1/2)^s using productMeasure_boxes from your project, very nice library.

Dominic Steinitz (Oct 11 2024 at 13:09):

image.png

The Levy approximation on ((D 10) : List ℚ) where

-- The dyadic points
def D {α : Type} [LinearOrderedField α] (n : ℕ) : List α :=
  List.range (2^n + 1) |>.map (λ k => k / (2^n : α))

Vincent Beffara (Oct 11 2024 at 13:53):

This looks wrong, you shouldn't be able to see the dyadic nature of the approximation that clearly on the picture. There are 16 clearly separated local minima here... plus the graph is almost symmetric across x=1/2.

Dominic Steinitz (Oct 11 2024 at 14:05):

Vincent Beffara said:

This looks wrong, you shouldn't be able to see the dyadic nature of the approximation that clearly on the picture. There are 16 clearly separated local minima here... plus the graph is almost symmetric across x=1/2.

That's very useful - I will investigate

Peter Pfaffelhuber (Mar 21 2025 at 10:04):

My goal for 2025 is to formalize Brownian Motion.

Together with @Rémy Degenne we have formalized the Kolmogorov Extension Theorem. I am still on my way of improving some of the results and slowly PRing them. @Jakob Stiefel is about to finish PRing that characteristic functions uniquely characterize measures., which will be helpful in other places as well.

The main next steps will be (see also the outline):

Define the finite-dimensional distributions. (Here we will need the characteristic functions.)
Show the projective property of the latter. (Again, using characteristic functions.)
Implementing the Kolmogorov-Chentson criterion, which will become computation-heavy, and which should be more generat than taking an interval as the set of possible times.
Bringing everything together.

Any comments and suggestions are welcome!

Sébastien Gouëzel (Mar 21 2025 at 10:08):

In what kind of generality are you planning to do things? If you can organize things so that it's possible to reuse some parts when doing, say, Lévy processes, or Brownian motion on a Euclidean space (or even a Riemannian manifold), that would be awesome!

Peter Pfaffelhuber (Mar 21 2025 at 10:15):

The Kolmogorov-Chentsov Theorem is general anyway, but only gives finite-dimensional distributions. The main point when generalizing to Lévy processes will be to define the finite dimensional distributions, and show their projectivity. Since they are rarely continuous, one would need to start working on càdlàg functions, tightness on this space, and so on...
For The Kolmogorov-Chentsov, I plan to work with a space, which locally can be covered with $O(\varepsilon^{-d})$ balls as $\varepsilon \to 0$ , which should cover the cases you mention.

Yan Yablonovskiy (Mar 21 2025 at 11:10):

(deleted)

Dominic Steinitz (Mar 21 2025 at 11:13):

Dominic Steinitz said:

Vincent Beffara said:

This looks wrong, you shouldn't be able to see the dyadic nature of the approximation that clearly on the picture. There are 16 clearly separated local minima here... plus the graph is almost symmetric across x=1/2.

That's very useful - I will investigate

Some time ago I did investigate and it was indeed wrong (my supply of IID RVs had run out if that makes any sense). I still think there is some value in reproducing Levy's construction but I don't either @Olga Ponomarenko or I are going to be doing anything on it in the foreseeable future. Anyway I think the relationship between measures and characteristic functions will be useful here so keep up the good work :clap:

Dimitri Vulis (Apr 05 2025 at 16:21):

Olga Ponomarenko said:

I was hopeful that it would be "straightforward" to write out all five proofs of the Black-Scholes formula (and some other more interesting models like SABR), but it looks like I need to dig deeper. Which is also great, especially given all this other interest in the topic! )

I gave the following prompt: "Write a proof of Black-Scholes option pricing formula in Lean language using Itô's lemma. Clearly identify all the assumptions." to Google Bard (generative AI), whose output had many gaps (where the AI said "sorry"), but was mostly reasonable overall. It's too long to quote here, but I posted on Stackexchange. An interesting project, for someone looking for low hanging fruit in quantitative finance, would be to start with a book that has rigorous proofs (for example, Steven Shreve, Stochastic Calculus for Finance II: Continuous-Time Models, which I used to teach an undergraduate class) and try to work on the various theorems and little lemmas that lead up to Black-Scholes proofs and other more complicated results.

Last updated: May 02 2025 at 03:31 UTC