Zulip Chat Archive

Stream: Brownian motion

Topic: Second phase of the project


Rémy Degenne (Jul 29 2025 at 07:44):

We have reached the goal of the project, which was to define a Brownian motion. Now we are moving it to Mathlib, and we will also need to write a paper about it, but the "proving theorems in Lean" part is done.
Several people have expressed to me their willingness to contribute and their surprise that the project was done so quickly and that they did not have time to join. These people are not busy with making PRs to Mathlib and are presumably still looking for a project, so I propose that we start a second phase of the project, which will be slower, less guided but more collaborative.

Let me first reflect on how the project worked in its first phase

  • with Peter, we discussed the goal of the project and the path we should take towards that goal 2 years ago. Peter wrote a short document to detail the proof. It was meant as our private formalization project but we had other commitments and it did not move forward.
  • 3 weeks before the project became public, we decided to open it to external collaboration and in preparation I started writing a detailed blueprint (around 40 pages): first list every lemma and link them correctly in the graph (with proofs for most of them), then write all the Lean definitions, and finally for most lemmas on the Kolmogorov-Chentsov side I wrote Lean statements (by copying the latex into a docstring, waiting for copilot to propose a statement, then fixing it... it was not perfect, but I needed it to not take ages). The Gaussian side was less detailed because I expected people to be familiar with the standard objects used there. That was not a full time job for 3 weeks, but I spend a few hours whenever I could to advance the blueprint.
  • after the start of the main phase of the project, the Lean formalization proceeded quickly thanks to an amazing dedication of several contributors, and in 3 weeks we were done.

My main point is that the preparation took as long as the project, and I presume it's mostly because it was not collaborative. Perhaps the blueprint was too detailed and I could have gone faster, perhaps the public phase of the project should have started while the blueprint was in construction. The bottom line is that I am not willing to write a second blueprint document for a next phase of the project right now, but I'd like to try a more experimental and collaborative way to use the leanblueprint framework.

Next phase of the project

I propose we target Itô's lemma for the stochastic integral (for which we don't have a definition yet). It's theorem 18.18 in Kallenberg's Foundations of Modern Probability (3rd ed.). We will need several new definitions and many intermediate results. In practice that means that we need to:

  • discuss that goal. Perhaps there is a better goal for the next phase of the project.
  • find the key results that we need on the way
  • start new sections of the blueprint. First write the big theorems and definitions, then add intermediate lemmas as needed. It's important to add links for the dependency graph. Working on the blueprint is the same as working on the code: just fork the repo and open a PR.
  • discuss and add Lean definitions.
  • prove everything.

A last point: during the first phase, I intentionally used a collaboration model in which people coordinate by talking in a zulip topic to know what's claimed and what remains to be done. That's suitable for a small number of collaborators who follow the project closely and needs the continuous involvement of someone who posts updated claim lists, which was the contribution model I was looking for. However it's not great for new people who are just looking for something small to prove. For a longer, more open project, we might need to switch to github issues.

What do you think? Does that sound like a nice project?

Etienne Marion (Jul 29 2025 at 08:19):

I think it's a great goal!

Rémy Degenne (Aug 02 2025 at 09:17):

I added a rough outline of what we'll need in this new chapter: https://remydegenne.github.io/brownian-motion/blueprint/sect0002.html
Feel free to pick a part of that chapter and make a PR to expand it. Note that I am not an expect in that field, so don't hesitate to correct errors, suggest a better approach, or do any other required change.
For now the chapter contains only a list of definitions (some of which are of the form "there exists a process such that" and require big theorems). All of that needs proofs, and lemmas extracted from those proofs, until we have many small results, each of which could correspond to one Lean def/lemma.

A concrete and relatively simple result that one could already prove is Doob's Lp inequality for non-negative sub-martingales, which should be a consequence of this: MeasureTheory.maximal_ineq. I thought we proved it years ago with @Kexing Ying , but apparently not.

Something else we need to figure out is Stieltjes integrals (https://en.wikipedia.org/wiki/Lebesgue%E2%80%93Stieltjes_integration, https://dec41.user.srcf.net/h/III_L/stochastic_calculus_and_applications/1). In Mathlib we have the fact that a monotone right continuous function defines a measure (StieltjesFunction.measure), so we can integrate against those. We'll also need to get a signed measure from a function of locally bounded variation. We know from LocallyBoundedVariationOn.exists_monotoneOn_sub_monotoneOn that such a function is a difference of monotone functions. If we add some continuity properties to that we could define the signed measure as the difference of the two Stieltjes measures of the monotone functions. And then there is the issue that we don't have a definition of an integral against a signed measure.

Yongxi Lin (Aaron) (Aug 02 2025 at 13:13):

I want to try section 7.2 (Theorem 7.4, Definition 7.5, and Theorem 7.6).

Rémy Degenne (Aug 02 2025 at 14:10):

Great! In the beginning of that second phase, I think the focus should be on generating many lemmas and Lean statements with sorry proofs, to enable other people to contribute. So it would be great if you could add details to the blueprint and formalize statements and proofs at a high level, leaving smaller lemmas to be proved by others. The idea is to be as collaborative as possible.

About the mathematical content: the blueprint is written for martingales in R right now because my sources suppose that (this page in particular), but I guess it should work fine in a Hilbert space (?). The norm of the martingale is a non-negative sub-martingale and Doob's Lp inequality will be applied to the norm.

Yongxi Lin (Aaron) (Aug 02 2025 at 17:11):

I see, so I guess I'll start with writing an outline of the proof in the blueprint for section 7.2. I'll find some references for Doob's Lp inequality in Hilbert space.

Rémy Degenne (Aug 02 2025 at 17:18):

No I don't mean that the Lp inequality should be in a Hilbert space, I'm talking about the space of square integrable martingales. The Lp inequality should be about real nonnegative sub-martingales : we would apply the Lp inequality to the norm of the martingale.

Yongxi Lin (Aaron) (Aug 02 2025 at 17:25):

Yes, this is what I meant. Sorry for not being clear earlier.

Kexing Ying (Aug 03 2025 at 12:15):

It seems that I only did the maximal inequality for discrete time since we didn't have cadlag (in the case right continuity seems to be enough). So theres some work to be done there

Kexing Ying (Aug 03 2025 at 13:47):

For the integral against a signed measure, one can probably define it like this:

Kexing Ying (Aug 03 2025 at 13:47):

noncomputable def SignedMeasure.integral {α : Type*} {_ : MeasurableSpace α}
    (σ : SignedMeasure α) (f : α  ) :  :=
   x, f x σ.toJordanDecomposition.posPart -  x, f x σ.toJordanDecomposition.negPart

Kexing Ying (Aug 03 2025 at 13:49):

This is well defined since the p and q obtained by exists_monotoneOn_sub_monotoneOn will be such thatStieltjesFunction.measure p is mutually singingular wrt StieltjesFunction.measure q and that the Jordan decomposition is unique

Sébastien Gouëzel (Aug 03 2025 at 13:55):

Please don't. Signed measure are just a special case of vector valued measures (just like complex measures). Having a general theory seems to me to be a much better idea than using hacks for the real case and the complex case.

Yoh Tanimoto (Aug 03 2025 at 14:06):

With @Oliver Butterley I was thinking how to implement integrals of complex measures, and I suspected that it would be better to define it for general VectorMeasure, but it would be a considerable work. Is there anyone working on it?

Etienne Marion (Aug 03 2025 at 14:09):

I am not aware of anything other than https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/Integral.20with.20respect.20to.20a.20complex.20measure/near/523498031.

Etienne Marion (Aug 03 2025 at 14:13):

I guess it would be nice to make this part of the blueprint to ease collaboration.

Yongxi Lin (Aaron) (Aug 03 2025 at 15:04):

I have seen (extended) signed measures defined as a function taking values in EReal instead of R\mathbb{R}. In this case, I think we should use the Jordan decomposition to define it.

Lorenzo Luccioli (Aug 03 2025 at 23:06):

Hello,

@Alessio Rondelli, @Pietro Monticone, and I started a private formalization project around two months ago focused on the definition of the stochastic integral. We plan to open a PR in the next few days to move our work into the Brownian Motion project.

We followed a different reference book: Andrea Pascucci, Probability Theory II (ISBN 978-3-031-63193-1), which is very detailed and might be useful for other parts of the project as well.

Our current material includes:

  • a blueprint for the Debùt theorem (quite advanced, though not complete)

  • formalization of most statements related to the Debùt theorem, with some easy proofs

  • an almost complete blueprint for the Doob-Meyer theorem (not covered in Kallenberg, but can serve as a drop-in replacement to prove the existence of the quadratic variation, Definition 7.11 in the blueprint)

  • a very preliminary blueprint for Doob’s L^p inequality

Rémy Degenne (Aug 04 2025 at 06:36):

That's great! All those things will be nice to have and it's great to see more people involved in the project!

The Doob-Meyer theorem is covered by Kallenberg in chapter 10. It's just not used to define the integral in chapter 18 (although he mentions that it could be used as an alternative path). But actually I find that chapter 18 is not the clearest chapter in the book, hence why I referred also to some other lecture notes. Thanks for pointing out that other book!

Rémy Degenne (Aug 30 2025 at 11:56):

I see that Lorenzo Luccioli and Alessio Rondelli have opened several PRs for this blueprint material. That's great! I'll review them next week.

Peter Pfaffelhuber (Sep 09 2025 at 19:57):

Being at a conference, I have an idea for a Lean probability project, which might be mathematically interesting and also provides something new on the mathematical side:
In stochastic processes, people often say that "All stochastic processes are Markov, if you enlarge the state space enough." (See e.g. here). I think, this can be proved using Mathlib, based on our Kolmogorov Extension theorem. The main point is that our extension theorem gives a Measure (Π i, α i), which is a measure on a dependent type. So, for the corresponding stochastic process YY, this means that the state space of YtY_t might depend on tt. Assume we are given a stochastic process XX with (complete and separable) state space EE. Then, I can construct a process YY with Yt=(Xs)stY_t = (X_s)_{s\le t} (which has type (Icc 0 t) \to E at time tt. Apparently, the sigma-algebra generated by (Xs)st(X_s)_{s\le t} is the same as the sigma-algebra generated by YtY_t, therefore, P(Yt+s.(Yr)rt)=P(Yt+s.Yt)P(Y_{t+s} \in . | (Y_r)_{r\le t}) = P(Y_{t+s} \in . | Y_t), which is the Markov property for YY.
So: I claim that we have indeed extended the extension theorem in a useful way, since we are able to construct stochastic processes where the state space depends on time. The above argument proves a "well-known" saying in probability, where no real proof exists.


Last updated: Dec 20 2025 at 21:32 UTC