Zulip Chat Archive

I'd be interested in formalising a few foundational results that are critical for the analysis of randomised algorithms (the randomness and computing course was certainly one of my favourite courses in my PhD).

1) how much preliminary material would need to be formalised to be able to state and prove concentration inequalities like the Azuma-hoeffding bound (I've played around with PMF and I'm only marginally less efficient than I was in Rocq)

2) if it does seem feasible to do right now, where should this live? any particular sublibraries of mathlib it could be added to? or cslib?

3) are there any particular textbooks/writeups of the proofs of these bounds that would follow the established definitions in mathlib the closest, that I could use as a reference for trying to formalise these bounds?

Yaël Dillies (Aug 15 2025 at 14:20):

Azuma-Hoeffding is either done or doable. In general, basically all concentration inequalities are within scope right now. I've done a few myself to formalise my PhD

Yaël Dillies (Aug 15 2025 at 14:21):

Such results should definitely live in mathlib, but I am not aware of any write-up that follows particularly closely mathlib's conventions

Kiran (Aug 15 2025 at 14:28):

awesome! it's good to know that they're now within scope! for the ones you've formalised, where did they end up living in mathlib?

Orthogonal question, is there an established definition for random processes (discrete time I suppose) in mathlib? I'd default to functions of Nat -> PMF A, but given my unfamiliarity with the conventions in mathlib, maybe there are other representations that might interface better with the existing definitions etc.

Rémy Degenne (Aug 15 2025 at 14:29):

We have a sub-Gaussian version of Azuma-Hoeffding: ProbabilityTheory.measure_sum_ge_le_of_HasCondSubgaussianMGF

Rémy Degenne (Aug 15 2025 at 14:31):

For how to work with stochastic processes, you can look at the files in the Probability/Process folder. For example https://leanprover-community.github.io/mathlib4_docs/Mathlib/Probability/Process/FiniteDimensionalLaws.html

Etienne Marion (Aug 15 2025 at 14:32):

A general random process is written as a function X : ι → Ω → 𝓧 where ι is a general index type, Ω is a measurable space equipped with a probability measure, and 𝓧 is measurable space.

Rémy Degenne (Aug 15 2025 at 14:34):

None of those files use PMF. They use the Measure type for probability measures. On this point, you can read the blog post https://leanprover-community.github.io/blog/posts/basic-probability-in-mathlib/

Kiran (Aug 15 2025 at 14:34):

A general random process is written as a function X : ι → Ω → 𝓧 where ι is a general index type, Ω is a measurable space equipped with a probability measure, and 𝓧 is measurable space.

Aha, I see I see, yes, that makes sense! I'm glad I asked! I wasn't sure of the right level of generality to pick there.

Rémy Degenne (Aug 15 2025 at 14:40):

Azuma-Hoeffding is usually written for conditionally bounded random variables. We have Hoeffding's lemma (ProbabilityTheory.hasSubgaussianMGF_of_mem_Icc) to show that a bounded random variable is sub-Gaussian. We might need to add a conditional version of that to get Azuma-Hoeffding for conditionally bounded from our current result for sub-Gaussian.

Kiran (Aug 15 2025 at 14:45):

mhhm I see, I see. I think I understand. For context, I've mostly seen these bounds in the context of analysis of randomised algorithms (and it's been a while so apologies if my definitions are a little rusty right now), but the form that I've seen them is as follows (from Mitzenmacher & Upfal):

Screenshot 2025-08-15 at 9.44.52 AM.png

and

Screenshot 2025-08-15 at 9.45.15 AM.png

Would these be derivable from the definitions in mathlib?

Kiran (Aug 15 2025 at 14:49):

Thank you for the link about the encoding of probability in mathlib. That's pretty helpful. I think I'm realising it might be worthwhile also making a separate thread in the computer science topic about notations and an instantiation of mathlib's probability classes more specialised for analysing discrete computations in something like cslib

Rémy Degenne (Aug 15 2025 at 14:53):

The martingale definition is this: MeasureTheory.Martingale

The theorem you linked is not in Mathlib yet. We have a version of it with the bounded assumption replaced by conditionally sub-Gaussian (ProbabilityTheory.measure_sum_ge_le_of_HasCondSubgaussianMGF) and we have the fact that bounded implies sub-Gaussian (ProbabilityTheory.hasSubgaussianMGF_of_mem_Icc), so it should not be very hard to prove.

Kiran (Aug 15 2025 at 14:56):

Ah, awesome! okay thanks for explaining this all to me! Cool cool, in that case, I'll try playing around with it and getting the theorem in the form that I'm more familiar with by composing the theorems you mentioned!

Last updated: Dec 20 2025 at 21:32 UTC

leanprover-community / mathlib