Zulip Chat Archive
Stream: new members
Topic: How difficult would it be to formalise Azuma-hoeffding?
Kiran (Aug 15 2025 at 14:15):
I'd be interested in formalising a few foundational results that are critical for the analysis of randomised algorithms (the randomness and computing course was certainly one of my favourite courses in my PhD).
1) how much preliminary material would need to be formalised to be able to state and prove concentration inequalities like the Azuma-hoeffding bound (I've played around with PMF and I'm only marginally less efficient than I was in Rocq)
2) if it does seem feasible to do right now, where should this live? any particular sublibraries of mathlib it could be added to? or cslib?
3) are there any particular textbooks/writeups of the proofs of these bounds that would follow the established definitions in mathlib the closest, that I could use as a reference for trying to formalise these bounds?
Yaël Dillies (Aug 15 2025 at 14:20):
Azuma-Hoeffding is either done or doable. In general, basically all concentration inequalities are within scope right now. I've done a few myself to formalise my PhD
Yaël Dillies (Aug 15 2025 at 14:21):
Such results should definitely live in mathlib, but I am not aware of any write-up that follows particularly closely mathlib's conventions
Kiran (Aug 15 2025 at 14:28):
awesome! it's good to know that they're now within scope! for the ones you've formalised, where did they end up living in mathlib?
Orthogonal question, is there an established definition for random processes (discrete time I suppose) in mathlib? I'd default to functions of Nat -> PMF A, but given my unfamiliarity with the conventions in mathlib, maybe there are other representations that might interface better with the existing definitions etc.
Rémy Degenne (Aug 15 2025 at 14:29):
We have a sub-Gaussian version of Azuma-Hoeffding: ProbabilityTheory.measure_sum_ge_le_of_HasCondSubgaussianMGF
Rémy Degenne (Aug 15 2025 at 14:31):
For how to work with stochastic processes, you can look at the files in the Probability/Process folder. For example https://leanprover-community.github.io/mathlib4_docs/Mathlib/Probability/Process/FiniteDimensionalLaws.html
Etienne Marion (Aug 15 2025 at 14:32):
A general random process is written as a function X : ι → Ω → 𝓧 where ι is a general index type, Ω is a measurable space equipped with a probability measure, and 𝓧 is measurable space.
Rémy Degenne (Aug 15 2025 at 14:34):
None of those files use PMF. They use the Measure type for probability measures. On this point, you can read the blog post https://leanprover-community.github.io/blog/posts/basic-probability-in-mathlib/
Kiran (Aug 15 2025 at 14:34):
A general random process is written as a function
X : ι → Ω → 𝓧whereιis a general index type,Ωis a measurable space equipped with a probability measure, and𝓧is measurable space.
Aha, I see I see, yes, that makes sense! I'm glad I asked! I wasn't sure of the right level of generality to pick there.
Rémy Degenne (Aug 15 2025 at 14:40):
Azuma-Hoeffding is usually written for conditionally bounded random variables. We have Hoeffding's lemma (ProbabilityTheory.hasSubgaussianMGF_of_mem_Icc) to show that a bounded random variable is sub-Gaussian. We might need to add a conditional version of that to get Azuma-Hoeffding for conditionally bounded from our current result for sub-Gaussian.
Kiran (Aug 15 2025 at 14:45):
mhhm I see, I see. I think I understand. For context, I've mostly seen these bounds in the context of analysis of randomised algorithms (and it's been a while so apologies if my definitions are a little rusty right now), but the form that I've seen them is as follows (from Mitzenmacher & Upfal):
Screenshot 2025-08-15 at 9.44.52 AM.png
and
Screenshot 2025-08-15 at 9.45.15 AM.png
Would these be derivable from the definitions in mathlib?
Kiran (Aug 15 2025 at 14:49):
Thank you for the link about the encoding of probability in mathlib. That's pretty helpful. I think I'm realising it might be worthwhile also making a separate thread in the computer science topic about notations and an instantiation of mathlib's probability classes more specialised for analysing discrete computations in something like cslib
Rémy Degenne (Aug 15 2025 at 14:53):
The martingale definition is this: MeasureTheory.Martingale
The theorem you linked is not in Mathlib yet. We have a version of it with the bounded assumption replaced by conditionally sub-Gaussian (ProbabilityTheory.measure_sum_ge_le_of_HasCondSubgaussianMGF) and we have the fact that bounded implies sub-Gaussian (ProbabilityTheory.hasSubgaussianMGF_of_mem_Icc), so it should not be very hard to prove.
Kiran (Aug 15 2025 at 14:56):
Ah, awesome! okay thanks for explaining this all to me! Cool cool, in that case, I'll try playing around with it and getting the theorem in the form that I'm more familiar with by composing the theorems you mentioned!
Last updated: Dec 20 2025 at 21:32 UTC