Zulip Chat Archive

Stream: mathlib4

Topic: On Potentially Formalizing Donsker Varadhan

Anthony Wang (Sep 29 2025 at 05:06):

Hi,

I'm interested in formalizing the Donsker Varadhan variational formula in mathlib. A couple questions:

Is this a good fit for mathlib, or would this be better for a project like RemyDegenne/testing-lower-bounds? Would we rather prove a more general statement about f-divergences? (In general, what is the philosophy for answering these questions?)
To formalize the statement, we need to be able to talk about EReal valued integrals where the integrand may not be $L^1$ (i.e. via $\int f d\mu =\int f^+ d\mu -\int f^- d\mu$). Does this exist in mathlib? I did a quick check, and I didn't find it, but I may be wrong. Is there a recommended way to check if something exists already? And if it doesn't exist, does it make sense to add it?

Thanks!

Rémy Degenne (Sep 29 2025 at 07:28):

I formalized the Donsker-Varadhan formula a while ago in a branch (https://github.com/leanprover-community/mathlib4/compare/master...RD_kl , in the KullbackLeibler file) but my code is out of date for several reasons:

it's not for the definition of KL that is now in Mathlib but for a previous attempt at that definition (it was written before the project you linked to)
it uses tricks to go around the annoying fact that Real.exp (Real.log 0) is 1 and not 0. This was before we introduced the log and exp with values in ENNReal/EReal, which don't have that issue.
it's on an old version of Mathlib and probably needs many changes to adapt to latest versions.

The TestingLowerBounds project has the same issue of mismatch of definitions: we wrote thousands of lines of code in that project, and then I became convinced that our definition of divergences was bad because of an unfortunate choice of types (Real vs ENNReal vs EReal for domain and codomain of functions, Lebesgue integral vs Bochner integral, etc). So the KL definition in Mathlib is not the one on the master branch of that project.
The project is now inactive: it's still useful because we can copy and adapt some proofs from there when adding facts about the right definitions to Mathlib, but nobody is doing any development over there anymore. You should contribute to Mathlib directly.

Anthony Wang (Oct 18 2025 at 21:23):

Hi! Thanks for the reply and sorry for the delay.

It seems that I have also fallen for this trap of writing hundreds of lines of code before realizing that dealing with types is quite tricky. In particular, currently the definition of llr is log (μ.rnDeriv ν x).toReal. This seems to run into the same Real.exp (Real.log 0) = 1 issue, e.g. we have something like ∫⁻ x, exp (llr μ ν x) ∂ν = 1 + ν { x | μ.rnDeriv ν x = 0 } instead of the expected 1 (fwiw your workaround log ((μ.rnDeriv ν x).toReal + u) for u → 0 is equivalent to my workaround llr μ ν x - { x | μ.rnDeriv ν x = 0 }.indicator fun _ ↦ log ε for ε → 0). On the other hand, if we make it EReal valued, EReal is no longer a group so we cannot use the Bochner integral on it.

In addition, the Bochner integral seems insufficient for expressing some of the nonfinite guarantees of the version of Donsker-Varadhan I am familiar with (as I alluded to in the original post). For example, we should have something like "if the KL-divergence is finite, then ∫⁻ x, ENNReal.ofReal (f x) ∂μ is finite too" for any f such that Measurable f ∧ Integrable (exp ∘ f) ν.

I'm thinking of replacing the log in llr with ENNReal.log and using noncomputable def integral_ereal (μ : Measure α) (f : α → EReal) : EReal := ∫⁻ x, (f x).toENNReal ∂μ - ∫⁻ x, (-f x).toENNReal ∂μ everywhere. What do you think? Alternatively, we can only do the former and then try to break up all the nonfinite guarantees into separate theorems.

You can find my very messy diff from main here https://github.com/leanprover-community/mathlib4/compare/master...cosmicgenius:mathlib4:donsker-varadhan. Thanks again for your response.

Last updated: Feb 28 2026 at 14:05 UTC