Zulip Chat Archive

Since the project is now building, I've just made the following experiment: modify the parameters of the construction (notably the value of D, which is $2^{100 a^2}$ in the blueprint, and which is the main player in all the constants we get) and see at what point the argument breaks down. It turns out that, without modifying anything to the logic of the argument, the argument works for D as low as $2^{7 a^2}$ (but not for 6!). The constant in the main theorem in the blueprint is $2 ^ {472 a^3}$ , but with the new value of $D$ one gets $2 ^ {46 a^3}$ -- an improvement by an order of magnitude. And all this following the usual crude strategy that all constants in the lemmas are of the form $2^{n a^3}$ for some integer $n$ that increases slowly along the course of the argument.

Patrick Massot (Jul 15 2025 at 09:49):

Sébastien, you really got addicted to this constant game!

Sébastien Gouëzel (Jul 15 2025 at 09:51):

When you have a tool, you want to use it for what it can do well. Formalization is very efficient for this kind of things!

Patrick Massot (Jul 15 2025 at 09:52):

Sure, this is really a great use of this tool!

Sébastien Gouëzel (Jul 15 2025 at 09:56):

Also, since I have to play a little bit with all the files, it gives me the impression to master all the proof. Which is definitely wrong, by the way: I have a good understanding of the big picture strategy, of the small scale details I have formalized, but I lack the middle scale picture in some parts of the proof! (This is pretty unusual and a little bit frustrating, by the way. And the blueprint is not really helping in this respect, since the goal is to focus on the tiny details, not to give overviews)

Floris van Doorn (Jul 15 2025 at 12:37):

Nice!
I don't think it's worth changing the blueprint with these optimized constants, since I think it's much more important that the constants in the blueprints are consistent with each other than finding optimal ones. But it's nice (and not surprising) that such questions can be answered quite easily with a Lean proof.

Do you want to push this optimization to a branch?

Sébastien Gouëzel (Jul 15 2025 at 13:39):

I've pushed the optimization to https://github.com/fpvandoorn/carleson/pull/458.

I agree we should definitely keep the blueprint in its current form. Optimization of constants is rather a nice byproduct of the formalization, that couldn't have been done reasonably to start with.

Georges Gonthier (Jul 24 2025 at 17:44):

From earlier notes, I believe one can take D as low as $2^{a(2a+3)}$ , or $2^{3a^2}$ if making use of the $a \geq 4$ assumption. The trick is to adjust the size of the balls used in the grid and tiling construction to get tighter bounds in the squeezing properties, as well as tweaking Chapter 5 constants so as to accomodate a much weaker Lemma 2.1.2:

In Section 4.1, using balls of radius $\frac{93}{256}D^k$ instead of $D^k$ strengthens (2.0.10) to

$c(I) \in B(c(I), \frac{87}{256}D^k) \subset I \subset B(c(I), \frac{3}{4}D^k)$

(Note that $\frac{1}{4}<\frac{5}{16}<\frac{87}{256}$ .)

Changing 100 to 7 in (5.1.8), (5.1.4), Lemma 5.4.1, and 10 to 6 in (5.3.4), as well as the conclusion of (5.3.5) to $6\frak{p} \lesssim 29\frak{q}$ and $4\frak{p} \lesssim 15\frak{q}$ , makes it possible to make Chapter 4 work with the last statement of Lemma 2.1.2 weakened to

$d_{I^\circ}(\vartheta, \theta) \leq \frac{1}{8}d_{J^\circ}(\vartheta, \theta)$

The latter can be proved with $D=2^{a(2a + 3)}$ , and the stronger version of (2.0.10).
Similarly by using balls of radius $\frac{7}{16}$ , (2.0.15) can be improved to

$Q(\frak{p}) \in B_{\frak{p}}(Q(\frak{p}), \frac{5}{16}) \subset \Omega(\frak{p}) \subset B_{\frak{p}}(Q(\frak{p}), 1)$

These adjustments improve various bounds in the rest of the proof, e.g., the $2^{9a}$ in Lemma 5.2.8 can be reduced to $2^{5a}$ .

It's also possible to reduce the constant factor introduced by Lemma 5.1.2, which is dominant in the final bound, by adjusting most of the Chapter 5 constants involved in the definition of the exceptional sets - though you may not want to go that far.

Cheers, and congrats on seeing the whole project through!

Georges Gonthier (Jul 25 2025 at 08:18):

I should perhaps clarify that the new ball sizes for the grid decomposition are tailored to work for any $a>0$ (that is, $a\geq 1$ as $a$ is an integer), as this ensures $$D=2^{a{2a+3)}\geq 32$$. If one assumes $a\geq 4$ then one can get the lower bound radius in (2.0.10) much closer to $\frac{3}{8}D^k$ .

This is because I've also pondered whether the $a\geq 4$ assumption is really useful. Apart from reducing the size of the constants (in a fairly limited way, given the dominance of the $D=2^{100a^2}$ contributions), $a\geq 4$ is only used two inconspicuous places in the metric Carleson proof, and to derive classical Carleson.

In the metric proof the assumption is used in Lemmas 4.1.9 and 7.4.4; oddly, neither of these uses are flagged in the text, in part because the use in 7.4.4 was masked by an error error in the statement of 7.4.6. In both cases only $a \geq 2$ is actually neededand even that constraint can be removed by adjusting the definition of $\kappa$ and $Z$ .

In the classical Carleson proof $a\geq 4$ is used to show that the Hilbert kernel satisfies the Lipschitz constant $2^{a^3}$ used in (1.0.14), (1.0.15), (1.0.18), (1.0.23), (10.0.1), and (10.0.3). With a tighter constants management in Chapter 11 $a \geq 2$ would again suffice: the dominant constraint for (10.0.3) is $86\sqrt{3} < 150 < 2^{a^3}$ . However the choice of the $2^{a^3}$ Lipschitz constant is completely arbitrary, as it can be absorbed by rescaling the kernel $K$ as is done in Chapter 10, so why not set it to $1$ instead to simplify the main proof, and scale the Hilbert kernel by $\frac{1}{150}$ (possibly through a scaling lemma)? Finally, note that the constant in (1.0.18) is only used for the non-dominant term (7.2.3) in the proof of Lemma 7.2.2, and for the non-dominant antichain term (5.1.33) in the proof of Proposition 2.0.2; thus it could be set to a larger value (at least $D^a2^ {5a}$ according to my notes!), which could absorb the increase in Chapter 10 (I haven't checked the details, though).

Also, I spotted two other unnecessary assumptions in the definition of compatible functions:

The $\vartheta(\circ)=0$ assumption is not used in the proof (is it necessary to derive classical formulations?).
Assuming that (1.0.8) and (1.0.10) hold for non-concentric balls only improves marginally the constants of the theorem, but is not coherent with the assumption (1.0.5) for doubling metrics.

Jeremy Tan (Jul 25 2025 at 23:46):

Georges Gonthier said:

The $\vartheta(\circ)=0$ assumption is not used in the proof (is it necessary to derive classical formulations?).

I did use this assumption to prove a continuity property in Lemma 3.0.1

Georges Gonthier (Jul 28 2025 at 07:43):

Jeremy Tan said:

Georges Gonthier said:

The $\vartheta(\circ)=0$ assumption is not used in the proof (is it necessary to derive classical formulations?).

I did use this assumption to prove a continuity property in Lemma 3.0.1

Point taken, but wouldn't the proof of Lemma 3.0.1 work without this assumption (and be in fact a little simpler) if the statement of the lemma is changed to use

$G(x,\vartheta,R_1,R_2) := \int_{R_1<\rho(x,y)<R_2}K(x,y)f(y)e(\vartheta(y)-\vartheta(x))d\mu(y)$

instead? As we would still have

$|G(x,\vartheta,R_1,R_2) | = \left|\int_{R_1<\rho(x,y)<R_2}K(x,y)f(y)e(\vartheta(y))d\mu(y)\right|$

this wouldn't change the proof of Theorem 1.0.2 below.

Last updated: Dec 20 2025 at 21:32 UTC

leanprover-community / mathlib