Zulip Chat Archive

Stream: Carleson

Topic: Constants


Sébastien Gouëzel (Jul 14 2025 at 20:17):

Since the project is now building, I've just made the following experiment: modify the parameters of the construction (notably the value of D, which is 2100a22^{100 a^2} in the blueprint, and which is the main player in all the constants we get) and see at what point the argument breaks down. It turns out that, without modifying anything to the logic of the argument, the argument works for D as low as 27a22^{7 a^2} (but not for 6!). The constant in the main theorem in the blueprint is 2472a32 ^ {472 a^3}, but with the new value of DD one gets 246a32 ^ {46 a^3} -- an improvement by an order of magnitude. And all this following the usual crude strategy that all constants in the lemmas are of the form 2na32^{n a^3} for some integer nn that increases slowly along the course of the argument.

Patrick Massot (Jul 15 2025 at 09:49):

Sébastien, you really got addicted to this constant game!

Sébastien Gouëzel (Jul 15 2025 at 09:51):

When you have a tool, you want to use it for what it can do well. Formalization is very efficient for this kind of things!

Patrick Massot (Jul 15 2025 at 09:52):

Sure, this is really a great use of this tool!

Sébastien Gouëzel (Jul 15 2025 at 09:56):

Also, since I have to play a little bit with all the files, it gives me the impression to master all the proof. Which is definitely wrong, by the way: I have a good understanding of the big picture strategy, of the small scale details I have formalized, but I lack the middle scale picture in some parts of the proof! (This is pretty unusual and a little bit frustrating, by the way. And the blueprint is not really helping in this respect, since the goal is to focus on the tiny details, not to give overviews)

Floris van Doorn (Jul 15 2025 at 12:37):

Nice!
I don't think it's worth changing the blueprint with these optimized constants, since I think it's much more important that the constants in the blueprints are consistent with each other than finding optimal ones. But it's nice (and not surprising) that such questions can be answered quite easily with a Lean proof.

Do you want to push this optimization to a branch?

Sébastien Gouëzel (Jul 15 2025 at 13:39):

I've pushed the optimization to https://github.com/fpvandoorn/carleson/pull/458.

I agree we should definitely keep the blueprint in its current form. Optimization of constants is rather a nice byproduct of the formalization, that couldn't have been done reasonably to start with.

Georges Gonthier (Jul 24 2025 at 17:44):

From earlier notes, I believe one can take D as low as 2a(2a+3)2^{a(2a+3)}, or 23a22^{3a^2} if making use of the a4a \geq 4 assumption. The trick is to adjust the size of the balls used in the grid and tiling construction to get tighter bounds in the squeezing properties, as well as tweaking Chapter 5 constants so as to accomodate a much weaker Lemma 2.1.2:

  • In Section 4.1, using balls of radius 93256Dk\frac{93}{256}D^k instead of DkD^k strengthens (2.0.10) to

c(I)B(c(I),87256Dk)IB(c(I),34Dk)c(I) \in B(c(I), \frac{87}{256}D^k) \subset I \subset B(c(I), \frac{3}{4}D^k)

(Note that 14<516<87256\frac{1}{4}<\frac{5}{16}<\frac{87}{256}.)

  • Changing 100 to 7 in (5.1.8), (5.1.4), Lemma 5.4.1, and 10 to 6 in (5.3.4), as well as the conclusion of (5.3.5) to 6p29q6\frak{p} \lesssim 29\frak{q} and 4p15q4\frak{p} \lesssim 15\frak{q}, makes it possible to make Chapter 4 work with the last statement of Lemma 2.1.2 weakened to

dI(ϑ,θ)18dJ(ϑ,θ)d_{I^\circ}(\vartheta, \theta) \leq \frac{1}{8}d_{J^\circ}(\vartheta, \theta)

  • The latter can be proved with D=2a(2a+3)D=2^{a(2a + 3)}, and the stronger version of (2.0.10).
  • Similarly by using balls of radius 716\frac{7}{16}, (2.0.15) can be improved to

Q(p)Bp(Q(p),516)Ω(p)Bp(Q(p),1)Q(\frak{p}) \in B_{\frak{p}}(Q(\frak{p}), \frac{5}{16}) \subset \Omega(\frak{p}) \subset B_{\frak{p}}(Q(\frak{p}), 1)

  • These adjustments improve various bounds in the rest of the proof, e.g., the 29a2^{9a} in Lemma 5.2.8 can be reduced to 25a2^{5a}.

It's also possible to reduce the constant factor introduced by Lemma 5.1.2, which is dominant in the final bound, by adjusting most of the Chapter 5 constants involved in the definition of the exceptional sets - though you may not want to go that far.

Cheers, and congrats on seeing the whole project through!

Georges Gonthier (Jul 25 2025 at 08:18):

I should perhaps clarify that the new ball sizes for the grid decomposition are tailored to work for any a>0a>0 (that is, a1a\geq 1 as aa is an integer), as this ensures $$D=2^{a{2a+3)}\geq 32$$. If one assumes a4a\geq 4 then one can get the lower bound radius in (2.0.10) much closer to 38Dk\frac{3}{8}D^k.

This is because I've also pondered whether the a4a\geq 4 assumption is really useful. Apart from reducing the size of the constants (in a fairly limited way, given the dominance of the D=2100a2D=2^{100a^2} contributions), a4a\geq 4 is only used two inconspicuous places in the metric Carleson proof, and to derive classical Carleson.

In the metric proof the assumption is used in Lemmas 4.1.9 and 7.4.4; oddly, neither of these uses are flagged in the text, in part because the use in 7.4.4 was masked by an error error in the statement of 7.4.6. In both cases only a2a \geq 2 is actually neededand even that constraint can be removed by adjusting the definition of κ\kappa and ZZ.

In the classical Carleson proof a4a\geq 4 is used to show that the Hilbert kernel satisfies the Lipschitz constant 2a32^{a^3} used in (1.0.14), (1.0.15), (1.0.18), (1.0.23), (10.0.1), and (10.0.3). With a tighter constants management in Chapter 11 a2a \geq 2 would again suffice: the dominant constraint for (10.0.3) is 863<150<2a386\sqrt{3} < 150 < 2^{a^3}. However the choice of the 2a32^{a^3} Lipschitz constant is completely arbitrary, as it can be absorbed by rescaling the kernel KK as is done in Chapter 10, so why not set it to 11 instead to simplify the main proof, and scale the Hilbert kernel by 1150\frac{1}{150} (possibly through a scaling lemma)? Finally, note that the constant in (1.0.18) is only used for the non-dominant term (7.2.3) in the proof of Lemma 7.2.2, and for the non-dominant antichain term (5.1.33) in the proof of Proposition 2.0.2; thus it could be set to a larger value (at least Da25aD^a2^ {5a} according to my notes!), which could absorb the increase in Chapter 10 (I haven't checked the details, though).

Also, I spotted two other unnecessary assumptions in the definition of compatible functions:

  • The ϑ()=0\vartheta(\circ)=0 assumption is not used in the proof (is it necessary to derive classical formulations?).
  • Assuming that (1.0.8) and (1.0.10) hold for non-concentric balls only improves marginally the constants of the theorem, but is not coherent with the assumption (1.0.5) for doubling metrics.

Jeremy Tan (Jul 25 2025 at 23:46):

Georges Gonthier said:

  • The ϑ()=0\vartheta(\circ)=0 assumption is not used in the proof (is it necessary to derive classical formulations?).

I did use this assumption to prove a continuity property in Lemma 3.0.1

Georges Gonthier (Jul 28 2025 at 07:43):

Jeremy Tan said:

Georges Gonthier said:

  • The ϑ()=0\vartheta(\circ)=0 assumption is not used in the proof (is it necessary to derive classical formulations?).

I did use this assumption to prove a continuity property in Lemma 3.0.1

Point taken, but wouldn't the proof of Lemma 3.0.1 work without this assumption (and be in fact a little simpler) if the statement of the lemma is changed to use

G(x,ϑ,R1,R2):=R1<ρ(x,y)<R2K(x,y)f(y)e(ϑ(y)ϑ(x))dμ(y)G(x,\vartheta,R_1,R_2) := \int_{R_1<\rho(x,y)<R_2}K(x,y)f(y)e(\vartheta(y)-\vartheta(x))d\mu(y)

instead? As we would still have

G(x,ϑ,R1,R2)=R1<ρ(x,y)<R2K(x,y)f(y)e(ϑ(y))dμ(y)|G(x,\vartheta,R_1,R_2) | = \left|\int_{R_1<\rho(x,y)<R_2}K(x,y)f(y)e(\vartheta(y))d\mu(y)\right|

this wouldn't change the proof of Theorem 1.0.2 below.


Last updated: Dec 20 2025 at 21:32 UTC