Zulip Chat Archive

Stream: PrimeNumberTheorem+

Topic: Let us formalize a lemma

Harald Helfgott (Jan 08 2026 at 22:47):

I'm trying to get a handle of the entire process: (human paper) -> blueprint -> (formal proof). I shall give some examples from a short paper I'm finishing. I am open to suggestions on how to write this and future papers so as to make the first step in the process easier.

Let me start with the first lemma in the paper. I will also open a thread about a short, self-contained appendix giving a proof of an explicit version of a standard estimate on $\zeta(s)$.

Harald Helfgott (Jan 08 2026 at 22:47):

In what follows, $Q(x)$ is the number of square-free integers $$1\leq n\leq x$, $R(x) = Q(x) - x/\zeta(2)$ , and $M(x) = \sum_{n\leq x} \mu(n)$ . (This is standard notation.) I do want $x$ to be an arbitrary non-negative real number - when it comes to using the results, assuming that $x$ is an integer gets in the way.

Harald Helfgott (Jan 08 2026 at 22:48):

Here are the Lemma and its proof, as they are written in my draft. (Hopefully LaTeX source is OK -- is there an automatic way to convert it?)

\begin{lemma}\label{lem:difintsq}
For any $x>0$,
\begin{equation}\label{eq:antenor}
R(x) = \sum_{k\leq x} M\left(\sqrt{\frac{x}{k}}\right) - \int_0^x M\left(\sqrt{\frac{x}{u}}\right) du.\end{equation}
Moreover, for any integer $K\geq 0$,
\begin{equation}\label{eq:singdot}\begin{aligned}
R(x) &= \sum_{k\leq K} M\left(\sqrt{\frac{x}{k}}\right)  -
\int_0^{K+\frac{1}{2}} M\left(\sqrt{\frac{x}{u}}\right) du\\
&-\sum_{K<k\leq x+1} \int_{k-\frac{1}{2}}^{k+\frac{1}{2}} \left(M\left(\sqrt{\frac{x}{u}}\right) -
M\left(\sqrt{\frac{x}{k}}\right)\right) du
% - 2 x \sum_{K<k\leq x} \int_{\sqrt{\frac{x}{k+1/2}}}^{\sqrt{\frac{x}{k-1/2}}} \frac{M(t)- M\left(\sqrt{\frac{x}{k}}\right)}{t^3} dt
\end{aligned}
\end{equation}
\end{lemma}
Since our sums start from $1$, the sum $\sum_{k\leq K}$ is empty for $K=0$.
\begin{proof}
The first equality is immediate from
$$Q(x) = \sum_{n\leq x} \sum_{d: d^2|n} \mu(d) = \sum_{k, d: k d^2\leq x} \mu(d) = \sum_{k\leq x} M\left(\sqrt{\frac{x}{k}}\right)$$
and (by exchanging the order of $\sum$ and $\int$, as is justified by
$\sum_n |\mu(n)|\int_0^{x/n^2} du \leq \sum_n x/n^2 < \infty$)
$$\int_0^x M\left(\sqrt{\frac{x}{u}}\right) du = \int_0^x \sum_{n\leq \sqrt{\frac{x}{u}}} \mu(n) du
=\sum_n \mu(n) \int_0^{\frac{x}{n^2}} du = x \sum_n \frac{\mu(n)}{n^2} = \frac{x}{\zeta(2)}.$$
In the second equality, it is clear that the terms of the form $M(\sqrt{x/k})$ can all be
grouped as $\sum_{k\leq x+1} M(\sqrt{x/k})$; the term $M(\sqrt{x/k})$ for $k = \lfloor x+1\rfloor > x$ equals $0$. For $f(u) = M(\sqrt{x/u})$,
\[\sum_{K<k\leq x+1} \int_{k-\frac{1}{2}}^{k+\frac{1}{2}} f(u) du = \int_{K+\frac{1}{2}}^{\lfloor x\rfloor + \frac{3}{2}} f(u) du
= \int_{K+\frac{1}{2}}^x f(u) du,\]
since $f(u) = M(\sqrt{x/u}) = 0$ for $x>u$. Here we are assuming that
$K\leq x$. If $K>x$, the second line of \eqref{eq:singdot}
is empty, and the first one equals \eqref{eq:antenor}, by
$M(t)=0$ for $t<1$, so \eqref{eq:singdot} holds.
\end{proof}

Terence Tao (Jan 08 2026 at 22:56):

I think Zulip can only do rudimentary LaTeX conversions of text within double dollar signs. You can put LaTeX code inside a code block (which is the button at the bottom of the input text bar with the < > signs) for slightly increased readability. I can go ahead and create a test file to hold this lemma and create an initial blueprint out of it (modulo the issue that the blueprint is currently not compiling); let me do that now.

Harald Helfgott (Jan 08 2026 at 22:57):

Thanks. But how do I create labels so that I can refer to equations? Or should I just put LaTeX code inside a code block, as I have just done?

Terence Tao (Jan 08 2026 at 23:04):

I think this is the best one can do within Zulip (one could link to a separate Overleaf file or PDF if one wanted to make it readable). I'm in the process of converting it into the LeanArchitect blueprint format, it should be human readable then.

Harald Helfgott (Jan 08 2026 at 23:11):

OK. If this format is socially acceptable, then I'll open the other thread and paste the appendix there now. Should I remove the explanatory passages (including references to previous work)? There's just one lemma and one proposition, so I don't need to reorder anything.

Terence Tao (Jan 08 2026 at 23:22):

One may as well remove any unnecessary text. If the appendix already appears say on the arXiv then you can also post a link to the appropriate PDF or whatever so that we can see a human-readable version, but it is the LaTeX code I will need to convert to a blueprint.

I just created a blueprint from the lemma at PNT#490; it has to pass through CI and the blueprint has to somehow build, but then it should be visible within the project.

Harald Helfgott (Jan 08 2026 at 23:24):

Understood. OTOH I'm a bit reluctant to remove text where I (a) explain why to care about the result, (b) give credit to other people. Let me leave that in this time, since this is an appendix that hasn't appeared in the arXiv yet, and let me remove similar passages when I post things that have already appeared in the arXiv.

Terence Tao (Jan 09 2026 at 06:49):

Both the blueprint for the lemma and the appendix are now up: https://alexkontorovich.github.io/PrimeNumberTheoremAnd/blueprint/sect0007.html#a0000000031 https://alexkontorovich.github.io/PrimeNumberTheoremAnd/blueprint/sect0007.html#a0000000032 . I broke up the lemma into three pieces that look relatively tractable, so one could soon add those to the list of outstanding tasks. As for the appendix, the proofs given are quite lengthy; they should probably be broken up into more manageable pieces, roughly speaking each displayed equation in the proof might end up being its own blueprint lemma. I won't be able to devote time to doing so in the next day or two, but if perhaps you want to try breaking up things into such pieces, be my guest.

Harald Helfgott (Jan 09 2026 at 10:45):

Thanks for this! I'll set to work.

One thing that I think is necessary, for the sake of legibility, is greater structure.

At a macro level: already the category of "primary estimates" seems to be asking for a division into (i) estimates on $\zeta$ , (ii) estimates on some basic sums ( $\psi(x)$ , $M(x)$ , perhaps $Q(x)$ ) that often use those estimates on $\zeta$ . I would put estimates on $\zeta(s)$ at an even more foundational level.

The categorization will never be completely unambiguous (are estimates on $Q(x)$ primary or secondary? arguably the former: (a) the derivation is arguably not routine, at least not now that there are several ways to do it, (b) estimates on $Q(x)$ based on bounds on $M(x)$ get fed back to the procedure to improve bounds on $M(x)$ ) but then it doesn't have to be -- it just has to be useful.

At a micro level: on the one hand, we are being asked to break up lemmas much more than most of us would in a paper. On the other hand, the resulting structure is completely flat and linear. If some of us tend to write lemmas with longer proofs in civilian life, it is precisely because the structure risks being obscured rather than clarified if lemmas are broken down too much.

Instead of labeling everything as a "theorem", we could have more names for results at different scales. "Sublemma" is an obvious choice - that would arguably be best if we want something that inherits the assumptions of a lemma, as "sub" suggests "substructure". An independent micro-lemma could be a λημμάτιον (ChatGPT suggests the more whimsical "λημμάριον"). In real life, we would use "Fact." for a really small sublemma within a Lemma.

All of this feels to me as self-evident as structured programming, or more so, since a blueprint is meant to be read only by humans, and so there is no analogue of the pain of computer-side implementation.

Terence Tao (Jan 10 2026 at 03:32):

OK, I have refactored the blueprint to move several results to a new "Zeta function estimate" chapter. The Mobius lemma is now split up into five sublemmas which are suitable for formalization separately: PNT#526 PNT#527 PNT#528 PNT#529 PNT#530. So they are now in the queue!

You can see the latest formalization of the Mobius lemma at https://alexkontorovich.github.io/PrimeNumberTheoremAnd/blueprint/primary-chapter.html#a0000000031 (it may take about 10 mins from this post for the final subdivision to show up, due to the time it takes to build the blueprint) . Hopefully this gives an idea of what level of granularity would be useful for formalizing the appendix.

Harald Helfgott (Jan 10 2026 at 08:10):

Thanks, Terry! I'm now working on the appendix. I'll first break things up as far as one could do in a human-readable text without making it unreadable or artificial - then I'll the .tex/.pdf here (purpleprint?) and you guys can tell me whether it needs to be broken down further. Sounds like a good idea?

Terence Tao (Jan 28 2026 at 16:31):

As of today it looks like all the components of the Mobius lemma are now formalized! Thanks to all contributors to getting it over the finish line.

Harald Helfgott (Jan 28 2026 at 16:37):

Wait, I thought this was done a while back. Was it the final bit that was missing up to now?

Harald Helfgott (Jan 28 2026 at 16:40):

It seems everything else was done by Aristotle two weeks ago. I guess the final bit was something that looks easy to humans but Aristotle got stuck on for some mysterious reason? (Wouldn't be the first time - I am trying to get a good sense of when that happens.) Or did nobody give it to Aristotle?

Steven Rossi (Jan 28 2026 at 17:37):

I gave it to Aristotle and it either timed out or made some very verbose proofs, but it helped for things like integrability for subsections. I also had this proven like a week ago, but rewrote it as a practice for writing more idiomatic proofs before submitting.

Harald Helfgott (Jan 28 2026 at 19:28):

I'm still trying to discern in what kind of situations Aristotle is likely to lose its way. Sometimes it works after firm guidance (option 3 in the menu).

Steven Rossi (Jan 28 2026 at 20:19):

I probably didn't give it enough context, usually when using the (3) option, I just paste the goals, and whatever definitions or lemmas it needs, if I gave the proof outline it might have been cleaner. I would blame my workflow more than the model for this one.

Last updated: Feb 28 2026 at 14:05 UTC