Zulip Chat Archive
Stream: PrimeNumberTheorem+
Topic: Merging with Morph
Alex Kontorovich (Sep 12 2025 at 13:32):
Given that the Morph folks have finished up the Strong PNT (woohoo!), maybe someone would like to volunteer to merge it into our project? As far as I can tell, we're just missing Borel-Caratheodory, so if we simply take their formalization of that (which had been claimed here by @Shuhao Song; sorry you got scooped...), then we should be able to have everything in one place; I'm guessing this will require only on the order of 1K lines (of their 22K). Anyone want to claim the job? Thanks!
Johan Commelin (Sep 12 2025 at 13:35):
How is Morph's output licensed? I don't see any mention of that in their repo.
Matthew Ballard (Sep 12 2025 at 13:35):
Ask gauss? Contributing back to the community useable code is going to be quite important for how this progresses.
Alex Kontorovich (Sep 12 2025 at 13:38):
Oh I didn't even think of that... If I see a theorem out in the world, I take it and use it (of course giving credit); I'm not used to how things work with code... The code is available online, so we can't just paste it with a note that it came from Morph?...
Johan Commelin (Sep 12 2025 at 13:39):
In my ideal world: yes.
In the real world: ask a lawyer
I think we should ask Morph to slap an open source license on their code.
Alex Kontorovich (Sep 12 2025 at 13:39):
(Anyway, I'm sure Jared/Jesse/Christian will give us permission, in fact they may already have in my private conversations with them about this weeks ago... But you're right, it'll be good to double check that.)
Alex Kontorovich (Sep 12 2025 at 14:31):
Got confirmation from Jesse - we can upstream anything we find useful.
Alex Kontorovich (Sep 12 2025 at 14:32):
So: any volunteers? :grinning_face_with_smiling_eyes:
Alastair Irving (Sep 13 2025 at 10:52):
This is very impressive. I think we'd need more of their code than you suggest, on top of Borel-Caratheodory we need the expansion of the log derivative as a sum over zeros and then the application to the zeta function. I'd guess that's about half their code and then we'd have to update the PNT proof in a number of places.
I'm not sure I see the value of copying lots of their code into our repo unles we're also committing to significant tidying of it to make it more suitable for Mathlib. Given the number of small lemmas they have in their blueprint that looks like a fairly thankless task.
Terence Tao (Sep 13 2025 at 11:55):
It seems the first step is to create a blueprint for StrongPNT designed for human formalizers (in particular identifying key subclaims such as Borel-Caratheodory (which might be of separate interest to Mathlib or other future projects). We can use the Morph blueprint as inspiration (in particular for any step which is more technical than conceptual, one can try to largely port over the Morph equivalent) but it would not be a one-to-one match.
Perhaps one can view this as a sort of accidental test case of how to import one formalization project into another, when the projects have been written with very different styles, to the extent that direct importation is not desirable for the purposes of anything other than the very narrow goal of getting some formal proof of the target claim. The process of doing so may uncover some of the unstated goals of our project - while ostensibly we only declared the explicit goal of achieving a formalization of various version of the PNT (and variants), I think we had some implicit goals about the structure of that formalization and how it was to be achieved that we did not articulate. But now we have an opportunity to do so.
Maksym Radziwill (Sep 23 2025 at 01:17):
Alex Kontorovich said:
(Anyway, I'm sure Jared/Jesse/Christian will give us permission, in fact they may already have in my private conversations with them about this weeks ago... But you're right, it'll be good to double check that.)
As far as I can tell they merged big chunks of our MediumPNT file into their StrongPNT file with minor changes (e.g I1Bound, I3Bound and various other lemmas).
On that note, since Lean has proof irrelevance for Prop what is the purpose of tidying the code (beyond replacing things like simp_all by simp only, etc to improve stability)?
Lawrence Wu (llllvvuu) (Sep 23 2025 at 01:55):
maintainability, compile time (likely a few OOMs of overhead here), development of theory/automation that benefits the broader project of formalizing mathematics
Terence Tao (Sep 23 2025 at 02:39):
I can imagine that if we want to generalize the PNT theorems to, say, the Chebotarev density theorem, that having PNT proofs that has been either written by humans, or refactored by humans from AI-generated proofs, will be a more useful starting point than pure AI-generated code, at least at the higher levels of the proof. (For technical lemmas that are "just" routine computations and applications of standard tools, though, I am happy to outsource to whatever SAT solver, hammer, smart tactic, or AI tool will do the job.)
Maksym Radziwill (Sep 23 2025 at 02:55):
Maybe to clarify my question, I definitely see the benefit of having code broken down into many small propositions (small in the sense of accomplishing a small isolated function), each with a well designed type signature, meaningful data structures etc. (Just as is the case for a well-written math paper).
My question is really about the proofs of these propositions. There is a clear preference among contributors of mathlib for shorter proofs, e.g "code golfing". I don't quite understand why this matters (besides the compile time and stability concerns) if the architecture of the project is well laid out? Perhaps the ability to consistently write short proof is the marker of successful design choices?
In any case I don't want to be polemic about this so I will refrain from pushing the question further...
Kevin Buzzard (Sep 23 2025 at 06:23):
If all you want is a compiling proof so you can say "I did this" then stability concerns are irrelevant. But if you want to actually create a usable artifact then stability is a major issue. Lean makes no attempt to be backwards compatible and if you follow the mathlib nightly testing branches you'll see that a large amount of time is being spent, daily, just ensuring that mathlib continues to compile with the daily changes to lean on the lean nightly testing branch. Code which is not optimal and continually breaks can cause big delays in this process, which is why the maintainers are so motivated to get it right the first time with their stringent review requirements etc. Here's a question -- does the morph work satisfy the mathlib linters? Most of those linters are there to make the maintainers' job easier. If it's not linting then that's already a sign that it's not maintainable.
Ruben Van de Velde (Sep 23 2025 at 06:50):
Though the claim that mathlib insists on golfing proofs to their absolute shortest form is somewhat overstated. It's more of a combination of
(a) independent parts of a proof should generally become a lemma of their own (serves as an abstraction layer, helps with compilation speed, helps with understanding the structure and interdependencies of the proof, allows the partial result to be reused, ...), which leads to quite some proofs that end up being little more than a straightforward combination of the constituent lemmas and
(b) steps that are mathematically trivial shouldn't take much code
Ruben Van de Velde (Sep 23 2025 at 06:51):
And (c) you shouldn't keep proving the same result over and over again
Alastair Irving (Sep 23 2025 at 17:56):
Kevin Buzzard said:
does the morph work satisfy the mathlib linters? Most of those linters are there to make the maintainers' job easier. If it's not linting then that's already a sign that it's not maintainable.
No, the linters are mostly disabled. Even the unused variable linter is turned off so there are a number of results with unneeded hypotheses.
Alex Kontorovich (Sep 25 2025 at 22:30):
FYI @Preston T has volunteered to upstream the LaTeX blueprint to PNT+, and from there we'll start new tasks to port the proof, hopefully streamlining it (significantly?) in the process... More soon!
Maksym Radziwill (Oct 02 2025 at 19:05):
Alex Kontorovich said:
FYI Preston T has volunteered to upstream the LaTeX blueprint to PNT+, and from there we'll start new tasks to port the proof, hopefully streamlining it (significantly?) in the process... More soon!
I'm happy to take a stab at Borel-Caratheodory (or if you prefer for me to wait please let me know). I'll pay attention to code quality this time.
Alex Kontorovich (Oct 03 2025 at 15:18):
That would be great, thanks Maks!! Borel-Caratheodory is all yours! I'll try to find time later to whip up a proper "Outstanding Tasks" list...
Maksym Radziwill (Oct 06 2025 at 04:22):
Alex Kontorovich said:
That would be great, thanks Maks!! Borel-Caratheodory is all yours! I'll try to find time later to whip up a proper "Outstanding Tasks" list...
I just sent a pull request with a proof of Borel-Caratheodory following the blueprint. Three comments:
- I created a StrongPNT folder and put the proof of Borel-Caratheodory in StrongPNT/BorelCaratheodory.lean . I imported this in StrongPNT.lean. Proceeding in this way for each major theorem/lemma should help keep the code modular.
- I tried my best to have the proof up to mathlib4 standards but I am sure I am falling short, if somebody could review it and improve it that would be greatly appreciated
- I think it would be good if we could upstream these lemmas to mathlib4 as we go, otherwise we might end up with a huge pile of code that needs to be either reviewed or streamlined, and then nobody wants to do it...
I'm happy to take a stab at Lemma 99 DerivativeBound, if that's OK with everybody...
Ruben Van de Velde (Oct 06 2025 at 13:48):
Are you volunteering to help move code to mathlib? Because that would be great :)
Ruben Van de Velde (Oct 06 2025 at 13:49):
I'm also working on enabling more mathlib linters, to make sure we catch issues earlier
Kevin Buzzard (Oct 06 2025 at 13:54):
I think this is an important question; what to do with these thousands of lines of unmaintainable code which the AI wrote. math.inc are now in discussion with a number of groups, including groups working on things which I would like to see in mathlib, and I'm genuinely scared that if these systems start generating low-quality Lean code proving useful theorems and then writing the blog post and then moving on without any interest in getting the code tidied up and into mathlib then we are going to start accumulating technical debt at a slightly alarming rate.
Maksym Radziwill (Oct 06 2025 at 15:49):
Ruben Van de Velde said:
Are you volunteering to help move code to mathlib? Because that would be great :)
Yes, absolutely, my intent is to first upstream the code that I'll write for StrongPNT (and then circle back to it once it is up-streamed). With the new blueprint by Alex and Preston there might be hope in cleaning up the previous code from MediumPNT... I feel responsible for polluting the earlier version with some "write-only" code...
Ruben Van de Velde said:
I'm also working on enabling more mathlib linters, to make sure we catch issues earlier
That would be great! Hopefully at some point passing all the linters will be a reasonable indication that the code is in a good enough shape for inclusion to mathlib4...
Ruben Van de Velde (Oct 06 2025 at 15:54):
I'm afraid the linters will only give a necessary but not a sufficient condition :)
Alex Kontorovich (Oct 10 2025 at 14:01):
Thanks Maks!! I went through and gave it a more "proper" review than I usually would, with the idea that we'll get this code close to right this time, and try to port it to Mathlib as we go. Please let me know if any comments don't make sense to you. And yes, DerivativeBound is all yours as well, thanks!!
Maksym Radziwill (Oct 10 2025 at 20:59):
Alex Kontorovich said:
Thanks Maks!! I went through and gave it a more "proper" review than I usually would, with the idea that we'll get this code close to right this time, and try to port it to Mathlib as we go. Please let me know if any comments don't make sense to you. And yes, DerivativeBound is all yours as well, thanks!!
Thanks @Alex Kontorovich and @Alastair Irving , I really appreciate your comments and the time you took to look over the code!
Maksym Radziwill (Oct 10 2025 at 23:14):
I posted a PR to mathlib4, https://github.com/leanprover-community/mathlib4/pull/30424 once I receive and address comments that arise there I'll push an updated version back into PrimeNumberTheoremAnd+ .
Harald Helfgott (Oct 11 2025 at 17:53):
Just out of curiosity - what is the hard part in formalizing Borel-Carathéodory?
Maksym Radziwill (Oct 12 2025 at 21:01):
Alex Kontorovich said:
And yes, DerivativeBound is all yours as well, thanks!!
I'll also claim indiscreetly BorelCaratheodoryDeriv since it is tightly coupled with Lemma 99.
Maksym Radziwill (Oct 13 2025 at 04:28):
I finished BorelCaratheodoryDeriv with a better constant
2 (R + r) M / (R - r)^2 <= 4 R M / (R - r)^2 . This is better by at least a factor of 4 R / (R - r) compared to the current 16 M R^2 / (R - r)^3 = (4 R / (R - r)) * (4 M R / (R - r)^2) in the blueprint.
The code is currently in my local repo https://github.com/maksym-radziwill/PrimeNumberTheoremAnd/blob/main/PrimeNumberTheoremAnd/StrongPNT/DerivativeBound.lean , I'll tidy it a little bit before pushing to Alex's upstream.
I'd be happy to update the blueprint with the new bound/proof and at all places where it propagates if that's OK...
Alex Kontorovich (Oct 14 2025 at 00:04):
That'd be great, thanks Maks!
Last updated: Dec 20 2025 at 21:32 UTC