Zulip Chat Archive

Stream: PrimeNumberTheorem+

Topic: A small cautionary tale with autoformalization

Terence Tao (Jan 10 2026 at 20:20):

PNT#510 asked to formalize a certain inequality for the LCM subproject. It was very quickly resolved by AlphaProof (see PNT#520). However, as pointed out in PNT#533, the inequality was misformalized, because natural number division was used in several places where real division was intended instead. In particular, the term n^(3/2) was in fact interpreted as n^1 because of this, which was not at all what was intended. (This is likely my fault, or more precisely the fault is due to me not reviewing a Github Copilot formalization of this statement based on the informal LaTeX version with sufficient care, as the formalization typechecked correctly and "looked" right.) So I have approved PNT#533 which repairs the statement (and removes the previous proof) and opened a new issue PNT#534 to establish the actually intended result.

This is only a minor speedbump, but it does show that even with the correctness and typechecking guarantees of Lean, one has to be a little careful when blindly applying AI tools to autoformalize sublemmas as - in contrast to human formalizers - such tools may not "notice" that the sublemma was suspiciously easy to prove, or stated in a "weird" way. On the other hand, this issue would have eventually been picked up at some other stage of the project when the output of this misformalized lemma was to be applied elsewhere, so this would be an ultimately self-correcting issue so long as the final theorems to be proven had their statements formalized correctly. Still, this is a useful data point to keep in mind as we continue to use automated tools to significantly accelerate the formalization process: they are certainly a net positive as far as shortening the total time needed to fully formalize a proof, but can still occasionally generate local delays.

Last updated: Feb 28 2026 at 14:05 UTC