Zulip Chat Archive

Stream: lean4

Topic: Checking the Goldbach conjecture


Kevin Buzzard (Feb 07 2022 at 11:23):

The Goldbach conjecture is that every even number n4n\geq4 is the sum of two primes. It's still an open question. Computationally-inclined number theorists sometimes write computer programs to check it for all even nXn\leq X, where XX grows slowly over time as computers get better. How big can we make XX in Lean?

Here is an example of a paper from 1993 by Sinisalo about numerically verifying the Goldbach conjecture up to 4×10114\times10^{11} (using non-formal techniques, as do all the papers I cite below). It's published in Math Comp, which was at that time the top journal for computational mathematics, so one can imagine that it was state of the art at the time. It doesn't do anything fancy involving zero estimates for the Riemann zeta functions (some other papers in this area do); it just does a brute force variant of Eratosthenes' sieve. The details of the algorithm are in this earlier paper of Granville et al, where if you can HANDLE THE FORTRAN CODE they explain the very simple method (my understanding, at least, is that this is the algorithm Sinisalo used as well). It seems that in 1988 (Granville et al) they could get up to X=2×1010X=2\times10^{10}, in 1993 (Sinisalo) they had got as far as 4×10114\times 10^{11}, and the most recent paper I can find is by Oliviera e Silva, Herzog and Pardi here which push the calculations up to 4×10184\times10^{18} and again they are using very simple algorithms (they also give a history of the computational aspect of the problem at the beginning, and they explain their simple algorithms in the paper, but it's just Eratosthenes + some extra thought to make it computationally more efficient).

Obviously computers are many times faster now than they were in 1993 (at the time of Sinisalo's paper), however formal proofs as opposed to Fortran proofs slow things down again. I was wondering how far one could get in Lean 4, using GMP if necessary (I would also be interested in whether GMP speeds things up or whether for numbers of this size it's of no use). It would be a pretty simple job to knock off a very naive algorithm to start verifying Goldbach, and if anyone gets the bug and wants to start setting formalisation records then these more advanced algorithms which seem to be giving state of the art results are really still extremely elementary in nature as you can see from their description in the linked papers. I have no idea what the record is for formalised computational Goldbach.

Why am I interested? Well, the ternary Goldbach conjecture is the weaker statement that every odd natural >= 7 is the sum of three primes, and a proof was announced a few years ago by Helfgott. Helfgott's proof is in two completely different parts; he uses abstract techniques from analytic number theory to prove the results for odd numbers 8×1031\geq 8\times 10^{31} and then (with Platt -- see here does a brute force computer search to check the cases less than this bound. The proof of the abstract analytic number theory part of the argument is 300 pages long so is unlikely to be formalised any time soon, however the basic techniques used in it by Helfgott (the circle method etc) are used a lot in this area, and are being formalised by mathematicians in this Lean project here, so it led me to idly speculate about how feasible it would be to formalise the various parts of the proof of the ternary conjecture and in particular the computational part. For the computational part Helfgott uses the Oliviera e Silva, Herzog and Pardi result, which reduces the claim that ternary Goldbach is true up to 8×10318\times 10^{31} to the claim that there are no two consecutive primes in this range which are distance 4×1018\geq 4\times 10^{18} apart, and this is what he proves using a (unformalised) computer proof. I would imagine that formalised methods right now can't get close to Helfgott's unformalised bound but it seems to me that if the state of the art is being achieved with these essentially elementary arguments then it might make a good computational project for someone interested in seeing how far Lean 4 can get; certainly no maths library will be needed, for example.

Siddhartha Gadgil (Feb 07 2022 at 11:48):

This is very interesting. I guess a formal proof can simply be an enumeration, with even numbers listed as sums of pairs of primes and a proof of the equality.

Kevin Buzzard (Feb 07 2022 at 11:50):

Exactly. But I have no feeling at all as to the complexity of doing this to a non-trivial bound in a formal setting. I know that they computed the first million digits of pi in Coq although the non-formal people are up to about 10^13 digits by now.

Siddhartha Gadgil (Feb 07 2022 at 12:00):

A very nice feature of lean 4 compiled code is that proofs are eliminated at runtime after checking correctness while compiling. So there is hope the overhead is small.

Johan Commelin (Feb 07 2022 at 12:30):

If Lean 4 uses GMP, then the implementation of GMP (in C, I guess) is part of the TCB, right?

Johan Commelin (Feb 07 2022 at 12:31):

Or is there a verified reimplementation?

Mario Carneiro (Feb 07 2022 at 12:32):

there are plans for a verified (or at least homegrown and not GPL) reimplementation, but indeed that's the current state of things

Reid Barton (Feb 07 2022 at 12:35):

410184 \cdot 10^{18} is around 264/4.52^{64}/4.5; so I don't think GMP will be relevant to that part (and maybe this is why they stopped at that particular value).

Johan Commelin (Feb 07 2022 at 12:35):

Aha, because if you off-load the bignum computations and the primality checks to a non-verified GMP, then there is little value in doing the project in Lean, methinks.

Mario Carneiro (Feb 07 2022 at 12:36):

the part of GMP in the lean TCB only includes + - * / % on nat and int, not primality

Mario Carneiro (Feb 07 2022 at 12:37):

of course you can use FFI + implementedBy + reduceBool to add arbitrary C code to the TCB if you want

Mario Carneiro (Feb 07 2022 at 12:38):

so for instance you could link to those GMP primality routines and add them to lean

Johan Commelin (Feb 07 2022 at 12:38):

Might also be interesting to do this verified Goldbach in MM0

Mario Carneiro (Feb 07 2022 at 12:39):

lol, it's always on my mind... I'm thinking to wait until I have verified computational reflection before tackling these kinds of problems though

Mario Carneiro (Feb 07 2022 at 12:42):

Maybe it might be better to start a little simpler, for example calculating the number of primes <= 100000 in lean 4 compared to (pick your favorite other language)

Bhavik Mehta (Sep 12 2022 at 18:18):

image.png Here's a proof that the number of primes less than 100000 is 9592 in Lean 3

Reid Barton (Sep 12 2022 at 18:35):

And kernel typechecking time?

Bhavik Mehta (Sep 12 2022 at 18:36):

How can I measure this?

Bhavik Mehta (Sep 12 2022 at 18:38):

image.png Here's all the other things the profiler tells me (on this run it says execution was around 4.5s)

Kevin Buzzard (Sep 12 2022 at 23:13):

But you didn't do this by explicitly computing all the primes less than 10^5, right?

Kevin Buzzard (Sep 12 2022 at 23:14):

BTW if anyone has questions for Helfgott I'm at a conference with him this week

Bhavik Mehta (Sep 12 2022 at 23:33):

I didn't, but I did also separately compute all the primes less than 2 * 10^5

Bhavik Mehta (Sep 12 2022 at 23:33):

I've also managed to check Goldbach (again in Lean 3) up to 2*10^5. In particular the statement

lemma goldbach_kevin :  i  200000, 4  i  even i   p q : , p.prime  q.prime  p + q = i :=

is sorry-free

Chris Lovett (Sep 16 2022 at 21:02):

This looks like fun learning experience. I tried super simple Eratosthenes Sieve in Lean 4 but it runs out of stack space pretty quick:

partial def EratosthenesSieveSimple : List   List 
  | [] => []
  | x :: xs => x :: EratosthenesSieveSimple (xs.filter (λ y => y % x  0))

So I used a StateM monad to allow some tail recursion:

partial def EratosthenesSieve (xs : List ) : StateM (List ) Unit := do
  match xs with
  | [] => pure ()
  | x :: xs' => do
    let r  get
    set (x :: r)
    EratosthenesSieve (xs'.filter (λ y => y % x  0))

The compiled version of this can find the first 1 million primes in about 93 seconds on my AMD Ryzen 9 CPU. It found 78498 primes. I can then save this to a new lean source file containing def primes := [2, 3, 5, 7, ...] but when I try and compile this simple array the Lean compiler times out, which is odd:

.\.\.\Primes1Million.lean:1:14: error: (deterministic) timeout at 'isDefEq', maximum number of heartbeats (200000) has b                                             een reached (use 'set_option maxHeartbeats <num>' to set the limit)

I was thinking of saving this file as a cheat so we could use it to implement a more efficient Nat.prime function, e.g. use it to build a HashMap of all the numbers with a true false saying whether they are primes or not so p.prime becomes an O(1) lookup....

Mario Carneiro (Sep 16 2022 at 21:20):

how long is your list def primes := [2, 3, 5, 7, ...]?

Mario Carneiro (Sep 16 2022 at 21:29):

Here's a MWE:

macro "make_list" : term => do
  `([$(mkArray 5000 ( `(0))),*])
def foo := make_list

While I expect there to be problems eventually, 5000 seems unusually low for them to show up here. I think there are some issues in the scheme used for compiling list literals using nested let bindings.

Mario Carneiro (Sep 16 2022 at 21:49):

It seems like there is some quadratic behavior here with it spending a huge amount of time in elimMVarDeps traversing the term to ensure no metavariables contain references to the newly introduced let bindings produced by the [xs,* | tail] notation

Mario Carneiro (Sep 16 2022 at 21:50):

This is really unnecessary, this notation should just be an elab and produce a term directly instead of macro expanding to a pile of lets that have to be elaborated

Kyle Miller (Sep 16 2022 at 22:30):

@Chris Lovett FYI, that's not the true sieve of Eratosthenes (though it's the one you see as a functional programming example, and there's a paper about this).

Here's a simple implementation of the true one -- note that it doesn't have any modulo operators nor any checking that a given number is not divisible by all of the primes found so far:

def primes (n : Nat) : Array Nat := Id.run do
  let mut res : Array Nat := #[]
  let mut buf : Array Bool := .mkArray n True
  for i in [2 : n] do
    if buf[i]! then
      res := res.push i
      for j in [i * i : n : i] do
        buf := buf.set! j False
  return res

def main : IO Unit := do
  let a := primes 1000000
  for p in a do
    IO.println s!"{p}"

Compiled, I get

$ time ./build/bin/primes
...  78498 lines omitted ...
./build/bin/primes  0.11s user 0.17s system 86% cpu 0.323 total

93 seconds seemed excessive for the primes less than a million!

Chris Lovett (Sep 17 2022 at 01:50):

Ok, building on your nice fast primes function we can find the sum of 2 primes for every even i >= 4 with this:

abbrev NatMap := HashMap Nat Nat

def goldback (n: ) := do
  let primes := primes n
  let mut map := HashMap.empty
  for p in primes do
    map := map.insert p p
  for i in [4 : n : 2] do
    let mut found := false
    for p in primes do
      if p > i then
        break
      else
        let q := i - p
        if map.contains q then
          found := true
          break
    if not found then
      IO.println s!"{i} disproves goldback conjecture"
      break

and the compiled version can prove a pair p + q exists for all 5,761,455 primes found under 100 million in 9 seconds which is not bad.

Chris Lovett (Sep 17 2022 at 01:55):

The performance seems quite linear (it increases by a factor of 10 each time I increase n by a factor of 10) but of course this solution will not scale too far before running out of memory, we'd need a more compact collection for checking whether q is a prime number... or a SQLite database partitioned across a bunch of terabyte drives or something... but then of course all that will slow things down....

Chris Lovett (Sep 17 2022 at 02:21):

I could fit 1 billion in memory which it did in 101 seconds... but I only have a piddly little 32GB RAM so 10 billion is where I run out. I need one of those machine learning boxes with a terabyte of RAM :-)

Tom (Sep 19 2022 at 02:21):

For the sieve, you could start by using a bit-vector instead of a hashmap, which will reduce your space overhead significantly.
Then, you can also halve the memory used by only storing the odd numbers, since any prime must be odd. :smile:

Tom (Sep 19 2022 at 02:28):

There are also some sparse bitvector data structures which would allow you to take advantage of the fact that your bitvector will be mostly empty because of https://en.wikipedia.org/wiki/Prime_number_theorem.

Bhavik Mehta (Oct 11 2022 at 01:48):

I think these approaches do a different thing to what I did - evaluating whether such a counterexample exists doesn't give you a Lean 4 proof that none exist. Certainly for Lean 3 it's a lot faster to evaluate/run instead of proving, and I expect similar for Lean 4

joaogui1 (he/him) (Oct 20 2022 at 01:29):

What's GMP and TCB?

Bolton Bailey (Oct 20 2022 at 01:39):

GMP is the GNU MultiPrecision Library it's a C library for doing integer arithmetic on numbers that are too big to fit into typical computer registers. TCB stands for "trusted computing base" - it means the set of all software we use when doing formally verification that we assume is bug-free.

joaogui1 (he/him) (Oct 20 2022 at 09:19):

Thanks!


Last updated: Dec 20 2023 at 11:08 UTC