Zulip Chat Archive

Stream: lean4

Topic: Checking the Goldbach conjecture

Kevin Buzzard (Feb 07 2022 at 11:23):

The Goldbach conjecture is that every even number $n\geq4$ is the sum of two primes. It's still an open question. Computationally-inclined number theorists sometimes write computer programs to check it for all even $n\leq X$ , where $X$ grows slowly over time as computers get better. How big can we make $X$ in Lean?

Here is an example of a paper from 1993 by Sinisalo about numerically verifying the Goldbach conjecture up to $4\times10^{11}$ (using non-formal techniques, as do all the papers I cite below). It's published in Math Comp, which was at that time the top journal for computational mathematics, so one can imagine that it was state of the art at the time. It doesn't do anything fancy involving zero estimates for the Riemann zeta functions (some other papers in this area do); it just does a brute force variant of Eratosthenes' sieve. The details of the algorithm are in this earlier paper of Granville et al, where if you can HANDLE THE FORTRAN CODE they explain the very simple method (my understanding, at least, is that this is the algorithm Sinisalo used as well). It seems that in 1988 (Granville et al) they could get up to $X=2\times10^{10}$ , in 1993 (Sinisalo) they had got as far as $4\times 10^{11}$ , and the most recent paper I can find is by Oliviera e Silva, Herzog and Pardi here which push the calculations up to $4\times10^{18}$ and again they are using very simple algorithms (they also give a history of the computational aspect of the problem at the beginning, and they explain their simple algorithms in the paper, but it's just Eratosthenes + some extra thought to make it computationally more efficient).

Obviously computers are many times faster now than they were in 1993 (at the time of Sinisalo's paper), however formal proofs as opposed to Fortran proofs slow things down again. I was wondering how far one could get in Lean 4, using GMP if necessary (I would also be interested in whether GMP speeds things up or whether for numbers of this size it's of no use). It would be a pretty simple job to knock off a very naive algorithm to start verifying Goldbach, and if anyone gets the bug and wants to start setting formalisation records then these more advanced algorithms which seem to be giving state of the art results are really still extremely elementary in nature as you can see from their description in the linked papers. I have no idea what the record is for formalised computational Goldbach.

Why am I interested? Well, the ternary Goldbach conjecture is the weaker statement that every odd natural >= 7 is the sum of three primes, and a proof was announced a few years ago by Helfgott. Helfgott's proof is in two completely different parts; he uses abstract techniques from analytic number theory to prove the results for odd numbers $\geq 8\times 10^{31}$ and then (with Platt -- see here does a brute force computer search to check the cases less than this bound. The proof of the abstract analytic number theory part of the argument is 300 pages long so is unlikely to be formalised any time soon, however the basic techniques used in it by Helfgott (the circle method etc) are used a lot in this area, and are being formalised by mathematicians in this Lean project here, so it led me to idly speculate about how feasible it would be to formalise the various parts of the proof of the ternary conjecture and in particular the computational part. For the computational part Helfgott uses the Oliviera e Silva, Herzog and Pardi result, which reduces the claim that ternary Goldbach is true up to $8\times 10^{31}$ to the claim that there are no two consecutive primes in this range which are distance $\geq 4\times 10^{18}$ apart, and this is what he proves using a (unformalised) computer proof. I would imagine that formalised methods right now can't get close to Helfgott's unformalised bound but it seems to me that if the state of the art is being achieved with these essentially elementary arguments then it might make a good computational project for someone interested in seeing how far Lean 4 can get; certainly no maths library will be needed, for example.

Siddhartha Gadgil (Feb 07 2022 at 11:48):

This is very interesting. I guess a formal proof can simply be an enumeration, with even numbers listed as sums of pairs of primes and a proof of the equality.

Kevin Buzzard (Feb 07 2022 at 11:50):

Exactly. But I have no feeling at all as to the complexity of doing this to a non-trivial bound in a formal setting. I know that they computed the first million digits of pi in Coq although the non-formal people are up to about 10^13 digits by now.

Siddhartha Gadgil (Feb 07 2022 at 12:00):

A very nice feature of lean 4 compiled code is that proofs are eliminated at runtime after checking correctness while compiling. So there is hope the overhead is small.

Johan Commelin (Feb 07 2022 at 12:30):

If Lean 4 uses GMP, then the implementation of GMP (in C, I guess) is part of the TCB, right?

Johan Commelin (Feb 07 2022 at 12:31):

Or is there a verified reimplementation?

Mario Carneiro (Feb 07 2022 at 12:32):

there are plans for a verified (or at least homegrown and not GPL) reimplementation, but indeed that's the current state of things

Reid Barton (Feb 07 2022 at 12:35):

$4 \cdot 10^{18}$ is around $2^{64}/4.5$ ; so I don't think GMP will be relevant to that part (and maybe this is why they stopped at that particular value).

Johan Commelin (Feb 07 2022 at 12:35):

Aha, because if you off-load the bignum computations and the primality checks to a non-verified GMP, then there is little value in doing the project in Lean, methinks.

Mario Carneiro (Feb 07 2022 at 12:36):

the part of GMP in the lean TCB only includes + - * / % on nat and int, not primality

Mario Carneiro (Feb 07 2022 at 12:37):

of course you can use FFI + implementedBy + reduceBool to add arbitrary C code to the TCB if you want

Mario Carneiro (Feb 07 2022 at 12:38):

so for instance you could link to those GMP primality routines and add them to lean

Johan Commelin (Feb 07 2022 at 12:38):

Might also be interesting to do this verified Goldbach in MM0

Mario Carneiro (Feb 07 2022 at 12:39):

lol, it's always on my mind... I'm thinking to wait until I have verified computational reflection before tackling these kinds of problems though

Mario Carneiro (Feb 07 2022 at 12:42):

Maybe it might be better to start a little simpler, for example calculating the number of primes <= 100000 in lean 4 compared to (pick your favorite other language)

Bhavik Mehta (Sep 12 2022 at 18:18):

image.png Here's a proof that the number of primes less than 100000 is 9592 in Lean 3

Reid Barton (Sep 12 2022 at 18:35):

And kernel typechecking time?

Bhavik Mehta (Sep 12 2022 at 18:36):

How can I measure this?

Bhavik Mehta (Sep 12 2022 at 18:38):

image.png Here's all the other things the profiler tells me (on this run it says execution was around 4.5s)

Kevin Buzzard (Sep 12 2022 at 23:13):

But you didn't do this by explicitly computing all the primes less than 10^5, right?

Kevin Buzzard (Sep 12 2022 at 23:14):

BTW if anyone has questions for Helfgott I'm at a conference with him this week

Bhavik Mehta (Sep 12 2022 at 23:33):

I didn't, but I did also separately compute all the primes less than 2 * 10^5

Bhavik Mehta (Sep 12 2022 at 23:33):

I've also managed to check Goldbach (again in Lean 3) up to 2*10^5. In particular the statement

lemma goldbach_kevin : ∀ i ≤ 200000, 4 ≤ i → even i → ∃ p q : ℕ, p.prime ∧ q.prime ∧ p + q = i :=

is sorry-free

Chris Lovett (Sep 16 2022 at 21:02):

This looks like fun learning experience. I tried super simple Eratosthenes Sieve in Lean 4 but it runs out of stack space pretty quick:

partial def EratosthenesSieveSimple : List ℕ → List ℕ
  | [] => []
  | x :: xs => x :: EratosthenesSieveSimple (xs.filter (λ y => y % x ≠ 0))

So I used a StateM monad to allow some tail recursion:

partial def EratosthenesSieve (xs : List ℕ) : StateM (List ℕ) Unit := do
  match xs with
  | [] => pure ()
  | x :: xs' => do
    let r ← get
    set (x :: r)
    EratosthenesSieve (xs'.filter (λ y => y % x ≠ 0))

The compiled version of this can find the first 1 million primes in about 93 seconds on my AMD Ryzen 9 CPU. It found 78498 primes. I can then save this to a new lean source file containing def primes := [2, 3, 5, 7, ...] but when I try and compile this simple array the Lean compiler times out, which is odd:

.\.\.\Primes1Million.lean:1:14: error: (deterministic) timeout at 'isDefEq', maximum number of heartbeats (200000) has b                                             een reached (use 'set_option maxHeartbeats <num>' to set the limit)

I was thinking of saving this file as a cheat so we could use it to implement a more efficient Nat.prime function, e.g. use it to build a HashMap of all the numbers with a true false saying whether they are primes or not so p.prime becomes an O(1) lookup....

Mario Carneiro (Sep 16 2022 at 21:20):

how long is your list def primes := [2, 3, 5, 7, ...]?

Mario Carneiro (Sep 16 2022 at 21:29):

Here's a MWE:

macro "make_list" : term => do
  `([$(mkArray 5000 (← `(0))),*])
def foo := make_list

While I expect there to be problems eventually, 5000 seems unusually low for them to show up here. I think there are some issues in the scheme used for compiling list literals using nested let bindings.

Mario Carneiro (Sep 16 2022 at 21:49):

It seems like there is some quadratic behavior here with it spending a huge amount of time in elimMVarDeps traversing the term to ensure no metavariables contain references to the newly introduced let bindings produced by the [xs,* | tail] notation

Mario Carneiro (Sep 16 2022 at 21:50):

This is really unnecessary, this notation should just be an elab and produce a term directly instead of macro expanding to a pile of lets that have to be elaborated

Kyle Miller (Sep 16 2022 at 22:30):

@Chris Lovett FYI, that's not the true sieve of Eratosthenes (though it's the one you see as a functional programming example, and there's a paper about this).

Here's a simple implementation of the true one -- note that it doesn't have any modulo operators nor any checking that a given number is not divisible by all of the primes found so far:

def primes (n : Nat) : Array Nat := Id.run do
  let mut res : Array Nat := #[]
  let mut buf : Array Bool := .mkArray n True
  for i in [2 : n] do
    if buf[i]! then
      res := res.push i
      for j in [i * i : n : i] do
        buf := buf.set! j False
  return res

def main : IO Unit := do
  let a := primes 1000000
  for p in a do
    IO.println s!"{p}"

Compiled, I get

$ time ./build/bin/primes
...  78498 lines omitted ...
./build/bin/primes  0.11s user 0.17s system 86% cpu 0.323 total

93 seconds seemed excessive for the primes less than a million!

Chris Lovett (Sep 17 2022 at 01:50):

Ok, building on your nice fast primes function we can find the sum of 2 primes for every even i >= 4 with this:

abbrev NatMap := HashMap Nat Nat

def goldback (n: ℕ) := do
  let primes := primes n
  let mut map := HashMap.empty
  for p in primes do
    map := map.insert p p
  for i in [4 : n : 2] do
    let mut found := false
    for p in primes do
      if p > i then
        break
      else
        let q := i - p
        if map.contains q then
          found := true
          break
    if not found then
      IO.println s!"{i} disproves goldback conjecture"
      break

and the compiled version can prove a pair p + q exists for all 5,761,455 primes found under 100 million in 9 seconds which is not bad.

Chris Lovett (Sep 17 2022 at 01:55):

The performance seems quite linear (it increases by a factor of 10 each time I increase n by a factor of 10) but of course this solution will not scale too far before running out of memory, we'd need a more compact collection for checking whether q is a prime number... or a SQLite database partitioned across a bunch of terabyte drives or something... but then of course all that will slow things down....

Chris Lovett (Sep 17 2022 at 02:21):

I could fit 1 billion in memory which it did in 101 seconds... but I only have a piddly little 32GB RAM so 10 billion is where I run out. I need one of those machine learning boxes with a terabyte of RAM :-)

Tom (Sep 19 2022 at 02:21):

For the sieve, you could start by using a bit-vector instead of a hashmap, which will reduce your space overhead significantly.
Then, you can also halve the memory used by only storing the odd numbers, since any prime must be odd. :smile:

Tom (Sep 19 2022 at 02:28):

There are also some sparse bitvector data structures which would allow you to take advantage of the fact that your bitvector will be mostly empty because of https://en.wikipedia.org/wiki/Prime_number_theorem.

Bhavik Mehta (Oct 11 2022 at 01:48):

I think these approaches do a different thing to what I did - evaluating whether such a counterexample exists doesn't give you a Lean 4 proof that none exist. Certainly for Lean 3 it's a lot faster to evaluate/run instead of proving, and I expect similar for Lean 4

joaogui1 (he/him) (Oct 20 2022 at 01:29):

What's GMP and TCB?

Bolton Bailey (Oct 20 2022 at 01:39):

GMP is the GNU MultiPrecision Library it's a C library for doing integer arithmetic on numbers that are too big to fit into typical computer registers. TCB stands for "trusted computing base" - it means the set of all software we use when doing formally verification that we assume is bug-free.

joaogui1 (he/him) (Oct 20 2022 at 09:19):

Thanks!

Harald Helfgott (Mar 27 2024 at 22:00):

I've noticed this stream only now - not sure anybody is still reading.

A trick that most lazy schoolchildren learn: it is enough to check for divisibility by primes up to sqrt(N). Any composite number up to <=N must have a prime factor <=sqrt(N).
The way you deal with limited in space (in any language) is by using a segmented sieve (... and that is efficient because of point 0. above): you can run a sieve of Erathostenes on a segment of the form [N,N+sqrt(N)] (say) in time roughly linear on sqrt(N).
(In fact you can go down to N^(1/3) if you are tricky, though that is more complicated.)

Harald Helfgott (Mar 27 2024 at 22:10):

The code examples given here simply run a sieve of Eratosthenes in Lean - in pretty much the same way and in the same sense as they would run it in any functional language (AFAIK). They are not giving a formal proof that the array that is being returned is actually telling us the truth.

I imagine there are two options:
3a (faster but I was told a few years ago that this was science-fiction, at least in another theorem prover): write a formal proof that the code tells you the truth. The difficulty here (I was told) lies in proving that a loop does what it is supposed to do, without actually going through it N times. Not sure why this should be the case (obviously someone will say 'halting problem', but I am talking about this particular problem; living after Gödel doesn't keep us from doing math either)
3b: writing a formal proof in Lean that actually goes through N iterations of the loops and gives us a certificate at the end that an array is telling us the truth.

I see no reason why 3b should be difficult for someone with experience in Lean. I am a rank beginner, so I have to ask how to go about it. I'd gladly help, in part so that I can learn more Lean.

(Obviously we would want to keep the time complexity roughly linear; otherwise one might as well just churn a list of prime certificates in a more conventional way, one by one, using an existing primality checker.)

Mario Carneiro (Mar 27 2024 at 22:16):

The code examples given here simply run a sieve of Eratosthenes in Lean - in pretty much the same way and in the same sense as they would run it in any functional language (AFAIK). They are not giving a formal proof that the array that is being returned is actually telling us the truth.

Note that this is generally not an issue. Once you have some code running in Lean, you can prove the correctness of the algorithm in O(1) work (possibly messy, but generally much less computational work than running the algorithm itself) and then combine it with the results of the algorithm to get a full formal proof out the other end

Mario Carneiro (Mar 27 2024 at 22:17):

That is to say, 3a is not science fiction at all, and it doesn't require going through the loop multiple times, as it's pure symbolic reasoning

Mario Carneiro (Mar 27 2024 at 22:19):

I personally find proofs in the style of 3a ("extrinsic verification") preferable to 3b ("intrinsic verification") because it means it is easier to guarantee that the verification stuff does not interfere with the compiler producing efficient-enough code

Harald Helfgott (Mar 27 2024 at 22:27):

Note that this is generally not an issue. Once you have some code running in Lean, you can prove the correctness of the algorithm in O(1) work (possibly messy, but generally much less computational work than running the algorithm itself) and then combine it with the results of the algorithm to get a full formal proof out the other end
11:17 PM

That is to say, 3a is not science fiction at all, and it doesn't require going through the loop multiple times, as it's pure symbolic reasoning

That's excellent news - and would like to see how to do it, preferably on this particular example (which is easy and interesting, at least to me; it is a useful algorithm that is at least 19 centuries old -- possibly 24).

Assia Mahboubi (Mar 27 2024 at 23:28):

FWIW, some related work to material discussed in this thread:

The CoqPrime library
A verified analogue of GMP: paper and code

Mario Carneiro (Mar 28 2024 at 01:05):

@Harald Helfgott Currently, proving facts about do notation programs is a bit fiddly, so I've taken the liberty of rewriting the original function into a more continuation-passing style (which is compiled to better code, as well) rather than reasoning directly about all the forIn cominators etc. But we can still run it just as before, and we can prove properties about the function by using a bunch of lemmas about array operators and induction:

import Mathlib.Data.Nat.Prime

-- def primes (n : Nat) : Array Nat := Id.run do
--   let mut res : Array Nat := #[]
--   let mut buf : Array Bool := .mkArray n true
--   for i in [2 : n] do
--     if buf[i]! then
--       res := res.push i
--       for j in [i * i : n : i] do
--         buf := buf.set! j false
--   return res

def primes (n : Nat) : Array Nat :=
  let rec loop1 (i : {i : Nat // i > 0})
      (buf : { arr : Array Bool // arr.size = n }) (res : Array Nat) : Array Nat :=
    if hi : i < n then
      if buf.1[i.1]'(by rw [buf.2]; exact hi) then
        let rec loop2 (j : Nat)
            (buf : { arr : Array Bool // arr.size = n }) : { arr : Array Bool // arr.size = n } :=
          if hj : j < n then
            loop2 (j + i) ⟨buf.1.set ⟨j, by rw [buf.2]; exact hj⟩ false, by simp [buf.2]⟩
          else
            buf
        termination_by n - j
        loop1 ⟨i+1, Nat.succ_pos _⟩ (loop2 (i * i) buf) (res.push i)
      else loop1 ⟨i+1, Nat.succ_pos _⟩ buf res
    else
      res
  termination_by n - i.1
  loop1 ⟨2, by decide⟩ ⟨.mkArray n true, by simp⟩ #[]

Mario Carneiro (Mar 28 2024 at 01:06):

lemma mem_primes_loop2 {n i hi buf buf' j} (m)
    (eq : primes.loop1.loop2 n ⟨i, hi⟩ j buf = buf')
    (j_eq : j = i * i + m * i)
    {k} (hk : k < n) :
      buf'.1[k]'(by rw [buf'.2]; exact hk) = true ↔
      buf.1[k]'(by rw [buf.2]; exact hk) = true ∧
      (j ≤ k → ∀ m, k ≠ i * i + m * i) := by
  rw [primes.loop1.loop2] at eq; split at eq <;> rename_i hj
  · rw [mem_primes_loop2 (j := j+i) (m+1) eq (by rw [j_eq, Nat.succ_mul, add_assoc]) hk]
    simp; rw [Array.get_set (hj := by rw [buf.2]; exact hk)]; split <;> rename_i jk <;> simp
    · cases show j = k from jk
      exact fun _ => ⟨le_rfl, _, j_eq⟩
    · refine fun _ => ⟨fun H h m' eq => ?_, fun H h => H ((Nat.le_add_right ..).trans h)⟩
      refine H ?_ _ eq
      have := lt_of_le_of_ne h jk
      subst eq j_eq
      rw [add_lt_add_iff_left] at this
      have := Nat.lt_of_mul_lt_mul_right this
      rw [add_assoc, add_le_add_iff_left, ← Nat.succ_mul]
      exact Nat.mul_le_mul_right _ this
  · subst eq; simp [hk.trans_le (not_lt.1 hj)]
termination_by n - j

lemma mem_primes_loop1 {p n i hi buf res} (i2 : 2 ≤ i) (hi' : i ≤ max 2 n)
    (hres : p ∈ res.data ↔ p < i ∧ Nat.Prime p) (sz : buf.size = n)
    (hbuf : ∀ k (h : k < n), buf[k]'(sz ▸ h) = true ↔
      ∀ q < i, q.Prime → ∀ m, k ≠ q * q + m * q) :
    p ∈ primes.loop1 n ⟨i, hi⟩ ⟨buf, sz⟩ res ↔ p < n ∧ Nat.Prime p := by
  unfold primes.loop1; simp; split <;> rename_i h1
  · have : buf[i] = true ↔ i.Prime := by
      rw [hbuf _ h1]; constructor
      · refine fun h2 => Nat.prime_def_le_sqrt.2 ⟨i2, fun m m2 hm mi => ?_⟩
        let ⟨q, pq, qm⟩ := Nat.exists_prime_and_dvd (Nat.ne_of_gt m2)
        have hq := Nat.le_sqrt.1 <| (Nat.le_of_dvd (Nat.le_of_lt m2) qm).trans hm
        obtain ⟨c, rfl⟩ := qm.trans mi
        have qc := (mul_le_mul_iff_of_pos_left pq.pos).1 hq
        apply h2 q (lt_mul_right pq.pos (pq.two_le.trans qc)) pq (c - q)
        rw [← add_mul, Nat.add_sub_cancel' qc, Nat.mul_comm]
      · intro pi q qi pq m eq
        exact qi.ne' <| (pi.dvd_iff_eq pq.ne_one).1 ⟨q+m, by rw [eq, ← add_mul, mul_comm]⟩
    split <;> rename_i pi <;> simp only [this] at pi
    · generalize eq : primes.loop1.loop2 .. = buf'
      refine mem_primes_loop1 (Nat.le_succ_of_le i2) (le_max_of_le_right h1)
        ?_ buf'.2 fun k hk => ?_
      · simp [hres, Nat.lt_succ_iff_lt_or_eq, or_and_right]
        exact or_congr_right (and_iff_left_iff_imp.2 (· ▸ pi)).symm
      · rw [mem_primes_loop2 0 eq (by simp) hk, hbuf _ hk]
        simp [Nat.lt_succ_iff_lt_or_eq, or_imp, forall_and, pi]
        exact fun _ => or_iff_not_imp_left.2 fun h2 m eq => h2 (eq ▸ Nat.le_add_right ..)
    · refine mem_primes_loop1 (Nat.le_succ_of_le i2) (le_max_of_le_right h1) ?_ sz fun k hk => ?_
      · rw [hres, Nat.lt_succ_iff_lt_or_eq]
        refine and_congr_left fun pp => (or_iff_left_of_imp fun h => ?_).symm
        subst i; contradiction
      · simp [hbuf _ hk, Nat.lt_succ_iff_lt_or_eq, or_imp, forall_and, pi]
  · have := le_antisymm hi' (max_le i2 (not_lt.1 h1)); subst this
    rw [Array.mem_def, hres, lt_max_iff]
    exact and_congr_left fun pp => or_iff_right pp.two_le.not_lt
termination_by n - i

theorem mem_primes (n : Nat) {p : Nat} : p ∈ primes n ↔ p < n ∧ p.Prime :=
  mem_primes_loop1 le_rfl (le_max_left ..)
    (by simp; exact fun p2 pp => pp.two_le.not_lt p2) _
    (fun k hk => by simp; exact fun _ p2 pp => (pp.two_le.not_lt p2).elim)

Mario Carneiro (Mar 28 2024 at 01:12):

(Note that we are all aware that this kind of proof is very difficult to write and maintain, and have a variety of ideas for being able to write better monadic verification code. But this at least shows that it is possible to do this kind of thing today, and indeed many of the algorithms in mathlib and std are verified in this way.)

Kim Morrison (Mar 28 2024 at 01:14):

And then:

example : Nat.Prime 101 := ((mem_primes 200).mp (by native_decide)).2

Mario Carneiro (Mar 28 2024 at 01:15):

well, that's actually not the best way to verify primality of a single number. Eratosthenes is optimized for the case where you have a whole dense set of numbers you want to check primality for

Kim Morrison (Mar 28 2024 at 01:25):

Sure.

Harald Helfgott (Mar 28 2024 at 01:50):

Ah, nice!

Harald Helfgott (Mar 28 2024 at 11:15):

Note that we are all aware that this kind of proof is very difficult to write and maintain, and have a variety of ideas for being able to write better monadic verification code.

I'd say this is the natural next challenge: make the code and the proof readable, and the task of producing them accessible to a working mathematician who happens to use Lean.

Jonatas Miguel (Mar 28 2024 at 14:17):

Is this an appropriate place to talk about ideas related to the Goldbach Conjecture? Or is this more about Lean specific discussions/efforts related to the Goldbach Conjecture? If it's not an appropriate place for some general, non-Lean related, discussions on GC, what would be an appropriate place for someone like myself to share some ideas?

Johan Commelin (Mar 28 2024 at 14:22):

@Jonatas Miguel The thread in this stream is very much about formalization, and I would say only about weak GC, not GC.
If your ideas are a mix of math but with formalization in mind, then #maths would be a good place. Otherwise, I would suggest a blogpost, or math.stackexchange or other online place where math is discussed.

Jonatas Miguel (Mar 28 2024 at 14:25):

Got it, yeah, I have some ideas and observations around patterns I've seen while playing around with the GC on my free time. But, I don't have proofs for anything just yet.

I'm not very experienced with formalization. I'm trying to learn but it's a bit slow going for me. I'll keep on working at it though.

Thanks for the response.

Mario Carneiro (Mar 28 2024 at 20:50):

@Harald Helfgott said:

Note that we are all aware that this kind of proof is very difficult to write and maintain, and have a variety of ideas for being able to write better monadic verification code.

I'd say this is the natural next challenge: make the code and the proof readable, and the task of producing them accessible to a working mathematician who happens to use Lean.

The ideas of the proof (and the proof strategy) are not that difficult: There is one function for every loop, and one theorem for every function which asserts what properties hold before and after the function. These are usually called "loop invariants" in software verification. Most of the loop invariants are just saying exactly what was done to the buf variable, and then there is an important have : buf[i] = true ↔ i.Prime subgoal in mem_primes_loop1 which does the mathematically interesting work of proving that ∀ q < i, q.Prime → ∀ m, i ≠ q * q + m * q if and only if i.Prime (given 2 ≤ i), which is a slightly obscured version of ∀ q ≤ sqrt i, q.Prime → ¬q ∣ i.

Then there is a bunch of boring and unnecessarily difficult work dealing with all the array get and set functions, and also to some extent the numeric manipulations can probably be simplified with omega, linarith and aesop for the general stuff. But if you look at the proof in the editor, the subgoals should all look reasonable.

Harald Helfgott (Mar 28 2024 at 22:09):

Mario Carneiro: that's what I meant - it would be better for the system to be sufficiently developed for the ideas of the proof to be reasonably clear from a well-written formal proof, and for the code itself to remain relatively transparent.

Harald Helfgott (Mar 30 2024 at 11:27):

Just to get this right: from reading the code above, I get the impression that 'continuation-style code' consists in replacing each loop by a sub-procedure defined by means of tail recursion, with the loop variable becoming a parameter. Is this correct?

Mario Carneiro (Mar 30 2024 at 11:50):

Yes. When you use for loops in do notation, it's actually syntactic sugar for some combinators that are ultimately defined using tail recursion. It's the form that the compiler normally uses for loops, and using it directly gives you some additional flexibility compared to using the combinators. It definitely looks less clean than the imperative style though.

Harald Helfgott (Mar 30 2024 at 13:30):

Beginner's question: I'm trying to create a file to run the above code and the above proof. I followed the instructions in https://gist.github.com/jcommelin/1d45a0ea7a84a87db8a28a12e93cac32 , Scenario 1. However, all I am getting in Lean Infoview with VS Code is "Waiting for Lean server to start..."

What to do?

Jonatas Miguel (Mar 30 2024 at 14:10):

Harald Helfgott said:

[...] I followed the instructions in https://gist.github.com/jcommelin/1d45a0ea7a84a87db8a28a12e93cac32 , Scenario 1. However, all I am getting in Lean Infoview with VS Code is "Waiting for Lean server to start..." [...]

Are people still using lean v3? The codebase that that gist refers to (specifically the link https://github.com/leanprover/mathlib/blob/master/leanpkg.toml#L4) even says to use lean v4.

The way that I got lean v4 working in VS Code was to install the lean4 extension for VS Code and follow the instructions it outlines there once installed. Once I did that I was able to create projects from scratch or even download existing projects and run them, though, the latter required an extra step that I found here.

I hope this helps. :fingers_crossed:

Harald Helfgott (Mar 30 2024 at 14:41):

Well, I just came back from a "Lean for the curious mathematician" workshop - in which we used Lean 4, and in which everything seemed to go well. Once I came back home, I tried to just use VS code and start a new project, and the system told me that it knew of no such thing as mathlib. Hence my confusion (and my following outdated information).

Eric Wieser (Mar 30 2024 at 16:01):

I would encourage you to create a thread in #new members if you are having trouble getting set up.

Harald Helfgott (Mar 30 2024 at 17:22):

Thanks. All right, I am now working within the directory that I pulled using git during workshop, and Lean and mathlib are running. I'll figure out later how to install things on my own and not bother people here.

Now, however, it seems I do not know how to use the above code. What do I do in order to get a table of the primes <= 1000, or a proof that that table is correct?

Matt Diamond (Mar 30 2024 at 20:12):

#eval primes 1000
#check mem_primes 1000

Harald Helfgott (Mar 30 2024 at 21:14):

Thanks. Why does mem_primes take an argument? Shouldn't an extrinsic proof work for all n? (That is, it should prove that the algorithm gives the correct result for arbitrary input.)

Matt Diamond (Mar 30 2024 at 22:15):

Yes, sorry if that was unclear... mem_primes by itself is a proof that ∀ n, p ∈ primes n ↔ p < n ∧ p.Prime, so it doesn't need an input. However, you had asked about a proof that the table of primes generated by primes 1000 is correct, so I wrote mem_primes 1000 as a way of specifying that proof (since universal quantification is just a function from parameters to proofs).

Matt Diamond (Mar 30 2024 at 22:19):

Or are you asking why a universally quantified proposition can be treated like a function?

Harald Helfgott (Mar 30 2024 at 22:38):

Ah, all right. On the last question: no, that was not my question, but I guess the answer is that that is just how Lean's syntax works? (Or, more strongly, and if I understand correctly: for Lean, "prove that, for every $n\in \mathbb{N}$..." and "For every $n\in \mathbb{N}$, return a proof..." are one of the same thing.)

Last updated: May 02 2025 at 03:31 UTC