Zulip Chat Archive

Stream: mathlib4

Topic: Archive.Examples.MersennePrimes #5704


Johan Commelin (Jul 04 2023 at 08:04):

The remaining todo in this PR is a port of the tactic LucasLehmer.run_test. So if you have some meta skills, and love prime numbers, feel free to attack.

Thomas Murrills (Jul 04 2023 at 11:57):

Quick port of the tactic with a porting note! I added elab "lucas_lehmer_test" : tactic => LucasLehmer.run_test to get rid of ugly by run_tacs, since now lucas_lehmer_test is freed up ( I think).

Unfortunately, like a couple other mathlib tactics, the direct port of this means using set_option hygiene false in one place. I'm going to see if it's easy to make it hygienic or not for a second...

Thomas Murrills (Jul 04 2023 at 12:26):

@Kyle Miller re: lucas lehmer test—I would like to avoid making it unhygienic, but this means either inlining some replace code or (possibly better?) doing the stepwise computation without using the goal state, the latter of which is a bit tricky, since using a tactic call fills in the types for us nicely.

I also wonder if it's worth calling norm_num1 at h each time (or something equivalent), since the terms get rather large. (I'm not sure if this matters.) However, there's no extension for Int.mod yet, so this would really only serve to reduce the first argument of sMod, which becomes 0 + 1 + 1 + ....

Kyle Miller (Jul 04 2023 at 12:37):

@Thomas Murrills I'm looking into converting this into a norm_num extension right now

Kyle Miller (Jul 04 2023 at 12:40):

This should help avoid hygiene issues, and I hope it makes it'll make it more robust too (while also reducing the terms you're mentioning)

Thomas Murrills (Jul 04 2023 at 12:48):

Oh, nice! :D

Kyle Miller (Jul 04 2023 at 14:46):

@Johan Commelin @Thomas Murrills I changed the implementation to have the proof rely on kernel reductions. The Lean 4 kernel is good at normalizing natural number expressions, and with it we can have some tiny proof terms, while being able to reach past the start of the computer age (mersenne 521). It can prove that mersenne 4423 is prime in just a second.

Kyle Miller (Jul 04 2023 at 14:46):

I've set it back to awaiting review because it was a relatively large change to the PR

Kyle Miller (Jul 04 2023 at 14:47):

The only reason it can't go farther is because of deep recursion

Scott Morrison (Jul 05 2023 at 00:22):

This is amazing. Welcome, everyone, to the computer age. :-)

(I guess to be fair when making a comparison to Lean 3, this speed up depends on the fact we now include GMP in the trusted code base, right?)

Kyle Miller (Jul 05 2023 at 00:42):

Yeah, the speedup definitely depends on having a fast bigint library. It's also a more direct implementation without repeatedly using simp, norm_num, and the tactic evaluation system, which is significant as well.

In the Lean 3 implementation there also seems to be accidental unary here in the second argument to s_mod, like Thomas pointed out, though it doesn't seem like it'd add too much to the running time for small numbers like p = 521.

Scott Morrison (Jul 05 2023 at 01:09):

I think I've dealt with the deep recursion issue, by proving a tail recursive version of sMod'. Just threading through the proofs now.

Scott Morrison (Jul 05 2023 at 01:19):

Oh, I am completely wrong. I had thought that it was sMod' that was deeply recursing (I thought I even checked #eval sMod' (2^9689 - 1) (9689 - 2) errored...) but apparently it is not.

Scott Morrison (Jul 05 2023 at 01:35):

I don't understand what is going on, I guess:

def sMod' (q : ) :   
  | 0 => 4 % q
  | i + 1 => (sMod' q i ^ 2 + (q - 2)) % q

def sMod''_aux (q : ) :     
  | 0, acc => acc
  | i + 1, acc => sMod''_aux q i (((acc ^ 2) + (q - 2)) % q)

/-- tail recursive version of `sMod'` -/
def sMod'' (q a : ) :  := sMod''_aux q a (4 % q)

#reduce sMod' (2^61 - 1) (61 - 2) -- deep recursion!
#reduce sMod'' (2^61 - 1) (61 - 2) -- deep recursion!
#eval sMod' (2^9689 - 1) (9689 - 2) -- no problem
#eval sMod'' (2^9689 - 1) (9689 - 2) -- no problem

Scott Morrison (Jul 05 2023 at 01:37):

Why then are we getting:

-- no problem
example : (mersenne 4423).Prime :=
  lucas_lehmer_sufficiency _ (by norm_num) (by norm_num)

-- First failure ("deep recursion detected")
example : (mersenne 9689).Prime :=
  lucas_lehmer_sufficiency _ (by norm_num) (by norm_num)

Scott Morrison (Jul 05 2023 at 01:37):

(same behaviour whether we use sMod' or sMod''.)

Scott Morrison (Jul 05 2023 at 01:39):

If anyone wants to play with it, branch#lucas_lehmer_tailrec has the proof that sMod' and sMod'' agree, and uses (to no avail) sMod'' in the norm_num tactic.

Scott Morrison (Jul 05 2023 at 01:47):

I guess my fundamental mistake is thinking that kernel reduction cares about tail recursiveness?

Scott Morrison (Jul 05 2023 at 01:47):

I still don't understand why the limit is 9689 rather than 61. :-(

Kyle Miller (Jul 05 2023 at 14:20):

Scott Morrison said:

I still don't understand why the limit is 9689 rather than 61. :-(

Maybe it's that #reduce is using the frontend's reducer (docs#Lean.Meta.Reduce) rather than the kernel's?

From what I've understood reading the kernel code, it doesn't care about tail recursiveness, though it does cache reductions (so even the naive exponential implementation of the fibonacci sequence ends up being very fast since it's auto-memoized!)

Kyle Miller (Jul 05 2023 at 17:52):

@Scott Morrison I found a trick to get the kernel to reduce further, and now we've gotten to the year 1979 with mersenne 23209, which takes 1.51 seconds for norm_num to pre-evaluate the sMod'' function to make sure it's actually a mersenne prime, and then another 7.9 seconds for the kernel to verify the proof. The next one to have deep recursion is mersenne 44497.

The first part of the trick is this function:

/-- Ensure the `Nat` is reduced before evaluating `f` -/
def nat_seq (f : Nat  α) : Nat  α
  | 0 => f 0
  | n => f n

The pattern match causes the kernel to reduce the natural number first before unfolding f while calculating whnf. The next part is using this to break up the iteration into blocks:

def iterBlockSize : Nat := 800

def sMod_iter (q x : ) :   
  | 0 => x
  | steps + 1 => (sMod_iter q x steps ^ 2 + (q - 2)) % q

def sMod_iter' (q x : ) :   
  | 0 => x
  | i + 1 => nat_seq (sMod_iter' q · i) (sMod_iter q x iterBlockSize)

It might not be using tail recursion, but at least it's being more considerate of the stack depth.

Eric Rodriguez (Jul 05 2023 at 19:05):

Are these tricks universal or is there reason to make them be user choice?

Scott Morrison (Jul 06 2023 at 02:04):

@Kyle Miller, nice!

Scott Morrison (Jul 06 2023 at 02:06):

Does anyone know how to hook up further functions to GMP?

I'd like the left-hand-sides of both of these equalities to use native GMP implementations.

example (k n : ) : k >>> n = k / 2 ^ n := sorry
example (k n : ) : k &&& (2 ^ n - 1) = k % 2 ^ n := sorry

Scott Morrison (Jul 06 2023 at 02:07):

(I wrote the glue that replaces the modulus operation in sMod_iter with the "trick version"

k  k / 2 ^ n + k % 2 ^ n [MOD 2 ^ n - 1]

where you then know you only have to do a single subtraction to do the %. It still breaks at 44497 however.)

Scott Morrison (Jul 06 2023 at 03:51):

Oh, the problem at 44497 comes down to this difference:

example : 2^25744 - (2^25744 - 1) = 1 := by norm_num -- succeeds
example : 2^25745 - (2^25745 - 1) = 1 := by norm_num -- fails because of deep recursion

Scott Morrison (Jul 06 2023 at 05:27):

With a 6 line tweak to core, (mersenne 44497).Prime.

Scott Morrison (Jul 06 2023 at 06:27):

No further changes required, and it is plowing through them. Done with 132049. If there are no further obstacles, the current implementation could finish the pre-GIMP era today.

Johan Commelin (Jul 06 2023 at 06:28):

How long did 44497 and 132049 take?

Scott Morrison (Jul 06 2023 at 06:46):

  • 44497: 21s
  • 86243: 82s
  • 110503: 143s
  • 132049: 222s
  • 216091: 650s

Scott Morrison (Jul 06 2023 at 06:46):

it's nicely quadratic.

Scott Morrison (Jul 06 2023 at 06:48):

There's now a big gap until the last 3 pre-GIMP primes, that will probably take overnight to run (barring surprises).

Scott Morrison (Jul 06 2023 at 06:53):

The lean4 patch is at lean4#2310.

Mario Carneiro (Jul 06 2023 at 07:00):

Oh, this is a norm_num bug

Scott Morrison (Jul 06 2023 at 08:39):

Lean crashes after an hour or so of working on (mersenne 756839).Prime, and thus is unable to handle the 90s.

Mario Carneiro (Jul 06 2023 at 10:38):

What does this look like on your patch? This is on mine:

import Mathlib.Tactic.NormNum.Basic

set_option profiler true
example : 10^40000000 = 10^40000000 := by norm_num
-- norm_num took 582ms
-- norm_num took 563ms
-- type checking took 587ms

Mario Carneiro (Jul 06 2023 at 11:02):

PR'd in #5740

Scott Morrison (Jul 06 2023 at 11:19):

norm_num took 421ms
norm_num took 412ms
type checking took 415ms

Scott Morrison (Jul 06 2023 at 11:22):

and your PR on the same hardware:

norm_num took 824ms
norm_num took 816ms
type checking took 829ms

Mario Carneiro (Jul 06 2023 at 11:23):

In that case, I suggest not growing the TCB for this

Scott Morrison (Jul 06 2023 at 11:25):

I guess I had not appreciated that we are/could be drawing some line through GMP, and saying "this bit is trusted, this bit isn't necessarily". But I agree it makes sense.

Maybe we should try to record somewhere (perhaps even just in Mersenne example file) that some further speed-up is available if someone actually needs it, by trusting GMP's power function in the kernel, but that this is not the case for now?

Kyle Miller (Jul 06 2023 at 11:26):

Do you have the newest version pushed to the branch, or at least to another branch?

Mario Carneiro (Jul 06 2023 at 11:26):

I mean, if we want to support everything GMP has there are a few other operations I would want to put on the list, like Nat.gcd and Nat.prime

Scott Morrison (Jul 06 2023 at 11:26):

Is the difference between the two just caching?

Scott Morrison (Jul 06 2023 at 11:26):

The fact that your PR takes exactly twice as long is suggestive.

Scott Morrison (Jul 06 2023 at 11:27):

(In which case maybe it is even less of a big deal? Most people don't want to know 10^40000000 twice.)

Scott Morrison (Jul 06 2023 at 11:28):

@Kyle Miller, I just pushed on the lucas_lehmer_tailrec branch.

Scott Morrison (Jul 06 2023 at 11:28):

It won't be able to do the later examples without either merging Mario's #5740, or switching the toolchain to semorrison:release-20230706-nat-pow.

Scott Morrison (Jul 06 2023 at 11:29):

err... semorrison/lean4:release-20230706-nat-pow

Scott Morrison (Jul 06 2023 at 11:31):

I wrote most of the proofs for using the k ≡ k / 2 ^ n + k % 2 ^ n [MOD 2 ^ n - 1] trick (and taking advantage of the fact that you know you can compute the MOD by subtracting 2^n-1 at most once). Sadly it made things slower rather than faster!

Scott Morrison (Jul 06 2023 at 11:32):

I suspect that the / 2 ^ n and % 2 ^ n are killing it, because I was just using the default implementations of those rather than working out how to have them be done bitwise on GMP numbers.

Mario Carneiro (Jul 06 2023 at 11:33):

is this in the kernel or in the compiler?

Kyle Miller (Jul 06 2023 at 11:33):

It'd probably help to pre-compute 2 ^ n and such and pass the value in. That's what I was going to experiment with with Mario's #5740

Kyle Miller (Jul 06 2023 at 11:35):

Right now it's relying on kernel reduction for everything, no norm_num at all except for the argument to LucasLehmerTest

Scott Morrison (Jul 06 2023 at 11:36):

I did attempt to make it pre-compute 2 ^ n, but probably did it wrong. If you want to look I have the branch lucas_lehmer_trickMod. (It is still very messy!!)

Kyle Miller (Jul 06 2023 at 11:39):

It's easy to get this wrong too, since the kernel is very very happy to reduce anything that looks like a natural number expression (hence the opaque wrapper trick in Mario's PR), so you think you've precomputed something but the kernel will recompute it just to be sure...

Scott Morrison (Jul 06 2023 at 11:40):

(also, my branch is missing a few proofs about monus that I couldn't bear to finish. :-)

Kyle Miller (Jul 06 2023 at 14:41):

@Scott Morrison With #5740 and using norm_num to precompute 2 ^ p in the non-trickMod version, it'd able to go further without needing to change the kernel. For mersenne 86243,

norm_num took 38s
type checking took 62.5s

It's a bit of a shame that norm_num has to pre-compute the sMod function to make sure it has the right value before handing it off to the kernel, just for it to do the calculation again from scratch.

Kyle Miller (Jul 06 2023 at 14:44):

This version uses the purely tail-recursive calculation inside the norm_num extension (which is evaluated, so it's fine), and then uses sMod'' for the kernel reduction. It doesn't actually seem to save any time versus using sMod'' directly though.


Last updated: Dec 20 2023 at 11:08 UTC