Zulip Chat Archive

Stream: general

Topic: calculating fibonacci numbers


view this post on Zulip Johan Commelin (Apr 18 2020 at 12:05):

What is the correct way to calculate fibonacci numbers in Lean?

import data.nat.fib

open nat

example : fib 12 = 144 := by norm_num -- doesn't work

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:11):

#eval fib 12

is instantaneous, but rfl doesn't work as proof of the above

view this post on Zulip Jason Rute (Apr 18 2020 at 12:18):

This is the archetypical example where the standard recursive definition is going to do a lot of extra computation (unless the lean compiler does something fancy). My guess is that if you need a fast version of fib, you can write your own (it’s pretty easy with some googling) and prove it is equal to fib. (But I’m not an expert.)

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:23):

Is this the bad way? https://leanprover.github.io/theorem_proving_in_lean/induction_and_recursion.html?highlight=fibonacci#structural-recursion-and-induction

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:24):

I clicked "try it online" and rfl failed to prove fib 12 = 144, but after staring at the definition I realised that -- oops -- it did prove fib 11 = 144 :-)

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:24):

The canonical normalisation is the one Johan is using.

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:26):

Wait, I get #eval fib 11 -- 89 in my Lean

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:27):

But rfl is way too slow for these facts, it doesn't memoize

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:28):

In TPIL they define their own Fibonacci numbers and I'm observing that rfl works for them

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:28):

(or at least it did for me with "try it online")

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:29):

Hmmm, weird... I think the definition in mathlib should actually memoize enough

view this post on Zulip Jason Rute (Apr 18 2020 at 12:37):

@Kevin Buzzard wrote

Is this the bad way? https://leanprover.github.io/theorem_proving_in_lean/induction_and_recursion.html?highlight=fibonacci#structural-recursion-and-induction

Generally speaking, from a computation standpoint, yes. From a math standpoint, that is the natural definition. I'll write some code showing the difference.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:39):

This is CS 101 coming up, isn't it :-) This is like the difference between foldl and foldr or something -- mathematically they are the same but computationally they couldn't be more different.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:41):

My problem is that even though I understand the basic principle that there are lots of mathematically equivalent ways to do something and they might have vastly different computational behaviour, when I see the definition like the one in TPIL I have so little understanding of what is happening under the hood that I am not capable of working out whether the definition is computationally good or not.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 12:42):

For all I know the equation compiler is cunningly storing all the auxiliary fib a computations on the way.

view this post on Zulip Jason Rute (Apr 18 2020 at 12:50):

I might be answering the wrong question. Johan isn't asking how does one compute with fib, but how does one prove fib 12 = 144. This works for me:

def fib:   
| 0 := 0
| 1 := 1
| (n + 2) := (fib n) + (fib (n + 1))

example : fib 12 = 144 := rfl

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:51):

@Jason Rute But that one is slow, right?
Also, to be precise, I want to work with mathlib's fib...

view this post on Zulip Anne Baanen (Apr 18 2020 at 12:51):

According to TPiL "[The implementation of structural recursion] provides an efficient implementation of fib, avoiding the exponential blowup that would arise from evaluating each recursive call independently.", so it should not be slow.

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:51):

What? Your fib is way faster than the one in mathlib... I'm really confused now

view this post on Zulip Shing Tak Lam (Apr 18 2020 at 12:52):

This works (I don't know why).

import data.nat.fib

open nat

example : fib 12 = 144 := by refl

view this post on Zulip Jason Rute (Apr 18 2020 at 12:53):

I'm more confused now too. I just looked at the mathlib implementation. It should be similar to this fast_fib which is much faster than the naive one:

def fast_fib_aux:    × 
| 0 := (0, 1)
| 1 := (1, 1)
| (n + 2) := let (a, b) := fast_fib_aux (n + 1) in (b, a + b)

def fast_fib (n: ):  := (fast_fib_aux n).1

example : (fast_fib 12) = 144 := rfl

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:54):

#eval fib 30 -- 832040 ... fast!

example : fib 30 = 832040 := by refl -- slow

view this post on Zulip Jason Rute (Apr 18 2020 at 12:55):

I thought I remember seeing that #eval takes shortcuts with interger computation that rfl can't take.

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:57):

Sure, that might be the case. I also don't care whether the proof is by rfl. I just want some way to prove this, that doesn't involve manually apply fib_succ_succ and computing all the 30 fibonacci numbers before it.

view this post on Zulip Jason Rute (Apr 18 2020 at 12:58):

Does the by refl trick work for you? (Why does by refl work and not rfl?)

view this post on Zulip Johan Commelin (Apr 18 2020 at 12:59):

It works for fib 12, but not for fib 30

view this post on Zulip Jason Rute (Apr 18 2020 at 13:01):

You can use my fast_fib above (it works for 30) and prove it is equivalent to fib.) Maybe that needs to be the definition in mathlib.

view this post on Zulip Jason Rute (Apr 18 2020 at 13:02):

Sorry, I thought it was working for 30, but it doesn't. My bad.

view this post on Zulip Reid Barton (Apr 18 2020 at 13:18):

The equation compiler uses a version of memoization automatically.

fib 30

Even adding two numbers of around this size by rfl is too slow, so this can't work.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:18):

Johan Commelin said:

Sure, that might be the case. I also don't care whether the proof is by rfl. I just want some way to prove this, that doesn't involve manually apply fib_succ_succ and computing all the 30 fibonacci numbers before it.

I thought the problem was that with some naive implementations you compute all the 30 fibonacci numbers before it many many times.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:19):

There's a way of computing fib n involving induction on binary expansion of n, because F2nF_{2n} and F2n+1F_{2n+1} are closely related to FnF_n by Binet.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:19):

That would be super-efficient, but I'm assuming that this isn't the question. Or is it?

view this post on Zulip Johan Commelin (Apr 18 2020 at 13:19):

So we turn that into simp-lemmas?

view this post on Zulip Kenny Lau (Apr 18 2020 at 13:26):

Kevin Buzzard said:

That would be super-efficient, but I'm assuming that this isn't the question. Or is it?

until you realize that dividing by 2 isn't O(1), but rather O(n)

view this post on Zulip Jason Rute (Apr 18 2020 at 13:29):

To be clear, I was originally mistaken on the definition used in mathlib. It is (what should be on paper) a fast definition. My first comment led us down the wrong track. As @Reid Barton said, apparently rfl isn't even good at adding large numbers (which is what exactly what any computation of fibonacci does). So I think this is still an open question how to prover stuff like this if rfl (and refl) are unusable.

view this post on Zulip Johan Commelin (Apr 18 2020 at 13:30):

But Kevin is talking about

F2n=i<nF2i+1,F2n+1=(i<nF2(i+1))+1F_{2n} = \sum_{i < n} F_{2i+1}, \qquad F_{2n+1} = \left(\sum_{i < n} F_{2(i+1)} \right) + 1

Am I stupid, or is there no division by two there?

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:34):

I'm talking about F2n=F_{2n}= something like Fn2F_n^2, the same trick that one uses when computing aba^b if bb is large, by repeated squaring.

view this post on Zulip Johan Commelin (Apr 18 2020 at 13:36):

image.png ??

view this post on Zulip Johan Commelin (Apr 18 2020 at 13:36):

@Kevin Buzzard That one :up: ?

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:39):

Right.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:39):

If n>>0n>>0 and one can do the arithmetic efficiently then this is a far quicker way of computing Fibonacci numbers.

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:40):

But I suspect that this has nothing to do with your original question.

view this post on Zulip Johan Commelin (Apr 18 2020 at 13:40):

Well, I want to show e.g. that fib 34 is > 4 million

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:43):

Reid Barton said:

Even adding two numbers of around this size by rfl is too slow, so this can't work.

@Reid Barton
I once found a counterexample to this in front of a room full of schoolchildren and it took me a while to realise what the heck was going on:

example : 1000000000000 + 1000000000000 = 2000000000000 := rfl -- works fine

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 13:43):

I had just coherently argued that this would fail miserably.

view this post on Zulip Jason Rute (Apr 18 2020 at 14:17):

So I think the prerequisite question is how does one prove this in Lean?

example : 317811 + 514229 = 832040

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 14:20):

by norm_num

view this post on Zulip Reid Barton (Apr 18 2020 at 14:21):

This one is by norm_num. The question is really how to make norm_num aware of fib, or fib aware of norm_num, or something which can do both definitional unfolding and norm_num where appropriate.

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:29):

maybe if norm_num accepts tags then we can tag fib_bit0 and fib_bit1 as norm_num

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:29):

this still wouldn't make it memoize though

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:30):

Johan Commelin said:

image.png ??

there must be a formula without minus

view this post on Zulip Johan Commelin (Apr 18 2020 at 14:31):

Well, F n * L n doesn't contain a minus... maybe there are fast formulas for L n as well??

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:48):

[F2n+1F2nF2nF2n1]=[Fn+1FnFnFn1]2\begin{bmatrix} F_{2n+1} & F_{2n} \\ F_{2n} & F_{2n-1} \end{bmatrix} = \begin{bmatrix} F_{n+1} & F_{n} \\ F_{n} & F_{n-1} \end{bmatrix}^2

20 = 2 * 10 = 2 * (2 * 5) = 2 * (2 * (2 * 2 + 1))
(F(0), F(1)) = (0, 1)
(F(1), F(2)) = (0*0+1*1, 1*(2*0+1)) = (1, 1)
(F(3), F(4)) = (1*1+1*1, 1*(2*1+1)) = (2, 3)
(F(4), F(5)) = (3, 2+3) = (3, 5)
(F(9), F(10)) = (3*3+5*5, 5*(2*3+5)) = (34, 55)
(F(19), F(20)) = (34*34+55*55, 55*(2*34+55)) = (4181, 6765)

proposed algorithm:

fib 20
= fib (bit0 10)
= fib (bit0 $ bit0 5)
= fib (bit0 $ bit0 $ bit1 2)
= fib (bit0 $ bit0 $ bit1 $ bit0 1)
 (fib_aux $ bit0 $ bit0 $ bit1 $ bit0 1).2
 (f0 $ fib_aux $ bit0 $ bit1 $ bit0 1).2
 (f0 $ f0 $ fib_aux $ bit1 $ bit0 1).2
 (f0 $ f0 $ fs $ f0 $ fib_aux $ bit0 1).2
 (f0 $ f0 $ fs $ f0 $ f0 $ fib_aux 1).2
 (f0 $ f0 $ fs $ f0 $ f0 (0, 1)).2
 (f0 $ f0 $ fs $ f0 (0*0+1*1, 1*(2*0+1))).2
 (f0 $ f0 $ fs $ f0 (1, 1)).2
 (f0 $ f0 $ fs (1*1+1*1, 1*(2*1+1))).2
 (f0 $ f0 $ fs (2, 3)).2
 (f0 $ f0 (3, 2+3)).2
 (f0 $ f0 (3, 5)).2
 (f0 (3*3+5*5, 5*(2*3+5))).2
 (f0 (34, 55)).2
 (34*34+55*55, 55*(2*34+55)).2
 (4181, 6765).2
 6765

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 14:51):

That is much better than the foldl/foldr issue

view this post on Zulip Reid Barton (Apr 18 2020 at 14:51):

foldl/foldr?

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:52):

def f0(s):
    return ("({0}*{0}+{1}*{1})".format(*s), "{1}*(2*{0}+{1})".format(*s))
def fs(s):
    return (s[1],"({0}+{1})".format(*s))

f20 = f0(f0(fs(f0(f0((0,1))))))[1]
print(f20)
print(eval(f20))

output:

(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(2*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))*(2*(1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(2*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))))

6765

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 14:52):

I just mean asking a functional language to compute Fibonacci numbers might naively involve computing FndF_{n-d} about dd times. It's not foldl/foldr but it's the same sort of thing.

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:53):

I don't know whether it will cause problem if the intermediate expressions are not immediately evaluated

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:53):

it might undo our optimization by having exponential complexity (on top of logarithmic complexity)

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:53):

then the overall complexity would just be O(n)

view this post on Zulip Reid Barton (Apr 18 2020 at 14:54):

None of this is really relevant if you can't compute these additions in subexponential time

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 14:54):

Yes, you are evaluating 0*0+1*1 a lot of times there

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:54):

Reid Barton said:

None of this is really relevant if you can't compute these additions in subexponential time

norm_num can

view this post on Zulip Reid Barton (Apr 18 2020 at 14:54):

But you can't solve the original problem using norm_num

view this post on Zulip Reid Barton (Apr 18 2020 at 14:55):

You are solving the second problem before solving the first problem.

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:55):

if you're referring to writing a tactic that will calculate fib 20, then my algorithm can call norm_num in each intermediate step

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:55):

what's the first problem?

view this post on Zulip Reid Barton (Apr 18 2020 at 14:56):

I don't want to write a new tactic to compute every function I write.

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:56):

so I also proposed having a norm_num tag that you can tag my lemmas with, which include fib n = (fib_aux n).2 and fib_aux (bit0 n) = f0 (fib_aux n) and fib_aux(bit1 n) = fs (f0 (fib_aux n)), and my algorithm demonstrated how you can use these three lemmas to calculate fib 20 in logarithmic time

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 14:57):

One could imagine a general "multiply these two matrices together using norm_num and also other tricks" function.

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:57):

(removed)

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 14:57):

Such a function will I think be hard coded into any decent computer algebra system

view this post on Zulip Kenny Lau (Apr 18 2020 at 14:57):

(removed)

view this post on Zulip Alex J. Best (Apr 18 2020 at 17:05):

@Mario Carneiro has mentioned that we need Coq's cbv tactic a few times now for similar questions I believe, but I don't know enough about coq to know what needs doing here (can we get a formal roadmap for a tactic)!

view this post on Zulip Mario Carneiro (Apr 18 2020 at 20:13):

As Reid suggests, the "right" answer here is to extend norm_num, and I've been meaning to use def_replacer for this purpose so that norm_num doesn't have to import every number theoretic function it will ever use

view this post on Zulip Mario Carneiro (Apr 18 2020 at 20:15):

There is nothing wrong with using - inside norm_num, because norm_num doesn't prove a - b = c, it proves b + c = a where a and b are the inputs and c is the output

view this post on Zulip Mario Carneiro (Apr 18 2020 at 20:20):

Extending norm_num is a bit more complicated than just marking fib_bit0 and fib_bit1, because it's not a simp set, it requires some carefully stated custom theorems and goes by recursion, which is much faster than simp

view this post on Zulip Mario Carneiro (Apr 18 2020 at 20:22):

But it also means that it is not difficult to get it to evaluate subterms as required to get the O(n log n) algorithm

view this post on Zulip Mario Carneiro (Apr 18 2020 at 20:24):

Johan Commelin said:

Well, I want to show e.g. that fib 34 is > 4 million

If that's all you want, you don't need to evaluate fib at all, unless that bound is particularly close. A quick estimation method would say that fib n is close to phi^n / sqrt 5 or something like that, and you can easily get proven bounds on phi using continued fractions

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 20:36):

import data.nat.fib

open nat

example : fib 12 = 144 := begin
  rw fib_succ_succ,
  rw @fib_succ_succ 9,
  rw @fib_succ_succ 8,
  rw @fib_succ_succ 7,
  rw @fib_succ_succ 6,
  rw @fib_succ_succ 5,
  rw @fib_succ_succ 4,
  rw @fib_succ_succ 3,
  rw @fib_succ_succ 2,
  rw @fib_succ_succ 1,
  rw @fib_succ_succ 0,
  refl,
end

view this post on Zulip Kevin Buzzard (Apr 18 2020 at 20:37):

example : fib 12 = 144 := begin
  -- start the countdown
  rw @fib_succ_succ 10,
  rw @fib_succ_succ 9,
  rw @fib_succ_succ 8,
  rw @fib_succ_succ 7,
  rw @fib_succ_succ 6,
  rw @fib_succ_succ 5,
  rw @fib_succ_succ 4,
  rw @fib_succ_succ 3,
  rw @fib_succ_succ 2,
  rw @fib_succ_succ 1,
  rw @fib_succ_succ 0,
  rw @fib_one,
  rw @fib_zero,
end

Last updated: May 11 2021 at 13:22 UTC