Zulip Chat Archive

Stream: general

Topic: calculating fibonacci numbers


Johan Commelin (Apr 18 2020 at 12:05):

What is the correct way to calculate fibonacci numbers in Lean?

import data.nat.fib

open nat

example : fib 12 = 144 := by norm_num -- doesn't work

Johan Commelin (Apr 18 2020 at 12:11):

#eval fib 12

is instantaneous, but rfl doesn't work as proof of the above

Jason Rute (Apr 18 2020 at 12:18):

This is the archetypical example where the standard recursive definition is going to do a lot of extra computation (unless the lean compiler does something fancy). My guess is that if you need a fast version of fib, you can write your own (it’s pretty easy with some googling) and prove it is equal to fib. (But I’m not an expert.)

Kevin Buzzard (Apr 18 2020 at 12:23):

Is this the bad way? https://leanprover.github.io/theorem_proving_in_lean/induction_and_recursion.html?highlight=fibonacci#structural-recursion-and-induction

Kevin Buzzard (Apr 18 2020 at 12:24):

I clicked "try it online" and rfl failed to prove fib 12 = 144, but after staring at the definition I realised that -- oops -- it did prove fib 11 = 144 :-)

Kevin Buzzard (Apr 18 2020 at 12:24):

The canonical normalisation is the one Johan is using.

Johan Commelin (Apr 18 2020 at 12:26):

Wait, I get #eval fib 11 -- 89 in my Lean

Johan Commelin (Apr 18 2020 at 12:27):

But rfl is way too slow for these facts, it doesn't memoize

Kevin Buzzard (Apr 18 2020 at 12:28):

In TPIL they define their own Fibonacci numbers and I'm observing that rfl works for them

Kevin Buzzard (Apr 18 2020 at 12:28):

(or at least it did for me with "try it online")

Johan Commelin (Apr 18 2020 at 12:29):

Hmmm, weird... I think the definition in mathlib should actually memoize enough

Jason Rute (Apr 18 2020 at 12:37):

@Kevin Buzzard wrote

Is this the bad way? https://leanprover.github.io/theorem_proving_in_lean/induction_and_recursion.html?highlight=fibonacci#structural-recursion-and-induction

Generally speaking, from a computation standpoint, yes. From a math standpoint, that is the natural definition. I'll write some code showing the difference.

Kevin Buzzard (Apr 18 2020 at 12:39):

This is CS 101 coming up, isn't it :-) This is like the difference between foldl and foldr or something -- mathematically they are the same but computationally they couldn't be more different.

Kevin Buzzard (Apr 18 2020 at 12:41):

My problem is that even though I understand the basic principle that there are lots of mathematically equivalent ways to do something and they might have vastly different computational behaviour, when I see the definition like the one in TPIL I have so little understanding of what is happening under the hood that I am not capable of working out whether the definition is computationally good or not.

Kevin Buzzard (Apr 18 2020 at 12:42):

For all I know the equation compiler is cunningly storing all the auxiliary fib a computations on the way.

Jason Rute (Apr 18 2020 at 12:50):

I might be answering the wrong question. Johan isn't asking how does one compute with fib, but how does one prove fib 12 = 144. This works for me:

def fib:   
| 0 := 0
| 1 := 1
| (n + 2) := (fib n) + (fib (n + 1))

example : fib 12 = 144 := rfl

Johan Commelin (Apr 18 2020 at 12:51):

@Jason Rute But that one is slow, right?
Also, to be precise, I want to work with mathlib's fib...

Anne Baanen (Apr 18 2020 at 12:51):

According to TPiL "[The implementation of structural recursion] provides an efficient implementation of fib, avoiding the exponential blowup that would arise from evaluating each recursive call independently.", so it should not be slow.

Johan Commelin (Apr 18 2020 at 12:51):

What? Your fib is way faster than the one in mathlib... I'm really confused now

Shing Tak Lam (Apr 18 2020 at 12:52):

This works (I don't know why).

import data.nat.fib

open nat

example : fib 12 = 144 := by refl

Jason Rute (Apr 18 2020 at 12:53):

I'm more confused now too. I just looked at the mathlib implementation. It should be similar to this fast_fib which is much faster than the naive one:

def fast_fib_aux:    × 
| 0 := (0, 1)
| 1 := (1, 1)
| (n + 2) := let (a, b) := fast_fib_aux (n + 1) in (b, a + b)

def fast_fib (n: ):  := (fast_fib_aux n).1

example : (fast_fib 12) = 144 := rfl

Johan Commelin (Apr 18 2020 at 12:54):

#eval fib 30 -- 832040 ... fast!

example : fib 30 = 832040 := by refl -- slow

Jason Rute (Apr 18 2020 at 12:55):

I thought I remember seeing that #eval takes shortcuts with interger computation that rfl can't take.

Johan Commelin (Apr 18 2020 at 12:57):

Sure, that might be the case. I also don't care whether the proof is by rfl. I just want some way to prove this, that doesn't involve manually apply fib_succ_succ and computing all the 30 fibonacci numbers before it.

Jason Rute (Apr 18 2020 at 12:58):

Does the by refl trick work for you? (Why does by refl work and not rfl?)

Johan Commelin (Apr 18 2020 at 12:59):

It works for fib 12, but not for fib 30

Jason Rute (Apr 18 2020 at 13:01):

You can use my fast_fib above (it works for 30) and prove it is equivalent to fib.) Maybe that needs to be the definition in mathlib.

Jason Rute (Apr 18 2020 at 13:02):

Sorry, I thought it was working for 30, but it doesn't. My bad.

Reid Barton (Apr 18 2020 at 13:18):

The equation compiler uses a version of memoization automatically.

fib 30

Even adding two numbers of around this size by rfl is too slow, so this can't work.

Kevin Buzzard (Apr 18 2020 at 13:18):

Johan Commelin said:

Sure, that might be the case. I also don't care whether the proof is by rfl. I just want some way to prove this, that doesn't involve manually apply fib_succ_succ and computing all the 30 fibonacci numbers before it.

I thought the problem was that with some naive implementations you compute all the 30 fibonacci numbers before it many many times.

Kevin Buzzard (Apr 18 2020 at 13:19):

There's a way of computing fib n involving induction on binary expansion of n, because F2nF_{2n} and F2n+1F_{2n+1} are closely related to FnF_n by Binet.

Kevin Buzzard (Apr 18 2020 at 13:19):

That would be super-efficient, but I'm assuming that this isn't the question. Or is it?

Johan Commelin (Apr 18 2020 at 13:19):

So we turn that into simp-lemmas?

Kenny Lau (Apr 18 2020 at 13:26):

Kevin Buzzard said:

That would be super-efficient, but I'm assuming that this isn't the question. Or is it?

until you realize that dividing by 2 isn't O(1), but rather O(n)

Jason Rute (Apr 18 2020 at 13:29):

To be clear, I was originally mistaken on the definition used in mathlib. It is (what should be on paper) a fast definition. My first comment led us down the wrong track. As @Reid Barton said, apparently rfl isn't even good at adding large numbers (which is what exactly what any computation of fibonacci does). So I think this is still an open question how to prover stuff like this if rfl (and refl) are unusable.

Johan Commelin (Apr 18 2020 at 13:30):

But Kevin is talking about

F2n=i<nF2i+1,F2n+1=(i<nF2(i+1))+1F_{2n} = \sum_{i < n} F_{2i+1}, \qquad F_{2n+1} = \left(\sum_{i < n} F_{2(i+1)} \right) + 1

Am I stupid, or is there no division by two there?

Kevin Buzzard (Apr 18 2020 at 13:34):

I'm talking about F2n=F_{2n}= something like Fn2F_n^2, the same trick that one uses when computing aba^b if bb is large, by repeated squaring.

Johan Commelin (Apr 18 2020 at 13:36):

image.png ??

Johan Commelin (Apr 18 2020 at 13:36):

@Kevin Buzzard That one :up: ?

Kevin Buzzard (Apr 18 2020 at 13:39):

Right.

Kevin Buzzard (Apr 18 2020 at 13:39):

If n>>0n>>0 and one can do the arithmetic efficiently then this is a far quicker way of computing Fibonacci numbers.

Kevin Buzzard (Apr 18 2020 at 13:40):

But I suspect that this has nothing to do with your original question.

Johan Commelin (Apr 18 2020 at 13:40):

Well, I want to show e.g. that fib 34 is > 4 million

Kevin Buzzard (Apr 18 2020 at 13:43):

Reid Barton said:

Even adding two numbers of around this size by rfl is too slow, so this can't work.

@Reid Barton
I once found a counterexample to this in front of a room full of schoolchildren and it took me a while to realise what the heck was going on:

example : 1000000000000 + 1000000000000 = 2000000000000 := rfl -- works fine

Kevin Buzzard (Apr 18 2020 at 13:43):

I had just coherently argued that this would fail miserably.

Jason Rute (Apr 18 2020 at 14:17):

So I think the prerequisite question is how does one prove this in Lean?

example : 317811 + 514229 = 832040

Kevin Buzzard (Apr 18 2020 at 14:20):

by norm_num

Reid Barton (Apr 18 2020 at 14:21):

This one is by norm_num. The question is really how to make norm_num aware of fib, or fib aware of norm_num, or something which can do both definitional unfolding and norm_num where appropriate.

Kenny Lau (Apr 18 2020 at 14:29):

maybe if norm_num accepts tags then we can tag fib_bit0 and fib_bit1 as norm_num

Kenny Lau (Apr 18 2020 at 14:29):

this still wouldn't make it memoize though

Kenny Lau (Apr 18 2020 at 14:30):

Johan Commelin said:

image.png ??

there must be a formula without minus

Johan Commelin (Apr 18 2020 at 14:31):

Well, F n * L n doesn't contain a minus... maybe there are fast formulas for L n as well??

Kenny Lau (Apr 18 2020 at 14:48):

[F2n+1F2nF2nF2n1]=[Fn+1FnFnFn1]2\begin{bmatrix} F_{2n+1} & F_{2n} \\ F_{2n} & F_{2n-1} \end{bmatrix} = \begin{bmatrix} F_{n+1} & F_{n} \\ F_{n} & F_{n-1} \end{bmatrix}^2

20 = 2 * 10 = 2 * (2 * 5) = 2 * (2 * (2 * 2 + 1))
(F(0), F(1)) = (0, 1)
(F(1), F(2)) = (0*0+1*1, 1*(2*0+1)) = (1, 1)
(F(3), F(4)) = (1*1+1*1, 1*(2*1+1)) = (2, 3)
(F(4), F(5)) = (3, 2+3) = (3, 5)
(F(9), F(10)) = (3*3+5*5, 5*(2*3+5)) = (34, 55)
(F(19), F(20)) = (34*34+55*55, 55*(2*34+55)) = (4181, 6765)

proposed algorithm:

fib 20
= fib (bit0 10)
= fib (bit0 $ bit0 5)
= fib (bit0 $ bit0 $ bit1 2)
= fib (bit0 $ bit0 $ bit1 $ bit0 1)
 (fib_aux $ bit0 $ bit0 $ bit1 $ bit0 1).2
 (f0 $ fib_aux $ bit0 $ bit1 $ bit0 1).2
 (f0 $ f0 $ fib_aux $ bit1 $ bit0 1).2
 (f0 $ f0 $ fs $ f0 $ fib_aux $ bit0 1).2
 (f0 $ f0 $ fs $ f0 $ f0 $ fib_aux 1).2
 (f0 $ f0 $ fs $ f0 $ f0 (0, 1)).2
 (f0 $ f0 $ fs $ f0 (0*0+1*1, 1*(2*0+1))).2
 (f0 $ f0 $ fs $ f0 (1, 1)).2
 (f0 $ f0 $ fs (1*1+1*1, 1*(2*1+1))).2
 (f0 $ f0 $ fs (2, 3)).2
 (f0 $ f0 (3, 2+3)).2
 (f0 $ f0 (3, 5)).2
 (f0 (3*3+5*5, 5*(2*3+5))).2
 (f0 (34, 55)).2
 (34*34+55*55, 55*(2*34+55)).2
 (4181, 6765).2
 6765

Kevin Buzzard (Apr 18 2020 at 14:51):

That is much better than the foldl/foldr issue

Reid Barton (Apr 18 2020 at 14:51):

foldl/foldr?

Kenny Lau (Apr 18 2020 at 14:52):

def f0(s):
    return ("({0}*{0}+{1}*{1})".format(*s), "{1}*(2*{0}+{1})".format(*s))
def fs(s):
    return (s[1],"({0}+{1})".format(*s))

f20 = f0(f0(fs(f0(f0((0,1))))))[1]
print(f20)
print(eval(f20))

output:

(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(2*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))*(2*(1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(2*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))))

6765

Kevin Buzzard (Apr 18 2020 at 14:52):

I just mean asking a functional language to compute Fibonacci numbers might naively involve computing FndF_{n-d} about dd times. It's not foldl/foldr but it's the same sort of thing.

Kenny Lau (Apr 18 2020 at 14:53):

I don't know whether it will cause problem if the intermediate expressions are not immediately evaluated

Kenny Lau (Apr 18 2020 at 14:53):

it might undo our optimization by having exponential complexity (on top of logarithmic complexity)

Kenny Lau (Apr 18 2020 at 14:53):

then the overall complexity would just be O(n)

Reid Barton (Apr 18 2020 at 14:54):

None of this is really relevant if you can't compute these additions in subexponential time

Kevin Buzzard (Apr 18 2020 at 14:54):

Yes, you are evaluating 0*0+1*1 a lot of times there

Kenny Lau (Apr 18 2020 at 14:54):

Reid Barton said:

None of this is really relevant if you can't compute these additions in subexponential time

norm_num can

Reid Barton (Apr 18 2020 at 14:54):

But you can't solve the original problem using norm_num

Reid Barton (Apr 18 2020 at 14:55):

You are solving the second problem before solving the first problem.

Kenny Lau (Apr 18 2020 at 14:55):

if you're referring to writing a tactic that will calculate fib 20, then my algorithm can call norm_num in each intermediate step

Kenny Lau (Apr 18 2020 at 14:55):

what's the first problem?

Reid Barton (Apr 18 2020 at 14:56):

I don't want to write a new tactic to compute every function I write.

Kenny Lau (Apr 18 2020 at 14:56):

so I also proposed having a norm_num tag that you can tag my lemmas with, which include fib n = (fib_aux n).2 and fib_aux (bit0 n) = f0 (fib_aux n) and fib_aux(bit1 n) = fs (f0 (fib_aux n)), and my algorithm demonstrated how you can use these three lemmas to calculate fib 20 in logarithmic time

Kevin Buzzard (Apr 18 2020 at 14:57):

One could imagine a general "multiply these two matrices together using norm_num and also other tricks" function.

Kenny Lau (Apr 18 2020 at 14:57):

(removed)

Kevin Buzzard (Apr 18 2020 at 14:57):

Such a function will I think be hard coded into any decent computer algebra system

Kenny Lau (Apr 18 2020 at 14:57):

(removed)

Alex J. Best (Apr 18 2020 at 17:05):

@Mario Carneiro has mentioned that we need Coq's cbv tactic a few times now for similar questions I believe, but I don't know enough about coq to know what needs doing here (can we get a formal roadmap for a tactic)!

Mario Carneiro (Apr 18 2020 at 20:13):

As Reid suggests, the "right" answer here is to extend norm_num, and I've been meaning to use def_replacer for this purpose so that norm_num doesn't have to import every number theoretic function it will ever use

Mario Carneiro (Apr 18 2020 at 20:15):

There is nothing wrong with using - inside norm_num, because norm_num doesn't prove a - b = c, it proves b + c = a where a and b are the inputs and c is the output

Mario Carneiro (Apr 18 2020 at 20:20):

Extending norm_num is a bit more complicated than just marking fib_bit0 and fib_bit1, because it's not a simp set, it requires some carefully stated custom theorems and goes by recursion, which is much faster than simp

Mario Carneiro (Apr 18 2020 at 20:22):

But it also means that it is not difficult to get it to evaluate subterms as required to get the O(n log n) algorithm

Mario Carneiro (Apr 18 2020 at 20:24):

Johan Commelin said:

Well, I want to show e.g. that fib 34 is > 4 million

If that's all you want, you don't need to evaluate fib at all, unless that bound is particularly close. A quick estimation method would say that fib n is close to phi^n / sqrt 5 or something like that, and you can easily get proven bounds on phi using continued fractions

Kevin Buzzard (Apr 18 2020 at 20:36):

import data.nat.fib

open nat

example : fib 12 = 144 := begin
  rw fib_succ_succ,
  rw @fib_succ_succ 9,
  rw @fib_succ_succ 8,
  rw @fib_succ_succ 7,
  rw @fib_succ_succ 6,
  rw @fib_succ_succ 5,
  rw @fib_succ_succ 4,
  rw @fib_succ_succ 3,
  rw @fib_succ_succ 2,
  rw @fib_succ_succ 1,
  rw @fib_succ_succ 0,
  refl,
end

Kevin Buzzard (Apr 18 2020 at 20:37):

example : fib 12 = 144 := begin
  -- start the countdown
  rw @fib_succ_succ 10,
  rw @fib_succ_succ 9,
  rw @fib_succ_succ 8,
  rw @fib_succ_succ 7,
  rw @fib_succ_succ 6,
  rw @fib_succ_succ 5,
  rw @fib_succ_succ 4,
  rw @fib_succ_succ 3,
  rw @fib_succ_succ 2,
  rw @fib_succ_succ 1,
  rw @fib_succ_succ 0,
  rw @fib_one,
  rw @fib_zero,
end

Last updated: Dec 20 2023 at 11:08 UTC