Zulip Chat Archive
Stream: general
Topic: calculating fibonacci numbers
Johan Commelin (Apr 18 2020 at 12:05):
What is the correct way to calculate fibonacci numbers in Lean?
import data.nat.fib open nat example : fib 12 = 144 := by norm_num -- doesn't work
Johan Commelin (Apr 18 2020 at 12:11):
#eval fib 12
is instantaneous, but rfl
doesn't work as proof of the above
Jason Rute (Apr 18 2020 at 12:18):
This is the archetypical example where the standard recursive definition is going to do a lot of extra computation (unless the lean compiler does something fancy). My guess is that if you need a fast version of fib
, you can write your own (it’s pretty easy with some googling) and prove it is equal to fib
. (But I’m not an expert.)
Kevin Buzzard (Apr 18 2020 at 12:23):
Is this the bad way? https://leanprover.github.io/theorem_proving_in_lean/induction_and_recursion.html?highlight=fibonacci#structural-recursion-and-induction
Kevin Buzzard (Apr 18 2020 at 12:24):
I clicked "try it online" and rfl
failed to prove fib 12 = 144
, but after staring at the definition I realised that -- oops -- it did prove fib 11 = 144
:-)
Kevin Buzzard (Apr 18 2020 at 12:24):
The canonical normalisation is the one Johan is using.
Johan Commelin (Apr 18 2020 at 12:26):
Wait, I get #eval fib 11 -- 89
in my Lean
Johan Commelin (Apr 18 2020 at 12:27):
But rfl
is way too slow for these facts, it doesn't memoize
Kevin Buzzard (Apr 18 2020 at 12:28):
In TPIL they define their own Fibonacci numbers and I'm observing that rfl works for them
Kevin Buzzard (Apr 18 2020 at 12:28):
(or at least it did for me with "try it online")
Johan Commelin (Apr 18 2020 at 12:29):
Hmmm, weird... I think the definition in mathlib should actually memoize enough
Jason Rute (Apr 18 2020 at 12:37):
@Kevin Buzzard wrote
Is this the bad way? https://leanprover.github.io/theorem_proving_in_lean/induction_and_recursion.html?highlight=fibonacci#structural-recursion-and-induction
Generally speaking, from a computation standpoint, yes. From a math standpoint, that is the natural definition. I'll write some code showing the difference.
Kevin Buzzard (Apr 18 2020 at 12:39):
This is CS 101 coming up, isn't it :-) This is like the difference between foldl and foldr or something -- mathematically they are the same but computationally they couldn't be more different.
Kevin Buzzard (Apr 18 2020 at 12:41):
My problem is that even though I understand the basic principle that there are lots of mathematically equivalent ways to do something and they might have vastly different computational behaviour, when I see the definition like the one in TPIL I have so little understanding of what is happening under the hood that I am not capable of working out whether the definition is computationally good or not.
Kevin Buzzard (Apr 18 2020 at 12:42):
For all I know the equation compiler is cunningly storing all the auxiliary fib a
computations on the way.
Jason Rute (Apr 18 2020 at 12:50):
I might be answering the wrong question. Johan isn't asking how does one compute with fib, but how does one prove fib 12 = 144. This works for me:
def fib: ℕ → ℕ | 0 := 0 | 1 := 1 | (n + 2) := (fib n) + (fib (n + 1)) example : fib 12 = 144 := rfl
Johan Commelin (Apr 18 2020 at 12:51):
@Jason Rute But that one is slow, right?
Also, to be precise, I want to work with mathlib's fib
...
Anne Baanen (Apr 18 2020 at 12:51):
According to TPiL "[The implementation of structural recursion] provides an efficient implementation of fib, avoiding the exponential blowup that would arise from evaluating each recursive call independently.", so it should not be slow.
Johan Commelin (Apr 18 2020 at 12:51):
What? Your fib
is way faster than the one in mathlib... I'm really confused now
Shing Tak Lam (Apr 18 2020 at 12:52):
This works (I don't know why).
import data.nat.fib open nat example : fib 12 = 144 := by refl
Jason Rute (Apr 18 2020 at 12:53):
I'm more confused now too. I just looked at the mathlib implementation. It should be similar to this fast_fib
which is much faster than the naive one:
def fast_fib_aux: ℕ → ℕ × ℕ | 0 := (0, 1) | 1 := (1, 1) | (n + 2) := let (a, b) := fast_fib_aux (n + 1) in (b, a + b) def fast_fib (n: ℕ): ℕ := (fast_fib_aux n).1 example : (fast_fib 12) = 144 := rfl
Johan Commelin (Apr 18 2020 at 12:54):
#eval fib 30 -- 832040 ... fast! example : fib 30 = 832040 := by refl -- slow
Jason Rute (Apr 18 2020 at 12:55):
I thought I remember seeing that #eval takes shortcuts with interger computation that rfl can't take.
Johan Commelin (Apr 18 2020 at 12:57):
Sure, that might be the case. I also don't care whether the proof is by rfl
. I just want some way to prove this, that doesn't involve manually apply fib_succ_succ
and computing all the 30 fibonacci numbers before it.
Jason Rute (Apr 18 2020 at 12:58):
Does the by refl
trick work for you? (Why does by refl
work and not rfl
?)
Johan Commelin (Apr 18 2020 at 12:59):
It works for fib 12
, but not for fib 30
Jason Rute (Apr 18 2020 at 13:01):
You can use my fast_fib above (it works for 30) and prove it is equivalent to fib.) Maybe that needs to be the definition in mathlib.
Jason Rute (Apr 18 2020 at 13:02):
Sorry, I thought it was working for 30, but it doesn't. My bad.
Reid Barton (Apr 18 2020 at 13:18):
The equation compiler uses a version of memoization automatically.
fib 30
Even adding two numbers of around this size by rfl
is too slow, so this can't work.
Kevin Buzzard (Apr 18 2020 at 13:18):
Johan Commelin said:
Sure, that might be the case. I also don't care whether the proof is by
rfl
. I just want some way to prove this, that doesn't involve manually applyfib_succ_succ
and computing all the 30 fibonacci numbers before it.
I thought the problem was that with some naive implementations you compute all the 30 fibonacci numbers before it many many times.
Kevin Buzzard (Apr 18 2020 at 13:19):
There's a way of computing fib n
involving induction on binary expansion of n, because and are closely related to by Binet.
Kevin Buzzard (Apr 18 2020 at 13:19):
That would be super-efficient, but I'm assuming that this isn't the question. Or is it?
Johan Commelin (Apr 18 2020 at 13:19):
So we turn that into simp
-lemmas?
Kenny Lau (Apr 18 2020 at 13:26):
Kevin Buzzard said:
That would be super-efficient, but I'm assuming that this isn't the question. Or is it?
until you realize that dividing by 2 isn't O(1), but rather O(n)
Jason Rute (Apr 18 2020 at 13:29):
To be clear, I was originally mistaken on the definition used in mathlib. It is (what should be on paper) a fast definition. My first comment led us down the wrong track. As @Reid Barton said, apparently rfl isn't even good at adding large numbers (which is what exactly what any computation of fibonacci does). So I think this is still an open question how to prover stuff like this if rfl (and refl) are unusable.
Johan Commelin (Apr 18 2020 at 13:30):
But Kevin is talking about
Am I stupid, or is there no division by two there?
Kevin Buzzard (Apr 18 2020 at 13:34):
I'm talking about something like , the same trick that one uses when computing if is large, by repeated squaring.
Johan Commelin (Apr 18 2020 at 13:36):
image.png ??
Johan Commelin (Apr 18 2020 at 13:36):
@Kevin Buzzard That one :up: ?
Kevin Buzzard (Apr 18 2020 at 13:39):
Right.
Kevin Buzzard (Apr 18 2020 at 13:39):
If and one can do the arithmetic efficiently then this is a far quicker way of computing Fibonacci numbers.
Kevin Buzzard (Apr 18 2020 at 13:40):
But I suspect that this has nothing to do with your original question.
Johan Commelin (Apr 18 2020 at 13:40):
Well, I want to show e.g. that fib 34
is > 4 million
Kevin Buzzard (Apr 18 2020 at 13:43):
Reid Barton said:
Even adding two numbers of around this size by
rfl
is too slow, so this can't work.
@Reid Barton
I once found a counterexample to this in front of a room full of schoolchildren and it took me a while to realise what the heck was going on:
example : 1000000000000 + 1000000000000 = 2000000000000 := rfl -- works fine
Kevin Buzzard (Apr 18 2020 at 13:43):
I had just coherently argued that this would fail miserably.
Jason Rute (Apr 18 2020 at 14:17):
So I think the prerequisite question is how does one prove this in Lean?
example : 317811 + 514229 = 832040
Kevin Buzzard (Apr 18 2020 at 14:20):
by norm_num
Reid Barton (Apr 18 2020 at 14:21):
This one is by norm_num
. The question is really how to make norm_num
aware of fib
, or fib
aware of norm_num
, or something which can do both definitional unfolding and norm_num
where appropriate.
Kenny Lau (Apr 18 2020 at 14:29):
maybe if norm_num
accepts tags then we can tag fib_bit0
and fib_bit1
as norm_num
Kenny Lau (Apr 18 2020 at 14:29):
this still wouldn't make it memoize though
Kenny Lau (Apr 18 2020 at 14:30):
Johan Commelin said:
image.png ??
there must be a formula without minus
Johan Commelin (Apr 18 2020 at 14:31):
Well, F n * L n
doesn't contain a minus... maybe there are fast formulas for L n
as well??
Kenny Lau (Apr 18 2020 at 14:48):
20 = 2 * 10 = 2 * (2 * 5) = 2 * (2 * (2 * 2 + 1)) (F(0), F(1)) = (0, 1) (F(1), F(2)) = (0*0+1*1, 1*(2*0+1)) = (1, 1) (F(3), F(4)) = (1*1+1*1, 1*(2*1+1)) = (2, 3) (F(4), F(5)) = (3, 2+3) = (3, 5) (F(9), F(10)) = (3*3+5*5, 5*(2*3+5)) = (34, 55) (F(19), F(20)) = (34*34+55*55, 55*(2*34+55)) = (4181, 6765)
proposed algorithm:
fib 20 = fib (bit0 10) = fib (bit0 $ bit0 5) = fib (bit0 $ bit0 $ bit1 2) = fib (bit0 $ bit0 $ bit1 $ bit0 1) → (fib_aux $ bit0 $ bit0 $ bit1 $ bit0 1).2 → (f0 $ fib_aux $ bit0 $ bit1 $ bit0 1).2 → (f0 $ f0 $ fib_aux $ bit1 $ bit0 1).2 → (f0 $ f0 $ fs $ f0 $ fib_aux $ bit0 1).2 → (f0 $ f0 $ fs $ f0 $ f0 $ fib_aux 1).2 → (f0 $ f0 $ fs $ f0 $ f0 (0, 1)).2 → (f0 $ f0 $ fs $ f0 (0*0+1*1, 1*(2*0+1))).2 → (f0 $ f0 $ fs $ f0 (1, 1)).2 → (f0 $ f0 $ fs (1*1+1*1, 1*(2*1+1))).2 → (f0 $ f0 $ fs (2, 3)).2 → (f0 $ f0 (3, 2+3)).2 → (f0 $ f0 (3, 5)).2 → (f0 (3*3+5*5, 5*(2*3+5))).2 → (f0 (34, 55)).2 → (34*34+55*55, 55*(2*34+55)).2 → (4181, 6765).2 → 6765
Kevin Buzzard (Apr 18 2020 at 14:51):
That is much better than the foldl/foldr issue
Reid Barton (Apr 18 2020 at 14:51):
foldl/foldr?
Kenny Lau (Apr 18 2020 at 14:52):
def f0(s): return ("({0}*{0}+{1}*{1})".format(*s), "{1}*(2*{0}+{1})".format(*s)) def fs(s): return (s[1],"({0}+{1})".format(*s)) f20 = f0(f0(fs(f0(f0((0,1))))))[1] print(f20) print(eval(f20))
output:
(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(2*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))*(2*(1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1)))*(2*1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))+(((0*0+1*1)*(0*0+1*1)+1*(2*0+1)*1*(2*0+1))+1*(2*0+1)*(2*(0*0+1*1)+1*(2*0+1))))) 6765
Kevin Buzzard (Apr 18 2020 at 14:52):
I just mean asking a functional language to compute Fibonacci numbers might naively involve computing about times. It's not foldl/foldr but it's the same sort of thing.
Kenny Lau (Apr 18 2020 at 14:53):
I don't know whether it will cause problem if the intermediate expressions are not immediately evaluated
Kenny Lau (Apr 18 2020 at 14:53):
it might undo our optimization by having exponential complexity (on top of logarithmic complexity)
Kenny Lau (Apr 18 2020 at 14:53):
then the overall complexity would just be O(n)
Reid Barton (Apr 18 2020 at 14:54):
None of this is really relevant if you can't compute these additions in subexponential time
Kevin Buzzard (Apr 18 2020 at 14:54):
Yes, you are evaluating 0*0+1*1
a lot of times there
Kenny Lau (Apr 18 2020 at 14:54):
Reid Barton said:
None of this is really relevant if you can't compute these additions in subexponential time
norm_num
can
Reid Barton (Apr 18 2020 at 14:54):
But you can't solve the original problem using norm_num
Reid Barton (Apr 18 2020 at 14:55):
You are solving the second problem before solving the first problem.
Kenny Lau (Apr 18 2020 at 14:55):
if you're referring to writing a tactic that will calculate fib 20
, then my algorithm can call norm_num
in each intermediate step
Kenny Lau (Apr 18 2020 at 14:55):
what's the first problem?
Reid Barton (Apr 18 2020 at 14:56):
I don't want to write a new tactic to compute every function I write.
Kenny Lau (Apr 18 2020 at 14:56):
so I also proposed having a norm_num
tag that you can tag my lemmas with, which include fib n = (fib_aux n).2
and fib_aux (bit0 n) = f0 (fib_aux n)
and fib_aux(bit1 n) = fs (f0 (fib_aux n))
, and my algorithm demonstrated how you can use these three lemmas to calculate fib 20
in logarithmic time
Kevin Buzzard (Apr 18 2020 at 14:57):
One could imagine a general "multiply these two matrices together using norm_num and also other tricks" function.
Kenny Lau (Apr 18 2020 at 14:57):
(removed)
Kevin Buzzard (Apr 18 2020 at 14:57):
Such a function will I think be hard coded into any decent computer algebra system
Kenny Lau (Apr 18 2020 at 14:57):
(removed)
Alex J. Best (Apr 18 2020 at 17:05):
@Mario Carneiro has mentioned that we need Coq's cbv
tactic a few times now for similar questions I believe, but I don't know enough about coq to know what needs doing here (can we get a formal roadmap for a tactic)!
Mario Carneiro (Apr 18 2020 at 20:13):
As Reid suggests, the "right" answer here is to extend norm_num
, and I've been meaning to use def_replacer
for this purpose so that norm_num
doesn't have to import every number theoretic function it will ever use
Mario Carneiro (Apr 18 2020 at 20:15):
There is nothing wrong with using -
inside norm_num
, because norm_num
doesn't prove a - b = c
, it proves b + c = a
where a
and b
are the inputs and c
is the output
Mario Carneiro (Apr 18 2020 at 20:20):
Extending norm_num
is a bit more complicated than just marking fib_bit0
and fib_bit1
, because it's not a simp set, it requires some carefully stated custom theorems and goes by recursion, which is much faster than simp
Mario Carneiro (Apr 18 2020 at 20:22):
But it also means that it is not difficult to get it to evaluate subterms as required to get the O(n log n) algorithm
Mario Carneiro (Apr 18 2020 at 20:24):
Johan Commelin said:
Well, I want to show e.g. that
fib 34
is > 4 million
If that's all you want, you don't need to evaluate fib
at all, unless that bound is particularly close. A quick estimation method would say that fib n
is close to phi^n / sqrt 5
or something like that, and you can easily get proven bounds on phi using continued fractions
Kevin Buzzard (Apr 18 2020 at 20:36):
import data.nat.fib open nat example : fib 12 = 144 := begin rw fib_succ_succ, rw @fib_succ_succ 9, rw @fib_succ_succ 8, rw @fib_succ_succ 7, rw @fib_succ_succ 6, rw @fib_succ_succ 5, rw @fib_succ_succ 4, rw @fib_succ_succ 3, rw @fib_succ_succ 2, rw @fib_succ_succ 1, rw @fib_succ_succ 0, refl, end
Kevin Buzzard (Apr 18 2020 at 20:37):
example : fib 12 = 144 := begin -- start the countdown rw @fib_succ_succ 10, rw @fib_succ_succ 9, rw @fib_succ_succ 8, rw @fib_succ_succ 7, rw @fib_succ_succ 6, rw @fib_succ_succ 5, rw @fib_succ_succ 4, rw @fib_succ_succ 3, rw @fib_succ_succ 2, rw @fib_succ_succ 1, rw @fib_succ_succ 0, rw @fib_one, rw @fib_zero, end
Last updated: Dec 20 2023 at 11:08 UTC