Zulip Chat Archive
Stream: Is there code for X?
Topic: Birthday Problem
Hunter Monroe (Apr 13 2021 at 05:21):
The birthday theorem is one of the 100 theorems not in Lean. To include this in lean, would it be sufficient to show 365/365 x 364/365 x ... x 343/365 < 1/2? In other theorem provers, the theorem is stated as follows. So it takes as input the number of people s=23 and the number of days in the year t=365, and then generically gives a function of s and t, such that the inequality holds for those particular s and t.
HOL: |- !s t. s HAS_SIZE 23 /\ t HAS_SIZE 365
==> 2 * CARD {f | f IN (s -> t) /\
?x y. x IN s /\ y IN s /\ ~(x = y) /\ f(x) = f(y)}
>= CARD (s -> t)
Isabelle: assumes "card S = 23" "card T = 365"
shows "2 * card {f ∈ S→\<^sub>E S T. ¬ inj_on f S} ≥ card (S →\<^sub>E T)"
Mizar: let l := picks 23 (enumerate 365) in
2 * length (filter collision l) > length l.
Johan Commelin (Apr 13 2021 at 05:24):
Personally, I would want to see it phrased in terms of probability theory.
Johan Commelin (Apr 13 2021 at 05:24):
And mathlib doesn't have much in that direction, unfortunately.
Johan Commelin (Apr 13 2021 at 06:06):
@Hunter Monroe Actually, I don't think it really has to be probability theory. But it should be a statement that is recognizably the Birthday problem, before any maths has been done.
Johan Commelin (Apr 13 2021 at 06:14):
import data.multiset
namespace multiset
variables {α : Type*} [decidable_eq α]
def likely_collision (s : multiset (multiset α)) : Prop :=
2 * (s.filter nodup).card < s.card
theorem birthday : likely_collision ((multiset.range 365).powerset_len 23) :=
sorry
end multiset
Johan Commelin (Apr 13 2021 at 06:16):
Or maybe
def likely_collision (s : multiset (multiset α)) : Prop :=
(s.filter nodup).card < (s.filter $ λ a, ¬ nodup a).card
Johan Commelin (Apr 13 2021 at 06:42):
Hmm, it seems that multiset.powerset_len
doesn't do what I thought it does.
Johan Commelin (Apr 13 2021 at 06:59):
I think this version is at least math-correct:
import data.vector2
import data.fintype.card
open fintype (card)
variables (α : Type*) [decidable_eq α] [fintype α]
def likely_collision (n : ℕ) : Prop :=
2 * (finset.univ.filter $ λ l : vector α n, l.to_list.nodup).card < card (vector α n)
theorem birthday : likely_collision (fin 365) 23 :=
begin
unfold likely_collision,
rw [card_vector, fintype.card_fin],
sorry
end
Bhavik Mehta (Apr 13 2021 at 13:26):
@Matt Kempster has also been working on this, and I think he's made a reasonable amount of progress
Matt Kempster (Apr 13 2021 at 14:28):
Yes I have, should I post what I have so far? It's very WIP but I think it's on the right track. It's not probability theory based at all - just counting numbers of injective functions from fin 23 to fin 365 and dividing by all functions from fun 23 to fin 365. This is basically how Metamath does it
Joseph Myers (Apr 14 2021 at 00:07):
The value of fintype.card (α ↪ β)
(for arbitrary [fintype α] [fintype β]
) seems like it belongs in mathlib proper, with only the evaluation for fin 23
and fin 365
going in the mathlib archive. (The denominator is given by fintype.card_fun
.)
Matt Kempster (Apr 14 2021 at 05:20):
Yep, very on the same page there. I'll push to a branch soon (tomorrow if I find time), but my work is pretty messy so far. Look forward to input and advice!
Hunter Monroe (Apr 15 2021 at 17:27):
We would just need to show (finset.filter (λ (l : vector (fin n) k), l.to_list.nodup) finset.univ).card = n* (left hand side but with n-1, k-1)
Yakov Pechersky (Apr 15 2021 at 17:28):
You can use list.of_fn
to skip the vector
Yakov Pechersky (Apr 15 2021 at 17:31):
You can also probably use things about powerset
to automatically get nodup subsets
Yakov Pechersky (Apr 15 2021 at 17:32):
Yakov Pechersky (Apr 15 2021 at 17:32):
docs#finset.powerset_len_eq_filter
Yakov Pechersky (Apr 15 2021 at 17:35):
You could generalize to any pi
fintype, via something like docs#fintype.card_pi and docs#fintype.card_fun
Yakov Pechersky (Apr 15 2021 at 17:44):
Ah I see Joseph said this already
Eric Rodriguez (Apr 22 2021 at 16:57):
So I've got a nearly finished version, actually, and just thought I should search on Zulip to check if anyone else has done it, and saw this... Sorry Matt!
Mario Carneiro (Apr 22 2021 at 16:58):
well, if a bunch of people work on different problems eventually two people will work on the same problem :grinning:
Eric Rodriguez (Apr 22 2021 at 16:58):
One thing that I found is that norm_num
is slow at evaluating stuff like 365!/(365-23)!
, so I made this class to help; do people think it belongs in mathlib proper?
def reverse_fac : ℕ → ℕ → ℕ
| 0 0 := 1
| 0 (k + 1) := 1
| (n + 1) 0 := 1
| (n + 1) (k + 1) := (n + 1) * reverse_fac n k
Eric Rodriguez (Apr 22 2021 at 16:59):
(and basically one lemma guaranteeing that it truly is equal to n! / (n-k)!)
Mario Carneiro (Apr 22 2021 at 17:00):
Yeah, if you write 365!/(365-23)!
that's what norm_num will evaluate
Mario Carneiro (Apr 22 2021 at 17:00):
if you want to evaluate the product of 23 terms instead of 365 you should use a function that evaluates that way
Eric Rodriguez (Apr 22 2021 at 17:02):
this is what reverse_fac does^^ just asking whether it's worth putting in mathlib or just keep it in the archive file
Mario Carneiro (Apr 22 2021 at 17:04):
isn't there already something like this?
Mario Carneiro (Apr 22 2021 at 17:04):
they are called rising and falling factorials IIRC
Mario Carneiro (Apr 22 2021 at 17:05):
Oh there is docs#pochhammer but it is only for polynomials
Eric Rodriguez (Apr 22 2021 at 17:05):
docs#ring_theory.polynomial.pochhammer, maybe? edit: sniped :b
Eric Rodriguez (Apr 22 2021 at 17:06):
i'll try retrofit it with the eval theorems it has there
Eric Rodriguez (Apr 22 2021 at 17:07):
because it ended up _not_ being trivial as expected and instead I ended up with this disaster:
theorem eval_reverse_fac : ∀ n k : ℕ, k ≤ n → n! / (n - k)! = reverse_fac n k
| 0 0 := by simp!
| 0 (k + 1) := by simp!
| (n + 1) 0 := λ _, by { rw [nat.sub_zero, nat.div_self], unfold reverse_fac, apply factorial_pos }
| (n + 1) (k + 1) := begin
intro h, replace h := le_of_succ_le_succ h,
simp only [reverse_fac, factorial, succ_eq_add_one, succ_sub_succ_eq_sub],
rw ←eval_reverse_fac _ _ h,
-- at this point, we have two slightly different divisions. we fix them both
apply mul_left_cancel' (factorial_ne_zero (n - k)),
obtain ⟨a, ha⟩ := factorial_dvd_factorial (sub_le n k),
rw nat.mul_div_cancel',
swap, {use a * (n + 1), rw ha, ring},
symmetry,
calc (n - k)! * ((n + 1) * (n! / (n - k)!)) = (n + 1) * ((n - k)! * (n! / (n - k)!)) : by ac_refl
... = (n + 1) * n! : by rw nat.mul_div_cancel' ⟨a, ha⟩
end
-- If I don't put the `skip` it errors?! (`simp` won't do this on its own btw)
@[simp] lemma rev_fac_zero (m : ℕ) : reverse_fac m 0 = 1 := by {cases m; simp!, skip}
@[simp] lemma zero_rev_fac (m : ℕ) : reverse_fac 0 m = 1 := by {cases m; simp!, skip}
Eric Rodriguez (Apr 22 2021 at 17:07):
but yeah factorials API isn't super well-developed
Mario Carneiro (Apr 22 2021 at 17:11):
Note that norm_num doesn't actually know how to compute factorials. It's just rewriting with the simp lemmas for factorial
Mario Carneiro (Apr 22 2021 at 17:12):
if you use simp lemmas for reverse_fac
it will be just as good
Mario Carneiro (Apr 22 2021 at 17:13):
The reason your proof is a disaster is because you used -
and /
on nat
Mario Carneiro (Apr 22 2021 at 17:13):
you should rewrite the theorem to use +
and *
Eric Rodriguez (Apr 22 2021 at 17:13):
ahh, so instead I could do it as (n + k)! = n! reverse_fac blah
Eric Rodriguez (Apr 22 2021 at 17:15):
is it worth expressing the card as a division still or as this function? just thinking as it's going in mathlib
Mario Carneiro (Apr 22 2021 at 17:19):
for the birthday problem itself you should use a division for presentation purposes
Mario Carneiro (Apr 22 2021 at 17:19):
but once you have proven the lemma with +
and *
it's not hard to postprocess it to the -
and /
version
Eric Rodriguez (Apr 22 2021 at 17:21):
I mean for the card_embedding
function that will be kind of like docs#fintype.card_fun
Matt Kempster (Apr 23 2021 at 01:00):
Eric Rodriguez said:
So I've got a nearly finished version, actually, and just thought I should search on Zulip to check if anyone else has done it, and saw this... Sorry Matt!
No worries! I actually would say I was only ~30% done, and was starting to hit some major road blocks. I look forward to seeing your proof to compare & contrast with my skeleton!
Eric Rodriguez (Apr 25 2021 at 13:15):
It's done! The birthday problem has now been reduced to a call to norm_num
:D Cleaning up the code, PR incoming soon :)
Eric Rodriguez (Apr 25 2021 at 14:24):
Hmm, how to make Lean give up on code versions that I've already overwritten? For example, say I'm editing a big file, and I add an import, it'll start screaming at me that data.finset.basi
doesn't exist, whilst of course I had typed data.finset.basic
and it was just sort of delayed and doing old code, before trying again
Eric Wieser (Apr 25 2021 at 14:27):
Restart vs code?
Eric Rodriguez (Apr 25 2021 at 14:35):
hmm, wasn't too helpful. I'll just let CI deal with it, I'm only copy-pasting stuff right now anyways
Eric Rodriguez (Apr 25 2021 at 14:36):
I changed stuff in embedding.lean
and fintype.lean
so I'm assuming it needs to recheck everything again, lol
Eric Rodriguez (Apr 25 2021 at 15:02):
Eric Rodriguez (Apr 25 2021 at 15:04):
There's still a couple things to be done, but all the code itself is there ready to review; the other things are basically admin
Last updated: Dec 20 2023 at 11:08 UTC