Zulip Chat Archive
Stream: Is there code for X?
Topic: negative binomial distribution
Frederick Pu (Mar 27 2024 at 02:46):
Is there a formalism of negative binomial distributions?
Josha Dekker (Mar 27 2024 at 06:51):
I wrote up the geometric distribution not too long ago: https://github.com/leanprover-community/mathlib4/blob/5bc68d9f5a6c4eca4ec422e6c5f9674e55ce8926/Mathlib/Probability/Distributions/Geometric.lean
It is rather straightforward to generalize this to negbin. I’d recommend first making a PR that adds negbin (you can largely follow my code), then after that merged make a PR that expresses the geometric distribution as a special case
Josha Dekker (Mar 27 2024 at 06:56):
Another route would be to first define compound Poisson distributions. Negbin can be represented as this, and compound Poisson would be very welcome imho! I’m not sure if we get the optimal API immediately however.
Josha Dekker (Mar 27 2024 at 06:59):
If you don’t want to /have time to write it up, let me know, and I can give it a shot this weekend!
Frederick Pu (Mar 28 2024 at 00:23):
Josha Dekker said:
Another route would be to first define compound Poisson distributions. Negbin can be represented as this, and compound Poisson would be very welcome imho! I’m not sure if we get the optimal API immediately however.
Hi John,
I have formalized most of the proof that the sum of the probability masses is equal to 1.
The only step I'm missing is showing that you can swap the infinite sum with the derivative
p: ℝ
hp0: 0 < p
hp1: p < 1
r: ℕ
f: ℕ → ℝ → ℝ := fun n p => ↑(Nat.choose (n + r) r) * (1 - p) ^ n
⊢ (deriv fun p => ∑' (n : ℕ), f n p) = fun p => ∑' (n : ℕ), deriv (f n) p
I also need a bit of guidance cleaning up the proof to be library ready and adjusting it mesh nicely with the PMF API.
I would definitely be interested to getting a PR for this tho :grinning_face_with_smiling_eyes:
Frederick Pu (Mar 28 2024 at 00:39):
here's the full proof so far:
PS please forgive my variable names:
theorem bruh (p : ℝ) (hp0 : 0 < p) (hp1 : p < 1) : (r : ℕ) →
(∑' (n : ℕ), Nat.choose (n + r) (r) * (1 - p) ^ n * p^(r+1) = 1)
| 0 => by {
have : (fun n => ↑(Nat.choose (n + 0) 0) * (1 - p) ^ n * p ^ (0 + 1)) = fun n => p * (1 - p) ^ n := by
ext n
simp [Nat.choose]
ring
rw [this, tsum_mul_left]
have hp1': 1 - p < 1 := by linarith
have hp0' : 0 ≤ 1 - p := by linarith
rw [tsum_geometric_of_lt_1 hp0' hp1']
have : 1 - (1 - p) = p :=
tsub_tsub_cancel_of_le (le_of_lt hp1)
rw [this]
apply mul_inv_cancel
linarith
}
| Nat.succ r => by {
have crux : ∑' (n : ℕ), (Nat.choose (n + r) (r+1) * (1 - p) ^ n) = (1 - p) * (p^(r + 2))⁻¹ := by
{
calc
_ = ∑' (n : ℕ), (n / (r + 1:ℕ) * Nat.choose (n + r) r * (1 - p) ^ n) := by {
apply congrArg
ext n
have : (r + 1 : ℝ) ≠ 0 := by linarith
field_simp only
rw [mul_right_comm, ← Nat.cast_mul]
rw [Nat.choose_succ_right_eq]
simp [add_tsub_cancel_right]
left
ring
}
_ = (1 - p) / (r + 1) * ∑' (n : ℕ), Nat.choose (n + r) r * (n * (1 - p) ^ (n - 1)) := by {
rw [← tsum_mul_left]
apply congrArg
ext n
cases n with
| zero => simp
| succ n => {
rw [pow_succ]
simp
ring
}
}
_ = (1 - p) / (r + 1) * ∑' (n : ℕ), (-1)*deriv (fun p:ℝ => Nat.choose (n + r) r * (1 - p) ^ n) p := by {
apply congrArg
apply congrArg
ext n
have h2: DifferentiableAt ℝ (fun p:ℝ => p^n) (1-p:ℝ) := differentiableAt_pow n
have h: DifferentiableAt ℝ (fun p:ℝ => 1 - p) p := by
rw [differentiableAt_const_sub_iff]
exact differentiableAt_id'
-- have h : deriv (fun p:ℝ => 1 - p) p = -1 := by
-- rw [deriv_const_sub]
-- simp only [deriv_id'']
have := deriv.comp p h2 h
simp at this
rw [deriv_const_sub, deriv_id''] at this
simp only [Function.comp] at this
rw [deriv_const_mul, this]
ring
exact DifferentiableAt.comp p h2 h
}
_ = (1 - p) / (r + 1) * (-1) * ∑' (n : ℕ), deriv (fun p:ℝ => Nat.choose (n + r) r * (1 - p) ^ n) p := by {
rw [tsum_mul_left]
ring
}
_ = (1 - p) / (r + 1) * (-1) * deriv (fun p:ℝ => (∑' (n : ℕ), Nat.choose (n + r) r * (1 - p) ^ n)) p := by {
apply congrArg
apply Eq.symm
let f := fun (n : ℕ) (p : ℝ) => ↑(Nat.choose (n + r) r) * (1 - p) ^ n
suffices : deriv (fun p => ∑' (n : ℕ), f n p) = fun p => (∑' (n : ℕ), deriv (f n) p)
simp only [f] at this
exact congrFun this p
sorry -- apply deriv_tsum
}
_ = (1 - p) / (r + 1) * (-1) * deriv (fun p:ℝ => (1 / p^(r + 1))) p := by {
apply congrArg
apply Filter.EventuallyEq.deriv_eq
simp [Filter.EventuallyEq]
rw [eventually_nhds_iff]
-- |1/2 - p| < ε => -ε < 1/2 - p < ε => 1/2 - ε < p < 1/2 + ε
-- 1 / 2 - ε = 1 / 2 - 1/ 2 = 0
-- 1/2 + ε = 1/ 2 + 1/ 2= 1
use Metric.ball (1 / 2) (1 / 2)
apply And.intro
intro x hx
simp only [Metric.ball, dist, Set.mem_setOf_eq] at hx
rw [abs_sub_lt_iff] at hx
have := bruh x (by linarith [hx.left]) (by linarith [hx.right]) r
rw [tsum_mul_right] at this
have w : x ≠ 0 := by linarith [hx.left, hx.right]
field_simp
rw [this]
apply And.intro
exact Metric.isOpen_ball
simp only [Metric.ball, dist, Set.mem_setOf_eq]
rw [abs_sub_lt_iff]
apply And.intro
linarith
linarith
}
_ = (1 - p) / (r + 1) * (-1) * (-(r+1) * (p^(r + 2))⁻¹) := by {
apply congrArg
have : (fun p:ℝ => 1 / p ^ (r + 1)) = (fun p:ℝ => (p ^ (r + 1))⁻¹) := by
ext x
rw [← inv_eq_one_div]
rw [this, deriv_inv'', deriv_pow]
simp
field_simp
ring
exact differentiableAt_pow (r + 1)
have : p ≠ 0 := by linarith
exact pow_ne_zero (r + 1) this
}
_ = (1 - p) * (p^(r + 2))⁻¹ := by field_simp; ring
}
calc
_ = ∑' (n : ℕ), (Nat.choose (n + r + 1) (r + 1)) * (1 - p) ^ n * p ^ (r + 1 + 1) := by {
apply congrArg
ext n
rw [Nat.succ_eq_add_one, ← add_assoc]
}
_ = ∑' (n : ℕ), (Nat.choose (n + r) r + Nat.choose (n + r) (r+1) :ℕ) * (1 - p) ^ n * p ^ (r + 1 + 1) := by {
simp [Nat.choose]
}
_ = ∑' (n : ℕ), (Nat.choose (n + r) r * (1 - p) ^ n * p ^ (r + 1 + 1) + Nat.choose (n + r) (r+1) * (1 - p) ^ n * p ^ (r + 1 + 1)) := by {
apply congrArg
ext n
push_cast
ring
}
_ = p * ∑' (n : ℕ), (Nat.choose (n + r) r * (1 - p) ^ n * p ^ (r+1)) + p ^ (r + 1 + 1) * ∑' (n : ℕ), (Nat.choose (n + r) (r+1) * (1 - p) ^ n) := by {
rw [← tsum_mul_left, ← tsum_mul_left]
rw [← tsum_add]
apply congrArg
ext n
push_cast
ring
apply Summable.mul_left
have := bruh p hp0 hp1 r
apply by_contradiction
intro h
simp [tsum_def, h] at this
apply Summable.mul_left
apply by_contradiction
intro h
simp [tsum_def, h] at crux
cases crux with
| inl l => linarith
| inr r => linarith
}
_ = p * ∑' (n : ℕ), (Nat.choose (n + r) r * (1 - p) ^ n * p ^ (r+1)) + p ^ (r + 1 + 1) * ((1 - p) * (p^(r + 2))⁻¹) := by {
rw [crux]
}
_ = 1 := by {
rw [bruh p hp0 hp1 r]
have : p ≠ 0 := by linarith
field_simp
ring
}
}
Frederick Pu (Mar 28 2024 at 00:40):
I used induction to make the proof but apparently this can also be done by formalising genereal binomials over real numbers first
Kim Morrison (Mar 28 2024 at 00:43):
Any proof this long needs to be chopped up into lots of little lemmas. The sooner you do it, the easier you life will be (if only because VSCode will be more responsive).
Frederick Pu (Mar 28 2024 at 02:23):
Scott Morrison said:
Any proof this long needs to be chopped up into lots of little lemmas. The sooner you do it, the easier you life will be (if only because VSCode will be more responsive).
yeah I started noticing that
Frederick Pu (Mar 28 2024 at 02:24):
also I couldn't find deriv_tsum in my version of mathlib only fderiv_tsum. Is deriv_tsum a very new theorem?
Josha Dekker (Mar 28 2024 at 06:57):
Why would you have derivatives in the definition? Isn’t it more natural to work directly with the PMF, which I think doesn’t have derivatives?
Also, have you tried proving the exact analogous statements of my proofs for the geometric distribution? I think that would give you quicker proofs. Also note that this is generally helpful, as it will ensure consistency in style with the rest of the distributions
Frederick Pu (Mar 28 2024 at 07:02):
Josha Dekker said:
Why would you have derivatives in the definition? Isn’t it more natural to work directly with the PMF, which I think doesn’t have derivatives?
Also, have you tried proving the exact analogous statements of my proofs for the geometric distribution? I think that would give you quicker proofs. Also note that this is generally helpful, as it will ensure consistency in style with the rest of the distributions
I don't have derivatives in the definition. I use it in the proof in order to compute $\sum_{n = 0}^{\infty} n x^n (n \choose r)$
Frederick Pu (Mar 28 2024 at 07:03):
it's a trick I learned in statistics in order to deal with the additional linear term
David Loeffler (Mar 28 2024 at 07:35):
Frederick Pu said:
also I couldn't find deriv_tsum in my version of mathlib only fderiv_tsum. Is deriv_tsum a very new theorem?
Looks like deriv_tsum
was added about 3 months ago, according to the git history (while fderiv_tsum
went in about 6 months earlier than that). If you're planning to contribute the new code you're writing to mathlib, you will save yourself a lot of pain if you update to the latest mathlib version before starting, since rebasing your code after you've written it is not a terribly pleasant job.
Josha Dekker (Mar 28 2024 at 08:37):
some guiding tips:
- I think that your
crux
should be (made into) a lemma. - You may also wish to begin the proof of
bruh
by computing in a separate lemmaNat.choose (n+r) (r) * a^n
(or evenNat.choose (n+r) (r) *a^(n+r)
if you find this easier). This lemma should be quicker to prove as you won't have to carry additional terms around. (Unless this lemma is already in Mathlib, of course!) - If such statements are not yet in Mathlib, you may consider adding them (to their corresponding files). In this case, I would recommend doing a combinatorial proof that avoids derivatives altogether, to avoid having to add new files to the import hierarchy.
Frederick Pu (Mar 28 2024 at 15:49):
right now I haven't cloned mathlib. I just imported mathlib in a lean project. Will I have all the mathlib cache if I'm working directly in the mathlib project? How does that work?
Ruben Van de Velde (Mar 28 2024 at 15:59):
Are you trying to make changes to the mathlib under YourProject/.lake
? Otherwise I don't understand your situation
Frederick Pu (Mar 28 2024 at 17:50):
I refactored the proof into 3 lemmas:
theorem tsum_choose_mul_compl (p : ℝ) (hp : p ∈ Set.Ioo 0 1) (r : ℕ)
(h : (∑' (n : ℕ), Nat.choose (n + r) (r) * (1 - p) ^ n * p^(r+1) = 1)) :
∑' (n : ℕ), Nat.choose (n + r) r * (1 - p) ^ n = (1 / p^(r + 1)) := by
rw [tsum_mul_right] at h
have : p ≠ 0 := by linarith [hp.left, hp.right]
field_simp
exact h
-- TODO: make better lemma name
theorem crux' (p : ℝ) (hp : p ∈ Set.Ioo 0 1) (r : ℕ)
(h : (∀ p ∈ Set.Ioo (0:ℝ) 1, ∑' (n : ℕ), Nat.choose (n + r) (r) * (1 - p) ^ n * p^(r+1) = 1)) :
∑' (n : ℕ), Nat.choose (n + r) r * (n * (1 - p) ^ (n-1)) = (r+1) * (p^(r + 2))⁻¹ := by
calc
_ = ∑' (n : ℕ), (-1)*deriv (fun p:ℝ => Nat.choose (n + r) r * (1 - p) ^ n) p := by {
apply congrArg
ext n
have h2: DifferentiableAt ℝ (fun p:ℝ => p^n) (1-p:ℝ) := differentiableAt_pow n
have h: DifferentiableAt ℝ (fun p:ℝ => 1 - p) p := by
rw [differentiableAt_const_sub_iff]
exact differentiableAt_id'
have := deriv.comp p h2 h
simp at this
rw [deriv_const_sub, deriv_id''] at this
simp only [Function.comp] at this
rw [deriv_const_mul, this]
ring
exact DifferentiableAt.comp p h2 h
}
_ = (-1) * deriv (fun p:ℝ => (∑' (n : ℕ), Nat.choose (n + r) r * (1 - p) ^ n)) p := by {
rw [tsum_mul_left]
apply congrArg
apply Eq.symm
let f := fun (n : ℕ) (p : ℝ) => ↑(Nat.choose (n + r) r) * (1 - p) ^ n
suffices : deriv (fun p => ∑' (n : ℕ), f n p) = fun p => (∑' (n : ℕ), deriv (f n) p)
simp only [f] at this
exact congrFun this p
sorry -- apply deriv_tsum
}
_ = (-1) * deriv (fun p:ℝ => (1 / p^(r + 1))) p := by {
apply congrArg
apply Filter.EventuallyEq.deriv_eq
simp only [Filter.EventuallyEq]
rw [eventually_nhds_iff]
use Set.Ioo 0 1
apply And.intro
intro x hx
have := tsum_choose_mul_compl x hx r (h x hx)
rw [this]
ring
exact ⟨isOpen_Ioo, hp⟩
}
_ = (-1) * (-(r+1) * (p^(r + 2))⁻¹) := by {
have : (fun p:ℝ => 1 / p ^ (r + 1)) = (fun p:ℝ => (p ^ (r + 1))⁻¹) := by
ext x
rw [← inv_eq_one_div]
rw [this, deriv_inv'', deriv_pow]
have : p ≠ 0 := by linarith [hp.left]
field_simp
ring
exact differentiableAt_pow (r + 1)
exact pow_ne_zero (r + 1) (by linarith [hp.left])
}
_ = (r+1) * (p^(r + 2))⁻¹ := by ring
theorem crux (p : ℝ) (hp : p ∈ Set.Ioo 0 1) (r : ℕ)
(h : (∀ p ∈ Set.Ioo (0:ℝ) 1, ∑' (n : ℕ), Nat.choose (n + r) (r) * (1 - p) ^ n * p^(r+1) = 1)) :
∑' (n : ℕ), (Nat.choose (n + r) (r+1) * (1 - p) ^ n) = (1 - p) * (p^(r + 2))⁻¹ := by
calc
_ = ∑' (n : ℕ), (n / (r + 1:ℕ) * Nat.choose (n + r) r * (1 - p) ^ n) := by {
apply congrArg
ext n
have : (r + 1 : ℝ) ≠ 0 := by linarith
field_simp only
rw [mul_right_comm, ← Nat.cast_mul]
rw [Nat.choose_succ_right_eq]
simp [add_tsub_cancel_right]
left
ring
}
_ = (1 - p) / (r + 1) * ∑' (n : ℕ), Nat.choose (n + r) r * (n * (1 - p) ^ (n - 1)) := by {
rw [← tsum_mul_left]
apply congrArg
ext n
cases n with
| zero => simp
| succ n => {
rw [pow_succ]
simp
ring
}
}
_ = (1 - p) / (r + 1) * ((r+1) * (p^(r + 2))⁻¹) := by {
apply congrArg
rw [crux' p hp]
intro x hx
exact h x hx
}
_ = (1 - p) * (p^(r + 2))⁻¹ := by field_simp
theorem bruh (p : ℝ) (hp0 : 0 < p) (hp1 : p < 1) : (r : ℕ) →
(∑' (n : ℕ), Nat.choose (n + r) (r) * (1 - p) ^ n * p^(r+1) = 1)
| 0 => by {
have : (fun n => ↑(Nat.choose (n + 0) 0) * (1 - p) ^ n * p ^ (0 + 1)) = fun n => p * (1 - p) ^ n := by
ext n
simp [Nat.choose]
ring
rw [this, tsum_mul_left]
have hp1': 1 - p < 1 := by linarith
have hp0' : 0 ≤ 1 - p := by linarith
rw [tsum_geometric_of_lt_1 hp0' hp1']
have : 1 - (1 - p) = p :=
tsub_tsub_cancel_of_le (le_of_lt hp1)
rw [this]
apply mul_inv_cancel
linarith
}
| Nat.succ r => by {
have crux := crux p ⟨hp0, hp1⟩ r (fun x hx => bruh x hx.left hx.right r)
calc
_ = ∑' (n : ℕ), (Nat.choose (n + r + 1) (r + 1)) * (1 - p) ^ n * p ^ (r + 1 + 1) := by {
apply congrArg
ext n
rw [Nat.succ_eq_add_one, ← add_assoc]
}
_ = ∑' (n : ℕ), (Nat.choose (n + r) r + Nat.choose (n + r) (r+1) :ℕ) * (1 - p) ^ n * p ^ (r + 1 + 1) := by {
simp [Nat.choose]
}
_ = ∑' (n : ℕ), (Nat.choose (n + r) r * (1 - p) ^ n * p ^ (r + 1 + 1) + Nat.choose (n + r) (r+1) * (1 - p) ^ n * p ^ (r + 1 + 1)) := by {
apply congrArg
ext n
push_cast
ring
}
_ = p * ∑' (n : ℕ), (Nat.choose (n + r) r * (1 - p) ^ n * p ^ (r+1)) + p ^ (r + 1 + 1) * ∑' (n : ℕ), (Nat.choose (n + r) (r+1) * (1 - p) ^ n) := by {
rw [← tsum_mul_left, ← tsum_mul_left]
rw [← tsum_add]
apply congrArg
ext n
push_cast
ring
apply Summable.mul_left
have := bruh p hp0 hp1 r
apply by_contradiction
intro h
simp [tsum_def, h] at this
apply Summable.mul_left
apply by_contradiction
intro h
simp [tsum_def, h] at crux
cases crux with
| inl l => linarith
| inr r => linarith
}
_ = p * ∑' (n : ℕ), (Nat.choose (n + r) r * (1 - p) ^ n * p ^ (r+1)) + p ^ (r + 1 + 1) * ((1 - p) * (p^(r + 2))⁻¹) := by {
rw [crux]
}
_ = 1 := by {
rw [bruh p hp0 hp1 r]
have : p ≠ 0 := by linarith
field_simp
ring
}
}
Frederick Pu (Mar 28 2024 at 17:51):
I still need to prove the ∑' deriv = deriv ∑'
step and I am also planning to use the theorems about geometric distributions to prove the base case.
Frederick Pu (Mar 28 2024 at 17:55):
also probably gonna use Summable.summable_of_eq_zero_or_self
to simplify a lot the logic in bruh
Frederick Pu (Mar 28 2024 at 17:56):
also also some one please let me know if there any tactics for proving derivative identities automatically for polynomials
Frederick Pu (Mar 28 2024 at 18:35):
Frederick Pu said:
also probably gonna use
Summable.summable_of_eq_zero_or_self
to simplify a lot the logic inbruh
I meant this theorem isn't in mathlib yet (if the tsum is nonzero then the function is summable):
theorem summable_of_tsum_ne_zero {α : Type u_1} {β : Type u_2} [AddCommMonoid β] [TopologicalSpace β] {f : α → β} [ContinuousAdd β]
(hf : (∑' (a : α), f a) ≠ 0) : Summable f := by
apply by_contradiction
intro h
simp [tsum_def, h] at hf
Frederick Pu (Apr 02 2024 at 22:54):
Josha Dekker said:
some guiding tips:
- I think that your
crux
should be (made into) a lemma.- You may also wish to begin the proof of
bruh
by computing in a separate lemmaNat.choose (n+r) (r) * a^n
(or evenNat.choose (n+r) (r) *a^(n+r)
if you find this easier). This lemma should be quicker to prove as you won't have to carry additional terms around. (Unless this lemma is already in Mathlib, of course!)- If such statements are not yet in Mathlib, you may consider adding them (to their corresponding files). In this case, I would recommend doing a combinatorial proof that avoids derivatives altogether, to avoid having to add new files to the import hierarchy.
Can I use the dual numbers. Instead or make a very simplified version of them within my formalism?
Last updated: May 02 2025 at 03:31 UTC