Zulip Chat Archive
Stream: general
Topic: Help Understanding Goal Manipulation
Jonathan Crall (Nov 20 2023 at 01:50):
Help with MCC proof
Not sure if General is the correct place to post this. I'm attempting to build a formal proof in lean to accompany an informal proof in a small paper I wrote: https://arxiv.org/abs/2305.00594 and I'm having a hard time porting concepts I've learned from Theorem Proving in Lean / Mathematics in Lean to a bigger problem.
For additional context, I also wrote a previous post related to help with the limit part: previous post.
The idea is that I want to prove that the Matthew's Correlation Coefficient
(MCC) tends towards the Fowlkes--Mallows index (FM) as the number of true
negatives in the MCC tends towards infinity.
I'm starting to get a feel for the lean4 syntax and how to go about using the
VSCode proof assistant, but there things I'm having trouble with. My thought it
that I can writeup how far I've gotten and hopefully ask questions and get
insights on how to proceed.
I think I have a reasonable statement of the problem via this code:
import Mathlib
import Lake
import Mathlib.Tactic.Positivity
open Filter
open Real
open NNReal
-- The definition of the MCC
noncomputable def MCC (tp tn fp fn : ℝ≥0) : ℝ :=
( tp * tn - fp * fn ) /
NNReal.sqrt (( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) )
-- The definition of FM
noncomputable def FM (tp fp fn : ℝ≥0) : ℝ≥0 :=
NNReal.sqrt (( tp / ( tp + fn ) ) * ( tp / ( tp + fp ) ) )
--- The overall goal
variable (tp fp fn : ℝ≥0)
noncomputable def MCC_tn := fun (tn : ℝ≥0) => MCC tp tn fp fn
noncomputable def FM_ := FM tp fp fn
theorem tendsto_MCC_tn_atTop_FM : Tendsto MCC_tn atTop (nhds FM_) := by {
sorry
}
And now I need to fill in the details for tendsto_MCC_tn_atTop_FM
.
I've tried to outline what I think the steps of the proof should look like in pseudo code.
* Consider the MCC:
MCC
* Expanding the MCC:
( tp * tn - fp * fn ) / Real.sqrt (( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) )
* Multiply by 1 = (tn⁻¹/tn⁻¹), so we also have to assert tn > 0
(tn⁻¹/tn⁻¹) * (( tp * tn - fp * fn ) / Real.sqrt (( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) ))
* Bring tn⁻¹ into numer and denom terms
(tn⁻¹ * ( tp * tn - fp * fn )) / (tn⁻¹ * Real.sqrt ( ( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) ))
--- Numerator Algebra ---
* Consider the numerator
Let numer = (tn⁻¹ * ( tp * tn - fp * fn ))
* Distribute tn⁻¹ over the addition
(tn⁻¹ * tp * tn) - (tn⁻¹ * fp * fn)
* Commute multiplication and cancel inverses
numer = (tp) - (tn⁻¹ * fp * fn)
--- Denominator Algebra ---
* Consider the denominator
Let denom = (tn⁻¹ * Real.sqrt ( ( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) ))
* Bring tn⁻¹ inside the sqrt
Real.sqrt ( tn⁻¹ * tn⁻¹ * ( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) )
* Commute the tn⁻¹'s
Real.sqrt (* ( tp + fp ) * ( tp + fn ) * tn⁻¹ * ( tn + fp ) * tn⁻¹ * ( tn + fn ) )
* Distribute the tn⁻¹'s
Real.sqrt (* ( tp + fp ) * ( tp + fn ) * ( tn⁻¹ * tn + tn⁻¹ * fp ) * (tn⁻¹ * tn + tn⁻¹ * fn ) )
* Cancel multiplicative inverses
denom = Real.sqrt ( ( tp + fp ) * ( tp + fn ) * (1 + tn⁻¹ * fp ) * (1 + tn⁻¹ * fn ) )
--- Limit Part ---
* Recall the new numer
numer = (tp) - (tn⁻¹ * fp * fn)
* Consider the new numer as tn tends to infinity
Limit(numer) = (tp) - (0 * fp * fn)
* Cancel the term multiplied by zero
Limit(numer) = (tp)
* Recall the new denom
denom = Real.sqrt ( ( tp + fp ) * ( tp + fn ) * (1 + tn⁻¹ * fp ) * (1 + tn⁻¹ * fn ) )
* Consider the new denom as tn tends to infinity
Limit(denom) = Real.sqrt ( ( tp + fp ) * ( tp + fn ) * (1 + 0 * fp ) * (1 + 0 * fn ) )
* Cancel the terms multiplied by zero, apply addative identity
Limit(denom) = Real.sqrt ( ( tp + fp ) * ( tp + fn ) * (1) * (1) )
* Apply multiplicative identity
Limit(denom) = Real.sqrt ( ( tp + fp ) * ( tp + fn ) )
* Apply: Limit(numer/denom) = Limit(numer)/Limit(denom)
Limit(MCC) = tp / Real.sqrt ( ( tp + fp ) * ( tp + fn ) )
--- Rearrange FM Part
* Consider the FM:
FM
* Given mul_fractions : (a/b) * (c/d) = (a * b) / (c * d)
* Given div_is_mul_inv : a / b = a * b⁻¹
* Given sqrt_of_square : Real.sqrt(a * a * c) = a * Real.sqrt(c)
* Given sqrt_of_frac : Real.sqrt(a / b) = Real.sqrt(a) * Real.sqrt(b)
* Given sqrt_one : Real.sqrt(1) = 1
* Expanding the FM:
Real.sqrt (( tp / ( tp + fn ) ) * ( tp / ( tp + fp ) ) )
* Apply div_is_mul_inv:
Real.sqrt (( tp * ( tp + fn )⁻¹ ) * ( tp * ( tp + fp )⁻¹ ) )
* Associativity and communitivty to rearange internal terms:
Real.sqrt ( tp * tp * ( tp + fn )⁻¹ * ( tp + fp )⁻¹ )
* Apply sqrt_of_square to extract tp:
tp * Real.sqrt ( ( tp + fn )⁻¹ * ( tp + fp )⁻¹ )
* Revert to division
tp * Real.sqrt ( 1 /( tp + fn ) * 1 / ( tp + fp ) )
* Apply: mul_fractions
tp * Real.sqrt ( 1 / (( tp + fn ) * ( tp + fp )) )
* Apply sqrt_of_frac
tp * Real.sqrt (1) / Real.sqrt( (( tp + fn ) * ( tp + fp )) )
* Apply sqrt_one and mul identity
FM = tp / Real.sqrt( ( tp + fn ) * ( tp + fp ) )
* Thus We find
Lim(MCC) = FM by exact
So given this pseudo code I'm trying to prove small parts, so I can then apply
those intermediate theorems to prove the big statement.
The one I'm currently trying to solve is the part where I simplify the FM to
look like what the LIM(MCC) will look like. I started with this:
-- Helper Theorems
theorem mul_fractions (a b c d : ℝ) : (a/b) * (c/d) = (a * c) / (b * d) := by {
ring
}
theorem div_is_mul_inv (a b: ℝ) : a / b = a * b⁻¹ := by {
ring
}
theorem sqrt_one : Real.sqrt 1 = 1 := by {
rw [sqrt_eq_one]
}
theorem sqrt_of_square (a c : ℝ≥0) : NNReal.sqrt (a * a * c) = a * NNReal.sqrt (c):= by {
rw [NNReal.sqrt_mul]
rw [NNReal.sqrt_mul_self]
}
theorem sqrt_of_frac (a b: ℝ≥0) : NNReal.sqrt (a / b) = (NNReal.sqrt a) / (NNReal.sqrt b) := by {
rw [NNReal.sqrt_div]
}
-- Going to show that FM_v1 simplifies to FM_v2
noncomputable def FM_v1 (tp fp fn : ℝ≥0) : ℝ≥0 :=
NNReal.sqrt (( tp / ( tp + fn ) ) * ( tp / ( tp + fp ) ) )
noncomputable def FM_v2 (tp fp fn : ℝ≥0) : ℝ≥0 :=
tp / NNReal.sqrt ( ( tp + fn ) * ( tp + fp ) )
theorem FM_simplify : FM_v1 tp fp fn = FM_v2 tp fp fn := by {
rw [FM_v1]
rw [FM_v2]
rw [div_eq_inv_mul]
rw [div_eq_inv_mul]
rw [div_eq_inv_mul]
have u1 := (tp + fn)⁻¹
have u2 := (tp + fp)⁻¹
-- ?? rw [u1]
--rw [mul_comm (tp + fn)⁻¹ tp]
--rw [mul_comm (tp + fp)⁻¹ tp]
--rw [← mul_assoc (tp + fn)⁻¹ tp (tp + fp)⁻¹]
--rw [← mul_assoc (tp + fp)⁻¹]
}
It was a useful exercise to get a handle on proving the simple helper theorems,
and I started to make some progress on FM_simplify
.
My current goal state is:
tp fp fn u1 u2: ℝ≥0
⊢ ↑NNReal.sqrt ((tp + fn)⁻¹ * tp * ((tp + fp)⁻¹ * tp)) = (↑NNReal.sqrt ((tp + fn) * (tp + fp)))⁻¹ * tp
But I was having trouble grouping the tp
terms. I wanted to try replacing the
(...)⁻¹
terms with u1
and u2
to make some of the mul_comm
and mul_assoc
rewrites easier to specify where they should be applied, but I rw [u1]
didn't
work and I don't understand why.
So in short, I'm looking to learn how to effectively manipulate the goal state.
Specifically:
-
If there is an identity that I want to apply, how I can specify exactly where it should be applied? Do I just have to rely on the pattern matching?
-
How can I do u-substitution on some part of an expression
-
If I have a large nested expression like
def foo x := f1 f2 f3 (f4 x + f5 x)
how can I "key-in" on part of it likef4 x
, and work on simplifying only that part? -
Is there something flawed in my code / thought process. I'm sure there are concepts here I'm not grasping yet, but I'm not sure if any are evident from what I've written.
Scott Morrison (Nov 20 2023 at 02:27):
Maybe you are looking for generalize h2 : (tp + fp)⁻¹ = u2
?
Jonathan Crall (Nov 20 2023 at 03:29):
That is helping me get further. Is there a quick way to undo it? i.e. sub the original expression back in for u2?
EDIT: Figured it out rw [← h2]
Thanks, I got the proof for the FM part!
theorem mul_fractions (a b c d : ℝ) : (a/b) * (c/d) = (a * c) / (b * d) := by {
ring
}
theorem div_is_mul_inv (a b: ℝ) : a / b = a * b⁻¹ := by {
ring
}
theorem sqrt_one : Real.sqrt 1 = 1 := by {
rw [sqrt_eq_one]
}
theorem sqrt_inv_rule (a b: ℝ≥0) : NNReal.sqrt (a⁻¹ * b⁻¹) = (NNReal.sqrt (a * b))⁻¹ := by {
rw [← NNReal.sqrt_inv]
rw [mul_inv_rev]
rw [mul_comm]
}
theorem sqrt_of_square (a c : ℝ≥0) : NNReal.sqrt (a * a * c) = a * NNReal.sqrt (c):= by {
rw [NNReal.sqrt_mul]
rw [NNReal.sqrt_mul_self]
}
theorem sqrt_of_frac (a b: ℝ≥0) : NNReal.sqrt (a / b) = (NNReal.sqrt a) / (NNReal.sqrt b) := by {
rw [NNReal.sqrt_div]
}
-- Going to show that FM_v1 simplifies to FM_v2
noncomputable def FM_v1 (tp fp fn : ℝ≥0) : ℝ≥0 :=
NNReal.sqrt (( tp / ( tp + fn ) ) * ( tp / ( tp + fp ) ) )
noncomputable def FM_v2 (tp fp fn : ℝ≥0) : ℝ≥0 :=
tp / NNReal.sqrt ( ( tp + fn ) * ( tp + fp ) )
theorem FM_simplify : FM_v1 tp fp fn = FM_v2 tp fp fn := by {
rw [FM_v1]
rw [FM_v2]
rw [div_eq_inv_mul]
rw [div_eq_inv_mul]
rw [div_eq_inv_mul]
generalize h1 : (tp + fn) = u1
generalize h2 : (tp + fp) = u2
generalize h3 : u1⁻¹ = u3
generalize h4 : u2⁻¹ = u4
rw [mul_comm u3 tp]
rw [mul_comm u4 tp]
rw [mul_assoc tp u3]
rw [← mul_assoc u3]
rw [mul_comm u3 tp]
rw [mul_assoc tp]
rw [← mul_assoc tp]
rw [sqrt_of_square]
rw [← h3]
rw [← h4]
rw [sqrt_inv_rule]
rw [mul_comm tp]
}
Any comments on how I could do this better would be appreciated.
Timo Carlin-Burns (Nov 20 2023 at 04:07):
Jonathan Crall said:
- If I have a large nested expression like
def foo x := f1 f2 f3 (f4 x + f5 x)
how can I "key-in" on part of it likef4 x
, and work on simplifying only that part?
This sounds like the exact purpose of conv
mode
Jonathan Crall (Nov 21 2023 at 14:54):
@Timo Carlin-Burns conv has been helpful at keying in on things. Thanks for the pointer.
I'm getting a lot farther, but I've found myself somewhat stuck again. First, I've figured out how to "get past" things I haven't solved yet, but know are true, using the axiom keyword.
axiom factor_rule1 (a b : ℝ≥0):
a * a * (1 + b/a) = a * (a + b)
axiom factor_rule2 (tn fn u1 u2 u3: ℝ≥0):
tn * u3 * u1 * u2 * (1 + fn / tn) = u3 * u1 * u2 * (tn + fn)
I found these above 2 rules useful in the next part of the proof, but they rely on the fact that a > 0
in the first case and tn > 0
in the second case, otherwise there can be a divide by zero. Thus I can't prove these as theorems because they aren't true. I'm wondering what is the right way to go about encoding these as theorems such they are aware of the fact that these variables will be positive in the proof I ultimately intend to use them in?
And on that note, here is the code that I use these two rules in. First the "working" version.
theorem mcc_denoms_eq
(tp tn fp fn : ℝ≥0)
--(tn_is_pos: tn > 0)
:
-- MCC_sqrd_denom_v1 tp tn fp fn =
-- MCC_sqrd_denom_v2 tp tn fp fn := by {
( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn ) =
tn * tn * ( 1 + fp/tn ) * ( tp + fp ) * ( tp + fn ) * ( 1 + fn/tn ) := by {
--rw [MCC_sqrd_denom_v1]
--rw [MCC_sqrd_denom_v2]
rw [factor_rule1 tn fp]
generalize h1 : tp + fp = u1
generalize h2 : tp + fn = u2
generalize h3 : tn + fp = u3
generalize h4 : tn + fn = u4
rw [factor_rule2]
rw [← h1]
rw [← h2]
rw [← h3]
rw [← h4]
ring
You'll notice in the above code, I have some things commented out. First is my initial attempt to make tn positive. Second is in my statement of the theorem. While I have definitions of MCC_sqrd_denom_v1
and MCC_sqrd_denom_v2
, for some reason the proof doesn't seem to work unless I paste in their manual definitions. I would have thought a rw
would have fixed it, but this gives errors, and I'm not sure why:
noncomputable def MCC_sqrd_denom_v1 (tp tn fp fn : ℝ) :=
( tp + fp ) * ( tp + fn ) * ( tn + fp ) * ( tn + fn )
noncomputable def MCC_sqrd_denom_v2 (tp tn fp fn : ℝ) :=
tn * tn * ( 1 + fp/tn ) * ( tp + fp ) * ( tp + fn ) * ( 1 + fn/tn )
theorem mcc_denoms_eq
(tp tn fp fn : ℝ≥0)
:
MCC_sqrd_denom_v1 tp tn fp fn =
MCC_sqrd_denom_v2 tp tn fp fn := by {
rw [MCC_sqrd_denom_v1]
rw [MCC_sqrd_denom_v2]
rw [factor_rule1 tn fp]
generalize h1 : tp + fp = u1
generalize h2 : tp + fn = u2
generalize h3 : tn + fp = u3
generalize h4 : tn + fn = u4
rw [factor_rule2]
rw [← h1]
rw [← h2]
rw [← h3]
rw [← h4]
ring
}
It is complaining about applying factor_rule1
tactic 'rewrite' failed, did not find instance of the pattern in the target expression
tn * tn * (1 + fp / tn)
tptnfpfn: ℝ≥0
⊢ (↑tp + ↑fp) * (↑tp + ↑fn) * (↑tn + ↑fp) * (↑tn + ↑fn) =
↑tn * ↑tn * (1 + ↑fp / ↑tn) * (↑tp + ↑fp) * (↑tp + ↑fn) * (1 + ↑fn / ↑tn)
It looks like I may have introduced an unbound variable or something, such that my rw [MCC_sqrd_denom_v1]
and rw [MCC_sqrd_denom_v2]
are not replaced with the original tp, tn, fp, fn
variables? I'm not quite sure? Can anyone provide insight?
Kevin Buzzard (Nov 21 2023 at 15:20):
Adding things that you can't prove as axioms is a really good route towards accidentally assuming something slightly false and thus completely invalidating all of your proof work. As you yourself point out, dividing by zero is typically a bad idea. Just add relevant hypotheses like (ha : 0 < a)
in your theorem statements, and then figure out how to prove them; skipping proofs is a dangerous game.
Kevin Buzzard (Nov 21 2023 at 15:22):
import Mathlib.Tactic
open NNReal
theorem factor_rule1 (a b : ℝ≥0) : a * a * (1 + b/a) = a * (a + b) := by
by_cases ha : a = 0
· simp [ha]
· field_simp
ring
Last updated: Dec 20 2023 at 11:08 UTC