Zulip Chat Archive
Stream: new members
Topic: Zero-argument tactic usage in Mathlib
Isak Colboubrani (Feb 01 2024 at 23:09):
I analyzed the use of "zero-argument tactics" in the Mathlib repo (excluding tests), focusing on those used to prove theorems or lemmas in one step, like this:
theorem […] := by tactic
lemma […] := by tactic
While the method has obvious limitations (for example, missing tactics not on the same line as the theorem or lemma), it hopefully gives a rough idea of which tactics that are used in said situations in Mathlib:
Tactic | Defined in | Used in # files | # uses | |
---|---|---|---|---|
1. | simp |
Lean | 164 files | 385 uses |
2. | rfl |
Lean | 9 files | 13 uses |
3. | aesop_cat |
Mathlib | 6 files | 13 uses |
4. | decide |
Lean | 5 files | 29 uses |
5. | aesop |
Mathlib | 4 files | 6 uses |
6. | tauto |
Mathlib | 3 files | 4 uses |
7. | ext |
Mathlib | 2 files | 11 uses |
8. | infer_instance |
Lean | 2 files | 7 uses |
9. | linarith |
Mathlib | 2 files | 4 uses |
10. | trivial |
Lean | 2 files | 3 uses |
11. | norm_cast |
Mathlib | 2 files | 2 uses |
12. | map_fun_tac |
Mathlib | 1 file | 9 uses |
13. | congr |
Lean | 1 file | 1 use |
14. | constructor |
Lean | 1 file | 1 use |
15. | measurability |
Mathlib | 1 file | 1 use |
16. | positivity |
Lean | 1 file | 1 use |
17. | transfer |
Mathlib | 1 file | 1 use |
Any surprises?
I was surprised to see aesop_cat
being used more than aesop
.
Jon Eugster (Feb 02 2024 at 08:34):
I was surprised that you only found 13 := by rfl
proofs. But actually since you're not capturing := rfl
proofs that makes more sense, and in most of these 13 cases it would probably be good mathlib style to remove the by
, too.
You're probably getting quite a skewed picture if you're search is some regex ensitive to newlines, as the manual 100-char line breaks (and the style guide in general) will cause a lot of statements to be written on multiple lines...
Jireh Loreaux (Feb 02 2024 at 14:24):
@Jon Eugster note that sometimes you can't replace by rfl
with rfl
. The tactic version is "smarter" in certain ways.
Jon Eugster (Feb 02 2024 at 15:20):
That's of course absolutely true, my bad.
Isak Colboubrani (Feb 03 2024 at 13:49):
@Jon Eugster Indeed, your observation regarding the complications introduced by the 100-character line break is spot on. Thanks for notifying! I have subsequently developed an enhanced parser that addresses this specific problem. Here are the revised results:
Tactic | Defined in | # uses | |
---|---|---|---|
1. | simp |
Lean | 600 uses |
2. | rfl |
Lean | 87 uses |
3. | aesop_cat |
Mathlib | 67 uses |
4. | decide |
Lean | 30 uses |
5. | aesop |
Mathlib | 18 uses |
6. | linarith |
Mathlib | 11 uses |
7. | ext |
Mathlib | 11 uses |
8. | coherence |
Mathlib | 11 uses |
9. | ring |
Mathlib | 9 uses |
10. | map_fun_tac |
Mathlib | 9 uses |
11. | witt_truncateFun_tac |
Mathlib | 7 uses |
12. | norm_cast |
Mathlib | 7 uses |
13. | infer_instance |
Lean | 7 uses |
14. | simp_all |
Lean | 6 uses |
15. | pgame_wf_tac |
Mathlib | 5 uses |
16. | congr |
Lean | 5 uses |
17. | tauto |
Mathlib | 5 uses |
18. | simpa |
Mathlib | 4 uses |
19. | trivial |
Lean | 3 uses |
20. | bitwise_assoc_tac |
Mathlib | 3 uses |
21. | positivity |
Mathlib | 2 uses |
22. | mfld_set_tac |
Mathlib | 2 uses |
23. | continuity |
Mathlib | 1 use |
24. | constructor |
Lean | 1 use |
25. | measurability |
Mathlib | 1 use |
Heather Macbeth (Feb 03 2024 at 18:36):
This is kind of a weird sample, because often if you can prove a fact with one tactic application, that's a reason not to make it a stand-alone theorem at all.
Of course there are exceptions, like if you're stating it to tag it as a simp or ext lemma.
Heather Macbeth (Feb 03 2024 at 18:39):
IMO, a more interesting study would be to analyse the no-argument tactics which occur as "finishing" tactics (closing a goal) anywhere in a mathlib proof.
Heather Macbeth (Feb 03 2024 at 18:41):
That will give you a much higher count than 9 for ring
.
Isak Colboubrani (Feb 03 2024 at 20:46):
Heather Macbeth said:
IMO, a more interesting study would be to analyse the no-argument tactics which occur as "finishing" tactics (closing a goal) anywhere in a mathlib proof.
Thank you for the valuable feedback! I believe I can modify my parser to accommodate that extraction.
To make sure I understand the proposed modification, I will demonstrate how my somewhat heuristic parser processes theorems of varying complexity.
The numeration preceding each line of proof represents a tuple where the first number is a counter for each := by
block, and the second number counts the lines within such a "block."
In the following situation, aesop_cat
would be acknowledged for its role, consistent with the logic in previous analyses?
Theorem: theorem NatTrans.mapHomologicalComplex_id (c : ComplexShape ι) (F : V ⥤ W) [F.Additive] : NatTrans.mapHomologicalComplex (𝟙 F) c = 𝟙 (F.mapHomologicalComplex c)
[ 1. 1.] aesop_cat
In the following situation is simp
the tactic to be credited?
Theorem: theorem iSupLift_comp_inclusion {i : ι} (h : K i ≤ T) : (iSupLift K dir f hf T hT).comp (inclusion h) = f i
[ 1. 1.] ext; simp
In the following situation is rfl
the tactic to be credited?
Theorem: theorem symm_symm (e : A₁ ≃ₐ[R] A₂) : e.symm.symm = e
[ 1. 1.] ext
[ 1. 2.] rfl
In the following situation is linarith
the tactic to be credited?
Theorem: theorem sum_Ico_Ico_comm {M : Type*} [AddCommMonoid M] (a b : ℕ) (f : ℕ → ℕ → M) : (∑ i in Finset.Ico a b, ∑ j in Finset.Ico i b, f i j) = ∑ j in Finset.Ico a b, ∑ i in Finset.Ico a (j + 1), f i j
[ 1. 1.] rw [Finset.sum_sigma', Finset.sum_sigma']
[ 1. 2.] refine' sum_nbij' (fun x ↦ ⟨x.2, x.1⟩) (fun x ↦ ⟨x.2, x.1⟩) _ _ (fun _ _ ↦ rfl) (fun _ _ ↦ rfl)
[ 1. 3.] (fun _ _ ↦ rfl) <;>
[ 1. 4.] simp only [Finset.mem_Ico, Sigma.forall, Finset.mem_sigma] <;>
[ 1. 5.] rintro a b ⟨⟨h₁, h₂⟩, ⟨h₃, h₄⟩⟩ <;>
[ 1. 6.] refine' ⟨⟨_, _⟩, ⟨_, _⟩⟩ <;>
[ 1. 7.] linarith
In the following situation is assumption
the tactic to be credited?
Theorem: theorem prod_add_prod_le' (hi : i ∈ s) (h2i : g i + h i ≤ f i) (hgf : ∀ j ∈ s, j ≠ i → g j ≤ f j) (hhf : ∀ j ∈ s, j ≠ i → h j ≤ f j) : ((∏ i in s, g i) + ∏ i in s, h i) ≤ ∏ i in s, f i
[ 1. 1.] classical
[ 1. 2.] simp_rw [prod_eq_mul_prod_diff_singleton hi]
[ 1. 3.] refine' le_trans _ (mul_le_mul_right' h2i _)
[ 1. 4.] rw [right_distrib]
[ 1. 5.] apply add_le_add <;> apply mul_le_mul_left' <;> apply prod_le_prod' <;>
[ 1. 6.] simp only [and_imp, mem_sdiff, mem_singleton] <;>
[ 1. 7.] intros <;>
[ 1. 8.] apply_assumption <;>
[ 1. 9.] assumption
In the following situation is tauto
the tactic to be credited? Only for the tauto
on the last line?
Theorem: theorem mem_range_mulIndicator {r : M} {s : Set α} {f : α → M} : r ∈ range (mulIndicator s f) ↔ r = 1 ∧ s ≠ univ ∨ r ∈ f '' s
[ 1. 1.] simp only [mem_range, mulIndicator, ne_eq, mem_image]
[ 1. 2.] rw [eq_univ_iff_forall, not_forall]
[ 1. 3.] refine ⟨?_, ?_⟩
[ 1. 4.] · rintro ⟨y, hy⟩
[ 1. 5.] split_ifs at hy with hys
[ 1. 6.] · tauto
[ 1. 7.] · left
[ 1. 8.] tauto
[ 1. 9.] · rintro (⟨hr, ⟨x, hx⟩⟩ | ⟨x, ⟨hx, hxs⟩⟩) <;> use x <;> split_ifs <;> tauto
In the following "multiple := by
" situation: should multiple credits be given or only one?
Theorem: lemma sum_mul_sq_le_sq_mul_sq (s : Finset ι) (f g : ι → α) : (∑ i in s, f i * g i) ^ 2 ≤ (∑ i in s, f i ^ 2) * ∑ i in s, g i ^ 2
[ 1. 1.] nontriviality α
[ 1. 2.] obtain h' | h' := (sum_nonneg fun _ _ ↦ sq_nonneg <| g _).eq_or_lt
[ 1. 3.] · have h'' : ∀ i ∈ s, g i = 0 := fun i hi ↦ by
[ 1. 4.] simpa using (sum_eq_zero_iff_of_nonneg fun i _ ↦ sq_nonneg (g i)).1 h'.symm i hi
[ 1. 5.] rw [← h', sum_congr rfl (show ∀ i ∈ s, f i * g i = 0 from fun i hi ↦ by simp [h'' i hi])]
[ 1. 6.] simp
[ 1. 7.] refine le_of_mul_le_mul_of_pos_left
[ 1. 8.] (le_of_add_le_add_left (a := (∑ i in s, g i ^ 2) * (∑ j in s, f j * g j) ^ 2) ?_) h'
[ 1. 9.] calc
[ 1.10.] _ = ∑ i in s, 2 * (f i * ∑ j in s, g j ^ 2) * (g i * ∑ j in s, f j * g j)
[ 2. 1.] simp_rw [mul_assoc (2 : α), mul_mul_mul_comm, ← mul_sum, ← sum_mul]; ring
[ 2. 2.] _ ≤ ∑ i in s, ((f i * ∑ j in s, g j ^ 2) ^ 2 + (g i * ∑ j in s, f j * g j) ^ 2) :=
[ 2. 3.] sum_le_sum fun i _ ↦ two_mul_le_add_sq (f i * ∑ j in s, g j ^ 2) (g i * ∑ j in s, f j * g j)
[ 2. 4.] _ = _
[ 3. 1.] simp_rw [sum_add_distrib, mul_pow, ← sum_mul]; ring
In the following "multiple := by
" situation: should multiple credits be given or only one?
Theorem: theorem sub_convergents_eq {ifp : IntFractPair K} (stream_nth_eq : IntFractPair.stream v n = some ifp) : let g := of v let B := (g.continuantsAux (n + 1)).b let pB := (g.continuantsAux n).
b v - g.convergents n = if ifp.fr = 0 then 0 else (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB))
[ 1. 1.] let g := of v
[ 1. 2.] let conts := g.continuantsAux (n + 1)
[ 1. 3.] let pred_conts := g.continuantsAux n
[ 1. 4.] have g_finite_correctness :
[ 1. 5.] v = GeneralizedContinuedFraction.compExactValue pred_conts conts ifp.fr :=
[ 1. 6.] compExactValue_correctness_of_stream_eq_some stream_nth_eq
[ 1. 7.] obtain (ifp_fr_eq_zero | ifp_fr_ne_zero) := eq_or_ne ifp.fr 0
[ 1. 8.] · suffices v - g.convergents n = 0 by simpa [ifp_fr_eq_zero]
[ 1. 9.] replace g_finite_correctness : v = g.convergents n;
[ 1.10.] · simpa [GeneralizedContinuedFraction.compExactValue, ifp_fr_eq_zero] using g_finite_correctness
[ 1.11.] exact sub_eq_zero.2 g_finite_correctness
[ 1.12.] · -- more shorthand notation
[ 1.13.] let A := conts.a
[ 1.14.] let B := conts.b
[ 1.15.] let pA := pred_conts.a
[ 1.16.] let pB := pred_conts.b
[ 1.17.] suffices v - A / B = (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB)) by simpa [ifp_fr_ne_zero]
[ 1.18.] replace g_finite_correctness : v = (pA + ifp.fr⁻¹ * A) / (pB + ifp.fr⁻¹ * B)
[ 1.19.] · simpa [GeneralizedContinuedFraction.compExactValue, ifp_fr_ne_zero, nextContinuants,
[ 1.20.] nextNumerator, nextDenominator, add_comm] using g_finite_correctness
[ 1.21.] suffices
[ 1.22.] (pA + ifp.fr⁻¹ * A) / (pB + ifp.fr⁻¹ * B) - A / B = (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB)) by
[ 1.23.] rwa [g_finite_correctness]
[ 1.24.] have n_eq_zero_or_not_terminated_at_pred_n : n = 0 ∨ ¬g.TerminatedAt (n - 1)
[ 2. 1.] cases' n with n'
[ 2. 2.] · simp
[ 2. 3.] · have : IntFractPair.stream v (n' + 1) ≠ none
[ 3. 1.] simp [stream_nth_eq]
[ 3. 2.] have : ¬g.TerminatedAt n' :=
[ 3. 3.] (not_congr of_terminatedAt_n_iff_succ_nth_intFractPair_stream_eq_none).2 this
[ 3. 4.] exact Or.inr this
[ 3. 5.] have determinant_eq : pA * B - pB * A = (-1) ^ n :=
[ 3. 6.] determinant_aux n_eq_zero_or_not_terminated_at_pred_n
[ 3. 7.] have pB_ineq : (fib n : K) ≤ pB :=
[ 3. 8.] haveI : n ≤ 1 ∨ ¬g.TerminatedAt (n - 2)
[ 4. 1.] cases' n_eq_zero_or_not_terminated_at_pred_n with n_eq_zero not_terminated_at_pred_n
[ 4. 2.] · simp [n_eq_zero]
[ 4. 3.] · exact Or.inr <| mt (terminated_stable (n - 1).pred_le) not_terminated_at_pred_n
[ 4. 4.] fib_le_of_continuantsAux_b this
[ 4. 5.] have B_ineq : (fib (n + 1) : K) ≤ B :=
[ 4. 6.] haveI : n + 1 ≤ 1 ∨ ¬g.TerminatedAt (n + 1 - 2)
[ 5. 1.] cases' n_eq_zero_or_not_terminated_at_pred_n with n_eq_zero not_terminated_at_pred_n
[ 5. 2.] · simp [n_eq_zero, le_refl]
[ 5. 3.] · exact Or.inr not_terminated_at_pred_n
[ 5. 4.] fib_le_of_continuantsAux_b this
[ 5. 5.] have zero_lt_B : 0 < B := B_ineq.trans_lt' <| cast_pos.2 <| fib_pos.2 n.succ_pos
[ 5. 6.] have : 0 ≤ pB := (cast_nonneg _).trans pB_ineq
[ 5. 7.] have : 0 < ifp.fr :=
[ 5. 8.] ifp_fr_ne_zero.lt_of_le' <| IntFractPair.nth_stream_fr_nonneg stream_nth_eq
[ 5. 9.] have : pB + ifp.fr⁻¹ * B ≠ 0
[ 6. 1.] positivity
[ 6. 2.] calc
[ 6. 3.] (pA + ifp.fr⁻¹ * A) / (pB + ifp.fr⁻¹ * B) - A / B =
[ 6. 4.] ((pA + ifp.fr⁻¹ * A) * B - (pB + ifp.fr⁻¹ * B) * A) / ((pB + ifp.fr⁻¹ * B) * B)
[ 7. 1.] rw [div_sub_div _ _ this zero_lt_B.ne']
[ 7. 2.] _ = (pA * B + ifp.fr⁻¹ * A * B - (pB * A + ifp.fr⁻¹ * B * A)) / _
[ 8. 1.] repeat' rw [add_mul]
[ 8. 2.] _ = (pA * B - pB * A) / ((pB + ifp.fr⁻¹ * B) * B)
[ 9. 1.] ring
[ 9. 2.] _ = (-1) ^ n / ((pB + ifp.fr⁻¹ * B) * B)
[10. 1.] rw [determinant_eq]
[10. 2.] _ = (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB))
[11. 1.] ac_rfl
Last updated: May 02 2025 at 03:31 UTC