Zulip Chat Archive

Stream: new members

Topic: Zero-argument tactic usage in Mathlib


Isak Colboubrani (Feb 01 2024 at 23:09):

I analyzed the use of "zero-argument tactics" in the Mathlib repo (excluding tests), focusing on those used to prove theorems or lemmas in one step, like this:

theorem […] := by tactic
lemma […] := by tactic

While the method has obvious limitations (for example, missing tactics not on the same line as the theorem or lemma), it hopefully gives a rough idea of which tactics that are used in said situations in Mathlib:

Tactic Defined in Used in # files # uses
1. simp Lean 164 files 385 uses
2. rfl Lean 9 files 13 uses
3. aesop_cat Mathlib 6 files 13 uses
4. decide Lean 5 files 29 uses
5. aesop Mathlib 4 files 6 uses
6. tauto Mathlib 3 files 4 uses
7. ext Mathlib 2 files 11 uses
8. infer_instance Lean 2 files 7 uses
9. linarith Mathlib 2 files 4 uses
10. trivial Lean 2 files 3 uses
11. norm_cast Mathlib 2 files 2 uses
12. map_fun_tac Mathlib 1 file 9 uses
13. congr Lean 1 file 1 use
14. constructor Lean 1 file 1 use
15. measurability Mathlib 1 file 1 use
16. positivity Lean 1 file 1 use
17. transfer Mathlib 1 file 1 use

Any surprises?

I was surprised to see aesop_cat being used more than aesop.

Jon Eugster (Feb 02 2024 at 08:34):

I was surprised that you only found 13 := by rfl proofs. But actually since you're not capturing := rfl proofs that makes more sense, and in most of these 13 cases it would probably be good mathlib style to remove the by, too.

You're probably getting quite a skewed picture if you're search is some regex ensitive to newlines, as the manual 100-char line breaks (and the style guide in general) will cause a lot of statements to be written on multiple lines...

Jireh Loreaux (Feb 02 2024 at 14:24):

@Jon Eugster note that sometimes you can't replace by rfl with rfl. The tactic version is "smarter" in certain ways.

Jon Eugster (Feb 02 2024 at 15:20):

That's of course absolutely true, my bad.

Isak Colboubrani (Feb 03 2024 at 13:49):

@Jon Eugster Indeed, your observation regarding the complications introduced by the 100-character line break is spot on. Thanks for notifying! I have subsequently developed an enhanced parser that addresses this specific problem. Here are the revised results:

Tactic Defined in # uses
1. simp Lean 600 uses
2. rfl Lean 87 uses
3. aesop_cat Mathlib 67 uses
4. decide Lean 30 uses
5. aesop Mathlib 18 uses
6. linarith Mathlib 11 uses
7. ext Mathlib 11 uses
8. coherence Mathlib 11 uses
9. ring Mathlib 9 uses
10. map_fun_tac Mathlib 9 uses
11. witt_truncateFun_tac Mathlib 7 uses
12. norm_cast Mathlib 7 uses
13. infer_instance Lean 7 uses
14. simp_all Lean 6 uses
15. pgame_wf_tac Mathlib 5 uses
16. congr Lean 5 uses
17. tauto Mathlib 5 uses
18. simpa Mathlib 4 uses
19. trivial Lean 3 uses
20. bitwise_assoc_tac Mathlib 3 uses
21. positivity Mathlib 2 uses
22. mfld_set_tac Mathlib 2 uses
23. continuity Mathlib 1 use
24. constructor Lean 1 use
25. measurability Mathlib 1 use

Heather Macbeth (Feb 03 2024 at 18:36):

This is kind of a weird sample, because often if you can prove a fact with one tactic application, that's a reason not to make it a stand-alone theorem at all.

Of course there are exceptions, like if you're stating it to tag it as a simp or ext lemma.

Heather Macbeth (Feb 03 2024 at 18:39):

IMO, a more interesting study would be to analyse the no-argument tactics which occur as "finishing" tactics (closing a goal) anywhere in a mathlib proof.

Heather Macbeth (Feb 03 2024 at 18:41):

That will give you a much higher count than 9 for ring.

Isak Colboubrani (Feb 03 2024 at 20:46):

Heather Macbeth said:

IMO, a more interesting study would be to analyse the no-argument tactics which occur as "finishing" tactics (closing a goal) anywhere in a mathlib proof.

Thank you for the valuable feedback! I believe I can modify my parser to accommodate that extraction.

To make sure I understand the proposed modification, I will demonstrate how my somewhat heuristic parser processes theorems of varying complexity.

The numeration preceding each line of proof represents a tuple where the first number is a counter for each := by block, and the second number counts the lines within such a "block."

In the following situation, aesop_cat would be acknowledged for its role, consistent with the logic in previous analyses?

Theorem: theorem NatTrans.mapHomologicalComplex_id (c : ComplexShape ι) (F : V  W) [F.Additive] : NatTrans.mapHomologicalComplex (𝟙 F) c = 𝟙 (F.mapHomologicalComplex c)
  [ 1. 1.]  aesop_cat

In the following situation is simp the tactic to be credited?

Theorem: theorem iSupLift_comp_inclusion {i : ι} (h : K i  T) : (iSupLift K dir f hf T hT).comp (inclusion h) = f i
  [ 1. 1.]  ext; simp

In the following situation is rfl the tactic to be credited?

Theorem: theorem symm_symm (e : A₁ ≃ₐ[R] A₂) : e.symm.symm = e
  [ 1. 1.]   ext
  [ 1. 2.]   rfl

In the following situation is linarith the tactic to be credited?

Theorem: theorem sum_Ico_Ico_comm {M : Type*} [AddCommMonoid M] (a b : ) (f :     M) : ( i in Finset.Ico a b,  j in Finset.Ico i b, f i j) =  j in Finset.Ico a b,  i in Finset.Ico a (j + 1), f i j
  [ 1. 1.]   rw [Finset.sum_sigma', Finset.sum_sigma']
  [ 1. 2.]   refine' sum_nbij' (fun x  x.2, x.1⟩) (fun x  x.2, x.1⟩) _ _ (fun _ _  rfl) (fun _ _  rfl)
  [ 1. 3.]     (fun _ _  rfl) <;>
  [ 1. 4.]   simp only [Finset.mem_Ico, Sigma.forall, Finset.mem_sigma] <;>
  [ 1. 5.]   rintro a b ⟨⟨h₁, h₂⟩, h₃, h₄⟩⟩ <;>
  [ 1. 6.]   refine' ⟨⟨_, _⟩, _, _⟩⟩ <;>
  [ 1. 7.]   linarith

In the following situation is assumption the tactic to be credited?

Theorem: theorem prod_add_prod_le' (hi : i  s) (h2i : g i + h i  f i) (hgf :  j  s, j  i  g j  f j) (hhf :  j  s, j  i  h j  f j) : (( i in s, g i) +  i in s, h i)   i in s, f i
  [ 1. 1.]   classical
  [ 1. 2.]     simp_rw [prod_eq_mul_prod_diff_singleton hi]
  [ 1. 3.]     refine' le_trans _ (mul_le_mul_right' h2i _)
  [ 1. 4.]     rw [right_distrib]
  [ 1. 5.]     apply add_le_add <;> apply mul_le_mul_left' <;> apply prod_le_prod' <;>
  [ 1. 6.]             simp only [and_imp, mem_sdiff, mem_singleton] <;>
  [ 1. 7.]           intros <;>
  [ 1. 8.]         apply_assumption <;>
  [ 1. 9.]       assumption

In the following situation is tauto the tactic to be credited? Only for the tauto on the last line?

Theorem: theorem mem_range_mulIndicator {r : M} {s : Set α} {f : α  M} : r  range (mulIndicator s f)  r = 1  s  univ  r  f '' s
  [ 1. 1.]   simp only [mem_range, mulIndicator, ne_eq, mem_image]
  [ 1. 2.]   rw [eq_univ_iff_forall, not_forall]
  [ 1. 3.]   refine ?_, ?_
  [ 1. 4.]   · rintro y, hy
  [ 1. 5.]     split_ifs at hy with hys
  [ 1. 6.]     · tauto
  [ 1. 7.]     · left
  [ 1. 8.]       tauto
  [ 1. 9.]   · rintro (⟨hr, x, hx⟩⟩ | x, hx, hxs⟩⟩) <;> use x <;> split_ifs <;> tauto

In the following "multiple := by" situation: should multiple credits be given or only one?

Theorem: lemma sum_mul_sq_le_sq_mul_sq (s : Finset ι) (f g : ι  α) : ( i in s, f i * g i) ^ 2  ( i in s, f i ^ 2) *  i in s, g i ^ 2
  [ 1. 1.]   nontriviality α
  [ 1. 2.]   obtain h' | h' := (sum_nonneg fun _ _  sq_nonneg <| g _).eq_or_lt
  [ 1. 3.]   · have h'' :  i  s, g i = 0 := fun i hi  by
  [ 1. 4.]       simpa using (sum_eq_zero_iff_of_nonneg fun i _  sq_nonneg (g i)).1 h'.symm i hi
  [ 1. 5.]     rw [ h', sum_congr rfl (show  i  s, f i * g i = 0 from fun i hi  by simp [h'' i hi])]
  [ 1. 6.]     simp
  [ 1. 7.]   refine le_of_mul_le_mul_of_pos_left
  [ 1. 8.]     (le_of_add_le_add_left (a := ( i in s, g i ^ 2) * ( j in s, f j * g j) ^ 2) ?_) h'
  [ 1. 9.]   calc
  [ 1.10.]     _ =  i in s, 2 * (f i *  j in s, g j ^ 2) * (g i *  j in s, f j * g j)
  [ 2. 1.]         simp_rw [mul_assoc (2 : α), mul_mul_mul_comm,  mul_sum,  sum_mul]; ring
  [ 2. 2.]     _   i in s, ((f i *  j in s, g j ^ 2) ^ 2 + (g i *  j in s, f j * g j) ^ 2) :=
  [ 2. 3.]         sum_le_sum fun i _  two_mul_le_add_sq (f i *  j in s, g j ^ 2) (g i *  j in s, f j * g j)
  [ 2. 4.]     _ = _
  [ 3. 1.]  simp_rw [sum_add_distrib, mul_pow,  sum_mul]; ring

In the following "multiple := by" situation: should multiple credits be given or only one?

Theorem: theorem sub_convergents_eq {ifp : IntFractPair K} (stream_nth_eq : IntFractPair.stream v n = some ifp) : let g := of v let B := (g.continuantsAux (n + 1)).b let pB := (g.continuantsAux n).
b v - g.convergents n = if ifp.fr = 0 then 0 else (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB))
  [ 1. 1.]   let g := of v
  [ 1. 2.]   let conts := g.continuantsAux (n + 1)
  [ 1. 3.]   let pred_conts := g.continuantsAux n
  [ 1. 4.]   have g_finite_correctness :
  [ 1. 5.]     v = GeneralizedContinuedFraction.compExactValue pred_conts conts ifp.fr :=
  [ 1. 6.]     compExactValue_correctness_of_stream_eq_some stream_nth_eq
  [ 1. 7.]   obtain (ifp_fr_eq_zero | ifp_fr_ne_zero) := eq_or_ne ifp.fr 0
  [ 1. 8.]   · suffices v - g.convergents n = 0 by simpa [ifp_fr_eq_zero]
  [ 1. 9.]     replace g_finite_correctness : v = g.convergents n;
  [ 1.10.]     · simpa [GeneralizedContinuedFraction.compExactValue, ifp_fr_eq_zero] using g_finite_correctness
  [ 1.11.]     exact sub_eq_zero.2 g_finite_correctness
  [ 1.12.]   · -- more shorthand notation
  [ 1.13.]     let A := conts.a
  [ 1.14.]     let B := conts.b
  [ 1.15.]     let pA := pred_conts.a
  [ 1.16.]     let pB := pred_conts.b
  [ 1.17.]     suffices v - A / B = (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB)) by simpa [ifp_fr_ne_zero]
  [ 1.18.]     replace g_finite_correctness : v = (pA + ifp.fr⁻¹ * A) / (pB + ifp.fr⁻¹ * B)
  [ 1.19.]     · simpa [GeneralizedContinuedFraction.compExactValue, ifp_fr_ne_zero, nextContinuants,
  [ 1.20.]         nextNumerator, nextDenominator, add_comm] using g_finite_correctness
  [ 1.21.]     suffices
  [ 1.22.]       (pA + ifp.fr⁻¹ * A) / (pB + ifp.fr⁻¹ * B) - A / B = (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB)) by
  [ 1.23.]       rwa [g_finite_correctness]
  [ 1.24.]     have n_eq_zero_or_not_terminated_at_pred_n : n = 0  ¬g.TerminatedAt (n - 1)
  [ 2. 1.]       cases' n with n'
  [ 2. 2.]       · simp
  [ 2. 3.]       · have : IntFractPair.stream v (n' + 1)  none
  [ 3. 1.]  simp [stream_nth_eq]
  [ 3. 2.]         have : ¬g.TerminatedAt n' :=
  [ 3. 3.]           (not_congr of_terminatedAt_n_iff_succ_nth_intFractPair_stream_eq_none).2 this
  [ 3. 4.]         exact Or.inr this
  [ 3. 5.]     have determinant_eq : pA * B - pB * A = (-1) ^ n :=
  [ 3. 6.]       determinant_aux n_eq_zero_or_not_terminated_at_pred_n
  [ 3. 7.]     have pB_ineq : (fib n : K)  pB :=
  [ 3. 8.]       haveI : n  1  ¬g.TerminatedAt (n - 2)
  [ 4. 1.]         cases' n_eq_zero_or_not_terminated_at_pred_n with n_eq_zero not_terminated_at_pred_n
  [ 4. 2.]         · simp [n_eq_zero]
  [ 4. 3.]         · exact Or.inr <| mt (terminated_stable (n - 1).pred_le) not_terminated_at_pred_n
  [ 4. 4.]       fib_le_of_continuantsAux_b this
  [ 4. 5.]     have B_ineq : (fib (n + 1) : K)  B :=
  [ 4. 6.]       haveI : n + 1  1  ¬g.TerminatedAt (n + 1 - 2)
  [ 5. 1.]         cases' n_eq_zero_or_not_terminated_at_pred_n with n_eq_zero not_terminated_at_pred_n
  [ 5. 2.]         · simp [n_eq_zero, le_refl]
  [ 5. 3.]         · exact Or.inr not_terminated_at_pred_n
  [ 5. 4.]       fib_le_of_continuantsAux_b this
  [ 5. 5.]     have zero_lt_B : 0 < B := B_ineq.trans_lt' <| cast_pos.2 <| fib_pos.2 n.succ_pos
  [ 5. 6.]     have : 0  pB := (cast_nonneg _).trans pB_ineq
  [ 5. 7.]     have : 0 < ifp.fr :=
  [ 5. 8.]       ifp_fr_ne_zero.lt_of_le' <| IntFractPair.nth_stream_fr_nonneg stream_nth_eq
  [ 5. 9.]     have : pB + ifp.fr⁻¹ * B  0
  [ 6. 1.]  positivity
  [ 6. 2.]     calc
  [ 6. 3.]       (pA + ifp.fr⁻¹ * A) / (pB + ifp.fr⁻¹ * B) - A / B =
  [ 6. 4.]           ((pA + ifp.fr⁻¹ * A) * B - (pB + ifp.fr⁻¹ * B) * A) / ((pB + ifp.fr⁻¹ * B) * B)
  [ 7. 1.]         rw [div_sub_div _ _ this zero_lt_B.ne']
  [ 7. 2.]       _ = (pA * B + ifp.fr⁻¹ * A * B - (pB * A + ifp.fr⁻¹ * B * A)) / _
  [ 8. 1.]  repeat' rw [add_mul]
  [ 8. 2.]       _ = (pA * B - pB * A) / ((pB + ifp.fr⁻¹ * B) * B)
  [ 9. 1.]  ring
  [ 9. 2.]       _ = (-1) ^ n / ((pB + ifp.fr⁻¹ * B) * B)
  [10. 1.]  rw [determinant_eq]
  [10. 2.]       _ = (-1) ^ n / (B * (ifp.fr⁻¹ * B + pB))
  [11. 1.]  ac_rfl

Last updated: May 02 2025 at 03:31 UTC