Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Learning to golf with Aristotle.

J. J. Issai (project started by GRP) (Feb 12 2026 at 21:20):

I am trying the approach of having Aristotle teach me Lean. I take some simple but nontrivial results and sketch a proof, and then feed the sketch to Aristotle. I analyze the output and want to have simpler, or more understandable, or briefer versions of the output. For discussion, here is a proof fragment from Aristotle (more context below this fragment):

  -- By induction on $k = \left\lceil \frac{n}{2} \right\rceil$, we can show that $\pi(2k) \leq k$.
  have h_ind : ∀ k : ℕ, k ≥ 1 → Nat.primeCounting (2 * k) ≤ k := by
    intro k hk
    induction' k with k ih;
    · native_decide +revert;
    · rcases k with ( _ | k ) <;> simp_all +decide [ Nat.mul_succ, Nat.primeCounting ];
      simp_all +arith +decide [ Nat.primeCounting' ];
      rw [ Nat.count_succ ];
      split_ifs <;> simp_all +arith +decide [ Nat.count_succ ];
      · cases ‹Nat.Prime ( 2 * k + 4 ) ›.eq_two_or_odd <;> simp_all +arith +decide [ Nat.add_mod, Nat.mul_mod ];
      · grind;

I think (as a subgoal to the theorem provided to the theorem below) the comment is a reasonable one (show $\pi(2k) \leq k$ ), but in pursuit of this goal, I wonder if native_decide is needed to handle $\pi(2) \leq 1$ . Isn't there a better or more clear way to do this? Also, what aid does +revert give?
Also, is there a good rational for handling k> 1 with rcases? The original prompt included the suggestion that Card.{ p : 2k<p<=2k+2 and p prime} <= Card{p . p =2k+1 } = 1, and that suggestion does not seem to appear in this fragment. What is going on with this fragment?

Here is more context. In running this in the sandbox the suggestion is made that
"This simp argument is unused:
Nat.mul_mod"
and that it can be omitted from the arguments to simp_all just before the call to grind.

import Mathlib

set_option linter.mathlibStandardSet false

open scoped BigOperators
open scoped Real
open scoped Nat
open scoped Classical
open scoped Pointwise

set_option maxHeartbeats 0
set_option maxRecDepth 4000
set_option synthInstance.maxHeartbeats 20000
set_option synthInstance.maxSize 128

set_option relaxedAutoImplicit false
set_option autoImplicit false

noncomputable section

#print Nat.primeCounting
#print Nat.ceil
#check Nat.floor

/-
For all n > 0, pi(n) <= ceil(n/2).
-/
theorem lemma_apple (n : ℕ) (h : 0 < n) : Nat.primeCounting n ≤ Nat.ceil ((n : ℝ) / 2) := by
  -- By induction on $k = \left\lceil \frac{n}{2} \right\rceil$, we can show that $\pi(2k) \leq k$.
  have h_ind : ∀ k : ℕ, k ≥ 1 → Nat.primeCounting (2 * k) ≤ k := by
    intro k hk
    induction' k with k ih;
    · native_decide +revert;
    · rcases k with ( _ | k ) <;> simp_all +decide [ Nat.mul_succ, Nat.primeCounting ];
      simp_all +arith +decide [ Nat.primeCounting' ];
      rw [ Nat.count_succ ];
      split_ifs <;> simp_all +arith +decide [ Nat.count_succ ];
      · cases ‹Nat.Prime ( 2 * k + 4 ) ›.eq_two_or_odd <;> simp_all +arith +decide [ Nat.add_mod, Nat.mul_mod ];
      · grind;
  rcases Nat.even_or_odd' n with ⟨ k, rfl | rfl ⟩;
  · simpa [ show ⌈ ( k : ℝ ) ⌉₊ = k by exact_mod_cast Nat.ceil_natCast _ ] using h_ind k ( by linarith );
  · convert h_ind ( k + 1 ) ( by linarith ) |> le_trans _ using 1;
    · exact Nat.ceil_eq_iff ( by positivity ) |>.2 ⟨ by norm_num; linarith, by norm_num; linarith ⟩;
    · exact Nat.monotone_primeCounting ( Nat.le_succ _ )

I would like suggestions on how to golf this (and similar lemmata). Alternatively, suggestions which would improve readability are also welcome. Most importantly, how can I ask Aristotle (or other agents) to simplify the fragment provided? (I haven't learned enough Lean yet to do it myself.)

Kim Morrison (Feb 12 2026 at 21:30):

I suspect you might not get that much traction asking humans to help you golf AI output. It's pretty unrewarding work. That said, Claude Code running Opus will handle this no problem, and can happily explain to you what golfing opportunities exist, which tactic steps here are weird or dodgy, or why some are necessary and can't be simplified further.

J. J. Issai (project started by GRP) (Feb 12 2026 at 21:52):

Thank you for the suggestion on Claude Code running Opus. I wonder if similar things have been tried with Aristotle, and if someone in this forum can respond in that vein. Also, I am more interested in the issues involved in trying this. If you have comments on this trial example that highlights some of the issues (or a pointer to a resource that talks about the issues), that would also be welcome. The goal is to have humans who have tried this (so experts) tell me some more of what to expect, and to illustrate some specific concerns using the example above.

Kim Morrison (Feb 12 2026 at 22:29):

Kim Morrison said:

I suspect you might not get that much traction asking humans to help you golf AI output.

Kim Morrison (Feb 12 2026 at 22:31):

If someone were to say they were going to write a high quality public how-to-golf-AI-proofs document, particularly one that was usable for AI prompting, then I would be happy to contribute.

J. J. Issai (project started by GRP) (Feb 12 2026 at 23:04):

I don't know about high-quality: I am trying this path to finish my current project (see chats about Sylvester-Schur) and am writing a blog post about my experience. If you know of someone who will write such material, I will be happy to contribute my post as well as supporting prompt queries and results.

Cameron Freer (Feb 13 2026 at 18:53):

There's a bunch of guidance about how to golf contained within the Claude skill in the reference files

proof-golfing
proof-golfing-patterns
proof-golfing-safety
as well as the slashcommand
/golf
and script
find_golfable

Feel free to borrow from this in your projects, or to contribute ideas/fixes via issues or PRs.

Bolton Bailey (Feb 13 2026 at 20:05):

I have found these tools in particular very useful, thanks for making them!

Vasily Ilin (Feb 20 2026 at 04:06):

I want to second this. @Cameron Freer , thank you for making the Lean skills for CC!

Last updated: Feb 28 2026 at 14:05 UTC