Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Assessing the adequacy of an LLM-generated reasoning trace


Suhaib Farooqui (Nov 02 2025 at 07:25):

Hello everyone,

I’m conducting a small research study exploring whether open-source LLMs can generate reasoning traces that are sufficiently structured to guide Lean-style formalization.

In this experiment, the model was explicitly instructed to organize its reasoning under five headings:

  1. My High-Level Proof Plan 

  2. Key Lemma and Theorem Mapping 

  3. Implicit Formal Steps (Gap Analysis) 

  4. Step-by-Step Formalization 

  5. Key Translation Patterns I Found

Below, I’ve included excerpts from Section 4 (“Step-by-Step Formalization”), which represents the model’s detailed tactical reasoning and its attempt to connect informal mathematical reasoning to Lean tactics.

I’d appreciate your feedback on whether this reasoning seems conceptually adequate to generate the Lean proof that follows — or whether it falls short of genuine proof-theoretic understanding.

:small_blue_diamond:Excerpts from the LLM’s reasoning trace

theorem lean_workbook_plus_12908  (f : ℝ → ℝ)

(h₀ : ∀ x y, x ≤ y → f x ≤ f y) :

∀ a b, a + b ≥ 0 → f a + f b ≥ f (-a) + f (-b)   := by

1.  -- INFORMAL: "If a, b ∈ ℝ, and a+b ≥ 0..."

-- MY REASONING:

-- Informal Intent: The proof begins by assuming the premises of the proposition.

-- Goal State Analysis: My goal is ∀ a b, a + b ≥ 0 → f a + f b ≥ f (-a) + f (-b). It's a sequence of universal quantifiers and an implication.

-- My Tactic Selection: The standard tactic to handle this structure is intro. It will introduce a, b, and the hypothesis a + b ≥ 0 into my local context.

-- Expected Outcome: The context will contain a b : ℝ and h : a + b ≥ 0, and the goal will become f a + f b ≥ f (-a) + f (-b).

intro a b h

2.  -- INFORMAL: "Since a+b ≥ 0, ∴ a ≥ -b, And since f(x) is an increasing function on ℝ, ∴ f(a) ≥ f(-b)"

-- MY REASONING:

-- Informal Intent: The proof establishes the first of two key intermediate inequalities.

-- Goal State Analysis: My goal is f a + f b ≥ f (-a) + f (-b). My main hypotheses are h₀ : f is increasing and h : a + b ≥ 0.

-- My Tactic Selection: I'll use a have statement to create this intermediate result, which I'll name h1. The statement will be h1 : f a ≥ f (-b). To prove this, I need to apply h₀. The lemma h₀ requires a proof that the arguments are in the correct order, i.e., -b ≤ a. The informal proof justifies this with "Since a+b ≥ 0". This is a simple linear inequality that linarith can solve from hypothesis h. So, the proof for my have statement will be h₀ _ _ (by linarith). The underscores let Lean infer the arguments a and -b from the type of the have statement.

-- Expected Outcome: A new hypothesis h1 : f a ≥ f (-b) will be added to my context.

have h1 : f a ≥ f (-b) := h₀ _ _ (by linarith)

3.  -- INFORMAL: "Similarly, f(b) ≥ f(-a)"

-- MY REASONING:

-- Informal Intent: The proof states the second intermediate inequality, noting it follows by the same logic as the first.

-- Goal State Analysis: Same as before. I have h1 and now I need the other piece.

-- My Tactic Selection: I'll follow the same pattern. I'll create h2 : f b ≥ f (-a) using have. The proof will again be an application of h₀. This time, h₀ needs a proof of -a ≤ b. Just like before, this follows from h : a + b ≥ 0, so by linarith will prove it.

-- Expected Outcome: A new hypothesis h2 : f b ≥ f (-a) will be added to my context.

have h2 : f b ≥ f (-a) := h₀ _ _ (by linarith)

4.  -- INFORMAL: "Therefore, f(a)+f(b) ≥ f(-a)+f(-b)"

-- MY REASONING:

-- Informal Intent: The proof concludes by adding the two intermediate inequalities.

-- Goal State Analysis: My goal is f a + f b ≥ f (-a) + f (-b). My context now contains h1 : f a ≥ f (-b) and h2 : f b ≥ f (-a).

-- My Tactic Selection: The goal is a direct linear combination of my hypotheses h1 and h2. The linarith tactic is designed to solve exactly these kinds of goals. I will provide h1 and h2 as arguments to linarith to focus its search.

-- Expected Outcome: The goal will be solved.

linarith [h1, h2]

:small_blue_diamond: Corresponding Informal Proof

Informal Solution:

Since $a+b \geqslant 0$, $\therefore a \geqslant -b$, And since $f(x)$ is an increasing function on $\mathbb{R}$, $\therefore f(a) \geqslant f(-b)$…(2 points) Similarly, $f(b) \geqslant f(-a)$   …(3 points) Therefore, $f(a)+f(b) \geqslant f(-a)+f(-b)$ …(5 points) (Ⅱ) The converse proposition is: "Given that $f(x)$ is an increasing function on $\mathbb{R}$, if $f(a)+f(b) \geqslant f(-a)+f(-b)$, then $a+b \geqslant 0$"…(7 points) This proposition is true.…(8 points) The proof by contradiction is as follows: Assume $a+b < 0$, then $a < -b$, $b < -a$, Since $f(x)$ is an increasing function on $\mathbb{R}$, we have: $f(a) < f(-b)$, $f(b) < f(-a)$, $\therefore f(a)+f(b) < f(-a)+f(-b)$ …(10 points) This contradicts the given $f(a)+f(b) \geqslant f(-a)+f(-b)$ …(11 points) Therefore, the converse proposition is true.…(12 points) Thus, the final answer for the truth of the converse proposition is $\boxed{\text{True}}$.

:small_blue_diamond: Discussion Prompt

From a formal reasoning perspective, would you consider the reasoning trace above:

  1. Sufficiently precise and structured to be considered a valid abstraction of the formal proof, or 

  2. Too heuristic or shallow to capture what a human mathematician or Lean expert would consider genuine “proof planning”? 

  3. Looking at the 5-step framework (High-Level Proof Plan, Gap Analysis, etc.), does this seem like an effective prompting strategy to elicit this kind of structured reasoning?

More broadly, do you think such LLM-generated reasoning traces can serve as intermediate artifacts for bridging informal-to-formal translation in automated theorem proving?

Any technical or conceptual feedback is deeply appreciated — this will help refine our methodology for evaluating the interpretability and formal alignment of LLM reasoning outputs.


Last updated: Dec 20 2025 at 21:32 UTC