Zulip Chat Archive
Stream: Machine Learning for Theorem Proving
Topic: Claude Code
Geoffrey Irving (Feb 24 2025 at 20:29):
My sense is that Claude Code isn’t that useful yet in a Lean context because without custom engineering the agent doesn’t know the tactic state. Is that right? Curious if anyone tries it and has a positive experience.
https://www.anthropic.com/news/claude-3-7-sonnet
GasStationManager (Feb 25 2025 at 18:52):
I tried Claude Code briefly. After a bit of prompting it was able to invoke lake env lean <file>, and try to fix its own syntax errors, but ended up not able to fix its syntax and got stuck in a loop. (The task prompt was "prove that 3 is not even")
What worked better for me is when I wrote a simple wrapper of LeanTool to export its tool as a Model Context Protocol (MCP) server, which can then be added to Claude Code. This provides somewhat more information than just calling Lean, as it also prints goal states from sorrys (using Pantograph).
See here for an example set up.
GasStationManager (Feb 25 2025 at 19:01):
This does make me wonder, if Claude can make use of more interactive Lean features; e.g. if one makes a MCP server wrapper for @Oliver Dressler's leanclient, which exposes the Lean Language Server features. Which features would be useful to an LLM?
GasStationManager (Feb 25 2025 at 19:25):
Similarly for Pantograph and other repl-like interfaces. Would Claude be able to use these for proof search?
Geoffrey Irving (Feb 26 2025 at 11:01):
Recent LLMs are pretty good at tool use if prompted appropriately, so likely use.
Quinn (Feb 26 2025 at 15:59):
doing the fvapps baselines, the simplest scaffold (a loop) elicited nontrivial ability to learn syntax that the core llm was wrong about, just error message no LSP wire-up
Quinn (Feb 26 2025 at 16:26):
i'm actually a little skeptical that LSP / goal state beats error message by a lot overall, but one choice over another might save a lot of tokens/dollars
Eric Taucher (Oct 21 2025 at 12:20):
(deleted)
Gavin Zhao (Oct 22 2025 at 01:39):
You really need at least lean-lsp-mcp to make Claude Code useful, because then it can enter a self feedback loop with the help of compiler error messages.
Gavin Zhao (Oct 22 2025 at 01:40):
I haven't found Claude Code on its own to be very helpful in writing Lean code.
Gavin Zhao (Oct 22 2025 at 01:41):
For tactics specifically, you can prompt it to write its tactic step by step and tell it to hover before the first character and at the end of the line to see how the context changes, and based on that try to find the next best possible tactic to apply or revise the current tactic.
Kim Morrison (Oct 23 2025 at 23:55):
Gavin Zhao said:
You really need at least
lean-lsp-mcpto make Claude Code useful, because then it can enter a self feedback loop with the help of compiler error messages.
I'm confused by this claim. Without an MCP, Claude Code is perfectly capable of reading Lean error messages from the IDE, and reacting to them.
Kim Morrison (Oct 23 2025 at 23:56):
I think instructing it to hover and use the MCP to read the goal is way less robust (and less generalizable) than simply reading the unsolved goal error messages which are automatically in context, without an MCP!
Nehal Patel (Oct 24 2025 at 11:28):
After the Cambridge, MA Lean meetup I created a video showing what the workflow flow can look like using claude-code + lean skills (https://github.com/cameronfreer/lean4-skills) + lean_lsp_mcp.
Caveats/Suggestions:
- Watch the video at 2-3x
- In this instance the setup produces poor lean code -- I really should redo with a more compelling example but was too lazy; but hopefully this will give an indication of a possible workflow (which I do think will become quite powerful)
-
Cameron Freer (Oct 24 2025 at 14:36):
@Kim Morrison in my recent experience, the error messages are the main thing Claude uses until the file fully compiles (possibly with sorries), but the various LSP MCP features are helpful for speeding up the process of filling sorries and refactoring and golfing, etc. In particular, I've seen lean_multi_attempt used for quickly finding the simplest successful option among several tactics, once a proof is mostly in place.
Cameron Freer (Oct 24 2025 at 15:12):
For more details on patterns where I've seen Claude productively use the LSP, see
https://github.com/cameronfreer/lean4-skills/blob/main/lean4-theorem-proving%2Freferences%2Flean-lsp-server.md
(It's a rather opinionated description written by Claude based on what it found helpful over several hundred commits -- I don't know that its advice should be taken literally, but on the other hand it's not like Claude follows its own suggestions all that closely, even with the skill.)
Gavin Zhao (Oct 25 2025 at 01:16):
I think instructing it to hover and use the MCP to read the goal is way less robust (and less generalizable) than simply reading the unsolved goal error messages which are automatically in context, without an MCP!
This is useful in the case where Claude outputs a wrong proof that's a long list of tactics. Then, hovering becomes very useful because it can walk through the tactic application step by step (line by line) and see whether each application of the tactic changed the context as it expected. This allows it to do much more efficient/ganular changes rather than trying to reconcile the long list of tactics and the error message and figure out whether it's just a small mistake somewhere or the entire approach is wrong.
Kim Morrison (Oct 25 2025 at 06:57):
My experience has been that it is way more efficient to insist that it never outputs a long list of tactics, and to only allow it to write one tactic at a time. Writing long tactic scripts in one shot is nearly always a failure, and then extremely time/token expensive to repair.
Eric Taucher (Oct 25 2025 at 08:00):
(deleted)
Cameron Freer (Nov 03 2025 at 20:26):
Apart from fixing errors, I've also found that inspecting the goals over MCP (at multiple points in an already-compiling proof) can be useful for automatically discovering where to attempt to split the proof into smaller lemmas -- see, e.g., this workflow that has worked reasonably well for me:
https://github.com/cameronfreer/lean4-skills/blob/main/plugins/lean4-theorem-proving/skills/lean4-theorem-proving/references/proof-refactoring.md#step-1-survey-the-proof
Last updated: Dec 20 2025 at 21:32 UTC