Zulip Chat Archive

Stream: mathlib4

Topic: LLM does not have good syntax of lean to prove simple case


Linfeng Song (Oct 03 2025 at 23:11):

I create case: Lean 1 + 2 + 3 + ... +n = n(n+1) /2 prove to ChatGPT, DeepSeek V3.2, QWen 3 235B, none of them provide correct syntax to work lean version. I try put output lean prove statement to try v4.8.0-rc1, v4.23.0. Does anybody to check real LLM generated lean statement to claim it works for them

Yan Yablonovskiy 🇺🇦 (Oct 04 2025 at 04:48):

Linfeng Song said:

I create case: Lean 1 + 2 + 3 + ... +n = n(n+1) /2 prove to ChatGPT, DeepSeek V3.2, QWen 3 235B, none of them provide correct syntax to work lean version. I try put output lean prove statement to try v4.8.0-rc1, v4.23.0. Does anybody to check real LLM generated lean statement to claim it works for them

Try using cursor with GPT-5, with .gitignore deleted. It actually looks in .lake and mathlib code.

Ruben Van de Velde (Oct 04 2025 at 06:09):

It is correct that asking an llm to generate lean code is not reliable at this point

(deleted) (Oct 04 2025 at 08:32):

In my setup, I write the Lean statement myself and have the AI agent prove it. And yes it does work very well albeit slowly provided that it can look into the .lake folder and read mathlib code.

(deleted) (Oct 04 2025 at 08:32):

Before AI got any good at Lean I already was a proficient Lean user.

(deleted) (Oct 04 2025 at 08:33):

AI being very powerful is real. I'm not making this up. I'm happy to provide a chat transcript.

(deleted) (Oct 04 2025 at 08:34):

Not a single soul who claims AI is powerful has posted a full chat transcript here before. I can be the first.

(deleted) (Oct 04 2025 at 09:27):

If you want me to send the chat transcript quickly, please tell me how I can export a chat from Codex

(deleted) (Oct 04 2025 at 09:27):

Otherwise I'll need to figure that out on my own

(deleted) (Oct 04 2025 at 12:02):

rollout-2025-10-03T11-24-50-0199a9d1-4d25-72d3-a65f-0f9d0f9268e8.jsonl

(deleted) (Oct 04 2025 at 12:02):

Here it is

(deleted) (Oct 04 2025 at 12:04):

See for yourself the sheer power. I had the LLM generate even longer proofs but the transcripts are too long to share and contain secrets

(deleted) (Oct 04 2025 at 12:05):

The only issue is speed.


Last updated: Dec 20 2025 at 21:32 UTC