Zulip Chat Archive

Stream: Lean Together 2025

Topic: David Renshaw: tryAtEachStep


Bolton Bailey (Jan 14 2025 at 19:13):

You mentioned in the talk using an LLM based tool - Was this over an API or run locally?

Joachim Breitner (Jan 14 2025 at 19:16):

Very nice talk and tool! Are the slides linked somewhere

David Renshaw (Jan 14 2025 at 19:32):

TryAtEachStep Lean Together 2025.pdf

David Renshaw (Jan 14 2025 at 19:32):

Bolton Bailey said:

You mentioned in the talk using an LLM based tool - Was this over an API or run locally?

I tried LeanCopilot, which uses a smallish local model.

Jason Rute (Jan 14 2025 at 19:45):

I had to miss the talk (I'm on a plane). Nice shout out to the PACT paper.

Jason Rute (Jan 14 2025 at 19:46):

Which parts of your talk reference LeanCopilot? And if you did run LeanCopilot over much of mathlib, what is your impression of it as an automated proof solver?

David Renshaw (Jan 14 2025 at 19:51):

I hear that Kevin will have the talk uploaded to youtube shortly.

Bolton Bailey (Jan 14 2025 at 19:51):

Is there a way to get a full list of command line options? If I run it with --help I get

uncaught exception: usage: tryAtEachStepInDirectory [OPTIONS] TACTIC DIRECTORY

David Renshaw (Jan 14 2025 at 19:51):

I wanted to run LeanCopilot on Mathlib, but ran into ffi linker errors.

David Renshaw (Jan 14 2025 at 19:51):

so that's future work

David Renshaw (Jan 14 2025 at 19:53):

Bolton Bailey said:

Is there a way to get a full list of command line options? If I run it with --help I get

uncaught exception: usage: tryAtEachStepInDirectory [OPTIONS] TACTIC DIRECTORY

right now, reading the source code is your only option. :\

David Renshaw (Jan 14 2025 at 19:53):

expanding the help message and the README are TODOs

Pietro Monticone (Jan 14 2025 at 23:03):

:video_camera: Video recording: https://youtu.be/67Ea9WbeAPI

David Renshaw (Jan 15 2025 at 12:25):

RESULTS.json

^ results from an up-to-date mathlib, from running this command overnight

$ lake exe tryAtEachStepInDirectory "with_reducible exact?" .lake/packages/mathlib/Mathlib -j 31

David Renshaw (Jan 15 2025 at 12:25):

(someone requested that I share this data)

Michael Rothgang (Jan 15 2025 at 13:02):

Was it Kim Morrison? In any case, thanks for sharing!

Michael Rothgang (Jan 15 2025 at 13:03):

By the way: I realised that it should be easy to write a Lean executable which reads RESULTS.json and patches your mathlib source with the suggestions. I could try to find time for that early next week.

Michael Rothgang (Jan 15 2025 at 13:03):

By the way: which mathlib revision was this generated on?

Johan Commelin (Jan 15 2025 at 13:05):

@David Renshaw Have you considered integrating this with lake exe refactor (which is in an open PR, I think)?

David Renshaw (Jan 15 2025 at 13:07):

Michael Rothgang said:

By the way: which mathlib revision was this generated on?

https://github.com/leanprover-community/mathlib4/commit/decffa078a6c8823ac43c261a07a4f8b2d4fad5f

David Renshaw (Jan 15 2025 at 13:08):

Michael Rothgang said:

Was it Kim Morrison? In any case, thanks for sharing!

It was someone at gather.town after the talk. Kim was not there. (Asleep presumably.)

David Renshaw (Jan 15 2025 at 13:11):

Johan Commelin said:

David Renshaw Have you considered integrating this with lake exe refactor (which is in an open PR, I think)?

mathlib4#13973

David Renshaw (Jan 15 2025 at 13:12):

I was not aware of that! Thanks for the pointer.

David Renshaw (Jan 15 2025 at 15:18):

This morning, I managed to get tryAtEachStep working with LeanCopilot, using the following horrible invocation:

$ LD_LIBRARY_PATH=./.lake/packages/aesop/.lake/build/lib:./.lake/packages/batteries/.lake/build/lib:./.lake/packages/LeanCopilot/.lake/build/lib:./.lake/build/lib:/home/dwrensha/.elan/toolchains/leanprover--lean4---v4.15.0/lib/lean:/home/dwrensha/.elan/toolchains/leanprover--lean4---v4.15.0/lib/ \
    LD_PRELOAD="/home/dwrensha/.elan/toolchains/leanprover--lean4---v4.15.0/lib/lean/libInit_shared.so  /home/dwrensha/.elan/toolchains/leanprover--lean4---v4.15.0/lib/lean/libleanshared.so .lake/packages/LeanCopilot/.lake/build/lib/libLeanCopilot-1.so"  \
    time lake exe tryAtEachStepInDirectory "aesop? (config := { maxRuleApplications := 40, maxRuleApplicationDepth := 3 }) (add 100% LeanCopilot.tacGen)" Compfiles --imports LeanCopilot -j 1

David Renshaw (Jan 15 2025 at 15:19):

I would be very happy to learn about a way to avoid those LD_LIBRARY_PATH and LD_PRELOAD things.

David Renshaw (Jan 15 2025 at 20:36):

I've been letting LeanCopilot run on every proof step in Compfiles. It's going to take a long time for it to finish! Here's the best result it found so far:

index 98aa70d..9c5be72 100644
--- a/Compfiles/Poland1998P4.lean
+++ b/Compfiles/Poland1998P4.lean
@@ -241,17 +241,11 @@ lemma strengthen
   obtain ⟨N0, hn0⟩ := he
   intro N
   induction' N with pn hpn
-  · obtain (hlt : 0 < N0) | (hlte : N0 ≤ 0) := lt_or_ge 0 N0
-    · exact ⟨N0, hlt, hn0⟩
-    · have heq : N0 = 0 := eq_bot_iff.mpr hlte
-      rw [heq] at hn0
-      exact h 0 hn0
+  · obtain ⟨M, left, right⟩ := h N0 hn0
+    exact ⟨M, pos_of_gt left, right⟩
   · obtain ⟨m, hm, hmp⟩ := hpn
-    obtain (hlt : pn + 1 < m) |  (hlte : m ≤ pn + 1) := lt_or_ge (pn + 1) m
-    · exact ⟨m, hlt, hmp⟩
-    · have heq : m = pn + 1 := le_antisymm hlte hm
-      rw [heq] at hmp
-      exact h (pn.succ) hmp
+    obtain ⟨M, left, right⟩ := h m hmp
+    exact ⟨M, by omega, right⟩

 theorem poland1998_p4' : (∀ N : ℕ, ∃ M : ℕ, N < M ∧ 7 ∣ a M) := by
   have he : 7 ∣ a 5 := by simp [a', a]

(commit 1 , commit 2).

David Renshaw (Jan 15 2025 at 20:39):

And no, aesop alone does not find these improvements. I'm guessing the language model is needed to guess which arguments to pass to h.

David Renshaw (Jan 16 2025 at 13:44):

Bolton Bailey said:

Is there a way to get a full list of command line options? If I run it with --help I get

uncaught exception: usage: tryAtEachStepInDirectory [OPTIONS] TACTIC DIRECTORY

I added --help output: https://github.com/dwrensha/tryAtEachStep/commit/1338858a5dc5dc9faaba715e3f88514e912e087a


Last updated: May 02 2025 at 03:31 UTC