Zulip Chat Archive
Stream: lean4
Topic: Evaluating tactics on mathlib
Yicheng Qian (Jan 05 2025 at 09:14):
Hi! I'm trying to obtain the success rate of different tactics (e.g. simp, aesop, duper, auto) on Mathlib
, but I've met some difficulties.
The basic setup for the evaluation is as follows: for each theorem T
, we find the module M
where T
is declared, replay the header (import statements) of M
and all the commands before the declaration of T
in M
, then run the tactic on the type of T
. This replay is to exclude the effect of the command that declares T
and the commands after the declaration of T
(if they're not excluded, then e.g. T
may be tagged with @[simp]
, which will cause simp
to immediately succeed when we run simp
on the type of T
).
The difficulty I met is that there are cases where I cannot pose time limits on the execution of certain tactics (see #lean4 > aesop nontermination, there are several theorems in the same Mathlib file with the same issue). When I run certain tactics on certain constants, they may loop forever even with reasonable maxHeartbeats
. If I run my evaluation code on a Mathlib file which contains such constants, the evaluation code will be stuck.
I'm trying to figure out a workaround, but I feel like I need help from more experienced hands. An idea is to run evaluation on a file for multiple times to determine where the timeout occurs, but this might be unreasonably inefficient (considering that the file Mathlib.RingTheory.FiniteType
mentioned in #lean4 > aesop nontermination contains several consecutive theorems that causes nontermination for aesop
).
Another thing I noticed is that when editing in VSCode, it's very easy to terminate the execution of any command (e.g. by commenting it out). I tried looking into src/lean/Lean/Server
but I'm not getting a clear idea how that's done. Would this be helpful?
Anyways, if you have any ideas for a workaround, please let me know.
Eric Wieser (Jan 05 2025 at 11:32):
My understanding is that commenting out a tactic does not terminate its execution, it just lets it continue in a background thread that no one is waiting for any more
Sebastian Ullrich (Jan 05 2025 at 12:18):
No, it also triggers docs#Lean.Core.Context.cancelTk?. So if you see CPU use drop after inserting the comment, then passing and triggering such a token from a different thread after a timeout might be a better approach than heartbeats.
Eric Wieser (Jan 05 2025 at 12:23):
Ah indeed, I got confused there. What I should have said is that if the tactic is not responding to the heartbeats limit (due to being in a tight loop without a call to checkSystem
), it probably won't respond to cancellation tokens either
Yicheng Qian (Jan 05 2025 at 12:36):
I managed to come up with a more concrete problem which is essentially what I'm trying to solve. Suppose I'm replaying a Mathlib file. Before elaboration of the command cmd
the environment is env
. Elaborating cmd
creates a new theorem ci : ConstantInfo
. Suppose prover : Expr -> CoreM Bool
attempts to find a proof of the input expression. How do I run prover
on ci.type
efficiently and asynchronously? And how do I cancel it if it times out (e.g. after 30s)? (Note that prover
need to have access to env
)
Yicheng Qian (Jan 05 2025 at 12:54):
I think IO.Process.spawn
is impractical because that would require passing the env
between processes.
Yicheng Qian (Jan 05 2025 at 13:18):
Ok, the original mem_adjoin_support
issue in #lean4 > aesop nontermination is solved :)
Last updated: May 02 2025 at 03:31 UTC