Zulip Chat Archive

Stream: lean4

Topic: Evaluating tactics on mathlib

Yicheng Qian (Jan 05 2025 at 09:14):

Hi! I'm trying to obtain the success rate of different tactics (e.g. simp, aesop, duper, auto) on Mathlib, but I've met some difficulties.
The basic setup for the evaluation is as follows: for each theorem T, we find the module M where T is declared, replay the header (import statements) of M and all the commands before the declaration of T in M, then run the tactic on the type of T. This replay is to exclude the effect of the command that declares T and the commands after the declaration of T (if they're not excluded, then e.g. T may be tagged with @[simp], which will cause simp to immediately succeed when we run simp on the type of T).
The difficulty I met is that there are cases where I cannot pose time limits on the execution of certain tactics (see #lean4 > aesop nontermination, there are several theorems in the same Mathlib file with the same issue). When I run certain tactics on certain constants, they may loop forever even with reasonable maxHeartbeats. If I run my evaluation code on a Mathlib file which contains such constants, the evaluation code will be stuck.
I'm trying to figure out a workaround, but I feel like I need help from more experienced hands. An idea is to run evaluation on a file for multiple times to determine where the timeout occurs, but this might be unreasonably inefficient (considering that the file Mathlib.RingTheory.FiniteType mentioned in #lean4 > aesop nontermination contains several consecutive theorems that causes nontermination for aesop).
Another thing I noticed is that when editing in VSCode, it's very easy to terminate the execution of any command (e.g. by commenting it out). I tried looking into src/lean/Lean/Server but I'm not getting a clear idea how that's done. Would this be helpful?
Anyways, if you have any ideas for a workaround, please let me know.

Eric Wieser (Jan 05 2025 at 11:32):

My understanding is that commenting out a tactic does not terminate its execution, it just lets it continue in a background thread that no one is waiting for any more

Sebastian Ullrich (Jan 05 2025 at 12:18):

No, it also triggers docs#Lean.Core.Context.cancelTk?. So if you see CPU use drop after inserting the comment, then passing and triggering such a token from a different thread after a timeout might be a better approach than heartbeats.

Eric Wieser (Jan 05 2025 at 12:23):

Ah indeed, I got confused there. What I should have said is that if the tactic is not responding to the heartbeats limit (due to being in a tight loop without a call to checkSystem), it probably won't respond to cancellation tokens either

Yicheng Qian (Jan 05 2025 at 12:36):

I managed to come up with a more concrete problem which is essentially what I'm trying to solve. Suppose I'm replaying a Mathlib file. Before elaboration of the command cmd the environment is env. Elaborating cmd creates a new theorem ci : ConstantInfo. Suppose prover : Expr -> CoreM Bool attempts to find a proof of the input expression. How do I run prover on ci.type efficiently and asynchronously? And how do I cancel it if it times out (e.g. after 30s)? (Note that prover need to have access to env)

Yicheng Qian (Jan 05 2025 at 12:54):

I think IO.Process.spawn is impractical because that would require passing the env between processes.

Yicheng Qian (Jan 05 2025 at 13:18):

Ok, the original mem_adjoin_support issue in #lean4 > aesop nontermination is solved :)

Last updated: May 02 2025 at 03:31 UTC