Zulip Chat Archive
Stream: general
Topic: Parallel Lean Repl
Frederick Pu (Apr 08 2025 at 18:35):
I've written the following batch proof verifying in lean repl
unsafe def getHeaderEnv (header : String) : IO Command.State := do
Lean.initSearchPath (← Lean.findSysroot)
enableInitializersExecution
let inputCtx := Parser.mkInputContext header "<input>"
let (header, parserState, messages) ← Parser.parseHeader inputCtx
let (env, _) ← processHeader header {} messages inputCtx
let commandState := (Command.mkState env messages {})
let s ← IO.processCommands inputCtx parserState commandState <&> Frontend.State.commandState
pure s
unsafe def batchVerifySequential (commandState : Command.State) (proofs : Array String) : IO (Array (CommandResponse ⊕ Error)) := do
proofs.mapM (fun pf => do
let inputCtx := Parser.mkInputContext pf "<input>"
let parserState := { : Parser.ModuleParserState }
let (_, msgs, _) ← Lean.Elab.IO.processCommandsWithInfoTrees inputCtx parserState commandState
return .inl ({ env := 0,
messages := ← msgs.mapM fun m => Message.of m,
sorries := [],
tactics := [],
infotree := none })
)
unsafe def batchVerifyParrallelNaive (header : String) (proofs : Array String) : IO (Array (CommandResponse ⊕ Error)) := do
let commandState ← getHeaderEnv header
let tasks : Array (Task (Except IO.Error CommandResponse)) ← (proofs.mapM <| fun proof => IO.asTask <| do
let inputCtx := Parser.mkInputContext proof "<input>"
let parserState := { : Parser.ModuleParserState }
let (_, msgs, _) ← Lean.Elab.IO.processCommandsWithInfoTrees inputCtx parserState commandState
return ({ env := 0,
messages := ← msgs.mapM fun m => Message.of m,
sorries := [],
tactics := [],
infotree := none
})
)
let result := ← IO.wait <| ← IO.mapTasks (·.mapM (pure ·)) tasks.toList (prio := Task.Priority.max)
match result with
| .ok results =>
return (results.map
fun x =>
match x with
| .ok cmdres => .inl cmdres
| .error e => .inr (Error.mk e.toString)
).toArray
| .error _ => return #[]
unsafe def batchVerifyParrallel (header : String) (proofs : Array String) (buckets : Option Nat): IO (Array (CommandResponse ⊕ Error)) := do
let buckets :=
match buckets with
| some x => x
| none => max 50 proofs.size
let commandState ← getHeaderEnv header
let tasks ← (splitArray proofs buckets |>.mapM <| fun bucket => IO.asTask ( (batchVerifySequential commandState bucket)))
let result := ← IO.wait <| ← IO.mapTasks (·.mapM (pure ·)) tasks.toList (prio := Task.Priority.max)
match result with
| .ok results => {
let womp : List (Array (CommandResponse ⊕ Error)) ← results.mapM (
fun x => do
match x with
| .ok bucket => pure bucket
| .error e =>
IO.println e
pure #[]
)
return womp.toArray.flatMap id
}
| .error _ => return #[]
However, the performance is significantly worse than running in leanrepl on multiple python processes. Can someone tell me what I'm doing wrong?
Frederick Pu (Apr 08 2025 at 18:36):
here the link to my full repo: https://github.com/leanprover-community/repl
Arthur Paulino (Apr 08 2025 at 18:47):
First thing that comes to my mind is: are you running compiled code or are you interpreting it?
Frederick Pu (Apr 08 2025 at 21:47):
im running lake build
, and then lake exe repl
Frederick Pu (Apr 08 2025 at 23:33):
One thing I noticed is that when you run lake build, or lake exe cache get, it usually spawns multiple processes. However my lean repl instance never does this
Frederick Pu (Apr 08 2025 at 23:40):
hmm seems like lake still uses subprocces calls to actually build the modules: https://github.com/leanprover/lean4/blob/master/src/lake/Lake/Build/Actions.lean
Frederick Pu (Apr 08 2025 at 23:46):
also when running naive parrallel you eventually create so many threads that you get the following error: libc++abi: terminating due to uncaught exception of type std::bad_alloc: std::bad_alloc
. Is there any way to see when you are about to reach the thread limit?
Last updated: May 02 2025 at 03:31 UTC