Zulip Chat Archive

Stream: lean4

Topic: Deadlocking the task system


Henrik Böving (Jul 23 2023 at 23:49):

While experimenting with the Task system to get a feel for how one might use it for an async setup I noticed that one can rather easily provoke a deadlock in the scheduler like so:

def sleep (ms : UInt32) : IO Unit := do
  IO.println s!"sleeping for {ms}"
  let t  IO.asTask (IO.sleep ms)
  IO.println "Task started, waiting"
  let res := t.get
  IO.println "Task done"
  match res with
  | .ok _ => return ()
  | .error e => throw e

def t1 : IO Unit := do
  sleep 1000
  IO.println "t1 done"

def main : IO Unit := do
  -- set this to nproc
  let ts  List.range 8 |>.mapM (fun _ => IO.asTask t1)
  for t in ts do
    let r := t.get
    IO.println r
  IO.println "bye"

To my understanding this happens because Lean does spawn at max 8 (nproc on my machine) pthreads under the hood and it will forcibly run a Task to completion instead of switching to other tasks if possible (for example because the current one is waiting).

My first question is: Is this a bug? It certainly makes code behave rather differently on machines of different CPU count.

Secondly this makes me wonder how I should view tasks in terms of async abstractions. It is certainly not usable for computations that want to switch contexts because they are sleeping. Or at least that seems to be the case from my interpretation of the above situation. I did see that we have the Promise abstraction which is based on Tasks, I am not quite certain whether I can provoke a similar issue with just Promises but I feel like it should be possible given how similarly they work?

In general I would be curious if it is possible to get the above code working only without a deadlock somehow. I would also be interested in how people that understand the concurrency primitives better than me would imagine the design of a Lean async runtime on top of it. Having looked at it so far I can't quite see how to make it work with the runtime primitives yet.

Mac Malone (Jul 24 2023 at 00:43):

@Henrik Böving The main problem with your example is that you are waiting on a task from within another task, which worthlessly consumes a thread in the thread pool and is a known pitfall with tasks in Lean and something that one needs to carefully work around (i.e., with bindTask or the like).

Henrik Böving (Jul 24 2023 at 15:13):

Right I understand the issue at hand. I just dont exactly see how this fits into my greater understanding of runtime provided green threads. In languages that I know the runtime is usually smart enough to switch green threads when they block or alternatively it doesn't provide a native green thread API at all and let's the user handle that but Lean fits in neither category so I am rather confused as to how would implement something with at least the intended behavior of the code above.


Last updated: Dec 20 2023 at 11:08 UTC