Zulip Chat Archive
Stream: lean4
Topic: MLList crash
Jovan Gerbscheid (Nov 29 2023 at 01:41):
In my DiscrTree
project, I was told to use MLList
as a list monad transformer, but when I tried this, it turned out to be terribly slow. So I decided to do a little test of its speed. But the following simple program made my lean crash:
import Std
open Lean Meta
def g (n : Nat) : MLList MetaM Nat := do
for i in [:n] do
if true then
continue
return 4
#eval MLList.force $ (g 10000)
telling me that
Message: Server process for file:///(.......)/lake-packages/std/Std/Data/MLList/test.lean crashed, likely due to a stack overflow or a bug.
Code: -32902
What's going on here?
Scott Morrison (Nov 29 2023 at 01:49):
Yeah, force
is bad. asArray
works fine.
Jovan Gerbscheid (Nov 29 2023 at 01:50):
I replaced force
by asArray
and get the same error.
Scott Morrison (Nov 29 2023 at 01:53):
Strange. I get #[4]
.
Scott Morrison (Nov 29 2023 at 01:53):
Have you restarted the server?
Jovan Gerbscheid (Nov 29 2023 at 01:53):
for me it works with 2000 and not with 3000
Jovan Gerbscheid (Nov 29 2023 at 01:55):
I just reloaded my VSCode, and get the same result
Scott Morrison (Nov 29 2023 at 01:56):
Yeah, okay, me too!
Scott Morrison (Nov 29 2023 at 02:01):
(As in, I was being confused by the error messages, and indeed mine also crashes at 3000, for either force or toArray.)
Mario Carneiro (Nov 29 2023 at 02:16):
Looks like removing @[specialize]
from casesM
makes the issue go away
Jovan Gerbscheid (Nov 29 2023 at 02:18):
I did this and ran the same test in the MLList/Basic.lean file, but I still get the error
Scott Morrison (Nov 29 2023 at 02:19):
Yeah, removing @[specialize] is not working for me.
Mario Carneiro (Nov 29 2023 at 02:27):
here's a MWE:
import Lean
unsafe inductive MLList (m : Type u → Type u) (α : Type u) : Type u
| nil : MLList m α
| cons : α → MLList m α → MLList m α
| thunk : Thunk (MLList m α) → MLList m α
| squash : (Unit → m (MLList m α)) → MLList m α
namespace MLList
unsafe def uncons {m : Type u → Type u} [Monad m] :
MLList m α → m (Option (α × MLList m α))
| .nil => pure none
| .thunk t => uncons t.get
| .squash t => t () >>= uncons
| .cons x xs => return (x, xs)
@[specialize]
unsafe def casesM [Monad m] (xs : MLList m α)
(hnil : Unit → m (MLList m β)) (hcons : α → MLList m α → m (MLList m β)) : MLList m β :=
squash fun _ => do
match ← xs.uncons with
| none => hnil ()
| some (x, xs) => hcons x xs
unsafe def cases [Monad m] (xs : MLList m α)
(hnil : Unit → MLList m β) (hcons : α → MLList m α → MLList m β) : MLList m β :=
xs.casesM (fun _ => return hnil ()) (fun x xs => return hcons x xs)
unsafe def append [Monad m] (xs : MLList m α) (ys : Unit → MLList m α) : MLList m α :=
xs.cases ys fun x xs => cons x (append xs ys)
unsafe def bind [Monad m] (xs : MLList m α) (f : α → MLList m β) : MLList m β :=
xs.cases (fun _ => nil) fun x xs => append (f x) (fun _ => bind xs f)
unsafe instance [Monad m] : Monad (MLList m) where
pure a := cons a nil
bind := bind
open Lean Meta
set_option trace.compiler.ir.result true
unsafe def g (n : Nat) : MLList IO Nat := do
for i in [:n] do
if true then
continue
return 4
open Lean Meta
#eval show IO _ from do
let x : MLList IO Nat := g 10000
if let .squash m := x then
let x ← m ()
Mario Carneiro (Nov 29 2023 at 02:33):
the stack overflow occurs in MLList.casesM._elambda_1._at.MLList.g._spec_6
which looks like this:
def MLList.casesM._elambda_1._at.MLList.g._spec_6 (x_1 : obj) (x_2 : obj) (x_3 : obj) (x_4 : obj) (x_5 : @& obj) (x_6 : obj) : obj :=
let x_7 : obj := proj[1] x_3;
inc x_7;
let x_8 : obj := MLList.uncons._rarg x_3 x_4;
let x_9 : obj := pap MLList.casesM._elambda_1._at.MLList.g._spec_6._lambda_1 x_1 x_2;
let x_10 : obj := app x_7 ◾ ◾ x_8 x_9 x_6;
ret x_10
This doesn't look recursive but I believe the app
call at x_10
is loading a function which is itself set up to call MLList.casesM._elambda_1._at.MLList.g._spec_6
Mario Carneiro (Nov 29 2023 at 04:25):
minimized some more:
set_option trace.compiler.extract_closed false
set_option trace.compiler.ir true
set_option genSizeOfSpec false in
inductive MLList (α : Type u) : Type u
| cons : α → MLList α
| squash : (Unit → Option (MLList α)) → MLList α
deriving Inhabited
namespace MLList
partial def uncons : MLList α → Option α
| .squash t => t () >>= uncons
| .cons x => return x
def cases (xs : MLList α) (hcons : α → MLList β) : MLList β :=
squash fun _ => do
let x ← xs.uncons
return hcons x
def bind (xs : MLList α) (f : α → MLList β) : MLList β :=
xs.cases fun x => (f x).cases cons
def loop : Nat → MLList Nat
| 0 => cons 4
| n+1 => bind (cons ()) fun _ => loop n
#eval show Option _ from do
let x : MLList Nat := bind (loop 10000) cons
if let .squash m := x then
let x ← m ()
Mario Carneiro (Nov 29 2023 at 04:50):
I wish I knew a better way to do this other than by-hand evaluation, but we can work out the evaluation sequence for this definition:
· bind (loop 10000) cons
· bind (bind (cons ()) fun _ => loop 9999) cons
· bind ((cons ()).cases fun _ => (loop 9999).cases cons) cons
· bind (squash fun _ => do
let x ← uncons (cons ())
return (loop 9999).cases cons) cons
· cases
(squash fun _ => do
let x ← uncons (cons ())
return (loop 9999).cases cons)
(fun x => (cons x).cases cons)
· squash fun _ => do
let x ← uncons (squash fun _ => do
let x ← uncons (cons ())
return (loop 9999).cases cons)
return (cons x).cases cons
It stops here, but after pattern matching on .squash m
and evaluating m ()
it is run some more:
· uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9999).cases cons)
>>= fun x => return (cons x).cases cons
· uncons (cons ())
>>= fun x => return (loop 9999).cases cons
>>= uncons
>>= fun x => return (cons x).cases cons
· pure x
>>= fun x => return (loop 9999).cases cons
>>= uncons
>>= fun x => return (cons x).cases cons
· pure ((squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons).cases cons)
>>= uncons
>>= fun x => return (cons x).cases cons
· pure (squash fun _ =>
uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons)
>>= fun x => return cons x)
>>= uncons
>>= fun x => return (cons x).cases cons
· uncons (squash fun _ =>
uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons)
>>= fun x => return cons x)
>>= fun x => return (cons x).cases cons
· uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons)
>>= fun x => return cons x
>>= uncons
>>= fun x => return (cons x).cases cons
· uncons (cons ())
>>= fun x => return (loop 9998).cases cons
>>= uncons
>>= fun x => return cons x
>>= uncons
>>= fun x => return (cons x).cases cons
...
At this point the repeating pattern is clear, this will keep accumulating >>= uncons
lines 10000 deep and run out of stack space
Mario Carneiro (Nov 29 2023 at 05:24):
which can be summarized as:
· uncons ((loop (n+1)).cases cons)
===>
· uncons (uncons ((loop n).cases cons) >>= fun x => return cons x)
In other words, the MLList monad does not preserve tail-recursiveness
Scott Morrison (Nov 29 2023 at 06:12):
Where does this leave us?
- MLList is just intrinsically limited (and we should put a big warning in the module doc)?
- Do you think we can add an extra operation inside the implementation of
MLList
that "supports tail recursion"...? Not really sure what that would mean.
Mario Carneiro (Nov 29 2023 at 06:13):
I have a fix, at least it fixes the OP example
Mario Carneiro (Nov 29 2023 at 06:14):
I added a MLList.uncons?
operation which returns none
in the thunk
and squash
cases, and use it in cases
and bind
to avoid building up stacks of thunks if it's just bind (pure a) ...
Mario Carneiro (Nov 29 2023 at 06:17):
Jovan Gerbscheid (Nov 29 2023 at 14:17):
I played around a bit more and found that even on a normal list monad, this makes lean crash. I really don't understand why:
instance : Monad List where
pure := List.pure
bind := List.bind
def g (n : Nat) : List Nat := do
for _ in [:n] do
if true then
continue
return 4
#eval ((g 10000))
Jovan Gerbscheid (Nov 30 2023 at 01:29):
Here's a more minimal example:
def myBind : Option α → (α → Option β) → Option β := Option.bind
instance : Monad Option where
pure := some
bind := myBind
#eval Nat.foldM (fun _ _ => some ()) () 10000
The only difference between myBind
and the default Option.bind
is that I have removed the @[inline]
tag. If the @[inline]
is present, it runs fine, but if it's not, then Lean crashes.
My question is: is this stack overflow expected?
Mario Carneiro (Nov 30 2023 at 03:46):
yes, this is basically the same issue as for MLList
Mario Carneiro (Nov 30 2023 at 03:46):
The evaluation sequence I showed above also applies in the list case, it stacks up a bunch of .. |>.append [] |>.append []
operations and so the evaluation uses up all the stack
Jovan Gerbscheid (Dec 19 2023 at 10:59):
I found a way to solve this problem for List, by handling lists of length 1 separately:
@[inline]
def myListBind (a : List α) (b : α → List β) : List β :=
match a with
| [a] => b a
| _ => List.bind a b
instance : Monad List where
pure := List.pure
bind := myListBind
#eval Nat.foldM (fun _ _ => List.pure ()) () 1000000
Is such a fix something you would want to have in Lean?
Mario Carneiro (Dec 19 2023 at 11:30):
well, core lean doesn't even have a Monad List
instance, so there is nothing to fix
Jovan Gerbscheid (Dec 24 2023 at 00:43):
I was playing around more with MLList, seeing what would happen if I removed the lazyness, and then I encountered a kernel error message that I have never seen before. Here's the mwe:
import Lean
namespace MLList2
private structure Spec (m : Type u → Type u) where
listM : Type u → Type u
nil : listM α
cons : α → listM α → listM α
squash : (m (listM α)) → listM α
uncons : [Monad m] → listM α → m (Option (α × listM α))
instance : Nonempty (Spec m) := .intro
{ listM := fun _ => PUnit
nil := ⟨⟩
cons := fun _ _ => ⟨⟩
squash := fun _ => ⟨⟩
uncons := fun _ => pure none }
private unsafe inductive MLListImpl (m : Type u → Type u) (α : Type u) : Type u
| nil : MLListImpl m α
| cons : α → MLListImpl m α → MLListImpl m α
| squash : ( m (MLListImpl m α)) → MLListImpl m α
private unsafe def unconsImpl {m : Type u → Type u} [Monad m] :
MLListImpl m α → m (Option (α × MLListImpl m α))
| .nil => pure none
| .squash t => t >>= unconsImpl
| .cons x xs => return (x, xs)
@[inline] private unsafe def specImpl (m) : Spec m where
listM := MLListImpl m
nil := .nil
cons := .cons
squash := .squash
uncons := unconsImpl
@[implemented_by specImpl]
private opaque spec (m) : MLList2.Spec m
end MLList2
def MLList2 (m : Type u → Type u) (α : Type u) : Type u := (MLList2.spec m).listM α
namespace MLList2
@[inline] def nil : MLList2 m α := (MLList2.spec m).nil
@[inline] def cons : α → MLList2 m α → MLList2 m α := (MLList2.spec m).cons
def squash : ( m (MLList2 m α)) → MLList2 m α := (MLList2.spec m).squash
@[inline] def uncons [Monad m] : MLList2.{u} m α → m (Option (α × MLList2 m α)) :=
(MLList2.spec m).uncons
instance : Inhabited (MLList2 m α) := ⟨nil⟩
partial def mapNatTrans [Monad m] (f : ∀ {α}, m α → n α) (L : MLList2 m α) : MLList2 n α :=
squash $ f do
match ← L.uncons with
| none => pure nil
| some (x, xs) => return cons x (mapNatTrans f xs)
open Lean.Meta
def g : MLList2 MetaM Unit → MLList2 MetaM Unit := MLList2.mapNatTrans id
/-
(kernel) function expected
_x_163 a✝²² a✝²¹ a✝²⁰ a✝¹⁹ a✝¹⁶
-/
Is this a bug or is there something bad about non-lazy monad-controlled lists?
James Gallicchio (Dec 24 2023 at 06:56):
i think any time you see (kernel)
in the error it is a bug
Mario Carneiro (Dec 24 2023 at 07:08):
adding noncomputable
makes the error go away, it looks like a bug in the compiler
Jovan Gerbscheid (Jan 23 2024 at 17:30):
Although the original bug was fixed, I'm still not using MLList, because the same function still takes too much time:
def g (n : Nat) : MLList (ReaderT Unit IO) Nat := do
for _ in [:n] do
if true then
continue
return 4
#eval timeit "" $ MLList.force ((g 1000000)) ()
I get an average time of 1.08s, whereas my alternative list monad does this in 150ms. I noticed that if MLList uses thunk : (Unit → MLListImpl m α) → MLListImpl m α
instead of thunk : Thunk (MLListImpl m α) → MLListImpl m α
, then it takes 830ms, which is better, but still far off what it should be. When I replace it with just thunk : MLListImpl m α → MLListImpl m α
, I'd expect it to be at least as fast, but it gives a stack overflow, which I don't understand.
So I propose that either MLList is somehow made more efficient, or alternatively a strict version should be available, like what I'm using now.
Joe Hendrix (Jan 23 2024 at 22:30):
@Jovan Gerbscheid Do you have a pointer to your alternative and what you are using it for?
Jovan Gerbscheid (Jan 24 2024 at 01:45):
Here's how you can define the strict ListT
transformer:
def ListT (m : Type u → Type v) (α : Type u) : Type v :=
m (List α)
@[always_inline, inline]
def ListT.run {m : Type u → Type v} {α : Type u} (x : ListT m α) : m (List α) :=
x
namespace ListT
variable {m : Type u → Type v} [Monad m] {α β : Type u}
protected def mk (x : m (List α)) : ListT m α :=
x
@[always_inline, inline]
protected def pure (a : α) : ListT m α := ListT.mk do
pure [a]
@[always_inline, inline]
protected def bind (x : ListT m α) (f : α → ListT m β) : ListT m β := ListT.mk do
match (← x) with
| [] => return []
| [a] => f a
| x => x.foldrM (fun a bs => (· ++ bs) <$> f a) []
@[always_inline]
instance : Monad (ListT m) where
pure := ListT.pure
bind := ListT.bind
It is crucial to tag the definition of bind with @[inline], and to match on the length 1 list separately, otherwise my example gives a stack overflow.
Jovan Gerbscheid (Jan 24 2024 at 01:53):
I am using it to do a computation that gives multiple outcomes. I do this inside a MetaM
monad, and the problem with that is that ListT MetaM
is technically not a monad, because MetaM
is not commutative. However, I don't use any state-altering MetaM
functions, so this is fine. (Edit: technically inferType
is state-altering since it caches results, but this is still commutative)
In my case, the number of outcomes of the computation is usually 1, so it is important that bind is efficient when the lists have length 1, which is what I am testing with my example.
Jovan Gerbscheid (Feb 08 2024 at 21:11):
Here's a minor variation of my original code that crashed, but this still crashes, complaining about deep recursion:
import Std
def g (n : Nat) : MLList Id Nat := do
for _ in [:n] do
pure ()
return 4
#eval MLList.force (g 6601)
6601 seems to be the lowest value at which this crashes
Last updated: May 02 2025 at 03:31 UTC