Zulip Chat Archive
Stream: lean4
Topic: MLList crash
Jovan Gerbscheid (Nov 29 2023 at 01:41):
In my DiscrTree
project, I was told to use MLList
as a list monad transformer, but when I tried this, it turned out to be terribly slow. So I decided to do a little test of its speed. But the following simple program made my lean crash:
import Std
open Lean Meta
def g (n : Nat) : MLList MetaM Nat := do
for i in [:n] do
if true then
continue
return 4
#eval MLList.force $ (g 10000)
telling me that
Message: Server process for file:///(.......)/lake-packages/std/Std/Data/MLList/test.lean crashed, likely due to a stack overflow or a bug.
Code: -32902
What's going on here?
Scott Morrison (Nov 29 2023 at 01:49):
Yeah, force
is bad. asArray
works fine.
Jovan Gerbscheid (Nov 29 2023 at 01:50):
I replaced force
by asArray
and get the same error.
Scott Morrison (Nov 29 2023 at 01:53):
Strange. I get #[4]
.
Scott Morrison (Nov 29 2023 at 01:53):
Have you restarted the server?
Jovan Gerbscheid (Nov 29 2023 at 01:53):
for me it works with 2000 and not with 3000
Jovan Gerbscheid (Nov 29 2023 at 01:55):
I just reloaded my VSCode, and get the same result
Scott Morrison (Nov 29 2023 at 01:56):
Yeah, okay, me too!
Scott Morrison (Nov 29 2023 at 02:01):
(As in, I was being confused by the error messages, and indeed mine also crashes at 3000, for either force or toArray.)
Mario Carneiro (Nov 29 2023 at 02:16):
Looks like removing @[specialize]
from casesM
makes the issue go away
Jovan Gerbscheid (Nov 29 2023 at 02:18):
I did this and ran the same test in the MLList/Basic.lean file, but I still get the error
Scott Morrison (Nov 29 2023 at 02:19):
Yeah, removing @[specialize] is not working for me.
Mario Carneiro (Nov 29 2023 at 02:27):
here's a MWE:
import Lean
unsafe inductive MLList (m : Type u → Type u) (α : Type u) : Type u
| nil : MLList m α
| cons : α → MLList m α → MLList m α
| thunk : Thunk (MLList m α) → MLList m α
| squash : (Unit → m (MLList m α)) → MLList m α
namespace MLList
unsafe def uncons {m : Type u → Type u} [Monad m] :
MLList m α → m (Option (α × MLList m α))
| .nil => pure none
| .thunk t => uncons t.get
| .squash t => t () >>= uncons
| .cons x xs => return (x, xs)
@[specialize]
unsafe def casesM [Monad m] (xs : MLList m α)
(hnil : Unit → m (MLList m β)) (hcons : α → MLList m α → m (MLList m β)) : MLList m β :=
squash fun _ => do
match ← xs.uncons with
| none => hnil ()
| some (x, xs) => hcons x xs
unsafe def cases [Monad m] (xs : MLList m α)
(hnil : Unit → MLList m β) (hcons : α → MLList m α → MLList m β) : MLList m β :=
xs.casesM (fun _ => return hnil ()) (fun x xs => return hcons x xs)
unsafe def append [Monad m] (xs : MLList m α) (ys : Unit → MLList m α) : MLList m α :=
xs.cases ys fun x xs => cons x (append xs ys)
unsafe def bind [Monad m] (xs : MLList m α) (f : α → MLList m β) : MLList m β :=
xs.cases (fun _ => nil) fun x xs => append (f x) (fun _ => bind xs f)
unsafe instance [Monad m] : Monad (MLList m) where
pure a := cons a nil
bind := bind
open Lean Meta
set_option trace.compiler.ir.result true
unsafe def g (n : Nat) : MLList IO Nat := do
for i in [:n] do
if true then
continue
return 4
open Lean Meta
#eval show IO _ from do
let x : MLList IO Nat := g 10000
if let .squash m := x then
let x ← m ()
Mario Carneiro (Nov 29 2023 at 02:33):
the stack overflow occurs in MLList.casesM._elambda_1._at.MLList.g._spec_6
which looks like this:
def MLList.casesM._elambda_1._at.MLList.g._spec_6 (x_1 : obj) (x_2 : obj) (x_3 : obj) (x_4 : obj) (x_5 : @& obj) (x_6 : obj) : obj :=
let x_7 : obj := proj[1] x_3;
inc x_7;
let x_8 : obj := MLList.uncons._rarg x_3 x_4;
let x_9 : obj := pap MLList.casesM._elambda_1._at.MLList.g._spec_6._lambda_1 x_1 x_2;
let x_10 : obj := app x_7 ◾ ◾ x_8 x_9 x_6;
ret x_10
This doesn't look recursive but I believe the app
call at x_10
is loading a function which is itself set up to call MLList.casesM._elambda_1._at.MLList.g._spec_6
Mario Carneiro (Nov 29 2023 at 04:25):
minimized some more:
set_option trace.compiler.extract_closed false
set_option trace.compiler.ir true
set_option genSizeOfSpec false in
inductive MLList (α : Type u) : Type u
| cons : α → MLList α
| squash : (Unit → Option (MLList α)) → MLList α
deriving Inhabited
namespace MLList
partial def uncons : MLList α → Option α
| .squash t => t () >>= uncons
| .cons x => return x
def cases (xs : MLList α) (hcons : α → MLList β) : MLList β :=
squash fun _ => do
let x ← xs.uncons
return hcons x
def bind (xs : MLList α) (f : α → MLList β) : MLList β :=
xs.cases fun x => (f x).cases cons
def loop : Nat → MLList Nat
| 0 => cons 4
| n+1 => bind (cons ()) fun _ => loop n
#eval show Option _ from do
let x : MLList Nat := bind (loop 10000) cons
if let .squash m := x then
let x ← m ()
Mario Carneiro (Nov 29 2023 at 04:50):
I wish I knew a better way to do this other than by-hand evaluation, but we can work out the evaluation sequence for this definition:
· bind (loop 10000) cons
· bind (bind (cons ()) fun _ => loop 9999) cons
· bind ((cons ()).cases fun _ => (loop 9999).cases cons) cons
· bind (squash fun _ => do
let x ← uncons (cons ())
return (loop 9999).cases cons) cons
· cases
(squash fun _ => do
let x ← uncons (cons ())
return (loop 9999).cases cons)
(fun x => (cons x).cases cons)
· squash fun _ => do
let x ← uncons (squash fun _ => do
let x ← uncons (cons ())
return (loop 9999).cases cons)
return (cons x).cases cons
It stops here, but after pattern matching on .squash m
and evaluating m ()
it is run some more:
· uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9999).cases cons)
>>= fun x => return (cons x).cases cons
· uncons (cons ())
>>= fun x => return (loop 9999).cases cons
>>= uncons
>>= fun x => return (cons x).cases cons
· pure x
>>= fun x => return (loop 9999).cases cons
>>= uncons
>>= fun x => return (cons x).cases cons
· pure ((squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons).cases cons)
>>= uncons
>>= fun x => return (cons x).cases cons
· pure (squash fun _ =>
uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons)
>>= fun x => return cons x)
>>= uncons
>>= fun x => return (cons x).cases cons
· uncons (squash fun _ =>
uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons)
>>= fun x => return cons x)
>>= fun x => return (cons x).cases cons
· uncons (squash fun _ =>
uncons (cons ())
>>= fun x => return (loop 9998).cases cons)
>>= fun x => return cons x
>>= uncons
>>= fun x => return (cons x).cases cons
· uncons (cons ())
>>= fun x => return (loop 9998).cases cons
>>= uncons
>>= fun x => return cons x
>>= uncons
>>= fun x => return (cons x).cases cons
...
At this point the repeating pattern is clear, this will keep accumulating >>= uncons
lines 10000 deep and run out of stack space
Mario Carneiro (Nov 29 2023 at 05:24):
which can be summarized as:
· uncons ((loop (n+1)).cases cons)
===>
· uncons (uncons ((loop n).cases cons) >>= fun x => return cons x)
In other words, the MLList monad does not preserve tail-recursiveness
Scott Morrison (Nov 29 2023 at 06:12):
Where does this leave us?
- MLList is just intrinsically limited (and we should put a big warning in the module doc)?
- Do you think we can add an extra operation inside the implementation of
MLList
that "supports tail recursion"...? Not really sure what that would mean.
Mario Carneiro (Nov 29 2023 at 06:13):
I have a fix, at least it fixes the OP example
Mario Carneiro (Nov 29 2023 at 06:14):
I added a MLList.uncons?
operation which returns none
in the thunk
and squash
cases, and use it in cases
and bind
to avoid building up stacks of thunks if it's just bind (pure a) ...
Mario Carneiro (Nov 29 2023 at 06:17):
Jovan Gerbscheid (Nov 29 2023 at 14:17):
I played around a bit more and found that even on a normal list monad, this makes lean crash. I really don't understand why:
instance : Monad List where
pure := List.pure
bind := List.bind
def g (n : Nat) : List Nat := do
for _ in [:n] do
if true then
continue
return 4
#eval ((g 10000))
Jovan Gerbscheid (Nov 30 2023 at 01:29):
Here's a more minimal example:
def myBind : Option α → (α → Option β) → Option β := Option.bind
instance : Monad Option where
pure := some
bind := myBind
#eval Nat.foldM (fun _ _ => some ()) () 10000
The only difference between myBind
and the default Option.bind
is that I have removed the @[inline]
tag. If the @[inline]
is present, it runs fine, but if it's not, then Lean crashes.
My question is: is this stack overflow expected?
Mario Carneiro (Nov 30 2023 at 03:46):
yes, this is basically the same issue as for MLList
Mario Carneiro (Nov 30 2023 at 03:46):
The evaluation sequence I showed above also applies in the list case, it stacks up a bunch of .. |>.append [] |>.append []
operations and so the evaluation uses up all the stack
Jovan Gerbscheid (Dec 19 2023 at 10:59):
I found a way to solve this problem for List, by handling lists of length 1 separately:
@[inline]
def myListBind (a : List α) (b : α → List β) : List β :=
match a with
| [a] => b a
| _ => List.bind a b
instance : Monad List where
pure := List.pure
bind := myListBind
#eval Nat.foldM (fun _ _ => List.pure ()) () 1000000
Is such a fix something you would want to have in Lean?
Mario Carneiro (Dec 19 2023 at 11:30):
well, core lean doesn't even have a Monad List
instance, so there is nothing to fix
Last updated: Dec 20 2023 at 11:08 UTC