Zulip Chat Archive

Stream: lean4

Topic: Monadic stream class


Tage Johansson (Apr 11 2023 at 14:44):

Sorry if I am getting something wrong here. But I notice the definition of the Stream class:

class Stream (stream : Type u) (value : outParam (Type v)) : Type (max u v) where
  next? : stream  Option (value × stream)

Where it doesn't seem to be possible to have a stream where the next function returns a custom monad.

The ForM class does allow a custom monad, but it seems to me that the ForM class is less flexible. I can for example not think of any satisfying way of implementing zip with ForM, but please correct me if I'm wrong on this point.

I would like a stream class parameterized over any monad, like this:

class StreamM (m : Type (max u v)  Type w) (stream : Type u) (value : outParam (Type v)) where
  next? [Monad m] : stream  m (Option (value × stream))

Or perhaps with StateT:

class StreamM (m : Type u  Type v) (stream : Type u) (value : outParam (Type u)) where
  next? [Monad m] : StateT stream m (Option value)

Is there any good reason why such a class doesn't exist? Or does it already exist?

Thanks,
Tage

James Gallicchio (Apr 11 2023 at 17:27):

as I understand it, the current Stream class is designed for pure code. see https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Core.20Stream.20class for some discussion

James Gallicchio (Apr 11 2023 at 17:31):

ForM is definitely limited to internal iteration, which for most cases is fine, but which is not fine for e.g. zipping, as you pointed out. It's also a bit of an awkward class to use, since it is largely intended as a notational typeclass.

James Gallicchio (Apr 11 2023 at 17:35):

YatimaStdLib, LeanColls, and SciLean all have their own iterator/iterable classes that are mostly equivalent. I don't think that any of them (?) let the next be in a monad, because you don't really need it for most applications. But I don't see any particular reason not to support it

Tage Johansson (Apr 11 2023 at 18:35):

There are many cases when it makes sence to put next inside a monad. For example a fallible iterator, or an an asynchronous stream, or both an fallible and asynchronous stream like IO.

James Gallicchio (Apr 11 2023 at 18:51):

Totally. The only related work I know so far is yatima's Straume library & typeclass, which is intended to model monadic and buffered streams

James Gallicchio (Apr 11 2023 at 18:51):

But it seems like that's also a WIP abstraction :joy:

Scott Morrison (Apr 11 2023 at 19:23):

There's ListM in mathlib4.

Tage Johansson (Apr 11 2023 at 20:10):

The straume library looks cool but I just want a simple stream that can be used in for loops.

ListM exists in mathlib, but I don't really like the definition of it, in particular because it is marked unsafe. I am not an expert in Lean, but it sounds scary.

I'll probably end up implementing my own StreamM class and create an instance of ForIn for it.

James Gallicchio (Apr 11 2023 at 20:11):

Would be good to experiment with it and see if any issues come up. Seems reasonable to include in Std.

François G. Dorais (Apr 11 2023 at 21:00):

In the earlier discussion mentioned by @James Gallicchio, I arrived at the conclusion that the simplest and most versatile stream class is

class Stream (σ : Type _) (m : Type _  Type _) (α : outParam (Type _)) where
  next : σ  m (α × σ)

Option is not necessary since that's the special case Stream σ (OptionT m) α. In particular, the core stream class is just Stream σ Option α.

Tage Johansson (Apr 11 2023 at 21:18):

I come up with the following. Feel free to comment on my code.

class StreamM (m : Type _  Type _) (σ : Type _) (α : outParam (Type _)) where
  next [Monad m] : σ  m (α × σ)

def StreamM.forIn [StreamM m σ α] [Monad m] (xs : σ) (b : β) (f : α  β  m (ForInStep β)) : m β := do
  let mut b := b
  let mut xs := xs
  repeat do
    let (x, xsNew)  StreamM.next xs
    match  f x b with
      | .yield bNew => do
        b := bNew
        xs := xsNew
      | .done bNew => do
        b := bNew
        break
  pure b
instance [StreamM m σ α] : ForIn m σ α where
  forIn := StreamM.forIn

First I tried to implement StreamM.forIn as a recursive function, but it didn't worked at all. Even when I marked it partial, the compiler failed with some strange error. Then I looked into the definition of the repeat statement. In particular Lean.Loop.forIn and noticed that it is a partial definition which is assigned to ForIn.forIn which is not marked as partial. Is it legal to just escape partiality in that way?

François G. Dorais (Apr 11 2023 at 21:43):

The issue is that, even though you have the (b : β) argument, Lean can't infer an instance of Inhabited β which is needed for partial. Another workaround is:

private partial def StreamM.forInAux [StreamM m σ α] [Monad m] [Inhabited β] (xs : σ) (b : β) (f : α  β  m (ForInStep β)) : m β := do
  let (x, xs)  StreamM.next xs
  match  f x b with
  | .yield b => forInAux xs b f
  | .done b => pure b

def StreamM.forIn [StreamM m σ α] [Monad m] (xs : σ) (b : β) (f : α  β  m (ForInStep β)) : m β := do
  let _ : Inhabited β := b
  forInAux xs b f

You could even use default instead of the b argument but that's just piling on silliness on top of silliness.

François G. Dorais (Apr 11 2023 at 21:46):

Except for the order of arguments, I've been using this class in personal work for a while with few issues. I think it's a worthy addition to Std4 but there's no rush since development of Std4 has a backlog right now.

Eric Wieser (Apr 11 2023 at 21:53):

Tage Johansson said:

ListM exists in mathlib, but I don't really like the definition of it, in particular because it is marked unsafe. I am not an expert in Lean, but it sounds scary.

docs4#ListM explains why this is:

The inductive construction is unsafe [because] we can build infinite objects

In practice, this means you can't prove anything about operations using ListM; but nor you can prove much about partial either.

François G. Dorais (Apr 11 2023 at 21:53):

(Actually, my version of StreamM doesn't have the [Monad m] argument for StreamM.next. I don't think it makes a big difference though.)

James Gallicchio (Apr 11 2023 at 21:57):

Yeah. All of this does not work in proof context :P I did a bunch of experimenting in LeanColls with well-founded external iteration, and it's definitely possible, but we need better tools for monadic reasoning first (which is why I stopped working on LeanColls...)

François G. Dorais (Apr 11 2023 at 21:59):

@Eric Wieser But isn't the issue just that pretty much all operations are inherently partial? Isn't unsafe a bit extreme then? At least, partial isn't as contagious.

James Gallicchio (Apr 11 2023 at 22:00):

note that lean doesn't let you state the inductive otherwise

François G. Dorais (Apr 11 2023 at 22:03):

Right! Because m (Option α × ListM m α)!

Mario Carneiro (Apr 11 2023 at 23:21):

Eric Wieser said:

Tage Johansson said:

ListM exists in mathlib, but I don't really like the definition of it, in particular because it is marked unsafe. I am not an expert in Lean, but it sounds scary.

docs4#ListM explains why this is:

The inductive construction is unsafe [because] we can build infinite objects

In practice, this means you can't prove anything about operations using ListM; but nor you can prove much about partial either.

It's worse than that, and you are right to be worried about the unsafe marking. partial means that you can't prove anything about the object, but unsafe means that the object's existence itself could potentially lead to a contradiction, and that is in fact the case here:

import Mathlib.Data.ListM

unsafe example : False :=
  let M (α : Type) : Type := PLift (Subsingleton α)
  have : Subsingleton (Option Empty) := fun | none, none => rfl
  have : ¬ Subsingleton (ListM M Empty) := fun h =>
    nomatch h ListM.nil (ListM.cons inferInstance⟩)
  have H : M (Option Empty × ListM M Empty)  False := fun ⟨⟨h⟩⟩ =>
    this fun a b => by cases h (none, a) (none, b); rfl
  this fun
    | ListM.nil, ListM.nil => rfl
    | ListM.cons h, _ | _, ListM.cons h => nomatch H h

Mario Carneiro (Apr 11 2023 at 23:23):

This is basically a variation on the proof that

inductive Foo (M : Prop -> Prop) : Prop
  | mk : M Foo -> Foo

is inconsistent by taking M := Not to get the liar paradox.

Eric Wieser (Apr 11 2023 at 23:48):

Isn't unsafe example : False trivial to prove another way anyway though?

Eric Wieser (Apr 11 2023 at 23:49):

IMO the takeaway is "never use unsafe on proofs".

James Gallicchio (Apr 12 2023 at 00:06):

yes, but if ListM were allowed by the kernel without unsafe, it would lead to contradiction outside of unsafe.

François G. Dorais (Apr 12 2023 at 00:12):

I guess the StreamM option is the best. I think this is an instance of: do you really need the monad m or do you just need the category of m-algebras?

Jannis Limperg (Apr 12 2023 at 15:34):

I would recommend that anyone interested in this look into Haskell's streaming libraries: pipes, conduit, streaming(?), streamly(?), etc. People have already explored a lot of this design space and there seem to be some non-obvious potential pitfalls.

Scott Morrison (Apr 24 2023 at 11:03):

@Tage Johansson, @François G. Dorais, you might be interested to see that ListM is now safe! :-) !4#3559.

Mac Malone (Apr 27 2023 at 00:56):

François G. Dorais said:

In the earlier discussion mentioned by James Gallicchio, I arrived at the conclusion that the simplest and most versatile stream class is

class Stream (σ : Type _) (m : Type _  Type _) (α : outParam (Type _)) where
  next : σ  m (α × σ)

What about next : StateT σ m α? After all, that next is just the long-form definition of StateT.

Mac Malone (Apr 27 2023 at 00:59):

What this highlights is that the most general monadic 'Stream' operation is actually just a monad equipped with a next or continue function that advances the internal iterator state one step (with operations like get and/or set to retrieve and/or manipulate said iterator).

Mac Malone (Apr 27 2023 at 01:01):

That is, it might be better to describe the class as MonadNext where next : m a or the like (with the StateT version being a specialization).


Last updated: Dec 20 2023 at 11:08 UTC