Zulip Chat Archive

List comprehension syntax doesn't really exist, but you can use do notation for it if you are downstream of the Monad List definition, or use filter and bind otherwise

Shreyas Srinivas (Apr 30 2024 at 21:02):

Asei Inoue said:

Does this mean that list comprehensions are not available in Lean 4?

Is that related to laziness. I guess you can have list comprehensions syntax by translating the comprehension into a def that constructs the list.

Mario Carneiro (Apr 30 2024 at 21:07):

do notation is not too far from list comprehension syntax, but it does require that the type have a Monad instance or something like it

Mario Carneiro (Apr 30 2024 at 21:08):

you can also add a monad instance in 3 lines if you want it for your project

Patrick Massot (Apr 30 2024 at 21:10):

Kyle Miller (Apr 30 2024 at 22:21):

Here's another version of this, which is a lot more flexible, collect_array ... yield ... expressions for Array. It's just defining a monad with Array state and using do notation with a different head keyword.

-- Array comprehension
#eval collect_array for i in [1:20] do for j in [1:20] do for k in [1:20] do
        if i^2 + j^2 = k^2 then yield (i,j,k)
/-
#[(3, 4, 5), (4, 3, 5), (5, 12, 13), (6, 8, 10), (8, 6, 10), (8, 15, 17), (9, 12, 15), (12, 5, 13), (12, 9, 15),
  (15, 8, 17)]
-/

-- Interleave IO actions
#eval do_array[IO]
  for i in [1:10] do
    if i % 2 = 0 then
      IO.println s!"yielding {i}"
      yield i
/-
yielding 2
yielding 4
yielding 6
yielding 8
#[2, 4, 6, 8]
-/

code

-- probably should have used WriterT, in Mathlib.Control.Monad.Writer
abbrev ArrayT (α : Type) := StateT (Array α)

abbrev ArrayM (α : Type) := ArrayT α Id

section
variable {α : Type} {m : Type → Type} [Monad m]

def ArrayT.yield (x : α) : ArrayT α m Unit := modify fun s => s.push x

def ArrayT.yield' (a : Array α) : ArrayT α m Unit := modify fun s => s ++ a

def ArrayT.run' (x : ArrayT α m Unit) : m (Array α) := Prod.snd <$> StateT.run x #[]

def ArrayM.run' (x : ArrayM α Unit) : Array α := Id.run <| Prod.snd <$> StateT.run x #[]

macro "yield " x:term : doElem => `(doElem| ArrayT.yield $x)
macro "yield' " x:term : doElem => `(doElem| ArrayT.yield' $x)

macro "do_array " s:doSeq : term => `(ArrayT.run' do $s)
macro "do_array" noWs "[" m:term "] " s:doSeq : term => `((ArrayT.run' do $s : $m _))

macro "collect_array " s:doSeq : term => `(ArrayM.run' do $s)

end

-- Array comprehension
#eval collect_array for i in [1:20] do for j in [1:20] do for k in [1:20] do
        if i^2 + j^2 = k^2 then yield (i,j,k)
/-
#[(3, 4, 5), (4, 3, 5), (5, 12, 13), (6, 8, 10), (8, 6, 10),
  (8, 15, 17), (9, 12, 15), (12, 5, 13), (12, 9, 15), (15, 8, 17)]
-/

-- Interleave IO actions
#eval do_array[IO]
  for i in [1:10] do
    if i % 2 = 0 then
      IO.println s!"yielding {i}"
      yield i
/-
yielding 2
yielding 4
yielding 6
yielding 8
#[2, 4, 6, 8]
-/

Jovan Gerbscheid (May 01 2024 at 00:29):

I've also had this problem of needing a nondeterminism monad. MLList was not useable in my use case due to performance, so I'm using a strict monad that is essentially the same as Monad List. It also has performance issues, but these are acceptable. But I just saw this message by @Mario Carneiro defining a nondeterminism monad in a lazy way using Array. I have extended this definition to be a monad transformer. It seems to me that this behaves identically to MLList for the purpose of being a nondeterminism monad, but it is very performant.

def ArrayT (m : Type u → Type v) (α : Type w) :=
  ∀ κ : Type u, (α → Array κ → m (Array κ)) → Array κ → m (Array κ)

variable [Monad m]

instance : Monad (ArrayT m) where
  pure a _ k := k a
  bind f g _ k := f _ (g · _ k)

instance : MonadLift List (ArrayT m) where
  monadLift l _ k := l.foldlM (fun r a => k a r)
instance : MonadLift Array (ArrayT m) where
  monadLift l _ k := l.foldlM (fun r a => k a r)

instance : MonadLift m (ArrayT m) where
  monadLift x _ k arr := do k (← x) arr

def ArrayT.run (x : ArrayT m α) : m (Array α) :=
  x _ (fun a r => return r.push a) #[]

instance : Alternative (ArrayT m) where
  failure _ _ := pure
  orElse a b _ k arr := a _ k arr >>= b () _ k


#eval ArrayT.run (m := Option) do
  let l ← some 100
  let a ← [1, 2, 3]
  let b ← #[1, 2, 3]
  return l + 10 * a + b
-- some #[111, 112, 113, 121, 122, 123, 131, 132, 133]

def foo (n : Nat) : ArrayT Id Nat := do
  for _ in [:n] do
    pure () <|> pure ()
  return 4

#eval (foo 20).run.size == 2^20 -- true

The function foo handles 2^20 case splits in less than a second. All the other nondeterminism monads I have seen so far in Lean crash when trying to do this many splits. Has this implementation not been considered before or am I misunderstanding something?

Jovan Gerbscheid (May 01 2024 at 01:02):

You can use it for comprehension as well:

def ArrayT.ofForIn [ForIn m ρ α] (as : ρ) : ArrayT m α := fun _ k arr => do
  let mut arr := arr
  for a in as do
    arr ← k a arr
  return arr

#eval ArrayT.run (m := Id) do
  let x ← ArrayT.ofForIn [1:20]
  let y ← ArrayT.ofForIn [1:20]
  let z ← ArrayT.ofForIn [1:20]
  guard (x^2 + y^2 = z^2)
  return (x,y,z)
/-
#[(3, 4, 5), (4, 3, 5), (5, 12, 13), (6, 8, 10), (8, 6, 10), (8, 15, 17), (9, 12, 15), (12, 5, 13), (12, 9, 15),
  (15, 8, 17)]
-/

Asei Inoue (May 01 2024 at 04:51):

Thank you all for your kindness!
I think it would be a good idea to add a note in the Lean Manual as to why developers deleted the List Monad instance.

Jovan Gerbscheid (May 01 2024 at 14:11):

I've tried to implement ArrayT into my code, but it is not able to handle deep recursion. The following crashes Lean:

def ArrayT (m : Type u → Type v) (α : Type w) :=
  ∀ κ : Type u, (α → Array κ → m (Array κ)) → Array κ → m (Array κ)

variable [Monad m]

instance : Monad (ArrayT m) where
  pure a _ k := k a
  bind f g _ k := f _ (g · _ k)
def ArrayT.run (x : ArrayT m α) : m (Array α) :=
  x _ (fun a r => return r.push a) #[]

partial def bar (n : Nat) : ArrayT Id Nat := do
  if n==0 then
    return 4
  else
    pure ()
  bar (n-1)


#eval ArrayT.run  (bar 5000)

On this example List also fails, but MLList works fine. My understanding of how return works, is that this function is the same as

partial def bar (n : Nat) : ArrayT Id Nat := do
  if n==0 then
    return 4
  else
    pure ()
    bar (n-1)

(with an extra let-expression abstracting the bar (n-1)). But somehow, this example does not crash. Is my understanding of do notation flawed?

There is also this example:

def foo (n : Nat) : MLList Id Nat := do
  for _ in [:n] do
    pure ()
  return 4

#eval MLList.force  (foo 6000)

This crashes Lean, while the same with ArrayT or List works fine. In my mind, all of these examples are just a sequence of n times pure () followed by pure 4. But they all give different results.

Jovan Gerbscheid (May 01 2024 at 15:10):

The do notation looks like it works how I thought. So, I can make two functions that give the exact same output when using #print, but one crashes and the other doesn't:

def ArrayT (m : Type u → Type v) (α : Type w) :=
  ∀ κ : Type u, (α → Array κ → m (Array κ)) → Array κ → m (Array κ)

variable [Monad m]

instance : Monad (ArrayT m) where
  pure a _ k := k a
  bind f g _ k := f _ (g · _ k)
def ArrayT.run (x : ArrayT m α) : m (Array α) :=
  x _ (fun a r => return r.push a) #[]

def bar (n : Nat) : ArrayT Id Nat := do
  match n with
  | 0 =>
    return 4
  | _+1 =>
    pure ()
  bar (n-1)
decreasing_by sorry

def bar' (n : Nat) : ArrayT Id Nat :=
    let __do_jp := fun y ↦
      bar' (n-1)
    match n with
    | 0 => pure 4
    | _+1 => do
      let y ← pure ()
      __do_jp y
decreasing_by sorry

#print bar
#print bar'
#eval ArrayT.run  (bar' 50000)
#eval ArrayT.run  (bar 50000)

Jovan Gerbscheid (May 01 2024 at 19:39):

I have now found an example where ArrayT crashes, but both List and MLList succeed:

def f (n : Nat) : ArrayT Id Nat := match n with
  | 0 => return 1
  | n+1 => do
    let x ← f n
    let y ← f n
    return x+y

#eval  (f 12).run

Because ArrayT doesn't store intermediate results, this example of adding 1 to itself 2^12 times creates a recursion depth of that size :pensive: . So it turns out that it really depends on the exact function that you're using which nondeterminism monad is the best. In my use case it is still the list-like one.

Last updated: May 02 2025 at 03:31 UTC