Zulip Chat Archive
Stream: lean4
Topic: Option do notation regression?
Zygimantas Straznickas (Mar 22 2021 at 23:40):
-- works
def test1 : IO Nat := do
let z ← IO.getNumHeartbeats
return z + 1
-- no longer works
def test2 : Option Nat := do
let x ← (some 5)
return x + 1 -- failed to synthesize instance Add (Option Nat)
Marc Huisinga (Mar 22 2021 at 23:44):
Option
is no longer a monad since https://github.com/leanprover/lean4/commit/9a5f2395134a2668ae778f32262b3105c277483a
Marc Huisinga (Mar 22 2021 at 23:45):
Code that used Option
as a monad before now seems to use OptionM.run
: https://github.com/leanprover/lean4/commit/04e3f21783e343a58eb245bcb9488dab59cb01a4
Mario Carneiro (Mar 23 2021 at 00:19):
Whaat? Why isn't Option
a monad? That's the classic example
Yakov Pechersky (Mar 23 2021 at 00:22):
Motivation:
Option
is data,OptionM
is control.
https://github.com/leanprover/lean4/commit/04e3f21783e343a58eb245bcb9488dab59cb01a4
Yakov Pechersky (Mar 23 2021 at 00:24):
I guess similar to how Haskell prefers to supply a StateT or state
over State?
Mario Carneiro (Mar 23 2021 at 00:25):
okay, but there are still uses of do
notation for constructing Option
data
Mario Carneiro (Mar 23 2021 at 00:26):
I mean, not all uses of do
notation are for control
Yakov Pechersky (Mar 23 2021 at 00:26):
:shrug:
Mario Carneiro (Mar 23 2021 at 00:27):
mathlib has plenty of other monads like the giry monad or filters or pfun and they can only very vaguely be interpreted as control structures
Yakov Pechersky (Mar 23 2021 at 00:30):
Perhaps this will make things like with_bot
or with_top
cleaner?
Mario Carneiro (Mar 23 2021 at 00:33):
I'm not sure what you mean. Those can be monads too
Joe Hendrix (Mar 23 2021 at 00:39):
Coming from Haskell, I also think it's reasonable for Option
to be a monad, and am not sure about the data Option
vs control OptionM
distinction. List
is perfectly reasonable a s a monad in Haskell.
I'd understand though if there is a compilation or performance reason for distinguishing the two, but don't see that referenced in the commit.
Yakov Pechersky (Mar 23 2021 at 00:43):
I mean that it might make with_bot T = OptionM ??? T
with something that isn't just Id for ??? ? Not sure.
Sebastian Ullrich (Mar 23 2021 at 07:58):
Joe Hendrix said:
List
is perfectly reasonable a s a monad in Haskell.
This is exactly the problem: Option
and List
are reasonable monads in Haskell, but in Lean are performance traps because of strictness. People automatically think of monads as computation and don't think twice about chaining them with *>
or <|>
, which is why it's a bad idea to make a strict data type a monad.
Gabriel Ebner (Mar 23 2021 at 09:58):
It's a surprising design choice. All strict languages that I can think of and which can express a monad type have a monad instance for the list type: e.g. Lean 3, scala (cats), or purescript off the top of my head.
Gabriel Ebner (Mar 23 2021 at 10:10):
While I can somewhat follow the argumentation of removing Monad List
(Lean does not implement stream fusion, so let's remove nice syntax instead..), I am completely baffled as to why Monad Option
is supposed to be problematic. All operations on Option are constant-time, and you don't need stream fusion to simplify f <$> g <$> x
to (f ∘ g) <$> x
(I would expect both expressions to be optimized to the same result after inlining <$>).
Sebastian Ullrich (Mar 23 2021 at 10:33):
It's not about the overhead of the operations themselves, but about that of eagerly evaluating an argument that shouldn't have been. If you do bigSearch1 <|> bigSearch2
on Option
and the optimizer decides it's over the inlining budget, you're screwed.
Sebastian Ullrich (Mar 23 2021 at 10:35):
A different, Lean 4-specific issue is
def f (xs : List Nat) : Option Nat := do
for x in xs do
if x > 10 then
return some x
return none
3:2:
type mismatch, 'for' has type
Option Nat : Type
but is expected to have type
Nat : Type
6:2:
application type mismatch
pure none
argument
none
has type
Option ?m.551 : Type ?u.550
but is expected to have type
Nat : Type
The user thought of Option
as data, but do
assumed it was computation
Gabriel Ebner (Mar 23 2021 at 10:38):
Sebastian Ullrich said:
It's not about the overhead of the operations themselves, but about that of eagerly evaluating an argument that shouldn't have been. If you do
bigSearch1 <|> bigSearch2
onOption
and the optimizer decides it's over the inlining budget, you're screwed.
This would be a reasonable argument, but the Alternative (Option α)
instance is still there. And then there's also fast && slow
, etc.
Gabriel Ebner (Mar 23 2021 at 10:39):
A different, Lean 4-specific issue is [...]
Oh, this is disappointing but also a good explanation. Thanks!
Sebastian Ullrich (Mar 23 2021 at 10:47):
Alternative Option
is gone, but the OrElse
instance is still there. I'm not completely sure what @Leonardo de Moura's reasoning there was, but at least it makes it harder to unknowingly use the operator on Option
in a combinator, since those usually use Alternative
I'd guess. If we wanted to be strict (heh) about it, we could rename the function Option.or
and add a separate Option.orElse
using a thunk, both without special syntax.
Leonardo de Moura (Mar 23 2021 at 14:42):
Sebastian Ullrich said:
Alternative Option
is gone, but theOrElse
instance is still there. I'm not completely sure what Leonardo de Moura's reasoning there was, but at least it makes it harder to unknowingly use the operator onOption
in a combinator, since those usually useAlternative
I'd guess. If we wanted to be strict (heh) about it, we could rename the functionOption.or
and add a separateOption.orElse
using a thunk, both without special syntax.
There is the effort of fixing stdlib, and finding a nice alternative for the fix.
Leonardo de Moura (Mar 23 2021 at 14:53):
Regarding the change, the goal is to improve the user experience. See do
example that Sebastian posted above.
We are exploring the design space and trying to figure out what works and what doesn't.
We didn't wake up one day and decide to remove the instance. @Sebastian Ullrich and I discussed the Monad Option
issue a few times.
There are many other changes we want to make. For example, we want to address x *> let a := <expensive>; b
Finally, if mathtlib users think Monad Option
and Monad List
are useful, they can define them in the Mathlib "core" files.
Gabriel Ebner (Mar 23 2021 at 15:13):
Yes, the do
example was convincing. Although it makes me slightly uncomfortable that adding a type class instance can break do
-notation that used to work before. That is, if we add instance : Monad FreeGroup
, it will break a function def foo (xs : List α) : FreeGroup α := do ...
.
There is often no way around Monad
. For example if you want to use lemmas that require LawfulMonad
, you need to provide a Monad
instance.
Gabriel Ebner (Mar 23 2021 at 15:13):
Leonardo de Moura said:
Finally, if mathtlib users think
Monad Option
andMonad List
are useful, they can define them in the Mathlib "core" files.
I don't think anybody will add these instances if they break do
-notation.
Sebastian Ullrich (Mar 23 2021 at 15:20):
I'm sure there will be users that would want do
to use Option
as the monad, so it might not be broken in their opinion. But prefixing the block with OptionM.run
seems like the best compromise to please both camps.
Gabriel Ebner (Mar 23 2021 at 15:23):
If you add the Monad Option
instance, is there a way to get back and tell Lean that def f (xs : List Nat) : Option Nat := do
should not use Option
as the functor for the do-notation?
Sebastian Ullrich (Mar 23 2021 at 15:26):
Yes, you have to wrap the type with Id
Andrew Kent (Mar 23 2021 at 15:42):
Sebastian Ullrich said:
It's not about the overhead of the operations themselves, but about that of eagerly evaluating an argument that shouldn't have been. If you do
bigSearch1 <|> bigSearch2
onOption
and the optimizer decides it's over the inlining budget, you're screwed.
There may be some obvious reasons I haven't thought of yet why this suggestion is a silly one... but this has me curious if <|>
wouldn't be better off as a macro that expands into control-flow primitives that only evaluate bigSearch2
if bigSearch1
"fails" (a la or
being a macro and not a function/primitive in Scheme et al).
Sebastian Ullrich (Mar 23 2021 at 15:44):
<|>
is from a typeclass, it cannot be a macro
Andrew Kent (Mar 23 2021 at 15:45):
Yes, I am aware it is currently a function from a typeclass, but that just reflects the current design, no? It's not a fundamental reason why <|>
as infix syntax must be a function and not a macro.
Andrew Kent (Mar 23 2021 at 15:47):
I'm not necessarily arguing it "should change", but just thinking out loud a bit. If <|>
has a strong association in enough peoples' minds with not always evaluating the right-hand side, maybe that's a reason to consider it a syntactic abstraction that does that instead of it being a function like it is in Haskell :shrug:
Sebastian Ullrich (Mar 23 2021 at 16:00):
<|>
is the control flow primitive. If you want something else to happen for all monads/alternatives, you'll need to find a new primitive monadic combinator and abstract over it in a typeclass. We considered making the RHS of <|>
a thunk, but that could be confusing in proofs/contexts where execution does not matter. If all but two interesting monads are already inherently lazy, the correct solution is to make those two lazy as well, not to change a fundamental combinator. Especially given that they are problematic in other ways as described above.
Andrew Kent (Mar 23 2021 at 16:04):
I don't necessarily disagree with you, although I think I was trying to talk more purely about the syntax <|>
and nothing more. I.e., what should the syntax <|>
mean when a user reaches for it and not arguing that the underlying tower of typeclasses should change.
Sebastian Ullrich (Mar 23 2021 at 16:06):
Then I do not understand. There is nothing the syntax can do here if the underlying combinator's signature doesn't change.
Andrew Kent (Mar 23 2021 at 16:20):
I think I'm imagining a more creative reinterpretation of the syntax <|>
than perhaps I'm letting on then which is focused on the control-flow aspects....? (And perhaps for this reason it would be too different and not a good idea, as it would not work in all the places <|>
is valid with its current meaning at present). In this alternative reality where <|>
as syntax means something very different , you might imagine a class of values where you morally get the following elaborations:
for Option
:
mA <|> mB
-- expands to
match mA with
| none => mB
| some a => pure a
for Except
:
mA <|> mB
-- expands to
match mA with
| Except.error _ => mB
| Except.ok a => pure a
etc
And this interpretation of <|>
may end up being very divorced from the current typeclasses it is associated with and work with only a subset or different set of values, etc, but you could still imagine there being such a syntactic abstraction that for some things it ends up computatinoally behaving more like <|>
does in a call-by-need language (in at least delaying the RHS).
At any rate... I worry this is a bit too speculative and distracting for continued debate in this thread xD
Sebastian Ullrich (Mar 23 2021 at 16:24):
If the notation was not based on a typeclass, you would not be able to use it in programs or combinators generic over the monad except by making the whole function a macro, which is not tenable.
Andrew Kent (Mar 23 2021 at 16:27):
Yes, if you think of this like a more generic "or", you would need a typeclass to tell you what values were "falsy" to know when to return the thing. And then of course you wouldn't be able to use it in all the places <|>
can be used now. It is quite a radical reimagining of the syntax =)
Joe Hendrix (Mar 23 2021 at 17:10):
@Sebastian Ullrich Thanks for explanation. In hindsight strictness seems like an obvious problem, and I think although bind would be ok, many of the the other operations like >>
and <|>
would not.
Joe Hendrix (Mar 23 2021 at 17:22):
@Andrew Kent @Sebastian Ullrich Another direction would be to make strategy annotations part of the type (ABI) of a function. Maude has had this for a while, but is a first order language. This would allow making operations like if then else
and short-circuit operators definable in the language. For Lean, you could just mark certain arguments as eager (call-by-value) and others as lazy (call-by-need).
I'm guessing this would be a distraction from current goals, but I thought I'd mention it as the approach is fairly well known in term rewriting.
Mario Carneiro (Mar 23 2021 at 22:47):
Sebastian Ullrich said:
Yes, you have to wrap the type with
Id
In your earlier example, with my lean-3-tinted eyes I don't see why the displayed behavior is regarded as unexpected. If you want to use return some x
and return none
then you have to use do
in a monad m (Option Nat)
; if the target type is Option Nat
then I expect that the monad is Option
and the inner type is Nat
. Indeed, if the target type is just A
without an obvious monad application, I think it is acceptable to error out rather than guessing the Id
monad. The concrete workaround for that example is to change the type to def f ... : id (Option Nat) := do ...
or def f ... : Option Nat := show id (Option Nat) from do ...
.
Mario Carneiro (Mar 23 2021 at 22:54):
Regarding eager function applications like o1 <|> o2
or fast && slow
, I think this has always been a problem and lean 3's solution of using @[inline]
is not good because it is sensitive to inlining decisions. Ideally the execution order should not be changed by inlining; even though lean is a pure language this is still a performance foot-gun. I agree with Joe Hendrix that a more robust solution is to explicitly indicate laziness in the types; some of this can be accomplished by using thunk A
but this is not the same type as A
so it can cause problems for mathematics and the kernel (e.g. if the type of Bool.and
was Bool -> (Unit -> Bool) -> Bool
that would be rather annoying to work with), so I think this should be either an identity wrapper / annotation or an attribute.
Gabriel Ebner (Mar 24 2021 at 08:35):
I think it is acceptable to error out rather than guessing the
Id
monad.
So this is what's happening here? I honestly did not make that connection at all.
Sebastian Ullrich (Mar 24 2021 at 09:09):
Mario Carneiro said:
with my lean-3-tinted eyes
Using do
blocks for pure code (without further annotations) has by now become second nature for us, it is super convenient. @Marc Huisinga told me he also ran into some Option
confusion issues before the change, maybe he can mention some practical examples from someone not involved in designing the feature :) .
Zygimantas Straznickas (Mar 24 2021 at 16:21):
Just wanted to mention I’ve also run into issues that Sebastian described. As someone who frequently uses Option
comprehension I dealt with them by sprinkling extra type annotations around in my for loops etc, but I understand that’s not good user experience for most users so the change seems like a net positive (as long as users can opt-in by providing their own Monad Option instance).
Mario Carneiro (Mar 24 2021 at 22:09):
What about a macro like pure_do
or similar to make it explicit that this is a do block in the identity monad? I think this is useful as documentation as well, because the version you posted reads weirdly otherwise
Mario Carneiro (Mar 24 2021 at 22:11):
this default fallback behavior seems difficult to control; there should be a simpler way to do it without cumbersome annotations if it is as useful as you say
Marius Melzer (Dec 19 2021 at 12:46):
Sebastian Ullrich said:
A different, Lean 4-specific issue is
def f (xs : List Nat) : Option Nat := do for x in xs do if x > 10 then return some x return none
3:2: type mismatch, 'for' has type Option Nat : Type but is expected to have type Nat : Type 6:2: application type mismatch pure none argument none has type Option ?m.551 : Type ?u.550 but is expected to have type Nat : Type
The user thought of
Option
as data, butdo
assumed it was computation
Might be a stupid question, but could somebody explain to me what the problem is here? And what's the difference between viewing a do block as data or as computation?
Horatiu Cheval (Dec 19 2021 at 12:57):
Have you seen the discussion here: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/.E2.9C.94.20Has.20the.20do.20notation.20changed.20recently.3F ? I think it's related
Horatiu Cheval (Dec 19 2021 at 12:59):
What version are you on? On 2021-12-18
I get a nicer error message:
invalid 'do' notation, expected type is not a monad application
Option Nat
You can use the `do` notation in pure code by writing `Id.run do` instead of `do`, where `Id` is the identity monad.
Marc Huisinga (Dec 19 2021 at 13:07):
If Option
is a monad, then return
is analogous to pure
in the sense that it'll take a value of type Nat
and embed it in Option Nat
(i.e. return x
is some x
in this case, with it additionally affecting control flow in Lean 4). So in the code snippet above, if Option
is a monad, it'll error because it expects Nat
for the argument to return
, not Option Nat
. Instead, you'd have to write it like so:
def f (xs : List Nat) : Option Nat := OptionM.run do
for x in xs do
if x > 10 then
return x
none
This behaviour makes sense for most side effects. If you use IO
for instance, you think of the type Nat
in IO Nat
as the real return value, not IO Nat
. If you use return x
in IO
, you very likely want to pass an integer, not an IO
.
But Option
is somehow different. It's not always used for side effects, sometimes you just want to use it as a value. In this case, the default above can be unintuitive. So the choice made here was to have OptionM
be the monad, Option
be a value and OptionM.run
to go from the former to the latter, so that you can still use the monadic version of Option
in functions that return Option
if you want to.
Edward Ayers (Mar 18 2022 at 20:47):
What's a terse way of doing x >>= f
for x : Option α
, f : α → Option β
?
Edward Ayers (Mar 18 2022 at 20:52):
I guess Option.bind x f
Mario Carneiro (Mar 19 2022 at 00:29):
x.bind f
Last updated: Dec 20 2023 at 11:08 UTC