Zulip Chat Archive

Stream: lean4

Topic: for, unexpected need for type ascription


Kyle Miller (Jan 24 2022 at 00:45):

I was experimenting with using parsec and trying out do/mut/for, and I came across a type error that was surprising to me. In the code below, if you remove the type ascription at the -- ?, there is an error "failed to synthesize instance HAdd Nat Nat (Parsec Nat)". This behavior also appears even if the loop contains monadic actions (for example, adding a skip).

import Lean.Data.Parsec

open Lean Lean.Parsec

def num : Parsec Int := do
  let neg : Bool  (skipChar '-' *> pure true) <|> pure false
  let mut n : Nat := 0
  for c in  many1 (satisfy Char.isDigit) do
    n  (10 * n + (c.toNat - '0'.toNat) : Nat) -- ?
  pure $ if neg then -n else n

def parse (s : String) : Except String Int :=
  match (num <* eof) s.mkIterator with
  | ParseResult.success _ res => Except.ok res
  | ParseResult.error it err  => Except.error s!"offset {it.i.repr}: {err}"

#eval parse "22"
#eval parse "-222"

Arthur Paulino (Jan 24 2022 at 00:57):

@Kyle Miller for some reason it works if you do n := (10 * n + (c.toNat - '0'.toNat)) instead

Kyle Miller (Jan 24 2022 at 01:00):

Ah, thanks. I forgot that := was the actual notation for updates, and that <- ... was for := (<- ...)

Kyle Miller (Jan 24 2022 at 01:01):

Is it inferring the identity monad, then, for <- here?

Arthur Paulino (Jan 24 2022 at 01:04):

Now I'm a bit confused. I thought I had used <- for updates before :thinking:
Idk the answer to your question but I was trying to explicitly access the id monad

Henrik Böving (Jan 24 2022 at 01:06):

def foo : IO Nat := return 10

def bar : IO Nat := do
  let mut x := 12
  x  foo
  x

#eval bar

It does work out

Arthur Paulino (Jan 24 2022 at 01:10):

def bar : IO Nat := do
  let mut x := 12
  x  x + 1
  x

Doesn't tho

Mario Carneiro (Jan 24 2022 at 01:10):

Kyle Miller said:

Is it inferring the identity monad, then, for <- here?

It's inferring the Parsec monad because you are in a Parsec do block

Mario Carneiro (Jan 24 2022 at 01:10):

if you write n <- _ then the _ should be a Parsec Nat

Mario Carneiro (Jan 24 2022 at 01:11):

if you type ascribe it as a Nat then it inserts the pure coercion

Mario Carneiro (Jan 24 2022 at 01:11):

the correct thing to do here is use n := ...instead of n <- .... Note no let

Mario Carneiro (Jan 24 2022 at 01:12):

omitting the let is how you know it's an update of a mutable variable and not a shadowed variable

Kyle Miller (Jan 24 2022 at 01:12):

@Mario Carneiro I get the Parsec do block part, and I'd forgotten that updates were with :=.

Why is the ascription inserting a pure?

Mario Carneiro (Jan 24 2022 at 01:12):

the := vs <- determines whether the rhs is a monadic expression or a pure expression

Kyle Miller (Jan 24 2022 at 01:13):

What led me astray is that <- works at all, like in @Henrik Böving's example

Mario Carneiro (Jan 24 2022 at 01:13):

Kyle Miller said:

Why is the ascription inserting a pure?

because there is a coercion from A to m A where m is a monad

Mario Carneiro (Jan 24 2022 at 01:13):

(actually I'm not sure whether it's literally a coercion in the Coe sense or whether this is special magic specific to the do DSL)

Kyle Miller (Jan 24 2022 at 01:14):

Would this be a place where there ought to be an explicit Id.run?

Mario Carneiro (Jan 24 2022 at 01:14):

no, Id.run is going the other way: you want an A and you are providing an Id A

Kyle Miller (Jan 24 2022 at 01:14):

Like

def foo : IO Nat := return 10

def bar : IO Nat := do
  let mut x := 12
  x  Id.run foo
  x

#eval bar

Mario Carneiro (Jan 24 2022 at 01:15):

here you have an A and you want an m A

Mario Carneiro (Jan 24 2022 at 01:15):

that's what pure does

Henrik Böving (Jan 24 2022 at 01:15):

#synth Coe Nat (Id Nat)

Fails @Mario Carneiro so it appears there is no coercion here.

Kyle Miller (Jan 24 2022 at 01:15):

Oh, right, I did get that backwards with what Id.run is meant to do. Still, it seems like this is something where users should be encouraged to insert a pure themselves if they really mean to use <- rather than :=.

Kyle Miller (Jan 24 2022 at 01:17):

Mario Carneiro said:

if you type ascribe it as a Nat then it inserts the pure coercion

Just to mention where my followup question came from: I thought you were saying it would put the pure because of the ascription, not because the ascription causes the arithmetic to be a Nat, so then Lean inserts a pure outside the ascription, so to speak.

Mario Carneiro (Jan 24 2022 at 01:17):

Kyle Miller said:

Oh, right, I did get that backwards with what Id.run is meant to do. Still, it seems like this is something where users should be encouraged to insert a pure themselves if they really mean to use <- rather than :=.

Yes, that's exactly what I do in these situations

Mario Carneiro (Jan 24 2022 at 01:18):

I like to omit the pure when I can, but it doesn't work in a case like def foo : IO Nat := do 2 + 2

Kyle Miller (Jan 24 2022 at 01:18):

What I mean is, can Lean encourage this by making it be an error if there's no pure?

Mario Carneiro (Jan 24 2022 at 01:19):

it's convenient to have the pure auto-inserted, and it makes it feel more like imperative code

Kyle Miller (Jan 24 2022 at 01:19):

Specifically for this <- update, though.

Kyle Miller (Jan 24 2022 at 01:19):

Because then why have := too?

Mario Carneiro (Jan 24 2022 at 01:19):

??

Mario Carneiro (Jan 24 2022 at 01:19):

If you are using <- then you specifically want a monadic value on the right

Kyle Miller (Jan 24 2022 at 01:20):

Right, so why have it auto-insert pure if it's not a monadic value on the right?

Mario Carneiro (Jan 24 2022 at 01:20):

It didn't insert the pure until you type ascribed it

Mario Carneiro (Jan 24 2022 at 01:21):

with the type ascription it's clearly the right thing to do, and without it it's unclear what to do and it opts to avoid the auto-insertion

Kyle Miller (Jan 24 2022 at 01:21):

def bar : IO Nat := do
  let y : Nat := 5
  let mut x : Nat := 12
  x  y
  x

def bar' : IO Nat := do
  let y : Nat := 5
  let mut x : Nat := 12
  x := y
  x

Mario Carneiro (Jan 24 2022 at 01:22):

I presume those both work

Kyle Miller (Jan 24 2022 at 01:22):

They do both work, which is what I'm referring to.

Mario Carneiro (Jan 24 2022 at 01:22):

because the type of y is known

Mario Carneiro (Jan 24 2022 at 01:22):

this is the usual setting where a \u would be inserted

Mario Carneiro (Jan 24 2022 at 01:23):

you have type A and you want type B and lean knows they can't be the same

Kyle Miller (Jan 24 2022 at 01:24):

Ok, but should it work here? I get more-or-less what mechanisms make it work, but would it be possible to make it not work? Is there any reason that it should, given that there's also := syntax?

Mario Carneiro (Jan 24 2022 at 01:24):

I think it's a more orthogonal design if it does work

Mario Carneiro (Jan 24 2022 at 01:25):

who knows, maybe it is inconvenient to use := for whatever reason (e.g. in a macro)

Kyle Miller (Jan 24 2022 at 01:31):

Given that do notation's been changed to require Id.run for the id monad, I feel like it's rather surprising that <- works for non-monadic RHS's. (I just mean "surprising" from the point of view of someone learning the language.)

Mario Carneiro (Jan 24 2022 at 01:36):

generally you can use a non-monadic expression where a monadic expression is expected. This is an instance of coercion (although it seems to be implemented in a slightly different way than the generic Coe, likely because A -> m A is a really generic type)

Mario Carneiro (Jan 24 2022 at 01:37):

Certainly coercion is weird from a strict type theory perspective - it's literally a type error and the compiler accepts it anyway

Mario Carneiro (Jan 24 2022 at 01:38):

but I would rather not see it removed, since it is one of the things that makes lean do notation more pleasant to write than Haskell do notation

Reid Barton (Jan 24 2022 at 02:00):

I think the main concern would be, if you blur the distinction a bit between x : a and pure x : IO a, then you may also accidentally blur m : IO a and pure m : IO (IO a), and what you think is m >> n may actually be pure m >> n, which will also typecheck but be equivalent to just n

Mario Carneiro (Jan 24 2022 at 02:05):

Since this isn't a Coe I think that latter thing will actually not work

Mario Carneiro (Jan 24 2022 at 02:05):

this is magic that only triggers in do block positions like do let x <- _ or do bla (<- _) or do _; _

Reid Barton (Jan 24 2022 at 02:07):

Aren't there other forms for all of those (e.g. let x := _), other than the last one (last underscore)?

Reid Barton (Jan 24 2022 at 02:08):

GHCi has a similar sort of ambiguity to the last one, where if you type something of type IO _ in the interpreter, it will execute it, otherwise evaluate it

Reid Barton (Jan 24 2022 at 02:11):

But in the interpreter, the stakes are low as you will probably figure it out immediately if ghci misunderstood what you intended.

Reid Barton (Jan 24 2022 at 02:17):

You definitely don't want do x; y to mean do pure x; y.

Mario Carneiro (Jan 24 2022 at 02:21):

you might, if x actually has side effects

Mario Carneiro (Jan 24 2022 at 02:21):

like the equivalent of lean 3 trace

Reid Barton (Jan 24 2022 at 02:22):

But it's still better for it to be a compiler error surely?

Reid Barton (Jan 24 2022 at 02:22):

because you can always insert pure yourself

Reid Barton (Jan 24 2022 at 02:24):

It's much more likely you intended to write a monadic action but messed up somehow, and luckily because you are using a typed language the compiler caught your error, right...?

Mario Carneiro (Jan 24 2022 at 02:25):

pshh, obviously the point of type systems is so that the compiler can always guess what you mean and never give you any type errors

Mario Carneiro (Jan 24 2022 at 02:27):

I remember thinking about how coercions have lead to the return of the "garbage theorem" when we found out that
(n1 n2) : nat typechecks for n1, n2 : nat

Reid Barton (Jan 24 2022 at 02:34):

Anyways, I don't think this pure insertion existed the last time I used Lean 4, but if the only useful place for it is indeed in the last line of a do block, that seems like a simpler and more predictable rule

Mario Carneiro (Jan 24 2022 at 02:34):

I have been looking for the code that does it and I haven't found it yet

Mac (Jan 24 2022 at 06:40):

Personally, I agree with @Kyle Miller and @Reid Barton and think the auto pure is more trouble than it is worth. @Mario Carneiro did you find an examples for where it is particularly convenient (outside the last line of a do block)? I think the feature is used a lot less often than one might think and that adding the explicit pure would not really be much of a hassle.

Siddhartha Gadgil (Jan 24 2022 at 08:00):

Incidentally I normally use the alternative return.

Gabriel Ebner (Jan 24 2022 at 10:15):

think the auto pure is more trouble than it is worth.

I kind of agree. It's extremely neat when it works (and is unambiguous), but there's enough cases where the coercion doesn't trigger (numerals, arithmetical operations, anonymous constructors, structure instances, lambdas, etc.) that you need pure/return half of the time anyway, causing inconsistent coding style.

Marcus Rossel (Jan 24 2022 at 11:21):

And speaking from a beginner's perspective: learning monadic programming is more confusing with <- working for non-monadic values.

Sebastian Ullrich (Jan 24 2022 at 11:32):

I have ambivalent feelings as well whenever I use it, even if the result looks nice. Not however that changing this would require more specialization of do syntax to the Id special case, as I think I outlined in the Id.run GH issue, because there you certainly do not want to worry about monadic lifting.

Gabriel Ebner (Jan 24 2022 at 12:22):

Shouldn't that just work for Id? (since Id α and α are defeq)

Mario Carneiro (Jan 24 2022 at 12:29):

you still have to worry about e.g. HAdd Nat Nat (Id Nat) not existing

Sebastian Ullrich (Jan 24 2022 at 12:32):

Yes, that's what I had in mind I think. But I believe the "pure lift" doesn't really help with that, so the issues might be more orthogonal than I thought.

Mario Carneiro (Jan 24 2022 at 12:32):

Regarding pure vs return: I like that lean 4 makes them not be synonyms any more, I always had trouble with keeping consistent style before in lean 3. Now I make a note to always use pure in both lean 3 and lean 4, unless I specifically want an early return

Sebastian Ullrich (Jan 24 2022 at 12:33):

That was my initial style as well, but Leo at least uses return extensively and it does save parentheses...

Mario Carneiro (Jan 24 2022 at 12:33):

@Sebastian Ullrich could you point me to the code that does the auto-pure? What is the scope of it?

Mario Carneiro (Jan 24 2022 at 12:34):

I will note that pure $ ...and return ... have the same number of characters :P

Mario Carneiro (Jan 24 2022 at 12:36):

Here's a thought: what if the last element of a do block was always a pure value? You would write <- foo if you want to use a monadic value

Sebastian Ullrich (Jan 24 2022 at 12:38):

I believe we've discarded that option at some point. It sure looks inconvenient if all you want to do is run some m Units.

Mario Carneiro (Jan 24 2022 at 12:38):

if that was extended to internal ignores like do _; ..., then you would have the property that every monadic bind has a <- which seems good for readability

Gabriel Ebner (Jan 24 2022 at 12:39):

what if the last element of a do block was always a pure value? You would write <- foo if you want to use a monadic value

I often write do foo (← getBar) (← getBaz).b etc.

Mario Carneiro (Jan 24 2022 at 12:39):

I very much like that style in fact

Mario Carneiro (Jan 24 2022 at 12:40):

but it doesn't seem much worse to have do ← foo (← getBar) (← getBaz).b when foo is also monadic

Mario Carneiro (Jan 24 2022 at 12:40):

and when it's not monadic you don't have this auto pure ambiguity

Mario Carneiro (Jan 24 2022 at 12:40):

(foo is often not monadic in those kinds of expressions)

Sebastian Ullrich (Jan 24 2022 at 12:45):

Mario Carneiro said:

Sebastian Ullrich could you point me to the code that does the auto-pure? What is the scope of it?

I'm actually not sure. It's not do though.

Kyle Miller (Jan 24 2022 at 19:14):

Here's a fun gotcha:

def sumRange''' (n : Nat) : Nat := Id.run do
  let mut acc := 0
  for i in [0:n] do
    acc := acc + i
  acc

Can you guess the error?

(This is not precisely an auto-pure problem, but it does suggest depending on it can lead to unanticipated errors!)

Arthur Paulino (Jan 24 2022 at 19:20):

The error is that you didn't explicit the type of acc so Lean thinks its of Id Nat

Arthur Paulino (Jan 24 2022 at 19:24):

def sumRange''' (n : Nat) : Nat := Id.run do
  let mut acc := (0 : Nat)
  -- let mut acc : Nat := 0 -- this works too
  for i in [0:n] do
    acc := acc + i
  acc

Kyle Miller (Jan 24 2022 at 19:26):

I meant to ask "where the error occurs" but, yeah, that's a root cause. Another fix is to use pure/return for acc at the end.

Arthur Paulino (Jan 24 2022 at 19:29):

It's not clear to me why acc is being inferred as Id Nat since the type of sumRange''' is explicitly Nat → Nat

Kyle Miller (Jan 24 2022 at 19:30):

The Id.run is making the do block be of type Id Nat, and the last line is acc by itself, so it infers it's an Id Nat.


Last updated: Dec 20 2023 at 11:08 UTC