Zulip Chat Archive
Stream: lean4
Topic: Ensuring naive purity
Ioannis Konstantoulas (Oct 11 2023 at 11:36):
This question is out of pure curiosity.
One misunderstanding about "pure" functions is that if I have f : Nat -> Nat
and at a point evaluate f 1
to 1
, I guarantee f 1
will always evaluate to 1
, because "functions only depend on their arguments". Of course I can write a function of signature Nat -> Nat
with completely non-deterministic behavior by running IO
actions inside it.
Is there a way to ensure in Lean that I really have a deterministic function only through its signature or some attribute? E.g. can I forbid running IO
inside the function body somehow?
Utensil Song (Oct 11 2023 at 11:44):
Of course I can write a function of signature Nat -> Nat with completely non-deterministic behavior by running IO actions inside it.
This is not true in a functional programming language like Lean 4. You'll need a monad to run IO actions, if you try, you'll have errors indicating type mismatch, that's how it's ensured to be forbidden. Chapter Monads in #fpil may have answers you need.
Ioannis Konstantoulas (Oct 11 2023 at 11:47):
Utensil Song said:
Of course I can write a function of signature Nat -> Nat with completely non-deterministic behavior by running IO actions inside it.
This is not true in a functional programming language like Lean 4. You'll need a monad to run IO actions, if you try, you'll have errors indicating type mismatch, that's how it's ensured to be forbidden. Chapter Monads in #fpil may have answers you need.
I am confused; what about
def natFun : Nat → Nat
| k => match (IO.getRandomBytes 8 |>.run' ()) with
| some bytes => bytes.toUInt64BE!.toNat % 256
| none => k
#eval natFun 4
#eval natFun 4
#eval natFun 4
Tomas Skrivan (Oct 11 2023 at 12:01):
This code is abusing that IO
monad is effectively StateM IO.RealWorld
and IO.RealWorld
is just Unit
. In the most recent version of lean, IO.RealWorld
is turned into opaque and you will no longer be able to write ()
to create an element of IO.RealWorld
. Furthermore, soon IO
will be opaque too so you won't be able to write the |>.run'
Tomas Skrivan (Oct 11 2023 at 12:03):
You are effectively exploiting a consistency bug.
Ioannis Konstantoulas (Oct 11 2023 at 12:05):
Nice; that takes care of IO
; are there any other similar traps that we know of? For the record, I had no idea this was considered a bug; I just freely used it in real code that I am writing.
Eric Wieser (Oct 11 2023 at 12:06):
These all fall into the category of "inconsistent implemented_by
/ extern
"s, which are only relevant for execution and not proofs
Ioannis Konstantoulas (Oct 11 2023 at 12:07):
Eric Wieser said:
These all fall into the category of "inconsistent
implemented_by
/extern
"s, which are only relevant for execution and not proofs
Yes, I am not talking about proofs at all; I am porting some CLI utilities I wrote in sh
to Lean as programming practice, and have been writing this kind of code a lot.
Utensil Song (Oct 11 2023 at 12:08):
Do we have
main = do
-- Save the current state of the universe
world_as_we_know_it <- getWorld
-- Cause serious international side effects
launchMissiles
-- After realizing that was a terrible, terrible mistake, restore the
-- pre-war state of the universe.
putWorld world_as_we_know_it
in Lean 4? (ref)
Eric Wieser (Oct 11 2023 at 12:09):
I think you can write that but it doesn't work
Tomas Skrivan (Oct 11 2023 at 12:10):
I think you can prove that you didn't delete your hard drive but in fact you actually did.
Eric Wieser (Oct 11 2023 at 12:11):
def ohNo : IO Unit := do
let s ← EStateM.get
IO.println "Oh no"
EStateM.set s
#eval ohNo
Ioannis Konstantoulas (Oct 11 2023 at 12:11):
Tomas Skrivan said:
I think you can prove that you didn't delete your hard drive but in fact you actually did.
"Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth
Eric Wieser (Oct 11 2023 at 12:14):
EStateM.set
does work for the random number generator though:
def anyway : IO Unit := do
let s ← EStateM.get
let x := ← (·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8
EStateM.set s
let y := ← (·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8
IO.println f!"{x} {y}"
#eval anyway
Utensil Song (Oct 11 2023 at 12:15):
Actually, I mean is it possible to write it as a pure function like @Ioannis Konstantoulas tried:
Eric Wieser (Oct 11 2023 at 12:17):
You can't run IO
from the Id
monad without cheating
Utensil Song (Oct 11 2023 at 12:18):
the question is asked after witnessing cheating :joy:
Eric Wieser (Oct 11 2023 at 12:18):
ohNo.run' ()
lets you cheat
Utensil Song (Oct 11 2023 at 12:19):
Eric Wieser said:
EStateM.set
does work for the random number generator though:def anyway : IO Unit := do let s ← EStateM.get let x := ← (·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8 EStateM.set s let y := ← (·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8 IO.println f!"{x} {y}" #eval anyway
Nice, it now consistently generates the same "random" number.
Chris Wong (Oct 11 2023 at 12:19):
Does the Lean interpreter do CSE or something?
Ioannis Konstantoulas (Oct 11 2023 at 12:20):
The fact that two eval
s return different results is not a problem, right?
def anyway : IO Unit := do
let s ← EStateM.get
let x := ← (·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8
EStateM.set s
let y := ← (·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8
IO.println f!"{x} {y}"
#eval anyway
#eval anyway
Eric Wieser (Oct 11 2023 at 12:21):
Correct, because the real world changed between the two evals
Ioannis Konstantoulas (Oct 11 2023 at 12:22):
Chris Wong said:
Does the Lean interpreter do CSE or something?
What is CSE?
Ruben Van de Velde (Oct 11 2023 at 12:23):
Common subexpression elimination, maybe
Chris Wong (Oct 11 2023 at 12:23):
Ioannis Konstantoulas said:
Chris Wong said:
Does the Lean interpreter do CSE or something?
What is CSE?
Common subexpression elimination – after inlining, both getRandomBytes
calls refer to the same RealWorld
, so it would be a "valid" optimization to unify them
Chris Wong (Oct 11 2023 at 12:25):
Though I wonder why Lean is happy to turn two calls into one, but not one into zero :thinking:
Tomas Skrivan (Oct 11 2023 at 12:30):
Good catch, it is exactly what is happening:
def randNat' : IO Nat :=
dbg_trace "calling randNat"
(·.toUInt64BE!.toNat % 256) <$> IO.getRandomBytes 8
def anyway : IO Unit := do
let s ← EStateM.get
let x := ← randNat'
EStateM.set s
let y := ← randNat'
IO.println f!"{x} {y}"
#eval anyway
prints calling randNat
only once
James Gallicchio (Oct 11 2023 at 16:31):
once the linearity checker is implemented, the realworld parameter is gonna be marked linear, you won't be able to get
the state, and none of these (wrong) optimizations will occur :p
Sebastian Ullrich (Oct 11 2023 at 16:40):
There are some stark misconceptions in this thread. Deduplicating an IO
value does not affect the number of executions. dbg_trace
breaks precisely because it does not live in IO
, thus the dbg_
prefix. There are no issues with compiler optimizations and IO
.
Tomas Skrivan (Oct 11 2023 at 19:09):
Wait so why is it producing two same "random" numbers?
Mario Carneiro (Oct 11 2023 at 19:10):
because the random state was reset?
Sebastian Ullrich (Oct 11 2023 at 19:16):
Ah, by breaking open the IO abstraction, you allow for CSE, yes. But the compiler would be free to deduplicate the dbg_trace even without that.
Mario Carneiro (Oct 11 2023 at 19:16):
I tend to think that even exposing that IO A
is a function type is a bad idea. Because if it is, then it is equal to its eta expansion, and you can do funny things exploiting that to duplicate effects
Mario Carneiro (Oct 11 2023 at 19:22):
def hello : IO Unit := IO.println "hi"
@[noinline] def blackBox {α} (x : α) : α := x
def foo {α β γ} (x : α) (f : β → γ) (a : β) : γ :=
(blackBox (x, f a)).2
def dup (x : IO Unit) : IO Unit :=
fun w => foo (x w) x w
example (x : IO Unit) : dup x = x := rfl
#eval hello
-- hi
#eval dup hello
-- hi
-- hi
Mario Carneiro (Oct 11 2023 at 19:24):
I'm pretty sure I can cause another native_decide contradiction using this and ST
Mario Carneiro (Oct 11 2023 at 19:29):
@[noinline] def blackBox {α} (x : α) : α := x
def foo {α β γ} (x : α) (f : β → γ) (a : β) : γ :=
(blackBox (x, f a)).2
def dup {ω} (x : ST ω Unit) : ST ω Unit :=
fun w => foo (x w) x w
def flipBool {ω} (r : ST.Ref ω Bool) : ST ω Unit := do r.set !(← r.get)
def ex1 : Bool := runST fun _ => do
let r ← ST.mkRef false
flipBool r
r.get
def ex2 : Bool := runST fun _ => do
let r ← ST.mkRef false
dup (flipBool r)
r.get
example : False :=
have : ex1 ≠ ex2 := by native_decide
this rfl
Jireh Loreaux (Oct 11 2023 at 19:31):
I'm confused. I thought we weren't supposed to trust native_decide
anyway?
Mario Carneiro (Oct 11 2023 at 19:32):
it's not at the same level of correctness, but we still want the compiler not to be able to prove false things, because if it can then safe lean code can cause UB
Joachim Breitner (Oct 11 2023 at 19:33):
Trust is always a choice :-)
If you trust native_decide
, you trust much more code, and we know that this allows you to prove false. But that doesn’t mean we want that. I assume we don’t, and will fix these bugs as we find them.
Mario Carneiro (Oct 11 2023 at 19:33):
We do not want native_decide
to be inconsistent
Mario Carneiro (Oct 11 2023 at 19:34):
or at least, it should be the fault of the user (e.g. some implemented_by
) if something goes wrong
Mario Carneiro (Oct 11 2023 at 19:37):
def bla : Bool :=
have : ex1 ≠ ex2 := by native_decide
if h : ex1 = ex2 then blackBox (this h).elim else (h rfl).elim
#eval bla -- boom
Tomas Skrivan (Oct 12 2023 at 00:51):
Mario Carneiro said:
because the random state was reset?
Yes, ideally that would happen but the function lean_io_get_random_bytes
is not using the IO.RealWorld
at all. IO.RealWorld
is not holding any information, so you can't reset the state of the random generator.
Tomas Skrivan (Oct 12 2023 at 00:54):
(deleted)
Scott Morrison (Oct 12 2023 at 03:50):
I did make a start on making EIO
an opaque definition (i.e. so you can't see at all that it is really an EStateM
). However there is a fair bit of API to collect, and some design decisions about how to expose "enough" of the fact that it is a function type, so I didn't get there. I may try again later, but if someone would like to make an RFC or PR for this I would happily review it!
Last updated: Dec 20 2023 at 11:08 UTC