Zulip Chat Archive

Stream: lean4

Topic: Stating properties of the result of IO a

Geoffrey Irving (Feb 29 2024 at 10:21):

Say I have an IO value which (1) prints some stuff and (2) returns a deterministic value. How do I write a theorem about that returned value? Here is an example, where I don't know how to write the theorem statement (I expect the proof will be easy once I can do that):

import Mathlib

def loud_seven : IO ℕ := do
  IO.println "yo"
  pure 7

/-- How do we write the theorem that the output is always 7? -/
lemma loud_seven_eq : loud_seven = pure 7 := by
  sorry  -- This is false, presumably.  What should I say instead?

Geoffrey Irving (Feb 29 2024 at 10:22):

If IO a had an accessor that returned the value I could use that, but of course it doesn't have that. What I want is something like "for all possible values x that could be returned if the actual IO stuff is byzantine, some property of x holds".

Henrik Böving (Feb 29 2024 at 10:25):

You don't prove things about IO basically. What you can do though is write the functions that process whatever came from IO in a pure (or also non-IO-monadic) fashion and then write proofs about these non IO functions.

Geoffrey Irving (Feb 29 2024 at 10:27):

Hmm, if I unpack the definitions of things they all seem transparent, so I'm not sure I buy that as the answer. E.g., IO a is just a function that turns IO.RealWorld into IO.RealWorld * a.

Geoffrey Irving (Feb 29 2024 at 10:29):

E.g., this definition type checks:

def IO.range {α : Type} (i : IO α) : Set α :=
  {x : α | ∃ w w', i w = EStateM.Result.ok x w'}

Henrik Böving (Feb 29 2024 at 10:30):

First things first no its not, IO is also capable of throwing errors so the proper type is IO.RealWorld -> EStateM.Result IO.Error IO.RealWorld a. Which is also the reason that your theorem above is not provable, you cannot know if IO.println will throw an error or not (it's bound by FFI after all) so there is no way of knowing if youre pure 7 would ever be reached.

Geoffrey Irving (Feb 29 2024 at 10:31):

Yes, that theorem statement is nonsense, which is why I'm looking for the right statement. I'll try to write it down in terms of IO.range.

Geoffrey Irving (Feb 29 2024 at 10:32):

I think the right theorem statement (accounting for errors) is

lemma loud_seven_eq_seven : ∀ x ∈ loud_seven.range, x = 7 := by
  intro x ⟨y,w,h⟩
  sorry

Henrik Böving (Feb 29 2024 at 10:33):

Right but you can never find out if there even exists an x that is a member of that statement so what use is it?

Geoffrey Irving (Feb 29 2024 at 10:34):

I can run it and see if it completes empirically, and know that if it does the result is correct. This seems useful.

Geoffrey Irving (Feb 29 2024 at 10:35):

The application here is proving that a program that produces images never produces an incorrect image.

Geoffrey Irving (Feb 29 2024 at 10:37):

Another example is a proof-by-reflection routine that uses an external solver, but then runs a fast proof-by-reflection check of the results. Here the result is wildly dependent on the result of the external call, but we can still write down the theorem that the reflection tactic never makes a mistake during postprocessing.

Henrik Böving (Feb 29 2024 at 10:40):

Geoffrey Irving said:

Another example is a proof-by-reflection routine that uses an external solver, but then runs a fast proof-by-reflection check of the results. Here the result is wildly dependent on the result of the external call, but we can still write down the theorem that the reflection tactic never makes a mistake during postprocessing.

Right but that would still require you to then run IO in a pure context (in this case a proof) which you are not supposed to do. The proper way to go here is as we do it in https://github.com/leanprover/leansat. We have a pure function that verifies a SAT solver proof certificate with a soundness proof. Then we run the external prover during the tactic and produce a proof term that contains the pure version of its output. Then the certificate checker can be run on that output during typechecking without any need for IO interaction remaining.

Henrik Böving (Feb 29 2024 at 10:42):

Henrik Böving said:

You don't prove things about IO basically. What you can do though is write the functions that process whatever came from IO in a pure (or also non-IO-monadic) fashion and then write proofs about these non IO functions.

which is exactly the principle described here, also know as "functional core imperative (aka monadic) shell"

Geoffrey Irving (Feb 29 2024 at 10:42):

I'm writing a program which produces images, and they're expensive to produce so I want to print a progress meter while I'm doing this. I'd like to prove this program is correct, and given the above it seems those proofs are entirely within reach. So I don't see the sense in which I'm "not supposed to do this".

Geoffrey Irving (Feb 29 2024 at 10:43):

In particular, the parallel code that wants to spit out progress reports is stitching different bits of the image together, and I don't see why it would be invalid to prove this stitching correct even if it is interleaved with printing.

Geoffrey Irving (Feb 29 2024 at 10:49):

The difference in the SAT case is that we’re running an NP-like computation where the final check is fast. Here there is no available final check: it’s just an expensive computation that doesn’t produce a certificate but has a proof a proof of correctness. So the progress meter requires me to interleave IO with the trusted part.

Eric Wieser (Feb 29 2024 at 10:49):

One option is

import Mathlib

def loud_seven : IO {n : ℕ // n = 7} := do
  IO.println "yo"
  pure ⟨7, rfl⟩

Eric Wieser (Feb 29 2024 at 10:58):

docs#SatisfiesM is similar, though I think won't help you much here

Geoffrey Irving (Feb 29 2024 at 11:45):

SatisfiesM seems like exactly the right thing!

import Mathlib

def loud_seven : IO ℕ := do
  IO.println "yo"
  pure 7

lemma loud_seven_satisfies : SatisfiesM (fun x ↦ x = 7) loud_seven := by
  rw [loud_seven]
  apply SatisfiesM.bind_pre
  apply SatisfiesM.of_true
  intro _
  exact SatisfiesM.pure rfl

Kim Morrison (Mar 01 2024 at 01:50):

That seems complicated compared to just saying something about the pure part of the code... And it doesn't tell you anything additional (because there is nothing to tell).

Geoffrey Irving (Mar 01 2024 at 07:45):

In this case the pure and monadic parts are well separated. In the original situation they are interleaved, and also combined with parallelism. So it does give me some additional clarity.

Mario Carneiro (Mar 02 2024 at 06:25):

Another idea is to be generic over the monad, and use a pure-functional monad for proving and IO for printing. This is kind of similar to the algebraic effects approach, or free monads, where you separate the code itself from the interpretation of the "print" function. Then you can e.g. run it in the Id monad and just strip out all the prints that way

Mario Carneiro (Mar 02 2024 at 06:26):

You could even run it in a mock-print monad if you wanted to prove that specific values are printed

Geoffrey Irving (Mar 02 2024 at 08:51):

Ah, the generic monad is a great idea! I’ll go with that.

Mario Carneiro (Mar 02 2024 at 08:55):

Also, just in case you weren't aware and don't need super high quality printing, you can use dbg_trace even in pure code

Last updated: May 02 2025 at 03:31 UTC