Zulip Chat Archive

Stream: lean4

Topic: Reasoning through a monadic bind

jthulhu (Nov 06 2024 at 07:56):

I'm trying to perform a compilation step, by lifting an operation that translates expressions Expr1 into Expr2 into something that takes an Array Expr1 and returns an Array Expr2. Furthermore, I have two constraints:

it is important that the size of the array doesn't change, and that I am able to prove it, because that array represents a local context, so locally bound variables are actually indices of that array, which, to avoid having to resort to [·]!, I encode as Fin n rather than Nat
the translation of each expression individually is actually a monadic action Expr1 → m Expr2 for some monad m.

If the translation of each expression didn't happen under a monad, I could simply use Array.map, for which there is a lemma that states that it preserves the size, and all would be well. But since I do have a monadic action, my code looks like

do let vars1 : Array Expr1 := ...
   let vars2 : Array Expr2 ← vars1.mapM translation
   let all_is_well : vars1.size = vars2.size := ???

now, the problem is that, when I want to prove that vars1.size = vars2.size, in the local context, vars2 is not bound, ie I just have vars2 : Array Expr2, not vars2 : Array Expr2 := ..., so the goal becomes impossible to prove.

I've managed to make this work for a concrete monad Except ε, for which I can "peek into" without doing a monadic bind, and so I can extract some more informatio. However, in my use case, the monad I'd like to work in is CoreM (or MetaM), which is a stack of monads on top of IO, which makes this technique of "peeking in the monad" without binding impossible.

In fact, I wouldn't even know how to state a general theorem that states that mapM preserves the size of the array, since the array we get is hidden in the monad, so I don't really know how to proceed.

Kim Morrison (Nov 06 2024 at 09:27):

One solution is docs#SatisfiesM, see e.g. docs#Array.size_modifyM for a simple example. However it's difficult to make practical use of this.

Kim Morrison (Nov 06 2024 at 09:27):

On the other hand, it's somewhat unusual that you want to prove something about CoreM in the first place. Could you perhaps #xy a bit for us?

jthulhu (Nov 06 2024 at 09:31):

I'd like to work in CoreM (or with additional elaboration monads in the stack) because that makes building a better UX easier, ie. to have precise error reporting, to provide precise feedback as the cursor moves around in the expressions I elaborate, things like that. For this reason, my first compilation steps (those that can fail) produce monadic actions in CoreM, rather than just, say, Except ε. I hope this makes sense.

Kim Morrison (Nov 06 2024 at 09:39):

Oh yes --- but why are you writing proofs about the behaviour of CoreM?

jthulhu (Nov 06 2024 at 09:42):

I'm not specifically interested about writing proofs about the behaviour of CoreM, but I'm working with types that depend on the size of the array, so I need to ensure that the array doesn't change size during one translation step to typecheck. If I were to replace ← vars1.mapM translation with := vars1.map translation in the above code, it would be easy to see that vars1.size and vars2.size are equal. With some effort, I managed to show that it is possible if the underlying monad is Except ε. However, the technique that I used does not generalize to CoreM, and is also very verbose with respect to the non-monadic solution.

Kim Morrison (Nov 06 2024 at 10:56):

@jthulhu, one could also use

import Batteries.Data.Vector.Basic

namespace Batteries.Vector

/-- Map a monadic function over a vector. -/
def mapM [Monad m] (f : α → m β) (v : Vector α n) : m (Vector β n) := do
  go 0 (Nat.zero_le n) .empty
where
  go (i : Nat) (h : i ≤ n) (r : Vector β i) : m (Vector β n) := do
    if h' : i < n then
      go (i+1) (by omega) (r.push (← f v[i]))
    else
      return r.cast (by omega)

end Batteries.Vector

Kim Morrison (Nov 06 2024 at 10:59):

@Mario Carneiro, is it actually possible to define something with this signature "after the fact", i.e. just using docs#Array.mapM and docs#Array.size_mapM?

Kim Morrison (Nov 06 2024 at 11:02):

(But, @jthulhu, I think the useful answer here is still to ask you to #xy further, and tell us why you want the equality of the array sizes, and see if that can just be avoided mixed in with the monadic code.)

Mario Carneiro (Nov 06 2024 at 11:42):

Kim Morrison said:

Mario Carneiro, is it actually possible to define something with this signature "after the fact", i.e. just using docs#Array.mapM and docs#Array.size_mapM?

I don't think so. This is one of the biggest issues with SatisfiesM, but I don't think it's possible to do better on general monads. I haven't been able to come up with a counterexample to just taking it as an axiom that SatisfiesM implies a lift to a subtype. Possibly this operation could itself be a typeclass and proved by induction over the monad structure

jthulhu (Nov 06 2024 at 12:20):

Kim Morrison said:

(But, jthulhu, I think the useful answer here is still to ask you to #xy further, and tell us why you want the equality of the array sizes, and see if that can just be avoided mixed in with the monadic code.)

I am embedding a (core subset of) a synchronous reactive programming language, such as Lustre, to be able to reason about its semantics in Lean. It's a class of purely functional programming language, deterministic, parallel, and such that you can statically check that they react in bounded time and memory to an input (ie. imagine the software you'd run on an airplane, a rocket or a train: you want to be sure that, whenever a sensor receives some physical input, that input will be handled within X ms). Concretely, in the language that I'd like to support, a function looks like this

let f x y z = o where
  r = x + y * 2
  l = r * z + 23
  o = l + r

At the compilation step where I am, I have translated the syntactic objects produced by Lean's parser into a structure that remembers:

the name of the function (in my example, f)
an array of variables that appear (in my example, x, y, z, r, l and o)
the output variables (in my example, o)

For each variable, I know its name, its type (but we can forget about this for now), and its body definition, which is an Option Expr. If the variable is an input variable, it has no body definition, otherwise it has one. Alongside all of this information, I have a lot of references to make error reporting precise, but we can also forget about this for now.

I want to do a compilation pass that checks that every variable used is actually bound somewhere, fail if it is not (ie. report an error to the end user in the most graceful way possible), and otherwise replace the representation of variables, from Name to a Fin n where n is the number of variables. This is very akin to a de Bruijn translation, except that we assume there is a single big binder for all the variables rather than introducing each new variable in a let, and there is a good reason for that: we want to accept (and make sense of) programs which are causal, but not necessarily statically causal. This is a fancy way of saying that we want to accept a class of programs for which the bindings cannot be statically ordered into a sequence of lets. You could think of it as if the bindings where all mutually recursively defined, implicitly.

To do so, I start by collecting an environment that maps Names of the variables which are actually defined somewhere in the function, to their index in the array. Then, I have a function which, given an expression, and such a mapping, tries to recursively explore the expression and replace every named variable by an index variable. Crucially, this function can fail, and has to interact with the user in case it does, because its failure indicates that the user has used a variable which is not bound anywhere. This is where I use Array.mapM. Then, I know that the variables in the new array live in Fin n, where n is the size of the old array. To finish, I need to prove that the old array has the same size as the new array.

Jannis Limperg (Nov 06 2024 at 13:12):

If you only use the monad for error reporting, you should be able to write a pure function that performs the renaming and returns the new data structure or, on failure, returns enough information about the non-bound variable to construct an error message. You can then fairly easily prove that this pure function has the properties you want, and on failure you can construct and throw the error.

jthulhu (Nov 06 2024 at 13:31):

Jannis Limperg said:

If you only use the monad for error reporting, you should be able to write a pure function that performs the renaming and returns the new data structure or, on failure, returns enough information about the non-bound variable to construct an error message. You can then fairly easily prove that this pure function has the properties you want, and on failure you can construct and throw the error.

I've managed to do it this way, but it's still much more verbose than I would like it to be. I was wondering whether there was a more general way to make it work regardless of the monad, that I could just "plug in". But having read the documentation of SatifiesM, I am under the impression this is a non-trivial process for which there is no general framework ready to use in Lean.

Jannis Limperg (Nov 06 2024 at 13:47):

Yeah, reasoning in a monadic context is much harder (conceptually, not just in Lean) than reasoning about pure functions. So I would avoid it whenever possible.

Kim Morrison (Nov 06 2024 at 21:00):

@jthulhu, I'm curious if the Vector.mapM I defined above helps at all with your original question.

jthulhu (Nov 07 2024 at 11:54):

Kim Morrison said:

jthulhu, I'm curious if the Vector.mapM I defined above helps at all with your original question.

Yes, it works wonderfully! At first I didn't see the difference between that implementation and Array's mapM, but I guess that's just because Array and Vector in Lean are named exactly in the opposite way as in Rust, so my brain was confused.

Notification Bot (Nov 07 2024 at 11:54):

jthulhu has marked this topic as resolved.

Shreyas Srinivas (Nov 07 2024 at 20:45):

Btw this is exactly where batteries#925 is stuck

Kim Morrison (Nov 07 2024 at 21:23):

@Shreyas Srinivas, so you're unstuck now, using my code above, right? :-)

Shreyas Srinivas (Nov 07 2024 at 22:31):

Yes it does. Thanks. That Satisfies predicate was confusing back then

Kim Morrison (Nov 08 2024 at 06:12):

@Mario Carneiro, would you take a look at batteries#1029, which enables:

example (xs : Array Expr) : MetaM { ts : Array Expr // ts.size = xs.size } := do
  let r ← satisfying (xs.size_mapM inferType)
  return r

Kim Morrison (Nov 08 2024 at 06:13):

(I feel dirty constructing the LawfulMonad MetaM instance, but ... it is allowed?)

Eric Wieser (Nov 08 2024 at 07:04):

(we already have the slightly dirty docs#instLawfulMonadIO_mathlib, and everything atop that on the monad stack is kosher)

Kim Morrison (Nov 13 2024 at 00:12):

Okay, I think this might be ready to go:

I've got the Mathlib update ready to go
I don't want to pursue @John Ericson's suggestion at this point.

Last updated: May 02 2025 at 03:31 UTC