Zulip Chat Archive

Stream: general

Topic: An adhoc way for proving loop invariants

Frederick Pu (Sep 27 2024 at 19:30):

I want to make a custom notation (elaborator) that allows you to prove loop invariants internally within lean's do notation using a similar sort of notation that is used for termination_by.
Namely:

def f (n : Nat) : Nat := Id.run do
    let mut x := 0
    let mut z := 2
    for y in List.range n do
--     ...
    perserving h : x + z = 2 by `stx`
    w : x + z = 2 := h

gets elaborated into

def f (n : Nat) : Nat := Id.run do
  let mut x := 0
  let mut z := 2
  let mut h' : Subtype (fun (⟨x, z⟩ : Nat × Nat) => x + z = 2) := ⟨(x, z), rfl⟩
  for y in List.range n do
    let h := h'.2
    ⟨x, z⟩ := h'.1
    -- ...
    h' := ⟨(x, z), by `stx`⟩
  h := h'.property
  w : x + z = 2 := h

Note that the w : x + z = 2 := h is just to demonstrate that the loop invariant is accessible outside of the loop body.
Right now I've mainly been looking at Lean.Predefinition.TerminationArgument for inspiration on how to implement the elaborator. Does anyone have any tips on what the general approach or structure would be for implementing an elaborator like this? (also any feedback on the design of the dsl/notation would also be appreciated)

Asei Inoue (Sep 28 2024 at 13:35):

nice idea!

Eric Wieser (Sep 28 2024 at 16:49):

One thing that complicates this is that the ... could contain break or return or continue

Mario Carneiro (Sep 28 2024 at 16:57):

I think @Jeremy Avigad had ideas for such a syntax, we discussed this problem a few months ago

Jeremy Avigad (Sep 28 2024 at 21:08):

@Zach Battleman, Mario, and I did some experiments on this the summer before last. IIRC, our main conclusion was that a small variation on existing do notation would do the trick, namely, allowing mutable variables to have dependent types. Then you could write something like:

def f (n : Nat) : Nat := Id.run do
    let mut x := 0
    let mut z := 2
    let mut h : x + z = 2 := by ...
    for y in List.range n do
      x := ...
      z := ...
      h := ...

In other words, you update the proof of the the invariant the same way you update the variables. Mario had an experimental alternative implementation of Lean's do notation and I think we convinced ourselves that the modifications needed to handle the dependency would be reasonably straightforward, but we never had an implementation.

Another conclusion we came to is that in the definitions of x, z, and h it is useful to refer to the previous (shadowed) definitions of x and z. One can do this manually in the current framework by renaming them. We talked about having special notation like old x and old z to refer to the previous values. I wasn't crazy about that, but I don't have a better idea.

Finally, I think we decided that the most general thing we would want to handle is a while loop:

let mut x := 10
let mut y := 15
let mut inv : y = x + 5 := ...

while cond : x < 30 do
  x := x + 5 -- shadows old x
  y := y + 5 -- shadows old y
  inv := ...
  have : 100 - x < 100 - old x := ...
termination_by 100 - x

After the while loop, the context should have cond : ¬ x < 30. for and repeat ... until can be defined from that.

I'd love to see something like this implemented! We called the project "lafny," i.e. our take on a Lean version of the Dafny system.

Frederick Pu (Sep 29 2024 at 01:58):

Yeah I'd defintely be interested in implementing dependently typed mutable variables and also getting termination by support for while loops. Would it be possible to get some mentorship for you guys since I'm still pretty new to the Lean/Parser/Do and Lean/Elab/Do code files?

Kim Morrison (Sep 29 2024 at 04:01):

This would be really cool, and is a huge missing piece of the current story for do blocks. Seeing more prototypes in this direction would be nice. (Remember, of course, that actually getting a solution into Lean is going to be a process with many moving parts, and will require the FRO to have the bandwidth to study and invest in a particular solution; right now that isn't there, but we do really want to get to this!)

Frederick Pu (Sep 29 2024 at 04:38):

Is there any resource explaining how the current mutable var system works for Lean's do notation? I'm having trouble understanding the big picture when reading Lean/Elab/Do

Kim Morrison (Sep 29 2024 at 04:53):

Apologies if this is a dumb question, but have you read "Do unchained"?

Frederick Pu (Sep 29 2024 at 05:08):

Kim Morrison said:

Apologies if this is a dumb question, but have you read "Do unchained"?

Yeah I have, although maybe not as thoroughly as I should have. I'm but I don't think that paper explains how variables are stored in the Context type. Since it seems to mainly focus on the semantics and soundness properties of lean's do notation

Frederick Pu (Sep 29 2024 at 05:14):

oh do do you mean I should right a reference implementation in a form similar to do' as a demo first and then try integrate into the actual do notation?

Henrik Böving (Sep 29 2024 at 08:57):

Kim Morrison said:

Apologies if this is a dumb question, but have you read "Do unchained"?

Do unchained is also not how the implementation currently actually works. Its more of a suggestion on how one could make an academically pleasing implementation. The current do elaborator is very different

Sebastian Ullrich (Sep 29 2024 at 09:51):

Which does make it much easier to prototype extensions in

Patrick Massot (Sep 29 2024 at 10:12):

Frederick, I’m afraid Henrik is not explicit enough. Lean.Elab.Do alone is 1800 lines of delicate mission critical code. There is no doubt that people would love to have efficient ways to prove things about programs written using all the great do notation goodies. And it’s great to experiment with that. But pushing those experimenting to core Lean is a completely different story. It would most certainly require to first try your hand at a lot of easier tasks, building your skills and trust from the core team. I don’t want to cut your momentum, and I’m not part of the core team. I only want you to have realistic expectation to avoid frustration.

Zach Battleman (Sep 29 2024 at 13:57):

Oh wow! It's great to see that people are interested in this! @Jeremy Avigad basically covered the story up to now about what we worked on, and admittedly it's been some time since I took a look at this, but I'm really excited to hear that there is interest in this! I don't have the bandwidth right now to work on it, but I'd be very excited to see where this develops and I'd be eager to return to it in the future.

@Jeremy Avigad is the repo owner so once he makes it public he can post it here (discussed over email)

Frederick Pu (Sep 29 2024 at 16:28):

Jeremy Avigad said:

Zach Battleman, Mario, and I did some experiments on this the summer before last. IIRC, our main conclusion was that a small variation on existing do notation would do the trick, namely, allowing mutable variables to have dependent types. Then you could write something like:
def f (n : Nat) : Nat := Id.run do
    let mut x := 0
    let mut z := 2
    let mut h : x + z = 2 := by ...
    for y in List.range n do
      x := ...
      z := ...
      h := ...
In other words, you update the proof of the the invariant the same way you update the variables. Mario had an experimental alternative implementation of Lean's do notation and I think we convinced ourselves that the modifications needed to handle the dependency would be reasonably straightforward, but we never had an implementation.

Another conclusion we came to is that in the definitions of x, z, and h it is useful to refer to the previous (shadowed) definitions of x and z. One can do this manually in the current framework by renaming them. We talked about having special notation like old x and old z to refer to the previous values. I wasn't crazy about that, but I don't have a better idea.

Finally, I think we decided that the most general thing we would want to handle is a while loop:
let mut x := 10
let mut y := 15
let mut inv : y = x + 5 := ...

while cond : x < 30 do
  x := x + 5 -- shadows old x
  y := y + 5 -- shadows old y
  inv := ...
  have : 100 - x < 100 - old x := ...
termination_by 100 - x
After the while loop, the context should have cond : ¬ x < 30. for and repeat ... until can be defined from that.

I'd love to see something like this implemented! We called the project "lafny," i.e. our take on a Lean version of the Dafny system.

K, so I have a concern about the dependently typed mutable variables.
Since h has a type dependent on x and z, wouldnt that mean you would need to update h everytime either x or z is updated? What mechanism allows you to procrasinate the proof h until the end of the do block inside of the for body?

I was thinking that whenever you break or coninue or reach the end of the loop body you need to emit a proof that the loop invariant holds somehting like

let mut x := 0
let mut z := 0
for y in List.range n perserving x + z = 2 := (by ...) do
       x := ....
       z := .....
       if y % 2 == 0 then
              continue emit (`proof of invariant`)
       if y % 3 == 0 then
              break emit (`proof of invariant`)
       emit `proof of invariant`

Overall i feel like having dependently typed mutable vars might be too aggressive of an abstraction for being able to prove loop invariants since the loop invariant would need to updated everytime the anytime the mutables vars it's responsible for are updated.

Instead, you should only need to prove the invariant whenever you are completing the current iteration of the loop (ie continueing to the end of the loop body, breaking out of hte loop body, or have reached the end of the loop body)

Frederick Pu (Sep 29 2024 at 16:36):

Patrick Massot said:

Frederick, I’m afraid Henrik is not explicit enough. Lean.Elab.Do alone is 1800 lines of delicate mission critical code. There is no doubt that people would love to have efficient ways to prove things about programs written using all the great do notation goodies. And it’s great to experiment with that. But pushing those experimenting to core Lean is a completely different story. It would most certainly require to first try your hand at a lot of easier tasks, building your skills and trust from the core team. I don’t want to cut your momentum, and I’m not part of the core team. I only want you to have realistic expectation to avoid frustration.

yeah im not to worried about that since I a) havent fleshed out how I want to implement the loop invariant thing b) have not implemented it yet lol

Patrick Massot (Sep 29 2024 at 16:38):

That’s great, it means you know from the beginning what you can expect. And I hope good things will come out of this work, one way or another.

Mario Carneiro (Sep 29 2024 at 17:23):

Frederick Pu said:

Since h has a type dependent on x and z, wouldnt that mean you would need to update h everytime either x or z is updated? What mechanism allows you to procrasinate the proof h until the end of the do block inside of the for body?

If you don't update h in the block, you will get a type error because the desugared code will attempt to pass the original h to the next iteration, and it will have the wrong type for that

Mario Carneiro (Sep 29 2024 at 17:24):

Other than that, there isn't really anything that needs to be done implementation wise, except to remove the piece of code which lean uses to explicitly force the type not to change when you use h := ... syntax. It's actually quite easy to implement since it's negative code

Mario Carneiro (Sep 29 2024 at 17:27):

Frederick Pu said:

Overall i feel like having dependently typed mutable vars might be too aggressive of an abstraction for being able to prove loop invariants since the loop invariant would need to updated everytime the anytime the mutables vars it's responsible for are updated.

I think this is a misunderstanding, you don't need to update h immediately after x and z here, but if you don't you will see in the infoview that it has a type referring to the old shadowed values of x and z, and you won't be able to end the loop until you've updated h to have a type consistent with the values of x and z that will be used in the next iteration.

Frederick Pu (Sep 29 2024 at 17:34):

Mario Carneiro said:

I think this is a misunderstanding, you don't need to update h immediately after x and z here, but if you don't you will see in the infoview that it has a type referring to the old shadowed values of x and z, and you won't be able to end the loop until you've updated h to have a type consistent with the values of x and z that will be used in the next iteration.

ok so if I'm understanding correctly, you need give the elaborator enough information to prove loop invariant by the end of the loop or it'll just treat the x and z modifications you made as shadowed variables?
But you can't shadow a mutable variable so is there some weird trick to make that work?

Mario Carneiro (Sep 29 2024 at 17:35):

mutable variables are desugared to shadowing let bindings

Mario Carneiro (Sep 29 2024 at 17:36):

plus some extra sauce so that modifications inside control structures are propagated to the subsequent scope

Mario Carneiro (Sep 29 2024 at 17:39):

#check show Id Nat from do
  let mut x := 1
  have _h : x = x := rfl
  x := 2
  -- _h : x✝ = x✝
  pure x

-- let_fun this :=
--   let x := 1;
--   let_fun _h := ⋯;
--   let x := 2;
--   pure x;
-- this : Id Nat

Frederick Pu (Sep 29 2024 at 17:42):

so the no shadowing mutable variables is more like a linter error generated by do notation?

Frederick Pu (Sep 29 2024 at 17:48):

also what happens if multiple variables dependent on the same variable x . Then you would need to both them to be proven in order for x to be unshadowed no?

Frederick Pu (Sep 29 2024 at 17:56):

also i dont see the definition of for ... in for the do unchained reference implementation

Mario Carneiro (Sep 29 2024 at 18:05):

Only variables that are mutated in the loop need to have up to date types

Mario Carneiro (Sep 29 2024 at 18:06):

or rather, when a variable is mutated in the loop anything depending on it also needs to be updated when you go around the loop

Frederick Pu (Sep 29 2024 at 18:08):

Mario Carneiro said:

or rather, when a variable is mutated in the loop anything depending on it also needs to be updated when you go around the loop

so how do you know that all of it's corresponding types have been updated? that seems like you'd have to do some reference counting or smth.

Frederick Pu (Sep 29 2024 at 18:09):

cause i thought it was just like when you prove h you also unshadow x and y. But if you have both h1 and h2 depending on x, then that trick wont work.

Frederick Pu (Sep 29 2024 at 18:35):

im working through the do unchained reference implementation right now and I'm getting the following error:

import Lean

declare_syntax_cat stmt

syntax "do'" stmt : term
syntax "if" term "then" stmt "else" stmt:1 : stmt
macro "{" s:stmt "}" : stmt => `($s)
-- application type mismatch
--   s.raw
-- argument
--   s
-- has type
--   Lean.TSyntax `stmt : Type
-- but is expected to have type
--   Lean.TSyntax `term : Type

Jeremy Avigad (Sep 29 2024 at 18:55):

I am happy to see the activity here. I should have made it clear when I said that I would love to see an implementation that what I am hoping for is a prototype proof of concept and some examples of how it can be used. That would help show that Lean can be used to verify programs in the precondition/postcondition/assertion/invariant style of doing things. That could be helpful to the Lean development team down the line if / when they revisit do notation one day.

This was a undergraduate research project for @Zach Battleman the summer before last, between his second and third years at CMU. By the end of the summer, we had worked out the syntax described above on paper and the translation to ordinary functional programs in Lean. Zach also manually worked out some examples from the Dafny book to convince ourselves that our approach was robust enough to support them. The repository is here. It wasn't meant for public use, but perhaps it will be useful. The repository also contains some of Mario's experiments implementing the syntax.

Mario Carneiro (Sep 29 2024 at 19:37):

Frederick Pu said:

Mario Carneiro said:

or rather, when a variable is mutated in the loop anything depending on it also needs to be updated when you go around the loop

so how do you know that all of it's corresponding types have been updated? that seems like you'd have to do some reference counting or smth.

No, just typechecking. If you don't update these values, then the original variables will be passed, but the type of the continuation function to go around the loop expects the values to have compatible types so this is a type error.

Mario Carneiro (Sep 29 2024 at 19:39):

There is no "unshadowing" going on. When you update h it's now another variable shadowing the original h, and this one has a type mentioning the new x and not the old one. If you update h but not x or vice versa, then it won't be the case that <x, h> lives in the appropriate sigma type.

Frederick Pu (Sep 29 2024 at 20:59):

Mario Carneiro said:

Frederick Pu said:

Mario Carneiro said:

or rather, when a variable is mutated in the loop anything depending on it also needs to be updated when you go around the loop

so how do you know that all of it's corresponding types have been updated? that seems like you'd have to do some reference counting or smth.

No, just typechecking. If you don't update these values, then the original variables will be passed, but the type of the continuation function to go around the loop expects the values to have compatible types so this is a type error.

What would the type of the continuation function be if you have h1 h2 dependent on x?
Would it be like Subtype (fun x => h1[x/x] ^ h2[x/x]) -> M (Subtype (fun x => h1[x/x] ^ h2[x/x]))
?

Mario Carneiro (Sep 29 2024 at 21:04):

cont : (x : T) -> (h1 : foo x) -> (h2 : bar x) -> M result

Mario Carneiro (Sep 29 2024 at 21:06):

in the desugaring this function is called as cont x h1 h2 with whatever the "current values" of x,h1,h2 are (i.e. the result of the last mutation to the variables), so h1 needs to have the type foo x and if not (e.g. if it's foo x✝ instead) it's a type error

Eric Wieser (Sep 29 2024 at 21:26):

Do you declare it a type error, or do you just drop h1 from the continuation? I'm thinking of a while loop that preserves an invariant related to termination, but then uses break at a point where it is no longer preserved

Mario Carneiro (Sep 29 2024 at 21:32):

when it's not used, it's not passed so there is no problem

Mario Carneiro (Sep 29 2024 at 21:32):

this is the loop continue continuation we're talking about

Frederick Pu (Sep 29 2024 at 21:45):

so how is the continuation function used?

Frederick Pu (Sep 29 2024 at 21:46):

is it called within the for loop after desugaring or smth?

Mario Carneiro (Sep 29 2024 at 21:48):

yes, the body of the loop concludes by calling one of the available continuations corresponding to the kind of exit (break, continue, return, end of block = continue)

Frederick Pu (Sep 29 2024 at 21:49):

i guess the one way it would differ from a loop invariant is you dont need to prove the loop invariant if you're returning. but i guess x would be scoped anyways in that case

Mario Carneiro (Sep 29 2024 at 21:49):

This is implemented in the do implementation Jeremy linked, https://github.com/avigad/lafny-experiments/blob/main/Lafny/Do.lean

Frederick Pu (Sep 29 2024 at 21:53):

Frederick Pu said:

im working through the do unchained reference implementation right now and I'm getting the following error:

import Lean

declare_syntax_cat stmt

syntax "do'" stmt : term
syntax "if" term "then" stmt "else" stmt:1 : stmt
macro "{" s:stmt "}" : stmt => `($s)
-- application type mismatch
--   s.raw
-- argument
--   s
-- has type
--   Lean.TSyntax `stmt : Type
-- but is expected to have type
--   Lean.TSyntax `term : Type

im sorry to bring this up again but i think it might have gotten buried.

Mario Carneiro (Sep 29 2024 at 21:53):

macro "{" s:stmt "}" : stmt => `(stmt| $s)

Mario Carneiro (Sep 29 2024 at 21:54):

the way these things get parsed changed a bit since that was written

Frederick Pu (Sep 29 2024 at 22:02):

also is there a way to verify that a parser/elaborator is correct with respect to a reference implmentation using lean's hygenic macro system?

Mario Carneiro (Sep 29 2024 at 22:03):

no, proofs about tactics and other metaprogramming is more or less completely unsupported

Mario Carneiro (Sep 29 2024 at 22:04):

MetaCoq has an interesting system for verified metaprogramming called Template-Coq, so in theory Lean4Lean could do the same thing and build a verified metaprogramming system on top

Frederick Pu (Sep 29 2024 at 22:08):

what about just the elaborator, since i think this change probably doesn't change the parser at all

Frederick Pu (Sep 29 2024 at 22:09):

cause arent most of the monad calls used to get fresh metavariables and stuff like that?

Mario Carneiro (Sep 29 2024 at 22:10):

the elaborator is the most complicated part of lean, and it has access to a ton of state, including IO

Frederick Pu (Sep 29 2024 at 22:11):

so it's like impossible to seperate the semantics of elab/do.lean from any of the sideeffect calls it makes right?

Mario Carneiro (Sep 29 2024 at 22:12):

it's possible but not without rewriting parts of core

Mario Carneiro (Sep 29 2024 at 22:12):

and even if you did it would be really crazy complicated to reason about

Frederick Pu (Sep 29 2024 at 22:13):

maybe it'll be possible with lafney 2051

Mario Carneiro (Sep 29 2024 at 22:13):

you would be better off writing another tactic framework from scratch with verification in mind

Frederick Pu (Sep 30 2024 at 00:48):

now getting the following errors

import Lean


declare_syntax_cat stmt
syntax "do'" stmt : term
syntax "let" ident "←" stmt ";" stmt:1 : stmt
syntax "if" term "then" stmt "else" stmt:1 : stmt
macro "{" s:stmt "}" : stmt => `(stmt | $s)

syntax "d!" stmt : term

macro_rules | `(d! let $x ← $s; $s') => `((d! $s) >>= fun $x => (d! $s'))

declare_syntax_cat expander
syntax "expand!" expander "in" stmt : stmt
syntax "mut" ident : expander

macro_rules | `(stmt| expand! mut $y in $e:stmt) => `(stmt| StateT.lift $e)
-- unexpected identifier; expected stmtLean 4
-- (for StateT.lift)

syntax "return" : expander syntax "break" : expander syntax "lift" : expander

macro_rules
| `(stmt| expand! $exp in if $e then $s1 else $s2) =>
`(stmt| if $e then expand! $exp in $s1 else expand! $exp in $s2)

macro_rules
| `(stmt| expand! mut $y in let $x ← $s; $s') =>
if x == y then
throw $ Lean.Macro.Exception.error x s!"cannot shadow 'mut' variable '{x.getId}'"
else
`(stmt| let $x ← expand! mut $y in $s; let $y ← get; expand! mut $y in $s')
-- elaboration function for 'command.pseudo.antiquot' has not been implemented
-- (for `get` `y `and `s'`)

Frederick Pu (Sep 30 2024 at 00:49):

also im assuming that do unchained doesn't show a full reference implementation since the syntax for let <- is missing and so is that for for loops

Frederick Pu (Sep 30 2024 at 01:20):

also doesnt the mut in syntax declaration contradict the macro?

declare_syntax_cat expander
syntax "expand!" expander "in" stmt : stmt
syntax "mut" ident : expander

macro_rules | `(stmt| expand! mut $y in $e:term) => `(stmt| StateT.lift $e)

Frederick Pu (Sep 30 2024 at 01:26):

ig the problem goes away if you add syntax term : stmt but that's kinda sketch

Mario Carneiro (Sep 30 2024 at 06:22):

the real do notation does in fact have syntax term : stmt

Mario Carneiro (Sep 30 2024 at 06:23):

Frederick Pu said:

also doesnt the mut in syntax declaration contradict the macro?

I don't know what you mean by this

Frederick Pu (Sep 30 2024 at 19:28):

Mario Carneiro said:

Frederick Pu said:

also doesnt the mut in syntax declaration contradict the macro?

I don't know what you mean by this

I just mean that for the macro rule it says e is a term but according to the syntax definition it should be a stmt

Frederick Pu (Sep 30 2024 at 20:07):

also just to confirm, the do unchained paper doesn't describe a full reference implementation using hygenic macros just a rough sketch of what you would need to do?

Patrick Massot (Sep 30 2024 at 20:12):

Yes.

Patrick Massot (Sep 30 2024 at 20:12):

Everything in that paper is much simpler than the real thing (and also the real thing involved since that paper).

Frederick Pu (Oct 07 2024 at 18:30):

so what's the difference between using (registerBuiltinParser and the leading_parser macro) and registering a syntax category? Do builtin parsers actually get compiled into the lean source code or are they still part of the Init library?

Frederick Pu (Oct 07 2024 at 18:33):

also, if the latter is true, i wonder if there could be a way of testing builtinParsers in a sandboxed enviroment (so that they have behavior to syntax categories). That would probably provide a nice debugging tool and also allow for some level of (albeit very scuffed) formal verification

Last updated: May 02 2025 at 03:31 UTC