Zulip Chat Archive

Stream: general

Topic: linter for partial functions


Quentin Januel (Aug 30 2023 at 19:20):

Hello, I'm new to lean but one of the features I absolutely love about it is that each function is total by default unless marked partial. However, this doesn't quite work as I expected because you can create non terminating functions that are not marked partial just by calling partial functions, as opposed to, for instance, Rust in which a function must be unsafe to call another unsafe function (leaving apart unsafe blocks)
As I would like to be able to see if a function terminates just by looking at its signature, is there a linter rule I can activate or something of that nature?

Yaël Dillies (Aug 30 2023 at 19:45):

I'm surprised by your claim. Can you give an example?

Kyle Miller (Aug 30 2023 at 19:47):

@Yaël Dillies You can't tell that g doesn't terminate:

partial def f (n : Nat) : Nat := f n

def g (n : Nat) := f n

#print g

Kyle Miller (Aug 30 2023 at 19:48):

#print axioms g doesn't show any axioms either

Kyle Miller (Aug 30 2023 at 19:49):

This is "fine" because Nat is inhabited, and f is opaque, so you shouldn't be able to prove anything false.

Eric Wieser (Aug 30 2023 at 19:49):

Is there an argument here that proof of termination is uninteresting, since what actually matters is termination in a tractable amount of time?

Quentin Januel (Aug 30 2023 at 19:50):

@Yaël Dillies I was about to provide the exact same example of @Kyle Miller but he was faster. As another example though, since the while syntax internally uses partial functions, the following code works:

def f : IO Unit := do
  while true do
    IO.println "hello"

Quentin Januel (Aug 30 2023 at 19:53):

@Kyle Miller I understand I wouldn't be able to prove anything false, but I would not just want to use lean as a proof assistant. I also see it as a "stronger haskell" and having each function being total unless marked otherwise felt powerful to me. Even if this is not a requirement in lean, I believe it shouldn't be too hard to statically check this and have a linter require me to mark partial every function that ever "might" not terminate (maybe I'm wrong though)

Kyle Miller (Aug 30 2023 at 19:57):

Yeah, it seems like it should be possible to write a @[terminates] attribute that asserts that a given function doesn't transitively call any opaque definitions.

Yaël Dillies (Aug 30 2023 at 19:58):

Why is partial non-poisoning, actually?

Quentin Januel (Aug 30 2023 at 19:58):

So by your reply, I assume such a thing doesn't exist yet? Also, if possible, I would very much prefer the opposite: that is, mark it as non terminating when it's not

Quentin Januel (Aug 30 2023 at 20:01):

@Yaël Dillies I don't know either, I discovered that in the "functional programming in lean" book just here:
https://leanprover.github.io/functional_programming_in_lean/monad-transformers/conveniences.html?highlight=%22infec#infinite-loops

Kyle Miller (Aug 30 2023 at 20:06):

Yaël Dillies said:

Why is partial non-poisoning, actually?

It's just a mechanism to make an opaque constant with a particular run-time interpretation. opaque is like axiom but it requires that the type be Inhabited, so it's not admitting the fact that the type is nonempty.

If it were to be poisoning, then an obvious implementation would lead to any function that calls out to code written in C++ being marked too. Even IO.println for example, which uses the opaque constant getStdout (which calls lean_get_stdout in C++).

Yaël Dillies (Aug 30 2023 at 20:06):

Is it shocking to have IO.println be partial?

Kyle Miller (Aug 30 2023 at 20:08):

No, not really, I just mentioned it because it was mentioned.

Quentin Januel (Aug 30 2023 at 20:34):

I think it would be fine to just not poison when doing FFI with C++ since this is obviously code that could go wrong and that should be checked and marked manually

Jannis Limperg (Aug 30 2023 at 20:45):

I suppose this is very unlikely to change, so the discussion about alternative schemes is a bit moot. But we can discuss how to deal with this design choice.

  • For noncomputable mathematics, there's no effect since consistency is all that matters.
  • For verified software:
    • If you do a full correctness proof, you can't use partial (or, more generally, opaque) functions anyway since you can't prove anything about them. However, @[implementedBy] can still overwrite your reference implementations with arbitrary garbage, so you have to prevent or audit that.
    • If you do anything short of a full correctness proof, it's hard to be sure that your top-level function actually terminates.

Mac Malone (Aug 30 2023 at 22:22):

@Jannis Limperg There is also the additional detail that there is no guarantee the compiler actually produces code that semantically corresponds to code you are proving things about. This leads into a key detail about current design of Lean. Proofs about code are about the Lean-level abstraction, they are not about the compiled code (hence why opaque does not poison and implemented_by does not need to come with a proof of equivalence). This is a problem for many formal verification contexts. Hopefully, however, this will slowly change once the code generator is written in Lean and we can start proving things about it.

Henrik Böving (Aug 30 2023 at 22:30):

If you want formally verified implemented_by that can just be csimp no?

Regarding the code generator, it is right now definitely not at all written in a way that allows for verification and in fact filled with partial itself (for some passes I know for sure that this is correct because they rely on additional unformalized invariants to even terminate in the first place)

On top of that there is always a C compiler or LLVM after the code generator, both with a gigantic code base that you would have to trust in addition. So if you are a person that doesn't believe in the correctness of the comparatively small code generator that we have right now, why would you believe in the LLVM/C compiler. On top of that the generated code is largely just making calls to the C++ runtime and not really doing much stuff on its own apart from branching and calling functions (both Lean and runtime ones).

That doesn't mean I think a formalized code generator would be useless, I do find the idea quite interesting. It's just that I don't feel like a formalized code generator is the actual blocker if you wish to fully trust a compiled lean program.

Henrik Böving (Aug 30 2023 at 22:42):

Yaël Dillies said:

Why is partial non-poisoning, actually?

Yaël Dillies said:

Is it shocking to have IO.println be partial?

I have a counter question to this. docs#Nat.add is tagged with extern just like many other primitive Nat operations to, also some Array operations etc. all to connect them to external C functions and make them performant. Now of course they do have a Lean body as well that is supposed to faithfully model what happens in the C++ based on the lean interpretation of the datatype (for Nat that is peano nat instead of funky multi precision nat, for Array that is a List instead of...well an array)

As I understand there are two parts to the idea here:

  1. Every function that isn't what it pretends to be to the logic should be marked as partial, this means in particular extern and implemented_by ones. If you go through with this idea Nat.add and friends would be partial and you can't prove things about them anymore so you would have to add their definitional equations as axioms. This doesn't sound like something desirable to me? Of course we could go through the effort of making these functions magical and what not to avoid this but that feels like a weird 2 class system.
  2. Making partial viral. While this is a nice idea in principle it gives the classic "function color issue" (you basically split the eco system into partial def and def and this would be a real split as the mathematicians would only care about def) which has been discussed at length by e.g. the rust folk for their async and is a thing that people disagree on to this date. On top of that as I see the only thing you get is this: You now know your function will terminate. And for this you basically give up the ability to argue over functions completley that may just use a single piece of partial somewhere. Say for example you have
def foo (x : Nat) : Nat :=
  if x = 0 then
    partialFoo x
  else
    2 * x

if partial is viral i cannot prove that if x \neq 0 foo x = 2 * x since you block me entirely from arguing over the body of the function. This does also not seem desirable to me. (note that this is not unfixable, we could rewrite foo in CPS but restructuring your code just for a termination dogma seems a bit meh)

That said I do agree it would be nice to maybe have an annotation as proposed above that checks this. Making the virality check "opt in" in parts of code that you know will not be damaged from it so to speak.

Quentin Januel (Aug 30 2023 at 23:09):

Thank you all for the explanations. Again I am very new to lean and probably don't realise the implications of what I'm trying to get, but I would like to clarify a something:
I do agree sometimes it is needed to have a partial function that will not poison, and it is the developer's responsability to make sure there won't be any issue with it. This would be needed for ffi for instance.
I don't mind that, these are implementation details burried in libraries I choose to trust.
However, when writing my own code, I feel like it should immediately know when I'm calling partial functions I wrote, unless I took the decision to mark them non partial after carefully checking them. To me, this is the same kind of difference between having an unsafe function in Rust, and a safe function that internally contain an unsafe block: the latter means the function have been verified manually.
Also I believe this is not just about a termination dogma, but rather having more precise function signatures, letting you know what they can possibly do just by reading their type. Monads provide a similar feature by making the type explicit if a function may use IO or internal state e.g. Why should partial be treated any differently?

Mac Malone (Aug 31 2023 at 00:08):

@Quentin Januel If you are curious about the rational, I think I am be able to explain. The goal of the termination checker in Lean is not primarily to verify that a function terminates. Instead, it verifies termination insofar as it is a necessary requirement to keep the logical system of the theorem prover sound. Thus, definitions like partial, opaque, or implemented_by which may be non-terminating are ignored because they are still guaranteed to be sound for other reasons. On the hand, your desire is that if a definition passes the termination check it guaranteed to terminate as long as the trusted codebase (i.e., core and manually marked functions) terminate.

I think your original idea of a linter (or something in the vein of #print axioms like #print partial) would be the best way of handling this (likely in some userland package).

Adomas Baliuka (Aug 31 2023 at 01:33):

Is it possible to prove termination for a concrete function (say without things like IO for now)? What would that look like?

Also, someone said it's not possible to do partial def for an uninhabited type. Does "uninhabited" here mean you cannot create an instance or is it about the typeclass Inhabited? For example, you can make an axiom that allows a contradiction and then partial def an instance of False(I tried). It the claim still true? (I guess it doesn't matter in this case since you already have a contradiction, so that's not a good example)

Mac Malone (Aug 31 2023 at 04:13):

ab said:

Is it possible to prove termination for a concrete function (say without things like IO for now)? What would that look like?

If the function does not make use of partial or opaque definitions then it passing the termination checker is proof it terminates. If it uses partial or opaque definitions, then it is impossible to prove termination because the details of the function are opaque to the prover. One way of verifying termination would be to unfold every definition until the definition is purely composed of constructors (also erasing and prop or type terms since they are irrelevant). If that works, the function terminates. If not, it does not.

Mac Malone (Aug 31 2023 at 04:19):

ab said:

Does "uninhabited" here mean you cannot create an instance or is it about the typeclass Inhabited?

These notions are meant to be isomorphic (i.e., if can construct an instance of type, you can also construct an instance of Inhabited for the type). You can also manually provide an instance of the type in the opaque syntax (e.g., opaque foo : A := <some instance of A>). This is useful if it being inhabited is dependent on the parameters of the opaque definition. Furthermore, technically, the requirement is actually just that it has a Nonempty instance (which is propositional rather than constructive) -- an Inhabited instance implies a Nonempty one, but not vice versa (because Nonempty, since it is propositional, can use axioms / classical logic to generate instances as opposed to constructive logic) and an implementation (e.g. extern / implemented_by) or is marked noncomputable.

Adomas Baliuka (Aug 31 2023 at 05:59):

@Mac Malone thanks for the answer. What I meant by "proof" was actually being able to write (and prove) something like

def f (x : Nat) : Nat := <... definition depending on definitions of definitions etc.>

theorem terminates_f : Terminates f := ...

I.e., encoding the statement within Lean. Can one prove that "after expanding definitions, f is a composition of constructors"?

Mac Malone (Aug 31 2023 at 06:05):

@ab That seems difficult. For one, even formalizing Terminates f is hard. (And, of course, in the general case, is the halting problem.) I am not sure how would go or about this or if this is even possible within Lean's theory.

Adomas Baliuka (Aug 31 2023 at 06:10):

@Mac Malone maybe an easier question first: Is it possible to formalize and prove "f is a constructor for some inductive type"?

Mac Malone (Aug 31 2023 at 06:14):

@ab I would imagine probably not. By the substitution rule for equality anything you can prove about a construct you can prove about something equal to a construct (which, e.g., could be an opaque definition). The only way I can think of doing this at the moment would be proving it about Lean's metalanguage itself (i.e., about Expr and Environments and the such) and that is not very amendable to proofs (i.e., it uses at lot of partials and opaques).

Mac Malone (Aug 31 2023 at 06:16):

This is why I think a linter would be easier. That way you can just reason about definitions programmatically and check whether they can reduce to such a result. Its not quite the level the guarantee of a theorem, but it seems good enough for most use cases (as even the theorem as caveats like the compiler actually producing semantically correct code).

Quentin Januel (Aug 31 2023 at 06:19):

@Mac Malone Thank you, I understood the primary goal of the termination checker was about soundness of the logical system, but just as lean can be used both as a proof assistant and a programming language, the termination checker might just as well have several benefits. Also yes I completely agree: I didn't expect lean to directly provide such a solution, a linter would be the best in this context I guess. Sadly since such a thing does not exist yet I suppose I will just have to check every once in a while if someone ever makes that (as I don't see myself able to do that unfortunately)

Quentin Januel (Aug 31 2023 at 06:23):

Yeah about the halting problem, I felt like it was because everyone just didn't want to throw away turing completeness. I first thought that lean wasn't turing complete and was interested to see how powerful it could still be with this restriction, but I was just wrong

Damiano Testa (Aug 31 2023 at 06:24):

I think that the undecidability of the halting problem (or undecidability more generally) simply means that you cannot hope to prove every true statement. This is very far from being actually able to prove many true statements!

Mario Carneiro (Aug 31 2023 at 06:38):

Here's a basic implementation based on #print axioms:

import Lean

open Lean Elab Command

namespace CollectPartial

structure State where
  visited : NameSet := {}
  opaques : Array Name := #[]

abbrev M := ReaderT Environment $ StateT State MetaM

partial def collect (c : Name) : M Unit := do
  let collectExpr (e : Expr) : M Unit := e.getUsedConstants.forM collect
  let s  get
  unless s.visited.contains c do
    modify fun s => { s with visited := s.visited.insert c }
    let env  read
    match env.find? c with
    | some (ConstantInfo.ctorInfo _)
    | some (ConstantInfo.recInfo _)
    | some (ConstantInfo.inductInfo _)
    | some (ConstantInfo.quotInfo _)   =>
      pure ()
    | some (ConstantInfo.defnInfo v)
    | some (ConstantInfo.thmInfo v)    =>
      unless  Meta.isProp v.type do collectExpr v.value
    | some (ConstantInfo.axiomInfo v)
    | some (ConstantInfo.opaqueInfo v) =>
      unless  Meta.isProp v.type do
        modify fun s => { s with opaques := s.opaques.push c }
    | none                             =>
      modify fun s => { s with opaques := s.opaques.push c }

end CollectPartial

elab "#print" "partial" name:ident : command => do
  let constName  resolveGlobalConstNoOverloadWithInfo name
  let env  getEnv
  let (_, s)  liftTermElabM <| ((CollectPartial.collect constName).run env).run {}
  if s.opaques.isEmpty then
    logInfo m!"'{constName}' does not use any partial definitions"
  else
    logInfo m!"'{constName}' depends on opaques: {s.opaques.toList}"

partial def foo : Unit  Nat := foo
def bar : Unit  Nat := foo

#print partial bar

Damiano Testa (Aug 31 2023 at 06:53):

By the way, this catches sorry also:

def no : Unit  Nat := sorry

#print partial no  --  'no' depends on opaques: [sorryAx]

Quentin Januel (Aug 31 2023 at 10:14):

@Mario Carneiro Wow, this is wonderful thank you so much!
Is it easy to make it print the warning when the function is declared, except if the said function is also declared partial (similarly to the message "declaration uses sorry" which doesn't need to be checked manually afterwards)? I already changed it so that it prints a warning if there are opaques, or nothing otherwise, but I am having troubles with that since afaik "elab" is not documented yet

Mario Carneiro (Aug 31 2023 at 10:15):

elab is how you declare new syntax and give it an implementation, it's basically (one of) the entry points to lean metaprogramming

Mario Carneiro (Aug 31 2023 at 10:17):

You can move the check to the function using an attribute, e.g. you write @[terminating] def no : Unit -> Nat := sorry and it gives a warning

Mario Carneiro (Aug 31 2023 at 10:18):

You could also use a linter if you want to have no syntax at the point of use, although this is a bit hackish

Quentin Januel (Aug 31 2023 at 10:41):

So you're saying elab wouldn't work to override existing syntax (for instance to, say, replace any "<declaration"> by "#print partial ident; <declaration>"?

Well I don't want to use attributes because it kills the point of forcing me to declare my function as partials. Are there already existing linters?

Mario Carneiro (Aug 31 2023 at 10:42):

elab declares new syntax and attaches rules to that syntax. You could declare syntax exactly matching def and with a higher priority, and then any future defs would use your syntax and apply its rules

Mario Carneiro (Aug 31 2023 at 10:43):

elab_rules attaches rules to existing syntax. You could use this to override the behavior of def without changing the syntax

Quentin Januel (Aug 31 2023 at 10:43):

I don't mind about it being hackish. Also, talking about hackish, if anyone's interested I modified your solution so that it doesn't raise a warning for "IO"s since those are already visible in the function signature:

import Lean

open Lean Elab Command

namespace CollectPartial

structure State where
  visited : NameSet := {}
  opaques : Array Name := #[]

abbrev M := ReaderT Environment $ StateT State MetaM

-- feels very hackish but nvm
def isIO : String  Bool :=
  (. |>.takeWhile (. != ' ') |> (. == "BaseIO"))

partial def collect (c : Name) : M Unit := do
  let collectExpr (e : Expr) : M Unit := e.getUsedConstants.forM collect
  let s  get
  unless s.visited.contains c do
    modify fun s => { s with visited := s.visited.insert c }
    let env  read
    match env.find? c with
    | some (ConstantInfo.ctorInfo _)
    | some (ConstantInfo.recInfo _)
    | some (ConstantInfo.inductInfo _)
    | some (ConstantInfo.quotInfo _) =>
      pure ()
    | some (ConstantInfo.defnInfo v)
    | some (ConstantInfo.thmInfo v) =>
      unless  Meta.isProp v.type do
        collectExpr v.value
    | some (ConstantInfo.axiomInfo v)
    | some (ConstantInfo.opaqueInfo v) =>
      unless  Meta.isProp v.type do
        unless isIO (ToString.toString v.type) do
          modify fun s => { s with opaques := s.opaques.push c }
    | none =>
      modify fun s => { s with opaques := s.opaques.push c }

end CollectPartial

elab "#print" "partial" name:ident  : command => do
  let constName  resolveGlobalConstNoOverloadWithInfo name
  let env  getEnv
  let (_, s)  liftTermElabM <| ((CollectPartial.collect constName).run env).run {}
  unless s.opaques.isEmpty do
    logWarning m!"'{constName}' may not terminate due to:\n{s.opaques.toList}."

Mario Carneiro (Aug 31 2023 at 10:44):

The part that is hackish about using linters is that you have to re-parse the syntax to find out the declaration that was made, and it's easy to miss one if you don't understand the syntax

Quentin Januel (Aug 31 2023 at 10:44):

Mario Carneiro said:

elab_rules attaches rules to existing syntax. You could use this to override the behavior of def without changing the syntax

This one seems interesting

Mario Carneiro (Aug 31 2023 at 10:45):

You still have to handle all the behavior of def though with that option, it's somewhat heavyweight

Mario Carneiro (Aug 31 2023 at 10:45):

you would basically be adapting whatever code def uses to do its thing

Mario Carneiro (Aug 31 2023 at 10:47):

Also, talking about hackish, if anyone's interested I modified your solution so that it doesn't raise a warning for "IO"s since those are already visible in the function signature

I think you have made an unjustified assumption here that "IO" is visible in the function signature. The function originally selected might not say anything about IO

Mario Carneiro (Aug 31 2023 at 10:50):

def loop : IO Unit := IO.iterate () fun _ => pure (.inl ())

def evil (_ : Unit) : Bool :=
  loop () matches .ok ..

Quentin Januel (Aug 31 2023 at 10:53):

Hmm, but in this case that's not a problem since IO.iterate should be reported as it allows to make infinite loops
I just wanted to turn off the warnings for functions like stdOut and stdIn

Quentin Januel (Aug 31 2023 at 10:53):

Or did I miss your point?

Mario Carneiro (Aug 31 2023 at 10:53):

I think you should specifically exempt opaques that are known to be terminating

Mario Carneiro (Aug 31 2023 at 10:54):

e.g. using an attribute to tag them

Quentin Januel (Aug 31 2023 at 10:54):

Well that would surely be the best option but wouldn't this be way more complicated?

Mario Carneiro (Aug 31 2023 at 10:55):

I mean, each tag comes with some soundness argument so it seems good to have it explicit by default

Mario Carneiro (Aug 31 2023 at 10:55):

it's not particularly complicated to have a whitelist in that code

Quentin Januel (Aug 31 2023 at 10:55):

Oh I think I see

Quentin Januel (Aug 31 2023 at 10:57):

Well thank you for all the help, it's very good to know that lean is powerful enough to express all of this by itself. I'm really not sure I'll be able to come with a solution though, but if I do I will post it here

Mario Carneiro (Aug 31 2023 at 11:04):

Here's how you make an attribute whitelist:

import Lean

open Lean Elab Command

initialize terminatesExt : SimplePersistentEnvExtension Name NameSet 
  registerSimplePersistentEnvExtension {
    addEntryFn := (·.insert)
    addImportedFn := mkStateFromImportedEntries (·.insert) {}
  }

initialize registerBuiltinAttribute {
  name := `terminates
  descr := "Asserts that an opaque is terminating"
  add := fun decl _ kind  do
    unless kind == AttributeKind.global do
      throwError "invalid attribute 'terminates', must be global"
    modifyEnv (terminatesExt.addEntry · decl)
}

namespace CollectPartial

structure State where
  visited : NameSet := {}
  opaques : Array Name := #[]

abbrev M := StateT State MetaM

syntax foo := ws
partial def collect (c : Name) : M Unit := do
  let collectExpr (e : Expr) : M Unit := e.getUsedConstants.forM collect
  let s  get
  unless s.visited.contains c do
    modify fun s => { s with visited := s.visited.insert c }
    let env  getEnv
    match env.find? c with
    | some (ConstantInfo.ctorInfo _)
    | some (ConstantInfo.recInfo _)
    | some (ConstantInfo.inductInfo _)
    | some (ConstantInfo.quotInfo _)   =>
      pure ()
    | some (ConstantInfo.defnInfo v)
    | some (ConstantInfo.thmInfo v)    =>
      unless  Meta.isProp v.type do collectExpr v.value
    | some (ConstantInfo.axiomInfo v)
    | some (ConstantInfo.opaqueInfo v) =>
      unless  Meta.isProp v.type do
        unless (terminatesExt.getState env).contains c do
          modify fun s => { s with opaques := s.opaques.push c }
    | none                             =>
      modify fun s => { s with opaques := s.opaques.push c }

end CollectPartial

initialize registerBuiltinAttribute {
  name := `terminating
  descr := "Checks that a definition does not depend on any nonterminating functions"
  applicationTime := .afterCompilation
  add := fun decl _ kind  do
    unless kind == AttributeKind.global do
      throwError "invalid attribute 'terminates', must be global"
    let ((_, s), _)  Meta.MetaM.run <| (CollectPartial.collect decl).run {}
    unless s.opaques.isEmpty do
      logError m!"'{decl}' depends on opaques: {s.opaques.toList}"
}

and then the test code has to be in a separate file:

import MyLib.TerminatesAttr

partial def foo : Unit  Nat := foo
@[terminating] def bar : Unit  Nat := foo -- error

@[terminates] partial def safe : Unit  Nat := safe
@[terminating] def baz : Unit  Nat := safe -- ok

Mario Carneiro (Aug 31 2023 at 11:11):

and then you could have things like

attribute [terminates] IO.getStdout IO.getStderr

to whitelist IO functions you trust

Quentin Januel (Sep 01 2023 at 14:52):

@Mario Carneiro Sorry for the late reply but this is amazing, thank you so much for all the help!!


Last updated: Dec 20 2023 at 11:08 UTC