Zulip Chat Archive
Stream: general
Topic: linter for partial functions
Quentin Januel (Aug 30 2023 at 19:20):
Hello, I'm new to lean but one of the features I absolutely love about it is that each function is total by default unless marked partial. However, this doesn't quite work as I expected because you can create non terminating functions that are not marked partial just by calling partial functions, as opposed to, for instance, Rust in which a function must be unsafe to call another unsafe function (leaving apart unsafe blocks)
As I would like to be able to see if a function terminates just by looking at its signature, is there a linter rule I can activate or something of that nature?
Yaël Dillies (Aug 30 2023 at 19:45):
I'm surprised by your claim. Can you give an example?
Kyle Miller (Aug 30 2023 at 19:47):
@Yaël Dillies You can't tell that g
doesn't terminate:
partial def f (n : Nat) : Nat := f n
def g (n : Nat) := f n
#print g
Kyle Miller (Aug 30 2023 at 19:48):
#print axioms g
doesn't show any axioms either
Kyle Miller (Aug 30 2023 at 19:49):
This is "fine" because Nat
is inhabited, and f
is opaque, so you shouldn't be able to prove anything false.
Eric Wieser (Aug 30 2023 at 19:49):
Is there an argument here that proof of termination is uninteresting, since what actually matters is termination in a tractable amount of time?
Quentin Januel (Aug 30 2023 at 19:50):
@Yaël Dillies I was about to provide the exact same example of @Kyle Miller but he was faster. As another example though, since the while syntax internally uses partial functions, the following code works:
def f : IO Unit := do
while true do
IO.println "hello"
Quentin Januel (Aug 30 2023 at 19:53):
@Kyle Miller I understand I wouldn't be able to prove anything false, but I would not just want to use lean as a proof assistant. I also see it as a "stronger haskell" and having each function being total unless marked otherwise felt powerful to me. Even if this is not a requirement in lean, I believe it shouldn't be too hard to statically check this and have a linter require me to mark partial every function that ever "might" not terminate (maybe I'm wrong though)
Kyle Miller (Aug 30 2023 at 19:57):
Yeah, it seems like it should be possible to write a @[terminates]
attribute that asserts that a given function doesn't transitively call any opaque
definitions.
Yaël Dillies (Aug 30 2023 at 19:58):
Why is partial
non-poisoning, actually?
Quentin Januel (Aug 30 2023 at 19:58):
So by your reply, I assume such a thing doesn't exist yet? Also, if possible, I would very much prefer the opposite: that is, mark it as non terminating when it's not
Quentin Januel (Aug 30 2023 at 20:01):
@Yaël Dillies I don't know either, I discovered that in the "functional programming in lean" book just here:
https://leanprover.github.io/functional_programming_in_lean/monad-transformers/conveniences.html?highlight=%22infec#infinite-loops
Kyle Miller (Aug 30 2023 at 20:06):
Yaël Dillies said:
Why is
partial
non-poisoning, actually?
It's just a mechanism to make an opaque
constant with a particular run-time interpretation. opaque
is like axiom
but it requires that the type be Inhabited
, so it's not admitting the fact that the type is nonempty.
If it were to be poisoning, then an obvious implementation would lead to any function that calls out to code written in C++ being marked too. Even IO.println
for example, which uses the opaque
constant getStdout
(which calls lean_get_stdout
in C++).
Yaël Dillies (Aug 30 2023 at 20:06):
Is it shocking to have IO.println
be partial
?
Kyle Miller (Aug 30 2023 at 20:08):
No, not really, I just mentioned it because it was mentioned.
Quentin Januel (Aug 30 2023 at 20:34):
I think it would be fine to just not poison when doing FFI with C++ since this is obviously code that could go wrong and that should be checked and marked manually
Jannis Limperg (Aug 30 2023 at 20:45):
I suppose this is very unlikely to change, so the discussion about alternative schemes is a bit moot. But we can discuss how to deal with this design choice.
- For noncomputable mathematics, there's no effect since consistency is all that matters.
- For verified software:
- If you do a full correctness proof, you can't use
partial
(or, more generally,opaque
) functions anyway since you can't prove anything about them. However,@[implementedBy]
can still overwrite your reference implementations with arbitrary garbage, so you have to prevent or audit that. - If you do anything short of a full correctness proof, it's hard to be sure that your top-level function actually terminates.
- If you do a full correctness proof, you can't use
Mac Malone (Aug 30 2023 at 22:22):
@Jannis Limperg There is also the additional detail that there is no guarantee the compiler actually produces code that semantically corresponds to code you are proving things about. This leads into a key detail about current design of Lean. Proofs about code are about the Lean-level abstraction, they are not about the compiled code (hence why opaque
does not poison and implemented_by
does not need to come with a proof of equivalence). This is a problem for many formal verification contexts. Hopefully, however, this will slowly change once the code generator is written in Lean and we can start proving things about it.
Henrik Böving (Aug 30 2023 at 22:30):
If you want formally verified implemented_by that can just be csimp no?
Regarding the code generator, it is right now definitely not at all written in a way that allows for verification and in fact filled with partial itself (for some passes I know for sure that this is correct because they rely on additional unformalized invariants to even terminate in the first place)
On top of that there is always a C compiler or LLVM after the code generator, both with a gigantic code base that you would have to trust in addition. So if you are a person that doesn't believe in the correctness of the comparatively small code generator that we have right now, why would you believe in the LLVM/C compiler. On top of that the generated code is largely just making calls to the C++ runtime and not really doing much stuff on its own apart from branching and calling functions (both Lean and runtime ones).
That doesn't mean I think a formalized code generator would be useless, I do find the idea quite interesting. It's just that I don't feel like a formalized code generator is the actual blocker if you wish to fully trust a compiled lean program.
Henrik Böving (Aug 30 2023 at 22:42):
Yaël Dillies said:
Why is
partial
non-poisoning, actually?
Yaël Dillies said:
Is it shocking to have
IO.println
bepartial
?
I have a counter question to this. docs#Nat.add is tagged with extern
just like many other primitive Nat
operations to, also some Array
operations etc. all to connect them to external C functions and make them performant. Now of course they do have a Lean body as well that is supposed to faithfully model what happens in the C++ based on the lean interpretation of the datatype (for Nat that is peano nat instead of funky multi precision nat, for Array that is a List instead of...well an array)
As I understand there are two parts to the idea here:
- Every function that isn't what it pretends to be to the logic should be marked as partial, this means in particular
extern
andimplemented_by
ones. If you go through with this ideaNat.add
and friends would be partial and you can't prove things about them anymore so you would have to add their definitional equations as axioms. This doesn't sound like something desirable to me? Of course we could go through the effort of making these functions magical and what not to avoid this but that feels like a weird 2 class system. - Making partial viral. While this is a nice idea in principle it gives the classic "function color issue" (you basically split the eco system into
partial def
anddef
and this would be a real split as the mathematicians would only care aboutdef
) which has been discussed at length by e.g. the rust folk for their async and is a thing that people disagree on to this date. On top of that as I see the only thing you get is this: You now know your function will terminate. And for this you basically give up the ability to argue over functions completley that may just use a single piece of partial somewhere. Say for example you have
def foo (x : Nat) : Nat :=
if x = 0 then
partialFoo x
else
2 * x
if partial is viral i cannot prove that if x \neq 0
foo x = 2 * x
since you block me entirely from arguing over the body of the function. This does also not seem desirable to me. (note that this is not unfixable, we could rewrite foo in CPS but restructuring your code just for a termination dogma seems a bit meh)
That said I do agree it would be nice to maybe have an annotation as proposed above that checks this. Making the virality check "opt in" in parts of code that you know will not be damaged from it so to speak.
Quentin Januel (Aug 30 2023 at 23:09):
Thank you all for the explanations. Again I am very new to lean and probably don't realise the implications of what I'm trying to get, but I would like to clarify a something:
I do agree sometimes it is needed to have a partial function that will not poison, and it is the developer's responsability to make sure there won't be any issue with it. This would be needed for ffi for instance.
I don't mind that, these are implementation details burried in libraries I choose to trust.
However, when writing my own code, I feel like it should immediately know when I'm calling partial functions I wrote, unless I took the decision to mark them non partial after carefully checking them. To me, this is the same kind of difference between having an unsafe function in Rust, and a safe function that internally contain an unsafe block: the latter means the function have been verified manually.
Also I believe this is not just about a termination dogma, but rather having more precise function signatures, letting you know what they can possibly do just by reading their type. Monads provide a similar feature by making the type explicit if a function may use IO or internal state e.g. Why should partial be treated any differently?
Mac Malone (Aug 31 2023 at 00:08):
@Quentin Januel If you are curious about the rational, I think I am be able to explain. The goal of the termination checker in Lean is not primarily to verify that a function terminates. Instead, it verifies termination insofar as it is a necessary requirement to keep the logical system of the theorem prover sound. Thus, definitions like partial
, opaque
, or implemented_by
which may be non-terminating are ignored because they are still guaranteed to be sound for other reasons. On the hand, your desire is that if a definition passes the termination check it guaranteed to terminate as long as the trusted codebase (i.e., core and manually marked functions) terminate.
I think your original idea of a linter (or something in the vein of #print axioms
like #print partial
) would be the best way of handling this (likely in some userland package).
Adomas Baliuka (Aug 31 2023 at 01:33):
Is it possible to prove termination for a concrete function (say without things like IO
for now)? What would that look like?
Also, someone said it's not possible to do partial def
for an uninhabited type. Does "uninhabited" here mean you cannot create an instance or is it about the typeclass Inhabited
? For example, you can make an axiom that allows a contradiction and then partial def
an instance of False
(I tried). It the claim still true? (I guess it doesn't matter in this case since you already have a contradiction, so that's not a good example)
Mac Malone (Aug 31 2023 at 04:13):
ab said:
Is it possible to prove termination for a concrete function (say without things like
IO
for now)? What would that look like?
If the function does not make use of partial
or opaque
definitions then it passing the termination checker is proof it terminates. If it uses partial or opaque definitions, then it is impossible to prove termination because the details of the function are opaque to the prover. One way of verifying termination would be to unfold every definition until the definition is purely composed of constructors (also erasing and prop or type terms since they are irrelevant). If that works, the function terminates. If not, it does not.
Mac Malone (Aug 31 2023 at 04:19):
ab said:
Does "uninhabited" here mean you cannot create an instance or is it about the typeclass
Inhabited
?
These notions are meant to be isomorphic (i.e., if can construct an instance of type, you can also construct an instance of Inhabited
for the type). You can also manually provide an instance of the type in the opaque
syntax (e.g., opaque foo : A := <some instance of A>
). This is useful if it being inhabited is dependent on the parameters of the opaque
definition. Furthermore, technically, the requirement is actually just that it has a Nonempty
instance (which is propositional rather than constructive) -- an Inhabited
instance implies a Nonempty
one, but not vice versa (because Nonempty
, since it is propositional, can use axioms / classical logic to generate instances as opposed to constructive logic) and an implementation (e.g. extern
/ implemented_by
) or is marked noncomputable
.
Adomas Baliuka (Aug 31 2023 at 05:59):
@Mac Malone thanks for the answer. What I meant by "proof" was actually being able to write (and prove) something like
def f (x : Nat) : Nat := <... definition depending on definitions of definitions etc.>
theorem terminates_f : Terminates f := ...
I.e., encoding the statement within Lean. Can one prove that "after expanding definitions, f
is a composition of constructors"?
Mac Malone (Aug 31 2023 at 06:05):
@ab That seems difficult. For one, even formalizing Terminates f
is hard. (And, of course, in the general case, is the halting problem.) I am not sure how would go or about this or if this is even possible within Lean's theory.
Adomas Baliuka (Aug 31 2023 at 06:10):
@Mac Malone maybe an easier question first: Is it possible to formalize and prove "f
is a constructor for some inductive type"?
Mac Malone (Aug 31 2023 at 06:14):
@ab I would imagine probably not. By the substitution rule for equality anything you can prove about a construct you can prove about something equal to a construct (which, e.g., could be an opaque
definition). The only way I can think of doing this at the moment would be proving it about Lean's metalanguage itself (i.e., about Expr
and Environments and the such) and that is not very amendable to proofs (i.e., it uses at lot of partials and opaques).
Mac Malone (Aug 31 2023 at 06:16):
This is why I think a linter would be easier. That way you can just reason about definitions programmatically and check whether they can reduce to such a result. Its not quite the level the guarantee of a theorem
, but it seems good enough for most use cases (as even the theorem
as caveats like the compiler actually producing semantically correct code).
Quentin Januel (Aug 31 2023 at 06:19):
@Mac Malone Thank you, I understood the primary goal of the termination checker was about soundness of the logical system, but just as lean can be used both as a proof assistant and a programming language, the termination checker might just as well have several benefits. Also yes I completely agree: I didn't expect lean to directly provide such a solution, a linter would be the best in this context I guess. Sadly since such a thing does not exist yet I suppose I will just have to check every once in a while if someone ever makes that (as I don't see myself able to do that unfortunately)
Quentin Januel (Aug 31 2023 at 06:23):
Yeah about the halting problem, I felt like it was because everyone just didn't want to throw away turing completeness. I first thought that lean wasn't turing complete and was interested to see how powerful it could still be with this restriction, but I was just wrong
Damiano Testa (Aug 31 2023 at 06:24):
I think that the undecidability of the halting problem (or undecidability more generally) simply means that you cannot hope to prove every true statement. This is very far from being actually able to prove many true statements!
Mario Carneiro (Aug 31 2023 at 06:38):
Here's a basic implementation based on #print axioms
:
import Lean
open Lean Elab Command
namespace CollectPartial
structure State where
visited : NameSet := {}
opaques : Array Name := #[]
abbrev M := ReaderT Environment $ StateT State MetaM
partial def collect (c : Name) : M Unit := do
let collectExpr (e : Expr) : M Unit := e.getUsedConstants.forM collect
let s ← get
unless s.visited.contains c do
modify fun s => { s with visited := s.visited.insert c }
let env ← read
match env.find? c with
| some (ConstantInfo.ctorInfo _)
| some (ConstantInfo.recInfo _)
| some (ConstantInfo.inductInfo _)
| some (ConstantInfo.quotInfo _) =>
pure ()
| some (ConstantInfo.defnInfo v)
| some (ConstantInfo.thmInfo v) =>
unless ← Meta.isProp v.type do collectExpr v.value
| some (ConstantInfo.axiomInfo v)
| some (ConstantInfo.opaqueInfo v) =>
unless ← Meta.isProp v.type do
modify fun s => { s with opaques := s.opaques.push c }
| none =>
modify fun s => { s with opaques := s.opaques.push c }
end CollectPartial
elab "#print" "partial" name:ident : command => do
let constName ← resolveGlobalConstNoOverloadWithInfo name
let env ← getEnv
let (_, s) ← liftTermElabM <| ((CollectPartial.collect constName).run env).run {}
if s.opaques.isEmpty then
logInfo m!"'{constName}' does not use any partial definitions"
else
logInfo m!"'{constName}' depends on opaques: {s.opaques.toList}"
partial def foo : Unit → Nat := foo
def bar : Unit → Nat := foo
#print partial bar
Damiano Testa (Aug 31 2023 at 06:53):
By the way, this catches sorry
also:
def no : Unit → Nat := sorry
#print partial no -- 'no' depends on opaques: [sorryAx]
Quentin Januel (Aug 31 2023 at 10:14):
@Mario Carneiro Wow, this is wonderful thank you so much!
Is it easy to make it print the warning when the function is declared, except if the said function is also declared partial (similarly to the message "declaration uses sorry" which doesn't need to be checked manually afterwards)? I already changed it so that it prints a warning if there are opaques, or nothing otherwise, but I am having troubles with that since afaik "elab" is not documented yet
Mario Carneiro (Aug 31 2023 at 10:15):
elab
is how you declare new syntax and give it an implementation, it's basically (one of) the entry points to lean metaprogramming
Mario Carneiro (Aug 31 2023 at 10:17):
You can move the check to the function using an attribute, e.g. you write @[terminating] def no : Unit -> Nat := sorry
and it gives a warning
Mario Carneiro (Aug 31 2023 at 10:18):
You could also use a linter if you want to have no syntax at the point of use, although this is a bit hackish
Quentin Januel (Aug 31 2023 at 10:41):
So you're saying elab wouldn't work to override existing syntax (for instance to, say, replace any "<declaration"> by "#print partial ident; <declaration>"?
Well I don't want to use attributes because it kills the point of forcing me to declare my function as partials. Are there already existing linters?
Mario Carneiro (Aug 31 2023 at 10:42):
elab
declares new syntax and attaches rules to that syntax. You could declare syntax exactly matching def
and with a higher priority, and then any future defs would use your syntax and apply its rules
Mario Carneiro (Aug 31 2023 at 10:43):
elab_rules
attaches rules to existing syntax. You could use this to override the behavior of def
without changing the syntax
Quentin Januel (Aug 31 2023 at 10:43):
I don't mind about it being hackish. Also, talking about hackish, if anyone's interested I modified your solution so that it doesn't raise a warning for "IO"s since those are already visible in the function signature:
import Lean
open Lean Elab Command
namespace CollectPartial
structure State where
visited : NameSet := {}
opaques : Array Name := #[]
abbrev M := ReaderT Environment $ StateT State MetaM
-- feels very hackish but nvm
def isIO : String → Bool :=
(. |>.takeWhile (. != ' ') |> (. == "BaseIO"))
partial def collect (c : Name) : M Unit := do
let collectExpr (e : Expr) : M Unit := e.getUsedConstants.forM collect
let s ← get
unless s.visited.contains c do
modify fun s => { s with visited := s.visited.insert c }
let env ← read
match env.find? c with
| some (ConstantInfo.ctorInfo _)
| some (ConstantInfo.recInfo _)
| some (ConstantInfo.inductInfo _)
| some (ConstantInfo.quotInfo _) =>
pure ()
| some (ConstantInfo.defnInfo v)
| some (ConstantInfo.thmInfo v) =>
unless ← Meta.isProp v.type do
collectExpr v.value
| some (ConstantInfo.axiomInfo v)
| some (ConstantInfo.opaqueInfo v) =>
unless ← Meta.isProp v.type do
unless isIO (ToString.toString v.type) do
modify fun s => { s with opaques := s.opaques.push c }
| none =>
modify fun s => { s with opaques := s.opaques.push c }
end CollectPartial
elab "#print" "partial" name:ident : command => do
let constName ← resolveGlobalConstNoOverloadWithInfo name
let env ← getEnv
let (_, s) ← liftTermElabM <| ((CollectPartial.collect constName).run env).run {}
unless s.opaques.isEmpty do
logWarning m!"'{constName}' may not terminate due to:\n{s.opaques.toList}."
Mario Carneiro (Aug 31 2023 at 10:44):
The part that is hackish about using linters is that you have to re-parse the syntax to find out the declaration that was made, and it's easy to miss one if you don't understand the syntax
Quentin Januel (Aug 31 2023 at 10:44):
Mario Carneiro said:
elab_rules
attaches rules to existing syntax. You could use this to override the behavior ofdef
without changing the syntax
This one seems interesting
Mario Carneiro (Aug 31 2023 at 10:45):
You still have to handle all the behavior of def
though with that option, it's somewhat heavyweight
Mario Carneiro (Aug 31 2023 at 10:45):
you would basically be adapting whatever code def
uses to do its thing
Mario Carneiro (Aug 31 2023 at 10:47):
Also, talking about hackish, if anyone's interested I modified your solution so that it doesn't raise a warning for "IO"s since those are already visible in the function signature
I think you have made an unjustified assumption here that "IO" is visible in the function signature. The function originally selected might not say anything about IO
Mario Carneiro (Aug 31 2023 at 10:50):
def loop : IO Unit := IO.iterate () fun _ => pure (.inl ())
def evil (_ : Unit) : Bool :=
loop () matches .ok ..
Quentin Januel (Aug 31 2023 at 10:53):
Hmm, but in this case that's not a problem since IO.iterate should be reported as it allows to make infinite loops
I just wanted to turn off the warnings for functions like stdOut and stdIn
Quentin Januel (Aug 31 2023 at 10:53):
Or did I miss your point?
Mario Carneiro (Aug 31 2023 at 10:53):
I think you should specifically exempt opaques that are known to be terminating
Mario Carneiro (Aug 31 2023 at 10:54):
e.g. using an attribute to tag them
Quentin Januel (Aug 31 2023 at 10:54):
Well that would surely be the best option but wouldn't this be way more complicated?
Mario Carneiro (Aug 31 2023 at 10:55):
I mean, each tag comes with some soundness argument so it seems good to have it explicit by default
Mario Carneiro (Aug 31 2023 at 10:55):
it's not particularly complicated to have a whitelist in that code
Quentin Januel (Aug 31 2023 at 10:55):
Oh I think I see
Quentin Januel (Aug 31 2023 at 10:57):
Well thank you for all the help, it's very good to know that lean is powerful enough to express all of this by itself. I'm really not sure I'll be able to come with a solution though, but if I do I will post it here
Mario Carneiro (Aug 31 2023 at 11:04):
Here's how you make an attribute whitelist:
import Lean
open Lean Elab Command
initialize terminatesExt : SimplePersistentEnvExtension Name NameSet ←
registerSimplePersistentEnvExtension {
addEntryFn := (·.insert)
addImportedFn := mkStateFromImportedEntries (·.insert) {}
}
initialize registerBuiltinAttribute {
name := `terminates
descr := "Asserts that an opaque is terminating"
add := fun decl _ kind ↦ do
unless kind == AttributeKind.global do
throwError "invalid attribute 'terminates', must be global"
modifyEnv (terminatesExt.addEntry · decl)
}
namespace CollectPartial
structure State where
visited : NameSet := {}
opaques : Array Name := #[]
abbrev M := StateT State MetaM
syntax foo := ws
partial def collect (c : Name) : M Unit := do
let collectExpr (e : Expr) : M Unit := e.getUsedConstants.forM collect
let s ← get
unless s.visited.contains c do
modify fun s => { s with visited := s.visited.insert c }
let env ← getEnv
match env.find? c with
| some (ConstantInfo.ctorInfo _)
| some (ConstantInfo.recInfo _)
| some (ConstantInfo.inductInfo _)
| some (ConstantInfo.quotInfo _) =>
pure ()
| some (ConstantInfo.defnInfo v)
| some (ConstantInfo.thmInfo v) =>
unless ← Meta.isProp v.type do collectExpr v.value
| some (ConstantInfo.axiomInfo v)
| some (ConstantInfo.opaqueInfo v) =>
unless ← Meta.isProp v.type do
unless (terminatesExt.getState env).contains c do
modify fun s => { s with opaques := s.opaques.push c }
| none =>
modify fun s => { s with opaques := s.opaques.push c }
end CollectPartial
initialize registerBuiltinAttribute {
name := `terminating
descr := "Checks that a definition does not depend on any nonterminating functions"
applicationTime := .afterCompilation
add := fun decl _ kind ↦ do
unless kind == AttributeKind.global do
throwError "invalid attribute 'terminates', must be global"
let ((_, s), _) ← Meta.MetaM.run <| (CollectPartial.collect decl).run {}
unless s.opaques.isEmpty do
logError m!"'{decl}' depends on opaques: {s.opaques.toList}"
}
and then the test code has to be in a separate file:
import MyLib.TerminatesAttr
partial def foo : Unit → Nat := foo
@[terminating] def bar : Unit → Nat := foo -- error
@[terminates] partial def safe : Unit → Nat := safe
@[terminating] def baz : Unit → Nat := safe -- ok
Mario Carneiro (Aug 31 2023 at 11:11):
and then you could have things like
attribute [terminates] IO.getStdout IO.getStderr
to whitelist IO functions you trust
Quentin Januel (Sep 01 2023 at 14:52):
@Mario Carneiro Sorry for the late reply but this is amazing, thank you so much for all the help!!
Last updated: Dec 20 2023 at 11:08 UTC