Zulip Chat Archive
Stream: lean4
Topic: Escaping the IO monad
Andrej Bauer (Oct 09 2023 at 18:42):
Suppose I have getMagicNumber : IO Nat
. How do I use it to define cow : Nat
whose value is whatever getMagicNumber
returns? I am not asking how to do this inside main
, but rather what meta-level incantation is needed to add a new declaration in the kernel. I already know how to use Lean.addDecl
, but that one requires quoted expressions. Background: I defined functions that load Lean objects from JSON files (yes, I am using FromJson
) – but how do I load them from files?
Adam Topaz (Oct 09 2023 at 18:44):
You can define a term elaborator
Adam Topaz (Oct 09 2023 at 18:44):
It’s possible to lift from IO to termElabM
Andrej Bauer (Oct 09 2023 at 18:48):
Is that the elab
thing? I could do that, but once I've got the object loaded from a JSON file and loaded, how do I do the equivalent of addDecl
?
Mario Carneiro (Oct 09 2023 at 18:53):
You can use initialize
to initialize a global variable with the result of executing getMagicNumber
Mario Carneiro (Oct 09 2023 at 18:56):
using a term elaborator will resolve the value at compile time, the result will be put in the .olean file. Using initialize
will resolve the value in every downstream module as well as at runtime, effectively it is calculated on the spot at import time
Mario Carneiro (Oct 09 2023 at 18:57):
@Andrej Bauer From what I know about your application, I think the term elaborator is the way to go. In fact IIRC you already have one, and are using it for exactly this purpose already
Andrej Bauer (Oct 09 2023 at 19:36):
I can't seem to find any documentation about initalize
.
Mario Carneiro (Oct 09 2023 at 19:42):
there might not be any... the short version is that initialize foo : A <- getFoo
creates a new value foo : A
from getFoo : IO A
Andrej Bauer (Oct 09 2023 at 19:42):
Ok, looking at the source, it looks like it's going to be
initialize cow : GraphCertificate ← loadGraphCertificate "cert.json"
but this triggers failed to synthesize Inhabited GraphCertificate
.
Mario Carneiro (Oct 09 2023 at 19:43):
oh, I guess it uses opaque
under the hood so you will need an inhabited instance
Andrej Bauer (Oct 09 2023 at 19:43):
Does it want to be fail-safe by obtaining a default value?
Mario Carneiro (Oct 09 2023 at 19:43):
that should be easy for GraphCertificate
Mario Carneiro (Oct 09 2023 at 19:43):
it doesn't use the default value AFAIK, but it should not be allowed to break the logic by constructing False
Andrej Bauer (Oct 09 2023 at 19:44):
Yes, and it would be a crime if the default were anything other than Petersen graph. Just one more question while I have your attention: what if I'd like to initialize instances?
Mario Carneiro (Oct 09 2023 at 19:44):
what do you mean?
Andrej Bauer (Oct 09 2023 at 19:44):
I'd like to install instances of type classes obtained from loading JSON files.
Mario Carneiro (Oct 09 2023 at 19:45):
is GraphCertificate
a class?
Andrej Bauer (Oct 09 2023 at 19:46):
initialize cow : Nat ← pure 42
#eval cow
gives me "Server process for untitled:Untitled-1 crashed, likely due to a stack overflow or a bug.".
Mario Carneiro (Oct 09 2023 at 19:46):
Ah right I was going to mention that: you can't use initialize
values in the same file they were declared
Mario Carneiro (Oct 09 2023 at 19:47):
you need to make another file, import the first one and then try #eval
/ etc
Mario Carneiro (Oct 09 2023 at 19:47):
for... reasons
Andrej Bauer (Oct 09 2023 at 19:47):
GraphCertificate
is not a class, but it has stuff in it from which I'd like to construct some instances of some other classes.
Andrej Bauer (Oct 09 2023 at 19:48):
This whole thing is beginning to look like someone's PhD thesis.
Mario Carneiro (Oct 09 2023 at 19:48):
in that case I think the term elaborator or command elaborator is the best way to construct all those values
Mario Carneiro (Oct 09 2023 at 19:48):
again, last I checked you were doing exactly this
Mario Carneiro (Oct 09 2023 at 19:49):
it really isn't supposed to be PhD-level complicated but the documentation is sparse
Andrej Bauer (Oct 09 2023 at 19:49):
Ok, but last time around I had quoted stuff, and now I have directly constructed values. I failed to "quote" them, and did not find any way to stick them inside addDecl
.
Mario Carneiro (Oct 09 2023 at 19:49):
Ah, you have to quote them to put them in a declaration to be checked by the kernel
Mario Carneiro (Oct 09 2023 at 19:49):
or quote something derived from them
Andrej Bauer (Oct 09 2023 at 19:50):
(BTW, many many thanks, and also, I am asking this barrage of questions because I followed your advice and now I have funky fromJson
instances all around.)
Mario Carneiro (Oct 09 2023 at 19:50):
because the kernel typechecker only understands Expr
Mario Carneiro (Oct 09 2023 at 19:51):
when you use initialize
you directly get a value of the type, not an Expr
denoting it. The value is not really in the kernel / logical system, only an opaque
which stands in for it (whence the Inhabited
requirement)
Andrej Bauer (Oct 09 2023 at 19:51):
So let me see if I understand the process. Working in elab
:
- Load a
cow
from JSON file, using Mario-inspiredFromJson
instances, obtain a bona fide kernel-blessed object of typeCow
. - "Quote"
cow
to something likecowQ
(how?) - Submit the quoted thing to the kernel, just like before.
Andrej Bauer (Oct 09 2023 at 19:52):
Ah, I probably misunderstood step 1 then.
Mario Carneiro (Oct 09 2023 at 19:52):
obtain a bona fide kernel-blessed object of type Cow.
This object is not kernel-blessed
Mario Carneiro (Oct 09 2023 at 19:52):
it is a well typed lean object in the runtime but not a symbolic object the kernel understands
Andrej Bauer (Oct 09 2023 at 19:53):
So the question is, how do I "quote" it? It's not q($cow)
.
Mario Carneiro (Oct 09 2023 at 19:53):
You can provide a generic quotation instance to turn cow
into cowQ
Mario Carneiro (Oct 09 2023 at 19:54):
if you have such an instance I believe Qq will use it, so q($cow)
should just work
Mario Carneiro (Oct 09 2023 at 19:54):
if not you can still just call quote cow
Andrej Bauer (Oct 09 2023 at 19:54):
Aha! quote
is the magic word.
Mario Carneiro (Oct 09 2023 at 19:54):
The relevant typeclass is Quote
IIRC
Adam Topaz (Oct 09 2023 at 19:55):
docs#Lean.Quote to be precise
Mario Carneiro (Oct 09 2023 at 20:04):
Oh wait no that's the wrong typeclass, that one targets Syntax
not Expr
Mario Carneiro (Oct 09 2023 at 20:06):
you want docs#Lean.ToExpr
Eric Wieser (Oct 09 2023 at 20:15):
Quote
seems to do some pretty weird things
Eric Wieser (Oct 09 2023 at 20:16):
quote #[1, 2]
gives the syntax mkArray2 1 2
or similar
Mario Carneiro (Oct 09 2023 at 20:16):
I think that might be used by the compiler? I'm not sure what this class is used for
Mario Carneiro (Oct 09 2023 at 20:17):
well, I use it in mathport but besides that
Eric Wieser (Oct 09 2023 at 20:19):
It's used in polyrith (!)
Mario Carneiro (Oct 09 2023 at 20:19):
unfortunately it's hard to grep for uses of quote
by type
Mario Carneiro (Oct 09 2023 at 20:20):
there are a ton of uses of the instances for Name
, String
, Nat
, Bool
but I'm not sure about the other instances
Eric Wieser (Oct 09 2023 at 20:21):
Do you know how quote
compares with toExpr
followed by delaboration?
Mario Carneiro (Oct 09 2023 at 20:21):
it's not monadic for one
Mario Carneiro (Oct 09 2023 at 20:22):
Aha, Lean/Elab/Quotation.lean has src#Lean.Elab.Term.Quotation.ArrayStxBuilder.build which calls quote
on a Array Term
Mario Carneiro (Oct 09 2023 at 20:24):
there is also a Quote (List String)
used later down in the same file
Mario Carneiro (Oct 09 2023 at 20:25):
which is used in elaborating quotations: `(stuff)
is a macro so it needs to create syntax for all the types used inside Syntax
in order to create syntax to construct a Syntax
object
Mario Carneiro (Oct 09 2023 at 20:27):
so it makes sense that this would have a funny instance for creating Array
values, because we want to optimize the size of the generated term (both the syntax itself and also the term it elaborates to)
Mario Carneiro (Oct 09 2023 at 20:27):
syntax quotations (and syntax match) routinely generate some really massive expressions
Andrej Bauer (Oct 09 2023 at 21:00):
To summarize, should I really be trying to parse JSON to create Lean objects , use Quote
or some such, and then pass them to the kernel? Or is it better to parse JSON directly to quoted stuff that is passed to kernel?
Andrej Bauer (Oct 09 2023 at 21:04):
BTW, I am pretty certain that the objects created from JSON are blessed. It looks like initialize
might be the way to go (as clumsy as it is).
Mario Carneiro (Oct 09 2023 at 21:44):
I mean, the function that does the blessing is addDecl
Mario Carneiro (Oct 09 2023 at 21:45):
and initialize
only passes a stub to addDecl
Mario Carneiro (Oct 09 2023 at 21:45):
Andrej Bauer said:
To summarize, should I really be trying to parse JSON to create Lean objects , use
Quote
or some such, and then pass them to the kernel? Or is it better to parse JSON directly to quoted stuff that is passed to kernel?
I don't understand the distinction being made here
Mario Carneiro (Oct 09 2023 at 21:46):
either way you have JSON in and Expr
out, generally mediated by some meta lean type which may or may not coincide with the type you are constructing an Expr
for
Andrej Bauer (Oct 10 2023 at 06:27):
I have FromJson
instances which, in addition to converting JSON to "info", also verify that the info is valid, which requires running some decision procedures in the kernel, as validity is in general kernel-verified. For example:
structure GraphData : Type where
vertexSize : Nat
edgeTree : SetTree (Edge vertexSize)
deriving Lean.FromJson, Inhabited
structure Graph extends GraphData where
edgeCorrect : edgeTree.isCorrect
instance Graph.fromJson : Lean.FromJson Graph where
fromJson? := fun json => do
let graphData ← Lean.FromJson.fromJson? json (α := GraphData)
if h : graphData.edgeTree.isCorrect then
pure (Graph.mk graphData h)
else
throw "invalid edges in a graph"
This seems to be a general strategy: split the certificate into two parts, "data" and "validation". The data is in JSON, validation is done by the kernel (as above). I like this way of doing things, except that importing JSON from a file seems like such a monadic headache.
Mario Carneiro (Oct 10 2023 at 06:38):
the isCorrect
call is going to happen in untrusted lean here, unless you have the kernel running the FromJson
implementation
Mario Carneiro (Oct 10 2023 at 06:38):
which means you will be doing the check twice
Mario Carneiro (Oct 10 2023 at 06:39):
although you can probably assume the cost of running the check in the interpreter is negligible compared to the kernel check
Mario Carneiro (Oct 10 2023 at 06:40):
but the important part is that validating the data structure in untrusted lean doesn't help you to produce a proof, it only safeguards against producing a bad proof / can help improve the error message if GAP / your input source has a bug
Mario Carneiro (Oct 10 2023 at 06:46):
If you really wanted to have "kernel-verified validity" in this instance, it would need to look something like this:
structure GraphData : Type where
vertexSize : Nat
edgeTree : SetTree (Edge vertexSize)
deriving Lean.FromJson, Inhabited
structure Graph extends GraphData where
edgeCorrect : Q(($edgeTree).isCorrect)
instance Graph.fromJson : Lean.FromJson Graph where
fromJson? := fun json => do
let graphData ← Lean.FromJson.fromJson? json (α := GraphData)
let edgeTreeQ := toExpr graphData.edgeTree
let name <- mkFreshName
try
addDecl {
name
type := q(($edgeTreeQ).isCorrect)
value := q(by decide)
}
pure (Graph.mk graphData (.const name []))
catch _ =>
throw "invalid edges in a graph"
Mario Carneiro (Oct 10 2023 at 06:49):
but there are a few issues with this:
FromJson
instances are basically pure functions, butaddDecl
needs access to the environment- The type of
Graph
is now some hybrid of meta-level and object-level definitions, it's not just a subtype anymore (and indeed you can't even recover the subtype from this without actually doing the check again, because meta lean doesn't reflect the correctness of the kernel)
Andrej Bauer (Oct 10 2023 at 07:17):
I am confused by your saying I don't get a kernel-blessed graph. Wouldn't that (at the very least) depend how I run the code. For example, isn't bessie
kernel-blessed in the following example:
import Lean
structure CowData : Type where
(weight : Nat)
deriving Lean.FromJson
structure Cow extends CowData where
(heavy : weight > 42)
instance: Lean.FromJson Cow where
fromJson? := fun json => do
let data ← Lean.FromJson.fromJson? json (α := CowData)
if h : data.weight > 42 then
pure (Cow.mk data h)
else
throw "the cow is too lean"
def bessieJ : Lean.Json :=
match Lean.Json.parse "{\"weight\" : 666}" with
| .ok j => j
| .error _ => .null
def bessie : Cow :=
match Lean.FromJson.fromJson? bessieJ with
| .ok c => c
| .error _ => { weight := 666, heavy := by simp }
Andrej Bauer (Oct 10 2023 at 07:19):
I agree the hybrid thing is strange. Perhaps a good trade-off is to have all the data slurped in with FromJson
, but then to generate quoted stuff with bare hands and submitted for validation to the kernel (sort of like what we had in Dagstuhl).
Mario Carneiro (Oct 10 2023 at 07:22):
Yes, the kernel has blessed that bessie
is a valid object of type Cow
, but it has not checked that it is equal to anything in particular, and in particular it has not checked that the fromJson?
function is not in the error case
Mario Carneiro (Oct 10 2023 at 07:23):
it's just some symbolic object representing a computation but nothing has forced it to a normal form
Mario Carneiro (Oct 10 2023 at 07:23):
The trick with these deserialization methods is to get the kernel to bless some normal form object as having a property
Mario Carneiro (Oct 10 2023 at 07:27):
To force the object to a normal form we could do something like
example : bessie.weight = 666 := rfl
(although in this case this is not so contentful because the error case also happens to have the same value). But the main reason not to do this is that we don't want to run the JSON parser inside the kernel, it's really bad at string handling and I wouldn't be surprised if it just fails to terminate
Mario Carneiro (Oct 10 2023 at 07:33):
in fact, this doesn't work at all, because attempting to force bessie
to a value gets stuck on
Lean.Json.Parser.anyCore (Lean.Parsec.skipWs (String.mkIterator "{\"weight\" : 666}"))
because anyCore
is a partial def
, which means it is opaque to the kernel.
Andrej Bauer (Oct 10 2023 at 07:40):
I am still under the influence of call-by-value, having programmed in OCaml on the way back from Dagstuhl. Thanks again! I'll stick with the quoted stuff.
Mario Carneiro (Oct 10 2023 at 07:50):
lean is extra special because meta lean uses call-by-value evaluation while the kernel uses call-by-name :smile:
Last updated: Dec 20 2023 at 11:08 UTC