Zulip Chat Archive

Stream: lean4

Topic: Escaping the IO monad


Andrej Bauer (Oct 09 2023 at 18:42):

Suppose I have getMagicNumber : IO Nat. How do I use it to define cow : Nat whose value is whatever getMagicNumber returns? I am not asking how to do this inside main, but rather what meta-level incantation is needed to add a new declaration in the kernel. I already know how to use Lean.addDecl, but that one requires quoted expressions. Background: I defined functions that load Lean objects from JSON files (yes, I am using FromJson) – but how do I load them from files?

Adam Topaz (Oct 09 2023 at 18:44):

You can define a term elaborator

Adam Topaz (Oct 09 2023 at 18:44):

It’s possible to lift from IO to termElabM

Andrej Bauer (Oct 09 2023 at 18:48):

Is that the elab thing? I could do that, but once I've got the object loaded from a JSON file and loaded, how do I do the equivalent of addDecl?

Mario Carneiro (Oct 09 2023 at 18:53):

You can use initialize to initialize a global variable with the result of executing getMagicNumber

Mario Carneiro (Oct 09 2023 at 18:56):

using a term elaborator will resolve the value at compile time, the result will be put in the .olean file. Using initialize will resolve the value in every downstream module as well as at runtime, effectively it is calculated on the spot at import time

Mario Carneiro (Oct 09 2023 at 18:57):

@Andrej Bauer From what I know about your application, I think the term elaborator is the way to go. In fact IIRC you already have one, and are using it for exactly this purpose already

Andrej Bauer (Oct 09 2023 at 19:36):

I can't seem to find any documentation about initalize.

Mario Carneiro (Oct 09 2023 at 19:42):

there might not be any... the short version is that initialize foo : A <- getFoo creates a new value foo : A from getFoo : IO A

Andrej Bauer (Oct 09 2023 at 19:42):

Ok, looking at the source, it looks like it's going to be

initialize cow : GraphCertificate  loadGraphCertificate "cert.json"

but this triggers failed to synthesize Inhabited GraphCertificate.

Mario Carneiro (Oct 09 2023 at 19:43):

oh, I guess it uses opaque under the hood so you will need an inhabited instance

Andrej Bauer (Oct 09 2023 at 19:43):

Does it want to be fail-safe by obtaining a default value?

Mario Carneiro (Oct 09 2023 at 19:43):

that should be easy for GraphCertificate

Mario Carneiro (Oct 09 2023 at 19:43):

it doesn't use the default value AFAIK, but it should not be allowed to break the logic by constructing False

Andrej Bauer (Oct 09 2023 at 19:44):

Yes, and it would be a crime if the default were anything other than Petersen graph. Just one more question while I have your attention: what if I'd like to initialize instances?

Mario Carneiro (Oct 09 2023 at 19:44):

what do you mean?

Andrej Bauer (Oct 09 2023 at 19:44):

I'd like to install instances of type classes obtained from loading JSON files.

Mario Carneiro (Oct 09 2023 at 19:45):

is GraphCertificate a class?

Andrej Bauer (Oct 09 2023 at 19:46):

initialize cow : Nat  pure 42
#eval cow

gives me "Server process for untitled:Untitled-1 crashed, likely due to a stack overflow or a bug.".

Mario Carneiro (Oct 09 2023 at 19:46):

Ah right I was going to mention that: you can't use initialize values in the same file they were declared

Mario Carneiro (Oct 09 2023 at 19:47):

you need to make another file, import the first one and then try #eval / etc

Mario Carneiro (Oct 09 2023 at 19:47):

for... reasons

Andrej Bauer (Oct 09 2023 at 19:47):

GraphCertificate is not a class, but it has stuff in it from which I'd like to construct some instances of some other classes.

Andrej Bauer (Oct 09 2023 at 19:48):

This whole thing is beginning to look like someone's PhD thesis.

Mario Carneiro (Oct 09 2023 at 19:48):

in that case I think the term elaborator or command elaborator is the best way to construct all those values

Mario Carneiro (Oct 09 2023 at 19:48):

again, last I checked you were doing exactly this

Mario Carneiro (Oct 09 2023 at 19:49):

it really isn't supposed to be PhD-level complicated but the documentation is sparse

Andrej Bauer (Oct 09 2023 at 19:49):

Ok, but last time around I had quoted stuff, and now I have directly constructed values. I failed to "quote" them, and did not find any way to stick them inside addDecl.

Mario Carneiro (Oct 09 2023 at 19:49):

Ah, you have to quote them to put them in a declaration to be checked by the kernel

Mario Carneiro (Oct 09 2023 at 19:49):

or quote something derived from them

Andrej Bauer (Oct 09 2023 at 19:50):

(BTW, many many thanks, and also, I am asking this barrage of questions because I followed your advice and now I have funky fromJson instances all around.)

Mario Carneiro (Oct 09 2023 at 19:50):

because the kernel typechecker only understands Expr

Mario Carneiro (Oct 09 2023 at 19:51):

when you use initialize you directly get a value of the type, not an Expr denoting it. The value is not really in the kernel / logical system, only an opaque which stands in for it (whence the Inhabited requirement)

Andrej Bauer (Oct 09 2023 at 19:51):

So let me see if I understand the process. Working in elab:

  1. Load a cow from JSON file, using Mario-inspired FromJson instances, obtain a bona fide kernel-blessed object of type Cow.
  2. "Quote" cow to something like cowQ (how?)
  3. Submit the quoted thing to the kernel, just like before.

Andrej Bauer (Oct 09 2023 at 19:52):

Ah, I probably misunderstood step 1 then.

Mario Carneiro (Oct 09 2023 at 19:52):

obtain a bona fide kernel-blessed object of type Cow.

This object is not kernel-blessed

Mario Carneiro (Oct 09 2023 at 19:52):

it is a well typed lean object in the runtime but not a symbolic object the kernel understands

Andrej Bauer (Oct 09 2023 at 19:53):

So the question is, how do I "quote" it? It's not q($cow).

Mario Carneiro (Oct 09 2023 at 19:53):

You can provide a generic quotation instance to turn cow into cowQ

Mario Carneiro (Oct 09 2023 at 19:54):

if you have such an instance I believe Qq will use it, so q($cow) should just work

Mario Carneiro (Oct 09 2023 at 19:54):

if not you can still just call quote cow

Andrej Bauer (Oct 09 2023 at 19:54):

Aha! quote is the magic word.

Mario Carneiro (Oct 09 2023 at 19:54):

The relevant typeclass is Quote IIRC

Adam Topaz (Oct 09 2023 at 19:55):

docs#Lean.Quote to be precise

Mario Carneiro (Oct 09 2023 at 20:04):

Oh wait no that's the wrong typeclass, that one targets Syntax not Expr

Mario Carneiro (Oct 09 2023 at 20:06):

you want docs#Lean.ToExpr

Eric Wieser (Oct 09 2023 at 20:15):

Quote seems to do some pretty weird things

Eric Wieser (Oct 09 2023 at 20:16):

quote #[1, 2] gives the syntax mkArray2 1 2 or similar

Mario Carneiro (Oct 09 2023 at 20:16):

I think that might be used by the compiler? I'm not sure what this class is used for

Mario Carneiro (Oct 09 2023 at 20:17):

well, I use it in mathport but besides that

Eric Wieser (Oct 09 2023 at 20:19):

It's used in polyrith (!)

Mario Carneiro (Oct 09 2023 at 20:19):

unfortunately it's hard to grep for uses of quote by type

Mario Carneiro (Oct 09 2023 at 20:20):

there are a ton of uses of the instances for Name, String, Nat, Bool but I'm not sure about the other instances

Eric Wieser (Oct 09 2023 at 20:21):

Do you know how quote compares with toExpr followed by delaboration?

Mario Carneiro (Oct 09 2023 at 20:21):

it's not monadic for one

Mario Carneiro (Oct 09 2023 at 20:22):

Aha, Lean/Elab/Quotation.lean has src#Lean.Elab.Term.Quotation.ArrayStxBuilder.build which calls quote on a Array Term

Mario Carneiro (Oct 09 2023 at 20:24):

there is also a Quote (List String) used later down in the same file

Mario Carneiro (Oct 09 2023 at 20:25):

which is used in elaborating quotations: `(stuff) is a macro so it needs to create syntax for all the types used inside Syntax in order to create syntax to construct a Syntax object

Mario Carneiro (Oct 09 2023 at 20:27):

so it makes sense that this would have a funny instance for creating Array values, because we want to optimize the size of the generated term (both the syntax itself and also the term it elaborates to)

Mario Carneiro (Oct 09 2023 at 20:27):

syntax quotations (and syntax match) routinely generate some really massive expressions

Andrej Bauer (Oct 09 2023 at 21:00):

To summarize, should I really be trying to parse JSON to create Lean objects , use Quote or some such, and then pass them to the kernel? Or is it better to parse JSON directly to quoted stuff that is passed to kernel?

Andrej Bauer (Oct 09 2023 at 21:04):

BTW, I am pretty certain that the objects created from JSON are blessed. It looks like initialize might be the way to go (as clumsy as it is).

Mario Carneiro (Oct 09 2023 at 21:44):

I mean, the function that does the blessing is addDecl

Mario Carneiro (Oct 09 2023 at 21:45):

and initialize only passes a stub to addDecl

Mario Carneiro (Oct 09 2023 at 21:45):

Andrej Bauer said:

To summarize, should I really be trying to parse JSON to create Lean objects , use Quote or some such, and then pass them to the kernel? Or is it better to parse JSON directly to quoted stuff that is passed to kernel?

I don't understand the distinction being made here

Mario Carneiro (Oct 09 2023 at 21:46):

either way you have JSON in and Expr out, generally mediated by some meta lean type which may or may not coincide with the type you are constructing an Expr for

Andrej Bauer (Oct 10 2023 at 06:27):

I have FromJson instances which, in addition to converting JSON to "info", also verify that the info is valid, which requires running some decision procedures in the kernel, as validity is in general kernel-verified. For example:

structure GraphData : Type where
  vertexSize : Nat
  edgeTree : SetTree (Edge vertexSize)
deriving Lean.FromJson, Inhabited

structure Graph extends GraphData where
  edgeCorrect : edgeTree.isCorrect

instance Graph.fromJson : Lean.FromJson Graph where
  fromJson? := fun json => do
    let graphData  Lean.FromJson.fromJson? json (α := GraphData)
    if h : graphData.edgeTree.isCorrect then
      pure (Graph.mk graphData h)
    else
      throw "invalid edges in a graph"

This seems to be a general strategy: split the certificate into two parts, "data" and "validation". The data is in JSON, validation is done by the kernel (as above). I like this way of doing things, except that importing JSON from a file seems like such a monadic headache.

Mario Carneiro (Oct 10 2023 at 06:38):

the isCorrect call is going to happen in untrusted lean here, unless you have the kernel running the FromJson implementation

Mario Carneiro (Oct 10 2023 at 06:38):

which means you will be doing the check twice

Mario Carneiro (Oct 10 2023 at 06:39):

although you can probably assume the cost of running the check in the interpreter is negligible compared to the kernel check

Mario Carneiro (Oct 10 2023 at 06:40):

but the important part is that validating the data structure in untrusted lean doesn't help you to produce a proof, it only safeguards against producing a bad proof / can help improve the error message if GAP / your input source has a bug

Mario Carneiro (Oct 10 2023 at 06:46):

If you really wanted to have "kernel-verified validity" in this instance, it would need to look something like this:

structure GraphData : Type where
  vertexSize : Nat
  edgeTree : SetTree (Edge vertexSize)
deriving Lean.FromJson, Inhabited

structure Graph extends GraphData where
  edgeCorrect : Q(($edgeTree).isCorrect)

instance Graph.fromJson : Lean.FromJson Graph where
  fromJson? := fun json => do
    let graphData  Lean.FromJson.fromJson? json (α := GraphData)
    let edgeTreeQ := toExpr graphData.edgeTree
    let name <- mkFreshName
    try
      addDecl {
        name
        type := q(($edgeTreeQ).isCorrect)
        value := q(by decide)
      }
      pure (Graph.mk graphData (.const name []))
    catch _ =>
      throw "invalid edges in a graph"

Mario Carneiro (Oct 10 2023 at 06:49):

but there are a few issues with this:

  1. FromJson instances are basically pure functions, but addDecl needs access to the environment
  2. The type of Graph is now some hybrid of meta-level and object-level definitions, it's not just a subtype anymore (and indeed you can't even recover the subtype from this without actually doing the check again, because meta lean doesn't reflect the correctness of the kernel)

Andrej Bauer (Oct 10 2023 at 07:17):

I am confused by your saying I don't get a kernel-blessed graph. Wouldn't that (at the very least) depend how I run the code. For example, isn't bessie kernel-blessed in the following example:

import Lean

structure CowData : Type where
  (weight : Nat)
deriving Lean.FromJson

structure Cow extends CowData where
  (heavy : weight > 42)

instance: Lean.FromJson Cow where
  fromJson? := fun json => do
    let data  Lean.FromJson.fromJson? json (α := CowData)
    if h : data.weight > 42 then
      pure (Cow.mk data h)
    else
      throw "the cow is too lean"

def bessieJ : Lean.Json :=
  match Lean.Json.parse "{\"weight\" : 666}" with
  | .ok j => j
  | .error _ => .null

def bessie : Cow :=
  match Lean.FromJson.fromJson? bessieJ with
  | .ok c => c
  | .error _ => { weight := 666, heavy := by simp }

Andrej Bauer (Oct 10 2023 at 07:19):

I agree the hybrid thing is strange. Perhaps a good trade-off is to have all the data slurped in with FromJson, but then to generate quoted stuff with bare hands and submitted for validation to the kernel (sort of like what we had in Dagstuhl).

Mario Carneiro (Oct 10 2023 at 07:22):

Yes, the kernel has blessed that bessie is a valid object of type Cow, but it has not checked that it is equal to anything in particular, and in particular it has not checked that the fromJson? function is not in the error case

Mario Carneiro (Oct 10 2023 at 07:23):

it's just some symbolic object representing a computation but nothing has forced it to a normal form

Mario Carneiro (Oct 10 2023 at 07:23):

The trick with these deserialization methods is to get the kernel to bless some normal form object as having a property

Mario Carneiro (Oct 10 2023 at 07:27):

To force the object to a normal form we could do something like

example : bessie.weight = 666 := rfl

(although in this case this is not so contentful because the error case also happens to have the same value). But the main reason not to do this is that we don't want to run the JSON parser inside the kernel, it's really bad at string handling and I wouldn't be surprised if it just fails to terminate

Mario Carneiro (Oct 10 2023 at 07:33):

in fact, this doesn't work at all, because attempting to force bessie to a value gets stuck on

Lean.Json.Parser.anyCore (Lean.Parsec.skipWs (String.mkIterator "{\"weight\" : 666}"))

because anyCore is a partial def, which means it is opaque to the kernel.

Andrej Bauer (Oct 10 2023 at 07:40):

I am still under the influence of call-by-value, having programmed in OCaml on the way back from Dagstuhl. Thanks again! I'll stick with the quoted stuff.

Mario Carneiro (Oct 10 2023 at 07:50):

lean is extra special because meta lean uses call-by-value evaluation while the kernel uses call-by-name :smile:


Last updated: Dec 20 2023 at 11:08 UTC