Zulip Chat Archive

Stream: lean4

Topic: Style for using external code


Xubai Wang (Aug 02 2021 at 03:44):

I'm going to add a Lean wrapper for the C++ Poco Library. After exploration, I find there are 3 styles of using external code in Lean 4:

  1. @[extern] constant and register_external_class:
constant FS.Handle : Type := Unit

@[extern "lean_io_prim_handle_mk"]
constant Handle.mk (s : @& String) (mode : @& String) : IO Handle := arbitrary _
void initialize_io() {
    g_io_error_nullptr_read = mk_string("null reference read");
    mark_persistent(g_io_error_nullptr_read);
    g_io_handle_external_class = lean_register_external_class(io_handle_finalizer, io_handle_foreach);
}
extern "C" obj_res lean_io_prim_handle_mk(b_obj_arg filename, b_obj_arg modeStr, obj_arg /* w */) {
    FILE *fp = fopen(lean_string_cstr(filename), lean_string_cstr(modeStr));
    if (!fp) {
        return set_io_error(decode_io_error(errno, filename));
    } else {
        return set_io_result(io_wrap_handle(fp));
    }
}
@[extern "lean_string_hash"]
protected constant String.hash (s : @& String) : USize := arbitrary _
instance : Hashable String := String.hash
size_t lean_string_hash(b_lean_obj_arg);
  1. @[extern] def
@[extern "lean_nat_add"]
protected def Nat.add : (@& Nat)  (@& Nat)  Nat
| a, Nat.zero   => a
| a, Nat.succ b => Nat.succ (Nat.add a b)
static inline lean_obj_res lean_nat_add(b_lean_obj_arg a1, b_lean_obj_arg a2) {
    if (LEAN_LIKELY(lean_is_scalar(a1) && lean_is_scalar(a2)))
        return lean_usize_to_nat(lean_unbox(a1) + lean_unbox(a2));
    else
        return lean_nat_big_add(a1, a2);
}
  1. lean_ctor_get|set

This style is taken from @Mac 's lean4-papyrus

// Wrap a pointer in a linked pointer object, transfering ownership to of it to Lean.
template<typename T> lean_object* mkLinkedOwnedPtr(b_lean_obj_arg link, T* ptr) {
    lean_object* obj = lean_alloc_ctor(0, 2, 0);
    lean_ctor_set(obj, 0, link);
    lean_ctor_set(obj, 1, mkOwnedPtr<T>(ptr));
    return obj;
}

What's the actual difference of @[extern] constant and @[extern] def? And when should we use the lean_ctor_get|set style?

Xubai Wang (Aug 02 2021 at 04:03):

There is also confusion on the type of C++ return value.
It seems that C++ uint64_t can automatically convert to Lean's UInt64. And lean_box() can be used as many different types.
And in lean4-papyrus, the indutive can be used as uint8 after annotated with @attribute [unbox]

inductive EngineKind
| Either
| JIT
| Interpreter
deriving BEq, Rec

attribute [unbox] EngineKind

@[extern "papyrus_execution_engine_create_for_module"]
constant createForModule (mod : @& ModuleRef) (kind : @& EngineKind := EngineKind.Either)
  (march : @& String := "") (mcpu : @& String := "") (mattrs : @& Array String := #[])
  (optLevel : @& OptLevel := OptLevel.Default) (verifyModule := false)
  : IO ExecutionEngineRef
extern "C" obj_res papyrus_execution_engine_create_for_module
(b_obj_arg modObj, uint8 kindObj, b_obj_arg marchStr, b_obj_arg mcpuStr,
  b_obj_arg mattrsObj, uint8 optLevel, uint8 verifyModules, obj_arg /* w */)

Wojciech Nawrocki (Aug 02 2021 at 04:57):

What's the actual difference of @[extern foo] constant and @[extern foo] def?

There should be no difference in the generated C code -- both will be calls to foo. There is a difference, however, in the kernel declaration emitted. constants are opaque to the kernel, while defs can be unfolded, and you can generally prove things about them. Using @[extern] def has some pitfalls in that if your extern code does not match the def definition, the compiled code may not do what you think it should do.

Wojciech Nawrocki (Aug 02 2021 at 05:00):

And when should we use the lean_ctor_get|set style?

I wouldn't say this is a separate style from the other ones. The lean_ctor_* family allows you to construct non-primitive Lean objects. You could use it to construct an object returned from an @[extern] constant, but that object could well also store an external_class as one of its fields.

Wojciech Nawrocki (Aug 02 2021 at 05:06):

It seems that C++ uint64_t can automatically convert to Lean's UInt64. And lean_box() can be used as many different types.

Yes. UInt64 is a primitive type which means it compiles directly to C uint64_t. You can find a list of the compiled types here. As you can see, some types are represented directly, but most others (such as inductives) are lean_objects whose types are erased in the compiled code. Hence why lean_ctor_* and lean_box can construct many different types.

Wojciech Nawrocki (Aug 02 2021 at 05:13):

And in lean4-papyrus, the inductive can be used as uint8 after annotated with @attribute [unbox]

Oh, cool, I didn't know you could do that :)

Sebastian Ullrich (Aug 02 2021 at 08:25):

I'm pretty sure [unbox] currently is a no-op

Mac (Aug 02 2021 at 15:10):

@Sebastian Ullrich is right, the unbox is a no-op. However, I put that attribute on it to make it clear the inductive is, in fact, unboxed (and to keep it that way if something in the core ever changes).

Mac (Aug 02 2021 at 15:35):

Wojciech Nawrocki said:

And when should we use the lean_ctor_get|set style?

I wouldn't say this is a separate style from the other ones. The lean_ctor_* family allows you to construct non-primitive Lean objects. You could use it to construct an object returned from an @[extern] constant, but that object could well also store an external_class as one of its fields.

To add onto this, the advantage of non-external class objects is that they are memory managed by Lean. I take advantage of this in Papyrus's LinkedOwnedPtr, which @Xubai Wang mentioned.

In particular, LinkedOwnedPtr is a special case of LinkedPtr, which is just a LinkedPtrImpl -- a normal structure consisting of a link (a normal lean object) and a ExternalPtr (alean_external_object). I, however, hide this detail by making LinkedPtr a constant. This prevents users from unfolding a LinkedPtr into a LinkedPtrImpl -- doing so would allow them to gaining access to the raw ExternalPtr, which can cause memory unsafety. This is because the link of a LinkedPtr is not supposed to be garbage collected before the LinkedPtr as the ExternalPtr expects the link to remain alive in its C code. If a user were to access the ExternalPtr directly and keep a reference to it while dropping its reference to the LinkedPtr, the link would be garbage collected, breaking the C bindings for the ExternalPtr.

Alternatively, this could be solved by having the link be contained within the lean_external_object itself (which lean-llvm does in its FFI). However, I choose to avoid this approach because I feel keeping as little data external to Lean as possible is idea. In particular, this makes it so the external object's finalizer and foreach are very simple and don't have to worry about the reference count of the link.


Last updated: Dec 20 2023 at 11:08 UTC