Zulip Chat Archive

Stream: lean4

Topic: Structure in FFI

Marcus Rossel (Apr 08 2024 at 14:13):

How does one access the fields of a Lean structure in C? For example, I have the following setup:

structure A where
  b : Bool
  s : String
  deriving Inhabited

@[extern "test"]
opaque test : A → A

def main : IO Unit :=
  let a : A := { b := false, s := "" }
  let a' := test a
  IO.println s!"⟨{a'.b}, {a'.s}⟩"

#include <lean/lean.h>
#include <stdio.h>

lean_obj_res test(lean_obj_arg a) {
    if (!lean_is_ctor(a)) {
        printf("a is not a ctor\n");
        return a;
    }

    unsigned n = lean_ctor_num_objs(a);
    printf("num objs: %d\n", n);

    lean_obj_res field_0 = lean_ctor_get(a, 0);
    lean_obj_res field_1 = lean_ctor_get(a, 1); // fails

    return a;
}

Running this fails with the following output:

num objs: 1
LEAN ASSERTION VIOLATION
File: /Users/marcus/.elan/toolchains/stable/include/lean/lean.h
Line: 549
i < lean_ctor_num_objs(o)

So I guess lean_ctor_get isn't the correct function for accessing structures' fields.
How are structures/inductive types represented in the FFI then?

Mario Carneiro (Apr 08 2024 at 14:18):

inductives are represented as two parts:

the lean_object* fields
the scalar part (a byte buffer)

To offset into the first part, use lean_ctor_get. To get the second part use lean_ctor_set_uint32 or similar functions, depending on the scalar type

Mario Carneiro (Apr 08 2024 at 14:18):

offsets in the second part are represented in bytes, so if there are two uint32s then one will have offset 0 and the next offset 4

Marcus Rossel (Apr 08 2024 at 14:19):

Mario Carneiro said:

inductives are represented as two parts:

the lean_object* fields

the scalar part (a byte buffer)

Is that somehow related to this?

typedef struct {
    lean_object   m_header;
    lean_object * m_objs[0];
} lean_ctor_object;

Mario Carneiro (Apr 08 2024 at 14:19):

Your structure A has a single pointer field, namely b : String, and one byte of scalar data, namely a : Bool as a uint8 at offset 0

Mario Carneiro (Apr 08 2024 at 14:22):

The struct doesn't actually say too much about this. Both parts I mentioned are actually indexed from m_objs as far as the C code is concerned, but some metadata about the number of fields of each kind is given in m_header so that it can do the necessary indexing

Marcus Rossel (Apr 08 2024 at 14:28):

Hmm, so from your explanation I thought this would work:

lean_obj_res test(lean_obj_arg a) {
    if (!lean_is_ctor(a)) {
        printf("a is not a ctor\n");
        return a;
    }

    unsigned n = lean_ctor_num_objs(a);
    printf("num objs: %d\n", n);

    uint8_t b = lean_ctor_get_uint8(a, 0);
    printf("b: %d\n", b);

    lean_obj_res s = lean_ctor_get(a, 0);
    char* s_cstr = lean_string_cstr(s);
    printf("s: %s\n", s_cstr);

    return a;
}

But that still fails with:

num objs: 1
LEAN ASSERTION VIOLATION
File: /Users/marcus/.elan/toolchains/stable/include/lean/lean.h
Line: 576
offset >= lean_ctor_num_objs(o) * sizeof(void*)

Marcus Rossel (Apr 08 2024 at 14:30):

Ahhh, so the byte buffer starts after the pointer fields, so the offset always has to be desired_offset + lean_ctor_num_objs(a)?

Mario Carneiro (Apr 08 2024 at 14:30):

it does, but lean_ctor_get_uint8 should handle that

Mario Carneiro (Apr 08 2024 at 14:30):

oh no, you are right

Marcus Rossel (Apr 08 2024 at 14:31):

Hmm, can you think of cases where one would use that function without the + lean_ctor_num_objs(_)? Seems like something that could be built in to the function.

Mario Carneiro (Apr 08 2024 at 14:33):

lean code does it like this

lean_ctor_set_uint8(x_3, sizeof(void*)*1, x_2);

Mario Carneiro (Apr 08 2024 at 14:35):

I agree it's a bit messy to be letting this implementation detail leak. The intent is that this will be constant folded by the C compiler, but lean code generating the C is trying to be agnostic about pointer size

Marcus Rossel (Apr 08 2024 at 14:37):

Cool, thanks for the help! Do you think adding some doc comments on some of the lean.h functions would be welcome? I personally found it quite difficult to understand anything non-trivial in there so far.

Mario Carneiro (Apr 08 2024 at 14:37):

Docs on everything is missing and important and welcome

Mario Carneiro (Apr 08 2024 at 14:38):

Note also that USize fields are ordered at the beginning of the scalar section, so you also have to skip them in the sizeof(void*) part of the count

Mario Carneiro (Apr 08 2024 at 14:39):

besides that, I believe scalars are just placed in declaration order (with padding, I think?)

Marcus Rossel (Apr 08 2024 at 14:40):

My real-world application only has bools, so I'll stay blissfully ignorant about that for now :sweat_smile:

Marcus Rossel (Apr 08 2024 at 18:54):

Doc comment PR: https://github.com/leanprover/lean4/pull/3846

Tomas Skrivan (Apr 08 2024 at 22:03):

Just a tip, I usually write a Lean function that does the structure manipulation, like accessing fields, compile it and inspecting the generated C code to see how the access is done exactly.

Maybe you can also add(not tested as I'm on mobile)

attribute [export A_get_s] A.s
attribute [export A_get_b] A.b

Compile it and copy paste the generated functions A_get_s and A_get_b to your C code.

Tomas Skrivan (Apr 08 2024 at 22:04):

(deleted)

Tomas Skrivan (Apr 08 2024 at 22:06):

(deleted)

Eric Wieser (Apr 11 2024 at 15:05):

As far as I can tell, attribute [export foo_bar] Foo.bar doesn't actually do anything on projections

Tomas Skrivan (Apr 11 2024 at 15:07):

I see, then you can just define a new function that calls the projection and export that and inspect.

Eric Wieser (Apr 11 2024 at 15:07):

Probably it should raise an error since it doesn't work?

Mario Carneiro (Apr 11 2024 at 15:55):

export on structures is currently broken (lean4#2292), I don't recommend it

Eric Wieser (Apr 11 2024 at 16:16):

That's only for single-field structures, right?

Leni Aniva (Apr 16 2024 at 16:54):

Marcus Rossel said:

Cool, thanks for the help! Do you think adding some doc comments on some of the lean.h functions would be welcome? I personally found it quite difficult to understand anything non-trivial in there so far.

I think it helps to find the relevant symbol in the Lean source code (.lean) since these tend to be better documented

Last updated: May 02 2025 at 03:31 UTC