Zulip Chat Archive

Stream: lean4

Topic: C FFI for enums and structs


Anders Christiansen Sørby (Jan 20 2022 at 00:31):

So working on FFI bindings to various libraries I want to be able to bind with structs and enums more easily.
If I have a constant that is supposed to reference an element of an enum like this

@[extern "SDL_APP_TERMINATING"]
constant SDL_APP_TERMINATING : UInt32 -- OS is terminating the application

It will generate this c code

extern uint32_t SDL_APP_TERMINATING;
LEAN_EXPORT lean_object* initialize_SDL_Events(lean_object* w) {
lean_object * res;
if (_G_initialized) return lean_io_result_mk_ok(lean_box(0));
_G_initialized = true;
res = initialize_Init(lean_io_mk_world());
if (lean_io_result_is_error(res)) return res;
lean_dec_ref(res);

l_SDL_Event_Type_SDL__APP__TERMINATING = _init_l_SDL_Event_Type_SDL__APP__TERMINATING(); // <--- this is never generated!

return lean_io_result_mk_ok(lean_box(0));
}

Why does it not generate the initializer _init_l_SDL_Event_Type_SDL__APP__TERMINATING?
My suggestion would be to generate this

uint32_t _init_l_SDL_Event_Type_SDL__APP__TERMINATING() {
  return SDL_APP_TERMINATING;
}

Sebastian Ullrich (Jan 20 2022 at 08:17):

The standard extern is supposed to be generic FFI that binds to symbols, but SDL_APP_TERMINATING is probably a header constant that doesn't exist at runtime. You can tell Lean "yes, I know this will only work with the C backend" (which of course is the only backend right now) using c inline: https://github.com/leanprover/lean4/blob/bbec84bb18d7d978b0c4ea13841b5a4ee6f1c1b4/src/Init/Meta.lean#L29-L31

Anders Christiansen Sørby (Jan 20 2022 at 09:31):

It's the field of an enum. Cool that we have this option. This will probably not be used outside of C for a long time.

Anders Christiansen Sørby (Jan 20 2022 at 11:52):

I think that referencing global constant static variables is platform/backend independent. It is just about assigning a value from a constant symbol at initialization.
As for structs I would suggest a sort of @[bindgen ...] declaration. This could also maybe be used to mirror enums with inductives.

An example

typedef struct {
  int x;
  int y;
} Data;

And in Lean something like

@[bindgen struct "Data" ]
structure Data where
  x : UInt32
  y : UInt32

Which automatically generates a PointedType instance for Data
which would generate something like (and more stuff)

uint32_t _lean_Data_get_Data_x(b_lean_obj_arg d){
Data *data = lean_get_extern_data(d);
return data->x;
}
uint32_t _lean_Data_get_Data_y(b_lean_obj_arg d){
Data *data = lean_get_extern_data(d);
return data->y;
}

Which are linked to the accessors of that function, makes it possible to access fields of that struct without writing too much boiler plate. The lean compiler needs to trust that you have specified the fields correctly though.

Anders Christiansen Sørby (Jan 20 2022 at 12:19):

@Sebastian Ullrich Where is the value from c inline actually emitted? I can't see it anywhere in the generated source.

Sebastian Ullrich (Jan 20 2022 at 12:20):

At each use site

Anders Christiansen Sørby (Jan 20 2022 at 12:22):

But it produces this error:

/nix/store/js66s0xwjnzg0ggi2lq9bcvlk6x2za13-binutils-2.35.2/bin/ld: src.c:(.text+0x662): undefined reference to `_init_l_SDL_Event_Type_SDL__LASTEVENT'

why isn't the initialization functions generated?

Anders Christiansen Sørby (Jan 20 2022 at 12:30):

I suspect this is because it is declared as just a constant UInt32. But this seems to work for Nat and bool in Init.Meta.

Sebastian Ullrich (Jan 20 2022 at 12:32):

I can't help you if you don't even post your code, but there is a reason for the Unit argument in the linked code.

Anders Christiansen Sørby (Jan 20 2022 at 12:38):

My code is the one I posted above. I see that that code uses a form of thunks, but since this is an immutable value I don't see why it shouldn't be just initialized to that value right away.

Sebastian Ullrich (Jan 20 2022 at 12:41):

I assumed you were now using c inline, since you won't solve this any other way

Anders Christiansen Sørby (Jan 20 2022 at 12:44):

I am using c inline. So I just wanted to avoid the indirection of using a thunk. I can always use

@[extern c inline "SDL_APP_TERMINATING"]
private constant _SDL_APP_TERMINATING (u : Unit) : UInt32
def SDL_APP_TERMINATING: UInt32 := _SDL_APP_TERMINATING ()

I just though there would be a simpler way

Sebastian Ullrich (Jan 20 2022 at 12:47):

There is no thunk at runtime. And you'll want remove the second Unit argument.

Anders Christiansen Sørby (Jan 20 2022 at 12:47):

Oh, forgot that one.

Anders Christiansen Sørby (Jan 20 2022 at 12:52):

But empty constants which are not filled by foreign initialization should cause compiler error. This shouldn't be left for the c compiler to detect IMO.

Anders Christiansen Sørby (Jan 20 2022 at 12:57):

Hm, so this still causes an error since I haven't included the header file which contains this enum.
Is there some way I could include this header? #include "SDL.h" or reexport it somehow?

Anders Christiansen Sørby (Jan 20 2022 at 12:58):

Alternatively I can just write it in C.

Sebastian Ullrich (Jan 20 2022 at 12:59):

Oh, you are right. There is no way to do that, and we probably don't want to cement the C backend's default status even further.

Anders Christiansen Sørby (Jan 20 2022 at 13:01):

I suggest removing the c inline functionality altogether and just use standard FFI and C code for those cases. It probably does more harm that usefullness if it can't even solve such simple things.

Sebastian Ullrich (Jan 20 2022 at 13:35):

See https://github.com/leanprover/lean4/pull/694

Anders Christiansen Sørby (Jan 20 2022 at 14:29):

Is it impossible to use match on UInt32 (that are not known at compile time)?

  match s.type : UInt32 with
  | SDL_QUIT => Event.quitEvent
  | SDL_KEYDOWN => Event.keyboardEvent
  | _ => Event.userEvent

This fails with redundant alternative since there is no way of knowing what those are at compile time. So do I have to resort to using if then else for this case?

Sebastian Ullrich (Jan 20 2022 at 14:35):

Yes. I think this is the case in virtually every language.

Daniil Kisel (Feb 09 2023 at 11:01):

Can a definition of a C-like enum with values known at compile time be made in a way that match can be used with it to produce Types? Usual match with { t: UInt8 // ... }fails, iiuc because its property is in Prop and can't be matched to produce a Type. Making the property a Type is not ideal because I want to preserve the enum's runtime representation. The only solution I could come up with is to use a O(n) chain of dite, which is less convenient than match and perhaps can be ignored by the optimizer and remain linear in time.

Sebastian Ullrich (Feb 09 2023 at 11:30):

What exactly do you mean by C-like? An inductive without constructor parameters is unboxed at run time

Daniil Kisel (Feb 09 2023 at 11:32):

I want to specify the tags manually and they can be noncontinuous.

James Gallicchio (Feb 09 2023 at 15:29):

you probably could do some metaprogramming magic to get that to work, by autogenerating @[implementedBy]s for the constructors and match statements, but I'm not sure you could get equivalent performance to the compiler-generated declarations...

Mario Carneiro (Feb 10 2023 at 04:14):

Daniil Kisel said:

Usual match with { t: UInt8 // ... }fails, iiuc because its property is in Prop and can't be matched to produce a Type.

You can certainly make this scheme work. Do you have an MWE of the issue you hit?

Mario Carneiro (Feb 10 2023 at 04:15):

the basic idea would be to match on t, not the property, and prove False in the impossible case(s)

Daniil Kisel (Feb 10 2023 at 15:07):

To prove False I would need to get inequality proofs in the default branch, is it possible?
MWE:

inductive P : UInt8 -> Prop where
  | A : P 10
  | B : P 20

def T := Subtype P

def f (t : T) : Nat :=
  match t.val with
    | 10 => 1
    | 20 => 2
    | x => ?f

For reference, Agda counts two branches, without a default one, as exhaustive.

Sebastian Ullrich (Feb 10 2023 at 15:21):

Agda accepts the match after throwing away the bound at t.val? That seems unlikely.
It works if you match deep enough.

def f (t : T) : Nat :=
  match t with
  | ⟨⟨10, _⟩, _ => 1
  | ⟨⟨20, _⟩, _ => 2

Daniil Kisel (Feb 10 2023 at 16:20):

Thank you, I haven't tried matching on t, only on t.val, t.property which I thought was the same. And indeed in Agda I matched with mk 10 p, not 10, sorry. Also TIL what ⟨e, ...⟩ means, thanks again.


Last updated: Dec 20 2023 at 11:08 UTC