Zulip Chat Archive

Stream: lean4

Topic: strange C data order


Arthur Paulino (Feb 27 2025 at 10:44):

Hi all, I've been working with the FFI API lately and I noticed something strange. Consider the following Lean code:

inductive Foo
  | usize : USize  Foo
  | uint64 : UInt64  Foo
  | uint64_usize : UInt64  USize  Foo
  | usize_uint64 : USize  UInt64  Foo

@[extern "c_print_foo"]
opaque printFoo : @& Foo  UInt32

def main : IO UInt32 :=
  let a := printFoo $ .usize 42
  let b := printFoo $ .uint64 42
  let c := printFoo $ .uint64_usize 42 101
  let d := printFoo $ .usize_uint64 42 101
  return a + b + c + d

And the corresponding C code:

#include "lean/lean.h"
#include "stdio.h"

extern uint32_t c_print_foo(b_lean_obj_arg foo) {
    uint8_t tag = foo->m_tag;
    lean_object **objs = lean_to_ctor(foo)->m_objs;
    if (tag == 0) {
        // usize
        printf("%lu\n", lean_unbox_usize(foo));
    } else if (tag == 1) {
        // uint64
        printf("%lu\n", lean_unbox_uint64(foo));
    } else if (tag == 2) {
        // uint64_usize
        printf("%lu %lu\n", (uint64_t)objs[0], (size_t)objs[1]);
    } else if (tag == 3) {
        // usize_uint64
        printf("%lu %lu\n", (size_t)objs[0], (uint64_t)objs[1]);
    }
    return 0;
}

For some reason, when I run it it prints

42
42
101 42
42 101

I wouldn't expect the inversion on the line before the last one, printing 101 42 instead of 42 101.

Why is that so? When am I supposed to expect this behavior?

Thanks in advance!

Markus Himmel (Feb 27 2025 at 10:51):

As described in https://github.com/leanprover/lean4/blob/master/doc/dev/ffi.md, USize fields are always laid out before other scalar fields, so your printing code for uint64_usize actually reads the USize field thinking it's reading the UInt64 field, and vice versa.

By the way, you should probably be using lean_ctor_get_usize and lean_ctor_get_uint64 instead of accessing m_objs directly.

Arthur Paulino (Feb 27 2025 at 11:24):

Thanks! What are the i and offset parameters from lean_ctor_get_usize and lean_ctor_get_uint64 respectively?

Markus Himmel (Feb 27 2025 at 12:05):

From the linked document:

  • To access USize fields, use lean_ctor_get_usize(val, n+i) to get the ith usize field and n is the total number of fields of the first kind.
  • To access other scalar fields, use lean_ctor_get_uintN(val, off) or lean_ctor_get_usize(val, off) as appropriate. Here off is the byte offset of the field in the structure, starting at n*sizeof(void*) where n is the number of fields of the first two kinds.

Fields of the first kind are non-scalar fields (of which there are none in your example) and fields of the second kind are USize fields.

Arthur Paulino (Feb 27 2025 at 12:10):

Thanks a lot! :pray:


Last updated: May 02 2025 at 03:31 UTC