Zulip Chat Archive
Stream: lean4
Topic: strange C data order
Arthur Paulino (Feb 27 2025 at 10:44):
Hi all, I've been working with the FFI API lately and I noticed something strange. Consider the following Lean code:
inductive Foo
  | usize : USize → Foo
  | uint64 : UInt64 → Foo
  | uint64_usize : UInt64 → USize → Foo
  | usize_uint64 : USize → UInt64 → Foo
@[extern "c_print_foo"]
opaque printFoo : @& Foo → UInt32
def main : IO UInt32 :=
  let a := printFoo $ .usize 42
  let b := printFoo $ .uint64 42
  let c := printFoo $ .uint64_usize 42 101
  let d := printFoo $ .usize_uint64 42 101
  return a + b + c + d
And the corresponding C code:
#include "lean/lean.h"
#include "stdio.h"
extern uint32_t c_print_foo(b_lean_obj_arg foo) {
    uint8_t tag = foo->m_tag;
    lean_object **objs = lean_to_ctor(foo)->m_objs;
    if (tag == 0) {
        // usize
        printf("%lu\n", lean_unbox_usize(foo));
    } else if (tag == 1) {
        // uint64
        printf("%lu\n", lean_unbox_uint64(foo));
    } else if (tag == 2) {
        // uint64_usize
        printf("%lu %lu\n", (uint64_t)objs[0], (size_t)objs[1]);
    } else if (tag == 3) {
        // usize_uint64
        printf("%lu %lu\n", (size_t)objs[0], (uint64_t)objs[1]);
    }
    return 0;
}
For some reason, when I run it it prints
42
42
101 42
42 101
I wouldn't expect the inversion on the line before the last one, printing 101 42 instead of 42 101.
Why is that so? When am I supposed to expect this behavior?
Thanks in advance!
Markus Himmel (Feb 27 2025 at 10:51):
As described in https://github.com/leanprover/lean4/blob/master/doc/dev/ffi.md, USize fields are always laid out before other scalar fields, so your printing code for uint64_usize actually reads the USize field thinking it's reading the UInt64 field, and vice versa.
By the way, you should probably be using lean_ctor_get_usize and lean_ctor_get_uint64 instead of accessing m_objs directly.
Arthur Paulino (Feb 27 2025 at 11:24):
Thanks! What are the i and offset parameters from lean_ctor_get_usize and lean_ctor_get_uint64 respectively?
Markus Himmel (Feb 27 2025 at 12:05):
From the linked document:
- To access
USizefields, uselean_ctor_get_usize(val, n+i)to get theith usize field andnis the total number of fields of the first kind.- To access other scalar fields, use
lean_ctor_get_uintN(val, off)orlean_ctor_get_usize(val, off)as appropriate. Hereoffis the byte offset of the field in the structure, starting atn*sizeof(void*)wherenis the number of fields of the first two kinds.
Fields of the first kind are  non-scalar fields (of which there are none in your example) and fields of the second kind are USize fields.
Arthur Paulino (Feb 27 2025 at 12:10):
Thanks a lot! :pray:
Last updated: May 02 2025 at 03:31 UTC