Zulip Chat Archive

Stream: lean4

Topic: Understanding Lean's Foreign Function Interface (FFI)


Yuri (Jul 17 2024 at 14:59):

I am sharing here a document that started as a few notes, morphed into a full blown reference and ended somewhere in between as a guide to understand how Lean does FFI today. It tries to approach the topic from a beginners point of view, but can get down to details too.

I hope this is helpful to others here: Understanding Lean's Foreign Function Interface (FFI)

And if you see mistakes, omissions or any feedback, please, let me know.

Yuri (Jul 17 2024 at 15:04):

BTW: I still want to add content for how to deal with external objects and closures.

Henrik Böving (Jul 17 2024 at 15:15):

If you want to keep expanding on this, I really liked https://ocaml.org/manual/5.2/intfc.html and would love if we had a guide in this direction.

Yuri (Jul 17 2024 at 15:20):

Hi, @Henrik Böving! That’s a possibility, but I plan to spend some time gathering corrections and feedback before making any major changes.

One concern I have, based on my experience so far, is that the FFI could benefit from a significant redesign to enhance its usability as a general FFI. While it works well as a low-overhead interface between C/C++ and Lean, it falls short in other areas.

Therefore, I would think twice before investing much more energy into a comprehensive documentation revamp.

Yuri (Jul 17 2024 at 15:23):

Henrik Böving said:

If you want to keep expanding on this, I really liked https://ocaml.org/manual/5.2/intfc.html and would love if we had a guide in this direction.

Out of curiosity, what specifically attracted you to it aside from presentation?

Daniil Kisel (Jul 17 2024 at 16:06):

Proposing to change:

  • In "Declaring Opaque Types" Inhabited -> Nonempty (certain operations -> opaque functions' return types must be nonempty (and IIRC Inhabited if not marked extern?))
  • Constructors without fields are stored as lean_box(tag), the example c_fallible_operation allocates ctor obj for Option.none
  • Constructor layout mentions that Chars are stored with UInt32s, while they are stored with objects
  • In "Using C Structs Directly with Accessor Functions" Int32 and lean_box_int32 are used, but they are not defined
  • In "Error Handling in Lean Code" () is used as a type, and the wording in some places seems to advise using IO for fallible operations, regardless of purity
  • IO functions C implementations lack the "RealWorld" argument (UB?)

Something that can be added:

  • LEAN_CC environment variable
  • Marking objects as MT/persistent
  • Handling of constants (Unit -> T)
  • Thread initialization/finalization
  • BaseIO and EIO
  • @[extern c inline "..."]
  • Precompile option in lakefile
  • Example of using lean_alloc_external, initialize (builtin_initialize) external code (which can be used to initialize external object classes)
  • Trivial wrappers
  • Alloy library and lean_sys rust crate

Yuri (Jul 17 2024 at 17:31):

This is a great list, thanks!

Siddhartha Gadgil (Jul 18 2024 at 04:20):

@Yuri de Wit Thanks for this. One thing that confused me: "We pass UInt8UInt32, and Float as unboxed values" seems to contradict "UInt64USize, and Float are always passed as boxed values". Is Float boxed or an unboxed double in C?

Yuri (Jul 18 2024 at 11:57):

Siddhartha Gadgil said:

Yuri de Wit Thanks for this. One thing that confused me: "We pass UInt8UInt32, and Float as unboxed values" seems to contradict "UInt64USize, and Float are always passed as boxed values". Is Float boxed or an unboxed double in C?

Hi @Siddhartha Gadgil , thanks for the feedback. I should make that clearer. When the scalar crosses the FFI bounded through an @[extern ..] function declaration it is passed unboxed:

@[extern "c_process_scalars"]
opaque processScalars (a : UInt8) (b : UInt32) (c : Float) : IO (UInt64 × Float)

And the corresponding C function:

LEAN_EXPORT lean_obj_res c_process_scalars(uint8_t a, uint32_t b, double c) {
..

Note that uint32_t and double are passed unboxed and not as lean_obj_arg (boxed).

In addition, Lean constructors (structure/inductive) have a layout that also stores scalars unboxed after all the boxed fields.

One place where where I see the need to explicitly box and unbox scalar values in FFI is when dealing with closures in FFI, since the API expects always a lean_object* boxed parameter. But I guess there could be other places I am not aware. This makes me think that boxing and unboxing of scalars may have gotten more attention than it deserved.

Yuri (Jul 18 2024 at 12:05):

@Siddhartha Gadgil , sorry, but I think I missed your original question:

* UInt8UInt16, and Bool are always passed unboxed in function calls.
vs
* UInt64USize, and Float are always passed as boxed values.

What I meant to say here is that when you lean_box_uint64 a UInt64 you are effectively creating a constructor with one scalar field. The resulting lean_object* points to this constructor in the heap containing the scalar. Since UInt8, Uint16 and maybe (depending on the machine architecture) UInt32 can fit in a machine word, this constructor wrapping does not happen there and the scalar is stored in the actual pointer with the lowest bit set to 1.

I will fix the doc.

James Gallicchio (Jul 18 2024 at 12:43):

this is a great reference, thanks for making it!

James Gallicchio (Jul 18 2024 at 12:44):

Daniil Kisel said:

  • Constructor layout mentions that Chars are stored with UInt32s, while they are stored with objects

is this accurate? I thought Char is represented as uint32_t as a trivial wrapper struct but haven't tested anything. Just surprised because I think my code is working right now treating it as a uint32_t :big_smile:

James Gallicchio (Jul 18 2024 at 12:45):

oh, wait, trivial wrappers still change the layout when used in constructors... got it......


Last updated: May 02 2025 at 03:31 UTC