Zulip Chat Archive
Stream: lean4
Topic: Understanding Lean's Foreign Function Interface (FFI)
Yuri (Jul 17 2024 at 14:59):
I am sharing here a document that started as a few notes, morphed into a full blown reference and ended somewhere in between as a guide to understand how Lean does FFI today. It tries to approach the topic from a beginners point of view, but can get down to details too.
I hope this is helpful to others here: Understanding Lean's Foreign Function Interface (FFI)
And if you see mistakes, omissions or any feedback, please, let me know.
Yuri (Jul 17 2024 at 15:04):
BTW: I still want to add content for how to deal with external objects and closures.
Henrik Böving (Jul 17 2024 at 15:15):
If you want to keep expanding on this, I really liked https://ocaml.org/manual/5.2/intfc.html and would love if we had a guide in this direction.
Yuri (Jul 17 2024 at 15:20):
Hi, @Henrik Böving! That’s a possibility, but I plan to spend some time gathering corrections and feedback before making any major changes.
One concern I have, based on my experience so far, is that the FFI could benefit from a significant redesign to enhance its usability as a general FFI. While it works well as a low-overhead interface between C/C++ and Lean, it falls short in other areas.
Therefore, I would think twice before investing much more energy into a comprehensive documentation revamp.
Yuri (Jul 17 2024 at 15:23):
Henrik Böving said:
If you want to keep expanding on this, I really liked https://ocaml.org/manual/5.2/intfc.html and would love if we had a guide in this direction.
Out of curiosity, what specifically attracted you to it aside from presentation?
Daniil Kisel (Jul 17 2024 at 16:06):
Proposing to change:
- In "Declaring Opaque Types"
Inhabited
->Nonempty
(certain operations -> opaque functions' return types must be nonempty (and IIRC Inhabited if not marked extern?)) - Constructors without fields are stored as
lean_box(tag)
, the examplec_fallible_operation
allocates ctor obj forOption.none
- Constructor layout mentions that
Char
s are stored withUInt32
s, while they are stored with objects - In "Using C Structs Directly with Accessor Functions"
Int32
andlean_box_int32
are used, but they are not defined - In "Error Handling in Lean Code"
()
is used as a type, and the wording in some places seems to advise usingIO
for fallible operations, regardless of purity IO
functions C implementations lack the "RealWorld" argument (UB?)
Something that can be added:
- LEAN_CC environment variable
- Marking objects as MT/persistent
- Handling of constants (
Unit -> T
) - Thread initialization/finalization
BaseIO
andEIO
@[extern c inline "..."]
- Precompile option in lakefile
- Example of using
lean_alloc_external
,initialize
(builtin_initialize
) external code (which can be used to initialize external object classes) - Trivial wrappers
Alloy
library andlean_sys
rust crate
Yuri (Jul 17 2024 at 17:31):
This is a great list, thanks!
Siddhartha Gadgil (Jul 18 2024 at 04:20):
@Yuri de Wit Thanks for this. One thing that confused me: "We pass UInt8
, UInt32
, and Float
as unboxed values" seems to contradict "UInt64
, USize
, and Float
are always passed as boxed values". Is Float
boxed or an unboxed double
in C?
Yuri (Jul 18 2024 at 11:57):
Siddhartha Gadgil said:
Yuri de Wit Thanks for this. One thing that confused me: "We pass
UInt8
,UInt32
, andFloat
as unboxed values" seems to contradict "UInt64
,USize
, andFloat
are always passed as boxed values". IsFloat
boxed or an unboxeddouble
in C?
Hi @Siddhartha Gadgil , thanks for the feedback. I should make that clearer. When the scalar crosses the FFI bounded through an @[extern ..]
function declaration it is passed unboxed:
@[extern "c_process_scalars"]
opaque processScalars (a : UInt8) (b : UInt32) (c : Float) : IO (UInt64 × Float)
And the corresponding C function:
LEAN_EXPORT lean_obj_res c_process_scalars(uint8_t a, uint32_t b, double c) {
..
Note that uint32_t
and double
are passed unboxed and not as lean_obj_arg
(boxed).
In addition, Lean constructors (structure/inductive) have a layout that also stores scalars unboxed after all the boxed fields.
One place where where I see the need to explicitly box and unbox scalar values in FFI is when dealing with closures in FFI, since the API expects always a lean_object*
boxed parameter. But I guess there could be other places I am not aware. This makes me think that boxing and unboxing of scalars may have gotten more attention than it deserved.
Yuri (Jul 18 2024 at 12:05):
@Siddhartha Gadgil , sorry, but I think I missed your original question:
*
UInt8,
UInt16, and
Bool are always passed unboxed in function calls.
vs
*
UInt64,
USize, and
Float are always passed as boxed values.
What I meant to say here is that when you lean_box_uint64
a UInt64
you are effectively creating a constructor with one scalar field. The resulting lean_object*
points to this constructor in the heap containing the scalar. Since UInt8
, Uint16
and maybe (depending on the machine architecture) UInt32
can fit in a machine word, this constructor wrapping does not happen there and the scalar is stored in the actual pointer with the lowest bit set to 1
.
I will fix the doc.
James Gallicchio (Jul 18 2024 at 12:43):
this is a great reference, thanks for making it!
James Gallicchio (Jul 18 2024 at 12:44):
Daniil Kisel said:
- Constructor layout mentions that
Char
s are stored withUInt32
s, while they are stored with objects
is this accurate? I thought Char
is represented as uint32_t
as a trivial wrapper struct but haven't tested anything. Just surprised because I think my code is working right now treating it as a uint32_t
:big_smile:
James Gallicchio (Jul 18 2024 at 12:45):
oh, wait, trivial wrappers still change the layout when used in constructors... got it......
Last updated: May 02 2025 at 03:31 UTC