Zulip Chat Archive
Stream: lean4
Topic: C FFI export
Jad Ghalayini (Jul 24 2022 at 02:09):
Hello! I am implementing an LCF-style kernel for a proof checker, and came up with what I think is a somewhat interesting design:
- Implement the kernel in Lean 4, and prove soundness and other fun properties
- Export the syntax tree and validated constructors via C FFI; this entire API is simply typed so should be easy to use safely
- Wrap the C FFI in a nice clean Rust API, and publish the result as a Rust library; the rest of the prover will then use the kernel via the Rust API, which we know to be bug free due to the Lean formalization
Unfortunately, I have no idea how I would go about exposing the necessary functions and/or data types (may actually want the term/derivation data structures to be opaque, owned by Lean and only accessed via exposed API functions) to C. Is there a way, in particular, to say take a function myFunc: Term -> Term
and expose it to C?
Jad Ghalayini (Jul 24 2022 at 02:14):
I'm guessing I need to use @[export]
from https://leanprover.github.io/lean4/doc/dev/ffi.html, but I'm a bit confused with how to package/link things nicely with Rust
Mac (Jul 24 2022 at 02:15):
You can tag a definition @[export your_c_name]
to give it said symbol name in the compiled object.
Mac (Jul 24 2022 at 02:15):
Oops beat me to it.
Mac (Jul 24 2022 at 02:19):
When you build a Lean package you get object files (and can build a static library for library via lake build Lib:static
). Just link the built static library / object file of the module containing the @[export]
symbol with your Rust code to make the symbol available. Then, in Rust, declare it as you would any other foreign language symbol.
Mac (Jul 24 2022 at 02:20):
e.g., via extern { fn your_c_name ... }
Mac (Jul 24 2022 at 02:33):
You will also likely to create an opaque type corresponding to Lean's lean_object
type via e.g. :
#[repr(C)] pub struct LeanObject { _private: [u8; 0] }
Then you can do:
extern {
pub fn my_func(term: *mut LeanObject) -> LeanObject;
}
To reference your Lean function defined like so:
@[export my_func] def myFunc: Term -> Term
Note: I do not know much Rust, but I believe this is how it works on that side.
Jad Ghalayini (Jul 24 2022 at 17:46):
@Mac sounds good! I was thinking of actually making a separate crate for re-usable Lean-Rust bindings too (basically, wrapping up lean.h in Rust), as that doesn't seem to exist yet and I'll have to write them anyways.
Jad Ghalayini (Jul 24 2022 at 17:46):
Any name suggestions?
Jad Ghalayini (Jul 24 2022 at 22:50):
Also question: what's the thread safety of the functions in lean.h
?
Jad Ghalayini (Jul 24 2022 at 22:50):
Are all the compiled functions in a Lean program thread safe?
Mac (Jul 24 2022 at 23:45):
Jad Ghalayini said:
Are all the compiled functions in a Lean program thread safe?
No and yes. Lots of the Lean code itself is not thread safe. However, the basic representations used to build functions (e.g., the allocator) are. That is, if a user makes sure to write thread safe code (e.g., parallel Tasks don't perform IO actions that can deadlock -- like waiting on each other), the under the hood parts of Lean (like memory management, etc.) are generally thread safe. One noteworthy exception I know of at the moment is #1282 -- IO.Process.spawn
with different environments is not thread safe.
Jad Ghalayini (Jul 25 2022 at 01:46):
Wait, what do you mean by thread safe? Because deadlocking is thread safe, just bad. I meant undefined behavior
Mac (Jul 25 2022 at 03:49):
I would generally not consider deadlocking to be thread safe (though, looking further, whether or not thread safe includes or excludes deadlocks seems debatable). However, to clarify, there are many functions in the Lean core which will can break if multiple threads run them simultanously (like my IO.Proccess.spawn
example above). One import example is setting up an Lean environment, which is very much not thread safe (as this comment highlights).
Gabriel Ebner (Jul 25 2022 at 14:40):
However, to clarify, there are many functions in the Lean core which will can break if multiple threads run them simultanously
Just to clarify, the only thing that's definitely not expected to be thread-safe is setting up an environment for imported modules. But you're not supposed to do that more than once in a single process anyhow (since it modifies global variables).
Accessing IO.Ref
s from multiple threads is merely a bad idea, but not really unsafe. Just be aware that IO.Ref
uses busy-waiting if another thread calls modify
, so keep those modify functions short. If you're only using get
and set
, they're just atomic references.
The process spawning issue is a bug. Feel free to submit a fix, you're one of the few people here with a windows machine.
Sebastian Ullrich (Jul 25 2022 at 14:56):
Coming back to lean.h
, the short answer is that all functions should be thread-safe as long as you give them Lean objects marked as multi-threaded (lean_mark_mt
).
Jad Ghalayini (Jul 25 2022 at 16:22):
@Mac I guess when I was daying unsafe I meant precisely the Rust semantics of the word, since I want to wrtie a Rust wrapper
Jad Ghalayini (Jul 25 2022 at 16:22):
And in Rust deadlock is safe haha
Jad Ghalayini (Jul 25 2022 at 16:24):
@Sebastian Ullrich I'm about 70% of the way through translating lean.h
to Rust, and on the way I came across some interesting design decisions
Jad Ghalayini (Jul 25 2022 at 16:24):
I also found that a lot of things that were not implemeneted with atomics should have been relaxed reads afaict
Jad Ghalayini (Jul 25 2022 at 16:25):
You know who I can ask about this?
Sebastian Ullrich (Jul 25 2022 at 16:25):
You can ask this stream :)
Locria Cyber (Jul 25 2022 at 22:06):
Is there plan for proper FFI support (feature parity with libffi)?
Jad Ghalayini (Jul 26 2022 at 02:17):
@Locria Cyber could you elaborate?
Jad Ghalayini (Jul 26 2022 at 02:26):
Sebastian Ullrich said:
You can ask this stream :)
Well, for one, all the reference count operations on the hot path, e.g. checking if something is single-threaded, do not use relaxed atomics, but rather just directly test for equality. I recognize I'm being a typical pedantic Rust stickler but aren't potentially racy accesses like that UB?
Jad Ghalayini (Jul 26 2022 at 02:26):
In my Rust wrapper anyways I used relaxed atomic loads for the checks; I still don't understand why we even need a separate single-threaded case at all though
Jad Ghalayini (Jul 26 2022 at 02:28):
In particular, the array-tricks used with mutation of arrays having only one reference can be implemented in a multi-threaded context with only a bit (1 atomic operation) of overhead; this would then avoid having to ever mark things multi-threaded while probably giving comparable performance
Jad Ghalayini (Jul 26 2022 at 02:28):
Has benchmarking shown this is inadequate?
Jad Ghalayini (Jul 26 2022 at 02:28):
Or is there other functionality that depends on the ST/MT distinction?
Sebastian Ullrich (Jul 26 2022 at 06:49):
The prime motivation of the MT bit is making single-threaded RC ops fast, see Section 7.2 of https://arxiv.org/pdf/1908.05647.pdf
Sebastian Ullrich (Jul 26 2022 at 06:50):
Locria Cyber said:
Is there plan for proper FFI support (feature parity with libffi)?
No plans, only a wish
Jad Ghalayini (Jul 26 2022 at 07:08):
@Sebastian Ullrich that's very interesting; I'm pleasantly surprised it's actually faster. One thing I noticed however is that multithreaded values are never recognized as unique; I think it should be relatively simple to add support for this though, but not sure if any problems would come up in practice.
Locria Cyber (Jul 26 2022 at 18:44):
I tried to bind libffi with lean. It seem feasible. I may need help on lean_object*
.
Locria Cyber (Jul 26 2022 at 18:46):
Jad Ghalayini said:
Sebastian Ullrich I'm about 70% of the way through translating
lean.h
to Rust, and on the way I came across some interesting design decisions
I probably should use Rust to bind libffi. What do you think?
Locria Cyber (Jul 26 2022 at 18:48):
@Jad Ghalayini I'd like see a link to that source code (Rust interface).
Jad Ghalayini (Jul 27 2022 at 02:34):
It's on crates.io!
Jad Ghalayini (Jul 27 2022 at 02:34):
lean-sys
Jad Ghalayini (Jul 27 2022 at 02:34):
Just a translation of lean.h to Rust
Jad Ghalayini (Jul 27 2022 at 02:35):
@Locria Cyber would be happy to try to extend it to add libffi support
Tom (Jul 27 2022 at 21:41):
@Jad Ghalayini What do you mean by
Well, for one, all the reference count operations on the hot path, e.g. checking if something is single-threaded, do not use relaxed atomics, but rather just directly test for equality. I recognize I'm being a typical pedantic Rust stickler but aren't potentially racy accesses like that UB?
C++ has a fairly-well defined memory model and if not explicitly specified (e.g. as a relaxed read) the default behavior is to fall back to sequential consistency (which is obviously stronger), so I assume I'm misunderstanding your statement. How would the access be racy/lead to UB? Could you please point me at an example C++ file?
As for the hot path comment, what arch are you running on? Seems most of time marking reads as relaxed hardly matters unless you're on some more exotic arch which doesn't implement MESI.
Jad Ghalayini (Jul 28 2022 at 02:20):
@Tom yes, a relaxed read to an atomic variable defaults to sequential consistency; but the reference count used is not marked atomic. It's just an i32
Jad Ghalayini (Jul 28 2022 at 02:20):
Also this is in lean.h so it's actually being called from C code, but the memory models don't disagree here
Sebastian Ullrich (Jul 28 2022 at 08:56):
Well, this is the core of a language runtime. So rules around UB might get a little... relaxed at times. Effectively, if it's impossible for the compiler or any reasonable hardware implementation to screw up, we might just have to live with it.
Sebastian Ullrich (Jul 28 2022 at 08:59):
Here the basic proposition not supported by the C/C++ memory models is that a nonatomic write (no larger than the word size, in particular) to a location where the old and new value agree on a particular bit cannot be observed concurrently to change that bit.
Mario Carneiro (Jul 28 2022 at 11:47):
I don't see why this can't be a relaxed atomic write. It should yield the same assembly
Mario Carneiro (Jul 28 2022 at 11:50):
Your basic proposition is certainly not valid in the C/C++ spec, so it can be miscompiled. Basically, the compiler can see a nonatomic write as evidence that it is the only writer to that location which makes it legal to insert other writes to that location, inline the results of reads and writes, and so on which is not what you want
Sebastian Ullrich (Jul 28 2022 at 12:11):
Mario Carneiro said:
I don't see why this can't be a relaxed atomic write. It should yield the same assembly
That's not true without further changes https://godbolt.org/z/fjK91aEYc
Mario Carneiro (Jul 28 2022 at 13:53):
Ah, I was assuming you didn't actually have data races, but this code is defending against exactly that. I assume lean_is_st
means "is single-threaded"? So there isn't actually a race to change this m_rc
variable because in multithreaded mode it's always 0?
Mario Carneiro (Jul 28 2022 at 14:03):
It's also a bit sketchy but this yields the equivalent assembly:
static inline bool lean_is_st2(lean_object2 * o) {
return atomic_load_explicit(&o->m_rc, memory_order_relaxed) > 0;
}
void lean_inc_ref2(lean_object2 * o) {
if (lean_is_st2(o)) {
(*(int*)&o->m_rc)++;
} else {}
}
In C++20 you can use atomic_ref
to do non-atomic operations on atomic objects, but AFAIK this is the best you can do in C11
Mario Carneiro (Jul 28 2022 at 14:05):
in this particular case you can also just manually inline the function instead of relying on the compiler to do a sketchy optimization on your behalf
void lean_inc_ref2(lean_object2 * o) {
int rc = atomic_load_explicit(&o->m_rc, memory_order_relaxed);
if (rc > 0) {
atomic_store_explicit(&o->m_rc, rc + 1, memory_order_relaxed);
} else {}
}
Sebastian Ullrich (Jul 28 2022 at 14:39):
Yeah, either of these would probably fine. It reminds me a bit of the non-transitivity of defeq issue - we know it's not optimal and there may even be a simple fix, but the practical impact is likely zero. So let's maybe wait a bit more.
Kevin Buzzard (Jul 28 2022 at 18:41):
I was really surprised that nontransitivity of defeq doesn't ruin the party more often (or indeed ever, as far as I can see). We've now done a substantial bunch of mathematics in Lean and I don't ever recall it getting in the way (perhaps because we often care less about defeq than about good simp
lemmas).
Leonardo de Moura (Jul 28 2022 at 20:03):
Sebastian Ullrich said:
Yeah, either of these would probably fine. It reminds me a bit of the non-transitivity of defeq issue - we know it's not optimal and there may even be a simple fix, but the practical impact is likely zero. So let's maybe wait a bit more.
I couldn't agree more. I agree that impact is likely zero. We currently don't even have resources to review a PR with this kind of change that affects a critical part of the system.
It is great to see all the enthusiasm. BTW, I'd like to remind everybody that we have a list of "help wanted" issues, they are great for first-time contributors.
https://github.com/leanprover/lean4/labels/help%20wanted
Jad Ghalayini (Jul 29 2022 at 04:10):
Btw @Mario Carneiro your code is basically what I put in my Rust lean-sys
Jad Ghalayini (Jul 29 2022 at 04:10):
Whwt can I say I'm very pedantic about UB hahaha
Locria Cyber (Jul 29 2022 at 11:39):
Jad Ghalayini said:
Locria Cyber would be happy to try to extend it to add libffi support
I mean to use lean-sys and bind ffi (Rust crate) with lean, so lean can use ffi.
Locria Cyber (Jul 29 2022 at 11:46):
it's a pain that Lean doesn't expose Nat -> C
all nat_to_intxxxx
is modulo
Locria Cyber (Jul 29 2022 at 12:02):
I found maybe a bug in tests/compiler/foreign/myfuns.cpp
lean_object * lean_S_update_global(b_lean_obj_arg s, lean_object /* w */)
world object is lean_object*
Locria Cyber (Jul 29 2022 at 12:03):
I think the world object is borrowed?
Sebastian Ullrich (Jul 29 2022 at 12:07):
It's just the tagged pointer 0x1
, so it's neither really. But that signature is incorrect, right.
Locria Cyber (Jul 29 2022 at 12:10):
also, Int64 isn't generated as int64_t in C
Sebastian Ullrich (Jul 29 2022 at 12:15):
There is no Int64 in the stdlib?
Locria Cyber (Jul 29 2022 at 12:18):
I mean FFI. It is generated as lean_object *
with @[extern "xxx"]
Locria Cyber (Jul 29 2022 at 12:21):
How do I use static library with lake?
Locria Cyber (Jul 29 2022 at 12:21):
The Lake's readme is vague
Locria Cyber (Jul 29 2022 at 13:38):
@Jad Ghalayini
Your lean_alloc_ctor
leaks memory
Locria Cyber (Jul 29 2022 at 13:47):
no, it's my fault of using Rust &str as C string
Locria Cyber (Jul 29 2022 at 14:01):
https://github.com/locriacyber/lean-sys-example
Here's an example I made
Mac (Jul 29 2022 at 16:20):
Locria Cyber said:
How do I use static library with lake?
Does the FFI example help?
Locria Cyber (Jul 29 2022 at 16:20):
I already figured it out
Locria Cyber (Aug 09 2022 at 18:48):
@Jad Ghalayini I made a pull request on Gitlab on lean-sys for some fixes
Locria Cyber (Aug 09 2022 at 18:50):
Turns out there is no need for libffi if I can link the C/Rust library.
Locria Cyber (Aug 09 2022 at 18:50):
at runtime, libffi may be needed
Locria Cyber (Aug 09 2022 at 18:50):
although some glue code in C is enough
Jad Ghalayini (Aug 09 2022 at 18:53):
Merged!
Jad Ghalayini (Aug 09 2022 at 18:53):
@Locria Cyber would you like to be added as a maintainer
Jad Ghalayini (Aug 09 2022 at 18:53):
I plan to extend the library significantly with some higher-level safe wrappers later
Jad Ghalayini (Aug 09 2022 at 18:54):
Tell me if you're interested!
Jad Ghalayini (Aug 09 2022 at 18:54):
Also I'm curious what exactly you need libffi for
Jad Ghalayini (Aug 09 2022 at 18:54):
I plan to be making very heavy use if Lean FFI in my PhD as proposed (see gitlab.com/tekne/phd-proposal) and would be happy to collaborate!
Tomas Skrivan (Aug 09 2022 at 22:15):
Do you a pdf of the proposal? I'm not very good at reading raw TeX.
Locria Cyber (Aug 16 2022 at 12:09):
Jad Ghalayini said:
Locria Cyber would you like to be added as a maintainer
ok
Locria Cyber (Aug 16 2022 at 12:10):
libffi is for runtime loading of dynamic library
Locria Cyber (Aug 16 2022 at 12:11):
at runtime
- loading dynamic library
- make the function's type/ABI
- call it
Locria Cyber (Aug 16 2022 at 12:12):
This is not really necessary, since in most FFI usage, you know the function types you want to call.
e.g. void puts(char*);
Locria Cyber (Aug 16 2022 at 12:16):
https://gitlab.com/tekne/phd-proposal/-/blob/main/figures/rvsdg.pdf
:rolling_on_the_floor_laughing: why this RVSDG seems very similar to a language/notation I made
Last updated: Dec 20 2023 at 11:08 UTC