Zulip Chat Archive

Stream: lean4

Topic: C FFI export


Jad Ghalayini (Jul 24 2022 at 02:09):

Hello! I am implementing an LCF-style kernel for a proof checker, and came up with what I think is a somewhat interesting design:

  • Implement the kernel in Lean 4, and prove soundness and other fun properties
  • Export the syntax tree and validated constructors via C FFI; this entire API is simply typed so should be easy to use safely
  • Wrap the C FFI in a nice clean Rust API, and publish the result as a Rust library; the rest of the prover will then use the kernel via the Rust API, which we know to be bug free due to the Lean formalization

Unfortunately, I have no idea how I would go about exposing the necessary functions and/or data types (may actually want the term/derivation data structures to be opaque, owned by Lean and only accessed via exposed API functions) to C. Is there a way, in particular, to say take a function myFunc: Term -> Term and expose it to C?

Jad Ghalayini (Jul 24 2022 at 02:14):

I'm guessing I need to use @[export] from https://leanprover.github.io/lean4/doc/dev/ffi.html, but I'm a bit confused with how to package/link things nicely with Rust

Mac (Jul 24 2022 at 02:15):

You can tag a definition @[export your_c_name] to give it said symbol name in the compiled object.

Mac (Jul 24 2022 at 02:15):

Oops beat me to it.

Mac (Jul 24 2022 at 02:19):

When you build a Lean package you get object files (and can build a static library for library via lake build Lib:static). Just link the built static library / object file of the module containing the @[export] symbol with your Rust code to make the symbol available. Then, in Rust, declare it as you would any other foreign language symbol.

Mac (Jul 24 2022 at 02:20):

e.g., via extern { fn your_c_name ... }

Mac (Jul 24 2022 at 02:33):

You will also likely to create an opaque type corresponding to Lean's lean_object type via e.g. :

#[repr(C)] pub struct LeanObject { _private: [u8; 0] }

Then you can do:

extern {
 pub fn my_func(term: *mut LeanObject) -> LeanObject;
}

To reference your Lean function defined like so:

@[export my_func] def myFunc: Term -> Term

Note: I do not know much Rust, but I believe this is how it works on that side.

Jad Ghalayini (Jul 24 2022 at 17:46):

@Mac sounds good! I was thinking of actually making a separate crate for re-usable Lean-Rust bindings too (basically, wrapping up lean.h in Rust), as that doesn't seem to exist yet and I'll have to write them anyways.

Jad Ghalayini (Jul 24 2022 at 17:46):

Any name suggestions?

Jad Ghalayini (Jul 24 2022 at 22:50):

Also question: what's the thread safety of the functions in lean.h?

Jad Ghalayini (Jul 24 2022 at 22:50):

Are all the compiled functions in a Lean program thread safe?

Mac (Jul 24 2022 at 23:45):

Jad Ghalayini said:

Are all the compiled functions in a Lean program thread safe?

No and yes. Lots of the Lean code itself is not thread safe. However, the basic representations used to build functions (e.g., the allocator) are. That is, if a user makes sure to write thread safe code (e.g., parallel Tasks don't perform IO actions that can deadlock -- like waiting on each other), the under the hood parts of Lean (like memory management, etc.) are generally thread safe. One noteworthy exception I know of at the moment is #1282 -- IO.Process.spawn with different environments is not thread safe.

Jad Ghalayini (Jul 25 2022 at 01:46):

Wait, what do you mean by thread safe? Because deadlocking is thread safe, just bad. I meant undefined behavior

Mac (Jul 25 2022 at 03:49):

I would generally not consider deadlocking to be thread safe (though, looking further, whether or not thread safe includes or excludes deadlocks seems debatable). However, to clarify, there are many functions in the Lean core which will can break if multiple threads run them simultanously (like my IO.Proccess.spawn example above). One import example is setting up an Lean environment, which is very much not thread safe (as this comment highlights).

Gabriel Ebner (Jul 25 2022 at 14:40):

However, to clarify, there are many functions in the Lean core which will can break if multiple threads run them simultanously

Just to clarify, the only thing that's definitely not expected to be thread-safe is setting up an environment for imported modules. But you're not supposed to do that more than once in a single process anyhow (since it modifies global variables).

Accessing IO.Refs from multiple threads is merely a bad idea, but not really unsafe. Just be aware that IO.Ref uses busy-waiting if another thread calls modify, so keep those modify functions short. If you're only using get and set, they're just atomic references.

The process spawning issue is a bug. Feel free to submit a fix, you're one of the few people here with a windows machine.

Sebastian Ullrich (Jul 25 2022 at 14:56):

Coming back to lean.h, the short answer is that all functions should be thread-safe as long as you give them Lean objects marked as multi-threaded (lean_mark_mt).

Jad Ghalayini (Jul 25 2022 at 16:22):

@Mac I guess when I was daying unsafe I meant precisely the Rust semantics of the word, since I want to wrtie a Rust wrapper

Jad Ghalayini (Jul 25 2022 at 16:22):

And in Rust deadlock is safe haha

Jad Ghalayini (Jul 25 2022 at 16:24):

@Sebastian Ullrich I'm about 70% of the way through translating lean.h to Rust, and on the way I came across some interesting design decisions

Jad Ghalayini (Jul 25 2022 at 16:24):

I also found that a lot of things that were not implemeneted with atomics should have been relaxed reads afaict

Jad Ghalayini (Jul 25 2022 at 16:25):

You know who I can ask about this?

Sebastian Ullrich (Jul 25 2022 at 16:25):

You can ask this stream :)

Locria Cyber (Jul 25 2022 at 22:06):

Is there plan for proper FFI support (feature parity with libffi)?

Jad Ghalayini (Jul 26 2022 at 02:17):

@Locria Cyber could you elaborate?

Jad Ghalayini (Jul 26 2022 at 02:26):

Sebastian Ullrich said:

You can ask this stream :)

Well, for one, all the reference count operations on the hot path, e.g. checking if something is single-threaded, do not use relaxed atomics, but rather just directly test for equality. I recognize I'm being a typical pedantic Rust stickler but aren't potentially racy accesses like that UB?

Jad Ghalayini (Jul 26 2022 at 02:26):

In my Rust wrapper anyways I used relaxed atomic loads for the checks; I still don't understand why we even need a separate single-threaded case at all though

Jad Ghalayini (Jul 26 2022 at 02:28):

In particular, the array-tricks used with mutation of arrays having only one reference can be implemented in a multi-threaded context with only a bit (1 atomic operation) of overhead; this would then avoid having to ever mark things multi-threaded while probably giving comparable performance

Jad Ghalayini (Jul 26 2022 at 02:28):

Has benchmarking shown this is inadequate?

Jad Ghalayini (Jul 26 2022 at 02:28):

Or is there other functionality that depends on the ST/MT distinction?

Sebastian Ullrich (Jul 26 2022 at 06:49):

The prime motivation of the MT bit is making single-threaded RC ops fast, see Section 7.2 of https://arxiv.org/pdf/1908.05647.pdf

Sebastian Ullrich (Jul 26 2022 at 06:50):

Locria Cyber said:

Is there plan for proper FFI support (feature parity with libffi)?

No plans, only a wish

Jad Ghalayini (Jul 26 2022 at 07:08):

@Sebastian Ullrich that's very interesting; I'm pleasantly surprised it's actually faster. One thing I noticed however is that multithreaded values are never recognized as unique; I think it should be relatively simple to add support for this though, but not sure if any problems would come up in practice.

Locria Cyber (Jul 26 2022 at 18:44):

I tried to bind libffi with lean. It seem feasible. I may need help on lean_object*.

Locria Cyber (Jul 26 2022 at 18:46):

Jad Ghalayini said:

Sebastian Ullrich I'm about 70% of the way through translating lean.h to Rust, and on the way I came across some interesting design decisions

I probably should use Rust to bind libffi. What do you think?

Locria Cyber (Jul 26 2022 at 18:48):

@Jad Ghalayini I'd like see a link to that source code (Rust interface).

Jad Ghalayini (Jul 27 2022 at 02:34):

It's on crates.io!

Jad Ghalayini (Jul 27 2022 at 02:34):

lean-sys

Jad Ghalayini (Jul 27 2022 at 02:34):

Just a translation of lean.h to Rust

Jad Ghalayini (Jul 27 2022 at 02:35):

@Locria Cyber would be happy to try to extend it to add libffi support

Tom (Jul 27 2022 at 21:41):

@Jad Ghalayini What do you mean by

Well, for one, all the reference count operations on the hot path, e.g. checking if something is single-threaded, do not use relaxed atomics, but rather just directly test for equality. I recognize I'm being a typical pedantic Rust stickler but aren't potentially racy accesses like that UB?

C++ has a fairly-well defined memory model and if not explicitly specified (e.g. as a relaxed read) the default behavior is to fall back to sequential consistency (which is obviously stronger), so I assume I'm misunderstanding your statement. How would the access be racy/lead to UB? Could you please point me at an example C++ file?

As for the hot path comment, what arch are you running on? Seems most of time marking reads as relaxed hardly matters unless you're on some more exotic arch which doesn't implement MESI.

Jad Ghalayini (Jul 28 2022 at 02:20):

@Tom yes, a relaxed read to an atomic variable defaults to sequential consistency; but the reference count used is not marked atomic. It's just an i32

Jad Ghalayini (Jul 28 2022 at 02:20):

Also this is in lean.h so it's actually being called from C code, but the memory models don't disagree here

Sebastian Ullrich (Jul 28 2022 at 08:56):

Well, this is the core of a language runtime. So rules around UB might get a little... relaxed at times. Effectively, if it's impossible for the compiler or any reasonable hardware implementation to screw up, we might just have to live with it.

Sebastian Ullrich (Jul 28 2022 at 08:59):

Here the basic proposition not supported by the C/C++ memory models is that a nonatomic write (no larger than the word size, in particular) to a location where the old and new value agree on a particular bit cannot be observed concurrently to change that bit.

Mario Carneiro (Jul 28 2022 at 11:47):

I don't see why this can't be a relaxed atomic write. It should yield the same assembly

Mario Carneiro (Jul 28 2022 at 11:50):

Your basic proposition is certainly not valid in the C/C++ spec, so it can be miscompiled. Basically, the compiler can see a nonatomic write as evidence that it is the only writer to that location which makes it legal to insert other writes to that location, inline the results of reads and writes, and so on which is not what you want

Sebastian Ullrich (Jul 28 2022 at 12:11):

Mario Carneiro said:

I don't see why this can't be a relaxed atomic write. It should yield the same assembly

That's not true without further changes https://godbolt.org/z/fjK91aEYc

Mario Carneiro (Jul 28 2022 at 13:53):

Ah, I was assuming you didn't actually have data races, but this code is defending against exactly that. I assume lean_is_st means "is single-threaded"? So there isn't actually a race to change this m_rc variable because in multithreaded mode it's always 0?

Mario Carneiro (Jul 28 2022 at 14:03):

It's also a bit sketchy but this yields the equivalent assembly:

static inline bool lean_is_st2(lean_object2 * o) {
    return atomic_load_explicit(&o->m_rc, memory_order_relaxed) > 0;
}

void lean_inc_ref2(lean_object2 * o) {
    if (lean_is_st2(o)) {
        (*(int*)&o->m_rc)++;
    } else {}
}

In C++20 you can use atomic_ref to do non-atomic operations on atomic objects, but AFAIK this is the best you can do in C11

Mario Carneiro (Jul 28 2022 at 14:05):

in this particular case you can also just manually inline the function instead of relying on the compiler to do a sketchy optimization on your behalf

void lean_inc_ref2(lean_object2 * o) {
    int rc = atomic_load_explicit(&o->m_rc, memory_order_relaxed);
    if (rc > 0) {
        atomic_store_explicit(&o->m_rc, rc + 1, memory_order_relaxed);
    } else {}
}

Sebastian Ullrich (Jul 28 2022 at 14:39):

Yeah, either of these would probably fine. It reminds me a bit of the non-transitivity of defeq issue - we know it's not optimal and there may even be a simple fix, but the practical impact is likely zero. So let's maybe wait a bit more.

Kevin Buzzard (Jul 28 2022 at 18:41):

I was really surprised that nontransitivity of defeq doesn't ruin the party more often (or indeed ever, as far as I can see). We've now done a substantial bunch of mathematics in Lean and I don't ever recall it getting in the way (perhaps because we often care less about defeq than about good simp lemmas).

Leonardo de Moura (Jul 28 2022 at 20:03):

Sebastian Ullrich said:

Yeah, either of these would probably fine. It reminds me a bit of the non-transitivity of defeq issue - we know it's not optimal and there may even be a simple fix, but the practical impact is likely zero. So let's maybe wait a bit more.

I couldn't agree more. I agree that impact is likely zero. We currently don't even have resources to review a PR with this kind of change that affects a critical part of the system.
It is great to see all the enthusiasm. BTW, I'd like to remind everybody that we have a list of "help wanted" issues, they are great for first-time contributors.
https://github.com/leanprover/lean4/labels/help%20wanted

Jad Ghalayini (Jul 29 2022 at 04:10):

Btw @Mario Carneiro your code is basically what I put in my Rust lean-sys

Jad Ghalayini (Jul 29 2022 at 04:10):

Whwt can I say I'm very pedantic about UB hahaha

Locria Cyber (Jul 29 2022 at 11:39):

Jad Ghalayini said:

Locria Cyber would be happy to try to extend it to add libffi support

I mean to use lean-sys and bind ffi (Rust crate) with lean, so lean can use ffi.

Locria Cyber (Jul 29 2022 at 11:46):

it's a pain that Lean doesn't expose Nat -> C
all nat_to_intxxxx is modulo

Locria Cyber (Jul 29 2022 at 12:02):

I found maybe a bug in tests/compiler/foreign/myfuns.cpp

lean_object * lean_S_update_global(b_lean_obj_arg s, lean_object /* w */)

world object is lean_object*

Locria Cyber (Jul 29 2022 at 12:03):

I think the world object is borrowed?

Sebastian Ullrich (Jul 29 2022 at 12:07):

It's just the tagged pointer 0x1, so it's neither really. But that signature is incorrect, right.

Locria Cyber (Jul 29 2022 at 12:10):

also, Int64 isn't generated as int64_t in C

Sebastian Ullrich (Jul 29 2022 at 12:15):

There is no Int64 in the stdlib?

Locria Cyber (Jul 29 2022 at 12:18):

I mean FFI. It is generated as lean_object * with @[extern "xxx"]

Locria Cyber (Jul 29 2022 at 12:21):

How do I use static library with lake?

Locria Cyber (Jul 29 2022 at 12:21):

The Lake's readme is vague

Locria Cyber (Jul 29 2022 at 13:38):

@Jad Ghalayini
Your lean_alloc_ctor leaks memory

Locria Cyber (Jul 29 2022 at 13:47):

no, it's my fault of using Rust &str as C string

Locria Cyber (Jul 29 2022 at 14:01):

https://github.com/locriacyber/lean-sys-example
Here's an example I made

Mac (Jul 29 2022 at 16:20):

Locria Cyber said:

How do I use static library with lake?

Does the FFI example help?

Locria Cyber (Jul 29 2022 at 16:20):

I already figured it out

Locria Cyber (Aug 09 2022 at 18:48):

@Jad Ghalayini I made a pull request on Gitlab on lean-sys for some fixes

Locria Cyber (Aug 09 2022 at 18:50):

Turns out there is no need for libffi if I can link the C/Rust library.

Locria Cyber (Aug 09 2022 at 18:50):

at runtime, libffi may be needed

Locria Cyber (Aug 09 2022 at 18:50):

although some glue code in C is enough

Jad Ghalayini (Aug 09 2022 at 18:53):

Merged!

Jad Ghalayini (Aug 09 2022 at 18:53):

@Locria Cyber would you like to be added as a maintainer

Jad Ghalayini (Aug 09 2022 at 18:53):

I plan to extend the library significantly with some higher-level safe wrappers later

Jad Ghalayini (Aug 09 2022 at 18:54):

Tell me if you're interested!

Jad Ghalayini (Aug 09 2022 at 18:54):

Also I'm curious what exactly you need libffi for

Jad Ghalayini (Aug 09 2022 at 18:54):

I plan to be making very heavy use if Lean FFI in my PhD as proposed (see gitlab.com/tekne/phd-proposal) and would be happy to collaborate!

Tomas Skrivan (Aug 09 2022 at 22:15):

Do you a pdf of the proposal? I'm not very good at reading raw TeX.

Locria Cyber (Aug 16 2022 at 12:09):

Jad Ghalayini said:

Locria Cyber would you like to be added as a maintainer

ok

Locria Cyber (Aug 16 2022 at 12:10):

libffi is for runtime loading of dynamic library

Locria Cyber (Aug 16 2022 at 12:11):

at runtime

  • loading dynamic library
  • make the function's type/ABI
  • call it

Locria Cyber (Aug 16 2022 at 12:12):

This is not really necessary, since in most FFI usage, you know the function types you want to call.
e.g. void puts(char*);

Locria Cyber (Aug 16 2022 at 12:16):

https://gitlab.com/tekne/phd-proposal/-/blob/main/figures/rvsdg.pdf
:rolling_on_the_floor_laughing: why this RVSDG seems very similar to a language/notation I made


Last updated: Dec 20 2023 at 11:08 UTC