Zulip Chat Archive

Stream: general

Topic: foreign function interface


James King (May 01 2019 at 02:01):

Hey -- I'm working out how to add a foreign function interface to Lean's VM so that Lean code can call into libraries written in C/C++. Ideas and feedback on how to achieve this are welcome. I've started a branch at https://github.com/agentultra/lean/tree/feature/vm-dynload for adding the low-level plumbing.

Mario Carneiro (May 01 2019 at 02:25):

You should coordinate with @Simon Hudon, who has also been working on FFI in the leanprover-community fork

Mario Carneiro (May 01 2019 at 02:26):

assuming you aren't already working on the same project as Simon's been discussing

James King (May 01 2019 at 02:26):

@Mario Carneiro I am working with him on it... he has convinced me to get involved with Lean. :)

Mario Carneiro (May 01 2019 at 02:28):

First, I would want to see what's the minimum implementation that is fully expressive

Simon Hudon (May 01 2019 at 02:31):

What I'm thinking is that you could load a .so file with

run_cmd load_external "file.so" "init_function" "finalize_function"

and then add bindings with:

@[ffi "file.so" "int c_add (int,int)"]
constant external_add_fn : c_int -> c_int -> c_int

Simon Hudon (May 01 2019 at 02:31):

And I'm thinking of accumulating .so objects in the `environment object

Mario Carneiro (May 01 2019 at 02:33):

how can you call a function using a string description like that?

Mario Carneiro (May 01 2019 at 02:33):

in the c++

James King (May 01 2019 at 02:35):

We may need some way of telling the Lean code how to call the function. I was thinking we'd try using the dlopen/dlsym API

Simon Hudon (May 01 2019 at 02:36):

How does it work?

Mario Carneiro (May 01 2019 at 02:36):

I think you just get a void* function pointer

James King (May 01 2019 at 02:37):

Well... it's all void * pointers

Mario Carneiro (May 01 2019 at 02:37):

I don't know if it's possible to dynamically call a function with a runtime known type

Simon Hudon (May 01 2019 at 02:38):

Right, as far as the .so file is concerned, only the function name matters. The rest of the string is to help us validate and marshal data

Mario Carneiro (May 01 2019 at 02:39):

but how can we adhere to C calling convention with a number of arguments only known at runtime

Simon Hudon (May 01 2019 at 02:39):

You load the so file, you get a void *, you cast it in the right type to do the call (or, in our case, into a type fn_name(...) function I assume) and then you call

Simon Hudon (May 01 2019 at 02:40):

We're going to need to manipulate the call stack to some extent

James King (May 01 2019 at 02:40):

I think so too.

Mario Carneiro (May 01 2019 at 02:41):

https://sourceware.org/libffi/ ?

James King (May 01 2019 at 02:42):

That would make things nice and easy.

Simon Hudon (May 01 2019 at 02:43):

Sounds like a nice tool

James King (May 01 2019 at 02:44):

Thanks @Mario Carneiro!

Mario Carneiro (May 01 2019 at 02:48):

See also the manual http://www.chiark.greenend.org.uk/doc/libffi-dev/html/, which was surprisingly hard to find

Patrick Massot (May 01 2019 at 12:58):

You guys are free to work on whatever sounds fun to you. But I cannot help wondering why you work on things that, as far as I can understand, will be completely wiped out by Lean 4. There are so many things that need expert work and will likely be portable to Lean 4. Why don't you work on transfer and parametricity for instance?

matt rice (May 01 2019 at 14:27):

I don't know if it's possible to dynamically call a function with a runtime known type

I know there are some experiments in this regard, e.g. https://github.com/stephenrkell/liballocs it essentially parses dwarf debug information and makes the necessary things available to the dynamic loader, but that entails custom compiler etc...

Simon Hudon (May 01 2019 at 14:33):

Have you had a look at http://www.chiark.greenend.org.uk/doc/libffi-dev/html/? This looks even better to me

matt rice (May 01 2019 at 14:45):

yeah, the thing liballocs gives on top of that is it gives one the ability to query the type of some memory allocation, and so the primitive types are safer i guess, where libffi is i guess an abstraction on top of void *, where the right ffi type corresponds to the c function (if you get it right), which is i believe Mario's complaint above.

matt rice (May 01 2019 at 14:46):

Anyhow, the thing i mentioned above is quite an endeavor to build, and rather experimental, but i wanted to point it out none the less.

Simon Hudon (May 01 2019 at 14:47):

Thanks!

James King (May 03 2019 at 02:07):

Looks like we're making good progress. We're able to load the shared object into memory and keep a handle; next we'll start adding libffi support and initializing ffi_cif from the Lean annotations

Mario Carneiro (May 03 2019 at 02:09):

I think we might be able to avoid the type annotations that Simon showed

Mario Carneiro (May 03 2019 at 02:09):

You should be able to derive it from the lean type, i.e. c_int -> c_int -> c_int

James King (May 03 2019 at 02:14):

I'll give that a try.

James King (May 08 2019 at 01:56):

@Mario Carneiro Do you have an idea for a Lean declaration that would work?

@[ffi "libfile.so"]
constant my_fun : c_int -> c_pointer -> c_void

Mario Carneiro (May 08 2019 at 01:57):

On the lean side, I think that would be io c_void but otherwise that seems fine

Mario Carneiro (May 08 2019 at 01:57):

Better yet, just use unit instead of defining c_void (and similarly for other types if they exist)

James King (May 08 2019 at 01:59):

@Mario Carneiro Good idea, thanks.

Mario Carneiro (May 08 2019 at 02:00):

@Simon Hudon Do you think c_pointer should be a constant or should it be defined to be u64 or whatever

Mario Carneiro (May 08 2019 at 02:01):

also I wonder whether we should allow people to define FFI functions that are not in io

Mario Carneiro (May 08 2019 at 02:03):

It doesn't really make sense for a unit-returning function to not be io but a regular function might be pure

James King (May 08 2019 at 02:04):

I thought io made sense myself but I think @Simon Hudon had a few insights.

Simon Hudon (May 08 2019 at 02:06):

I would make c_pointer into a constant and I would allow non-io-returning functions

Simon Hudon (May 08 2019 at 02:06):

We can't check whether io should be required so we just have to trust the writer of the bindings

Simon Hudon (May 08 2019 at 19:25):

@James King

Mario Carneiro Do you have an idea for a Lean declaration that would work?

@[ffi "libfile.so"]
constant my_fun : c_int -> c_pointer -> c_void

In passing, I was thinking that the ffi attribute could take a Lean name instead of a file name. That way, if there's any complexity around path resolution, it can be isolated into a single declaration

James King (May 10 2019 at 01:00):

So...

@[ffi 'main]
constant my_fun : c_int -> c_pointer -> c_void

Would be ideal?

Mario Carneiro (May 10 2019 at 01:10):

what does 'main mean?

Mario Carneiro (May 10 2019 at 01:10):

also that can probably be just main since we control the parser there

James King (May 10 2019 at 01:11):

We have added a function to the environment that loads the foreign object file and stores the handle in a map

James King (May 10 2019 at 01:11):

I was trying to follow along with @Simon Hudon 's suggestion.

Mario Carneiro (May 10 2019 at 01:11):

I see. What does that function look like?

Mario Carneiro (May 10 2019 at 01:12):

are these names associated to some definitions or are they in their own namespace?

James King (May 10 2019 at 01:15):

https://github.com/agentultra/lean/blob/feature/vm-dynload/src/library/vm/vm_dynload.h
https://github.com/agentultra/lean/blob/feature/vm-dynload/src/library/vm/vm_dynload.cpp#L33
https://github.com/agentultra/lean/blob/feature/vm-dynload/src/library/vm/vm_environment.cpp#L318

James King (May 10 2019 at 01:16):

The foreign symbol stuff will probably go, but we basically load those objects into a map so we can associate the handle with a "Lean name" as @Simon Hudon says

James King (May 10 2019 at 01:18):

I'm still pretty new to Lean so forgive my ignorance of the finer points of the language/syntax; it's pretty fun learning about it this way.

Mario Carneiro (May 10 2019 at 01:26):

What does the lean side of this look like?

Mario Carneiro (May 10 2019 at 01:27):

I think you're doing a great job btw, I'm just thinking about what the lean user side is

James King (May 10 2019 at 01:28):

see tests/lean/vm_dynload/ffi.lean

James King (May 10 2019 at 01:29):

I'm working on changing the bind function to take the argument specs to fill the ffi_cif struct

Mario Carneiro (May 10 2019 at 01:30):

Why is bind_foreign_symbol separate from the @[ffi] attribute?

Mario Carneiro (May 10 2019 at 01:30):

is there a reason they wouldn't be 1-1?

James King (May 10 2019 at 01:30):

I think that's where I'm going to change it so that it uses this attribute part, I'm just not sure where to start to hook into that.

Mario Carneiro (May 10 2019 at 01:35):

look for uses of register_system_attribute to add a new system attribute

James King (May 10 2019 at 01:35):

Will do, thanks.

Simon Hudon (May 10 2019 at 02:29):

This branch (the last 3 commits) has a lot in common the ffi feature: https://github.com/leanprover-community/lean/tree/vm_override

feel free to borrow from it. @Edward Ayers did most of it.


Last updated: Dec 20 2023 at 11:08 UTC