Zulip Chat Archive

Stream: lean4

Topic: Issue using ffi api


Patrick Ford (Nov 05 2023 at 22:20):

I'm trying to use ffi api but this simple function isn't working. (value I read is +/- 1 of the value I wrote) debugging seems to show that lean's dec_ref is editing the value stored at the memory address I allocated. Some insight on what's going wrong would be appreciated.

@[extern "mh_get_8"] opaque mh_get_8 (address : USize) : UInt8
@[extern "mh_set_8"] opaque mh_set_8 (address : USize) (value : UInt8) : Unit
@[extern "mh_alloc"] opaque mh_alloc (size : USize) : USize

def alloc (size : USize) : USize := mh_alloc size
def set8 (address : USize) (v : UInt8) : IO Unit := do return (mh_set_8 address v)
def get8 (address : USize) : IO UInt8 := do return (mh_get_8 address)

def main : IO Unit := do
  let address := mh_alloc 100
  set8 address 92
  IO.println s!"{←get8 address}"
#include <stdlib.h>
#include <stdint.h>
#include <malloc.h>

void* mh_alloc(size_t size) { return malloc(size); }
uint8_t mh_get_8(uint8_t* address) { return *address; }
void mh_set_8(uint8_t* address, uint8_t value) { *address = value; }
run 1: 93
run 2: 93
run 3: 91

Sebastian Ullrich (Nov 05 2023 at 23:21):

void -> Unit is not a correct translation according to https://lean-lang.org/lean4/doc/dev/ffi.html. You need to make this extern function IO and change the C function accordingly.

Alexander Fasching (Nov 06 2023 at 02:40):

As already mentioned, you use the wrong prototype for a function returning Unit. The correct return type is lean_object_res (or lean_object *)and the result should be lean_box(0) for the first and only constructor index in the Unit type.

lean_object * mh_set_8(uint8_t* address, uint8_t value) {
    *address = value;
    return lean_box(0);
}

You have another problem as well. All your functions have side effects, so they should be IO functions. You can see this with your allocation function when you call it multiple times with the same argument:

  let a1 := mh_alloc 100
  let a2 := mh_alloc 100
  IO.println s!"a1: {a1}"
  IO.println s!"a2: {a2}"

Outputs:

a1: 94095139304976
a2: 94095139304976

I'm not sure if wrapping the functions in the IO monad like you do it results in the correct behavior all the time. The external functions should be IO functions from the start:

@[extern "mh_get_8"] opaque get8 (address : USize) : IO UInt8
@[extern "mh_set_8"] opaque set8 (address : USize) (value : UInt8) : IO Unit
@[extern "mh_alloc"] opaque alloc (size : USize) : IO USize
lean_obj_res mh_alloc(size_t size, lean_object* unused) {
    void* address = malloc(size);
    return lean_io_result_mk_ok(lean_box_usize((size_t)address));
}
lean_obj_res mh_get_8(size_t address, lean_object* unused) {
    return lean_io_result_mk_ok(lean_box(*((uint8_t*)address)));
}
lean_obj_res mh_set_8(size_t address, uint8_t value, lean_object* unused) {
    *((uint8_t*)address) = value;
    return lean_io_result_mk_ok(lean_box(0));
}

Patrick Ford (Nov 06 2023 at 15:42):

Thanks for the help. That clears up the issue!

Mac Malone (Nov 06 2023 at 16:15):

An add-on: since these functions cannot produce Lean-level IO errors (i.e., they do not use lean_io_result_mk_error), they should be in BaseIO rather than IO to communicate this fact to Lean. That is:

@[extern "mh_get_8"] opaque get8 (address : USize) : BaseIO UInt8
@[extern "mh_set_8"] opaque set8 (address : USize) (value : UInt8) : BaseIO Unit
@[extern "mh_alloc"] opaque alloc (size : USize) : BaseIO USize

Mac Malone (Nov 06 2023 at 16:17):

(Otherwise, code which uses these functions will have to handle potential IO errors despite them not actually being possible.)

Timo Carlin-Burns (Nov 06 2023 at 18:47):

Doesn't it cause problems that alloc performs its side effect directly rather than encoding some kind of thunk in the BaseIO action? For example something like this might cause problems

def main : IO Unit := do
  let addr1 <- alloc 8
  let addr2 <- alloc 8
  IO.println (addr1 == addr2) -- should always be `false` because `malloc` doesn't give overlapping addresses

due to common subexpression elimination, this could get transformed into

def main : IO Unit := do
  let action := alloc 8
  let addr1 <- action
  let addr2 <- action
  IO.println (addr1 == addr2) -- will be `true` because calling `alloc` is what has the side effect, not binding the IO action

Henrik Böving (Nov 06 2023 at 19:05):

This is not how CSE in the BaseIO monad works. The entire point of the BaseIO monad is to prevent exactly this bug.

Timo Carlin-Burns (Nov 06 2023 at 19:10):

Ahh when we write

@[extern "mh_alloc"] opaque alloc (size : USize) : BaseIO USize

is Lean unfolding the definition of BaseIO USize to see that we're really saying

@[extern "mh_alloc"] opaque alloc (size : USize) : RealWorld -> EstateM Empty RealWorld USize

so that extra RealWorld argument is what that lean_object* unused was in the C code?

And then when we write alloc 8, we're not passing all the necessary arguments to the C function, so Lean automatically creates a partial application shim like fun w : RealWorld => alloc 8 w? So alloc 8 functions as a lazy IO action as desired..


Last updated: Dec 20 2023 at 11:08 UTC