Zulip Chat Archive
Stream: lean4
Topic: FFI for fixed-length arrays
James Gallicchio (Apr 05 2022 at 08:27):
Okay, I got something working:
Lean: https://github.com/JamesGallicchio/LeanColls/blob/main/LeanColls/Array.lean
C: https://github.com/JamesGallicchio/LeanColls/blob/main/bindings/leancolls_array.c
Still need to implement the FFI interface for a few things, but I'm pretty happy it works at all :rofl:
That said, not sure if what I'm doing is the right approach. If someone has a moment to take a quick look, I'd greatly appreciate it!
Mario Carneiro (Apr 05 2022 at 08:31):
I think it would be better to have this be a wrapper around Array
instead of List
Mario Carneiro (Apr 05 2022 at 08:32):
are there zero cost conversions to and from LeanColls.Array
?
James Gallicchio (Apr 05 2022 at 08:35):
Mario Carneiro said:
I think it would be better to have this be a wrapper around
Array
instead ofList
Part of me agrees, but part of me wants to completely replace the built-in array (see ArrayBuffer
for UBA built on top of this Array
), which I think means it is more reasonable to have a list representation. Unsure.
Mario Carneiro (Apr 05 2022 at 08:35):
That implementation seems like you would have no way to actually get the length
Mario Carneiro (Apr 05 2022 at 08:36):
I expect you will need the garbage collection functionalities of the original implementation even if it's basically a plain C array
James Gallicchio (Apr 05 2022 at 08:37):
Mario Carneiro said:
are there zero cost conversions to and from
LeanColls.Array
?
to/from LeanColls.Array
and List
? No, I'm not sure that's even possible (might be wrong?)
Mario Carneiro (Apr 05 2022 at 08:37):
but I assume that the existing Array implementation supports something that is essentially a single contiguous array, so you would just copy into a contiguous array when converting Array -> LeanColls.Array
and then both conversions become zero cost after that
Mario Carneiro (Apr 05 2022 at 08:37):
no, zero cost between Array
and LeanColls.Array
Mario Carneiro (Apr 05 2022 at 08:38):
that's why I suggested that LeanColls.Array
wraps Array
instead of List
James Gallicchio (Apr 05 2022 at 08:38):
Mario Carneiro said:
That implementation seems like you would have no way to actually get the length
Yeah, the length (where relevant) gets passed in to functions like it does in C world (because it has to be a parameter of the function). I'm still trying to figure out how to get Lean to erase it where irrelevant, though with aggressive enough inlining by the compiler it shouldn't matter
Mario Carneiro (Apr 05 2022 at 08:39):
wouldn't LC.Array
need to be a dependent type for that?
Mario Carneiro (Apr 05 2022 at 08:39):
you declared it as just LC.Array A
and not LC.Array A n
Mario Carneiro (Apr 05 2022 at 08:40):
oh whoops nvm
Mario Carneiro (Apr 05 2022 at 08:41):
you need a bounds check in leancolls_array_get
, it can safely be called out of bounds but that yields UB in your implementation
James Gallicchio (Apr 05 2022 at 08:41):
Mario Carneiro said:
but I assume that the existing Array implementation supports something that is essentially a single contiguous array, so you would just copy into a contiguous array when converting
Array -> LeanColls.Array
and then both conversions become zero cost after that
Ah, yeah, I could implement conversions between Array and this one. Lean's built-in seems to have specializations for small types and I'm not entirely sure how that works yet, which is what discourages me atm...
James Gallicchio (Apr 05 2022 at 08:43):
I think the only way to call leancolls_array_get
from Lean is by passing a Fin n
which should ensure it's always within bounds, no?
Mario Carneiro (Apr 05 2022 at 08:45):
oh I see, you have some misleadingly named functions
Mario Carneiro (Apr 05 2022 at 08:46):
why does getExternal
have a definition at all? It's a lie
Mario Carneiro (Apr 05 2022 at 08:46):
those functions shouldn't be private in any case
Mario Carneiro (Apr 05 2022 at 08:47):
you should just slap the extern
directly on get
James Gallicchio (Apr 05 2022 at 08:47):
One thing I'm particularly unsure about is this function for ensuring all mutations are safe to perform in place. It seems like there's kind of two cases where you aren't allowed to modify in place:
1) The object is shared. Here I want the implementation to just panic, because using these Arrays persistently should be eagerly caught.
2) The object is persistent. It seems like constants in a program just get compiled to always be marked as persistent even when they're only used once? Which I'm not sure how to avoid
Mario Carneiro (Apr 05 2022 at 08:47):
persistent objects can't be modified, because they are shared with the whole program
Mario Carneiro (Apr 05 2022 at 08:48):
it would not be sound to do so even if they are "unshared" because the next time you call the function you will get that constant again but its value is different
James Gallicchio (Apr 05 2022 at 08:50):
Mario Carneiro said:
you should just slap the
extern
directly onget
Okay, so that was my first thought as well -- but there's kinda two steps happening here?
First step is to erase index/length parameters from Nat
to USize
(which is safe because of memory constraints). Second step is to call the extern'd function.
I wanted the getExternal
to be a constant
rather than a definition, but it can't automatically generate an Inhabited
instance which made everything sad.
Mario Carneiro (Apr 05 2022 at 08:51):
you can just change the signature of the extern function to match get
James Gallicchio (Apr 05 2022 at 08:51):
Mario Carneiro said:
it would not be sound to do so even if they are "unshared" because the next time you call the function you will get that constant again but its value is different
What determines whether an expression is compiled to a persistent object? It seems kinda arbitrary
Mario Carneiro (Apr 05 2022 at 08:52):
Lean will automatically lift any subexpression with no free variables to a constant
Mario Carneiro (Apr 05 2022 at 08:53):
which basically means that Array.mk
will probably be a copy from a global instead of a constructor
James Gallicchio (Apr 05 2022 at 08:54):
Mario Carneiro said:
you can just change the signature of the extern function to match
get
Yeah. I'll try this out and see what it would look like.
James Gallicchio (Apr 05 2022 at 08:55):
Mario Carneiro said:
Lean will automatically lift any subexpression with no free variables to a constant
Huh, okay. What if I don't want it to do that..? I could see a use-case here where you don't want to have, say, an extra copy of a very large array floating around just because it's technically a closed expression...
Mario Carneiro (Apr 05 2022 at 08:56):
there is some trick here involving Unit -> A
functions that I forget
Mario Carneiro (Apr 05 2022 at 08:57):
@Leonardo de Moura will know if there is a way to disable constant lifting in a function
James Gallicchio (Apr 05 2022 at 08:58):
It just seems like not what I'd expect/want as a programmer. If I want an expression to not be recomputed every call, I'll lift it manually.
Mario Carneiro (Apr 05 2022 at 08:59):
It's one of the advantages of pure functional programming. lots of subexpressions can be cached or precomputed
Mario Carneiro (Apr 05 2022 at 09:00):
I don't think you would want to lift it manually considering that there are a huge number of such liftings in the average lean program
Mario Carneiro (Apr 05 2022 at 09:00):
Most imperative languages lift constants if they can as well (C / C++ / Rust all do this)
James Gallicchio (Apr 05 2022 at 09:02):
Fair enough! Maybe the solution is to just be less aggressive with marking things persistent, maybe by making sure that the things which are marked are actually used twice
James Gallicchio (Apr 05 2022 at 09:02):
Anyways -- I'll try to get benchmarks down soon to see how the performance looks, before I even try to optimize the emitted bytecode
Sebastian Ullrich (Apr 05 2022 at 09:03):
Mario Carneiro said:
Leonardo de Moura will know if there is a way to disable constant lifting in a function
That's set_option compiler.extract_closed false
Mario Carneiro (Apr 05 2022 at 09:04):
Have you considered using a variation on Fin n
that uses USize
instead of Nat
?
Mario Carneiro (Apr 05 2022 at 09:04):
that would avoid the boxing overhead
Mario Carneiro (Apr 05 2022 at 09:04):
and might actually be enough to get something that looks like a C loop over an array after enough inlining
Mario Carneiro (Apr 05 2022 at 09:05):
although I guess set
is still hard to optimize
Sebastian Ullrich (Apr 05 2022 at 09:06):
Mario Carneiro said:
I believe it should be possible to bind to those functions directly, but it doesn't seem to be working:
... -- could not find native implementation of external declaration 'Nat.toUSize!' (symbols 'l_Nat_toUSize_x21___boxed' or 'l_Nat_toUSize_x21')
The interpreter only speaks the always-boxed subset of the Lean ABI, so extern
s must always be compiled first before evaluation
Mario Carneiro (Apr 05 2022 at 09:07):
I tried putting it in a separate file but it didn't make a difference
Mario Carneiro (Apr 05 2022 at 09:07):
how does lean do it?
Mario Carneiro (Apr 05 2022 at 09:08):
maybe the interpreter is just ignoring those extern declarations on things like USize.add
Sebastian Ullrich (Apr 05 2022 at 09:08):
By compiling it first, into a binary named lean
:)
Sebastian Ullrich (Apr 05 2022 at 09:09):
That is, you would need https://github.com/leanprover/lake/pull/47 to make it work in a different module
Mario Carneiro (Apr 05 2022 at 09:10):
BTW @James Gallicchio I'm also wondering whether these functions can instead be exposed as regular functions on Array
instead of a new type
James Gallicchio (Apr 05 2022 at 15:41):
oh definitely -- the goal here was to sidestep Array
entirely, and reimplement dynamic arrays in Lean
James Gallicchio (Apr 05 2022 at 15:43):
Mario Carneiro said:
Have you considered using a variation on
Fin n
that usesUSize
instead ofNat
?
That's a good idea. {u : USize // u < n}
. I'll think about it.
James Gallicchio (Apr 05 2022 at 15:53):
James Gallicchio said:
I wanted the
getExternal
to be aconstant
rather than a definition, but it can't automatically generate anInhabited
instance which made everything sad.
Bumping this -- is there a way to give Lean an inhabitant without calling that inhabitant the definition?
Jannis Limperg (Apr 05 2022 at 16:12):
private unsafe def getExternal [Inhabited α] ... : α
If you have an arbitrary type α
, that's the only way to ensure inhabitation. But I think your primitive should rather take a proof that the index is in bounds and then you can give a reference implementation (in terms of lists) that doesn't need an Inhabited
constraint.
James Gallicchio (Apr 05 2022 at 16:27):
Yeah, that makes sense. I guess this is the tradeoff we make to avoid soundness issues :/
James Gallicchio (Apr 05 2022 at 16:29):
It just feels .. slightly not ideal that, in Lean world, it "looks" super cheap to go from the list representation to the array representation, when in reality that is a very expensive operation
Mario Carneiro (Apr 05 2022 at 16:29):
I thought you didn't need Inhabited
if the constant is unsafe
Mario Carneiro (Apr 05 2022 at 16:34):
go figure.
@[extern "foo"] unsafe constant something : α
-- failed to synthesize
-- Inhabited α
@[extern "foo"] unsafe constant loop : α := loop -- ok
Arthur Paulino (Apr 05 2022 at 17:15):
Mario Carneiro said:
@[extern "foo"] unsafe constant something : α -- failed to synthesize -- Inhabited α
I think that's a small bug
Leonardo de Moura (Apr 05 2022 at 18:29):
Yes, we can clearly skip the Inhabitant
synthesis code for unsafe
. The current implementation always executes the synthesis code.
Leonardo de Moura (Apr 05 2022 at 18:57):
@James Gallicchio Sorry for the delay. I had to take the day off yesterday.
Right now the behaviors just don't match up, which seems like a recipe for issues
The C and Lean behavior should match. Are you observing a mismatch in a particular example?
Maybe this does actually mod?
It should. We may have a bug in the current implementation, but our intention was to "mod" in C too.
James Gallicchio (Apr 05 2022 at 19:05):
Ah, no, I was reading the wrong thing there, sorry! It looks like it has the mod behavior in both C and Lean
Tomas Skrivan (Apr 05 2022 at 19:58):
What would be the best way to implement Fin n
with USize
? My main concern is: should n : USize
or n : Nat
or it does not matter?
My attempt:
structure UFin (n : USize) where
val : USize
property : val < n
@[export ufin_add]
def add {n} (x y : UFin n) : UFin n := ⟨(x.1 + y.1) % n, sorry⟩
@[export ufin_sub]
def sub {n} (x y : UFin n) : UFin n := ⟨(x.1 + (n - y.1)) % n, sorry⟩
@[export ufin_mul]
def mul {n} (x y : UFin n) : UFin n := ⟨(x.1 * y.1) % n, sorry⟩
@[export ufin_div]
def div {n} (x y : UFin n) : UFin n := ⟨(x.1 / y.1), sorry⟩
instance : Add (UFin n) := ⟨add⟩
instance : Sub (UFin n) := ⟨sub⟩
instance : Mul (UFin n) := ⟨mul⟩
instance : Div (UFin n) := ⟨div⟩
instance : ToString (UFin n) := ⟨λ n => toString n.1⟩
instance : OfNat (UFin m) n := ⟨Nat.toUSize n % m, sorry⟩
#eval (3 : UFin 5) - (4 : UFin 5)
#eval (3 : UFin 5) + (4 : UFin 5)
Example of the generated C code:
LEAN_EXPORT size_t ufin_sub(size_t x_1, size_t x_2, size_t x_3) {
_start:
{
size_t x_4; size_t x_5; size_t x_6;
x_4 = lean_usize_sub(x_1, x_3);
x_5 = lean_usize_add(x_2, x_4);
x_6 = lean_usize_mod(x_5, x_1);
return x_6;
}
}
With n : Nat
the generated code is:
LEAN_EXPORT size_t ufin_sub(lean_object* x_1, size_t x_2, size_t x_3) {
_start:
{
size_t x_4; size_t x_5; size_t x_6; size_t x_7;
x_4 = lean_usize_of_nat(x_1);
x_5 = lean_usize_sub(x_4, x_3);
x_6 = lean_usize_add(x_2, x_5);
x_7 = lean_usize_modn(x_6, x_1);
lean_dec(x_1);
return x_7;
}
}
You probably want to avoid the reference counting lean_dec(x_1)
and lean_usize_modn
that unboxes Nat
inside.
Tomas Skrivan (Apr 05 2022 at 19:59):
Also not sure if I got those operations right for n > sqrt(USize.size)
.
Sebastian Ullrich (Apr 05 2022 at 20:04):
If you care about performance, you definitely want to avoid that modulo (by directly proving val < n
for each index access, most probably)
Tomas Skrivan (Apr 05 2022 at 20:08):
Right, so having x y : UFin n
you would not write x + y
but ⟨x.1 + y.1, ...⟩
where you provide a custom proof.
Mario Carneiro (Apr 05 2022 at 20:19):
it doesn't make much sense to add two array indexes anyway
Mario Carneiro (Apr 05 2022 at 20:21):
I would imagine that in user code you wouldn't have a UFin n
in the first place if you are doing anything other than handling this index opaquely; you would just have a USize
and combine this with the proof of val < n
at the last minute when calling get
Tomas Skrivan (Apr 05 2022 at 20:33):
I'm quite often writing code like λ i => if (i=0) ∨ (i=(n-1)) then 0 else u[i+1] - u[i-1]
where the type is automatically deduced to Fin n
where n
is the size of array u
.
Mario Carneiro (Apr 05 2022 at 20:40):
there are redundant bounds checks in that expression though
James Gallicchio (Apr 06 2022 at 17:27):
I think that the built-in Array in Lean does the right thing here, where arrays use Nat
for indexing (because it's way easier that way), but behind the scenes we are making (technically unsafe) assumptions that everything fits in USize because otherwise you'd have already hit an out-of-memory error
Mario Carneiro (Apr 07 2022 at 00:53):
the problem is that it impacts the ABI of the function if you use Nat
- you don't get unboxed values anymore. This applies to both extern functions like get
and also lean functions that just call each other normally: in addition to the cost of boxing and unboxing this is an optimization barrier, and you need a whole lot of things to go right for lean loops to look like C loops and trigger LLVM loop optimizations
James Gallicchio (Apr 07 2022 at 18:14):
Yeah. Maybe instead I implement the interface directly on USize and then provide a separate interface for using Nat
s with it
James Gallicchio (Apr 08 2022 at 11:35):
So, I'm revamping my implementation to work around some boxing costs with ArrayBuffer as defined.
I need a type whose ABI representation could be any arbitrary value (not just leanbox(0)
). Does something like True
serve that purpose? Its value per the ABI would be leanbox(0)
, but does it have to be? or can it be any arbitrary value of the right width?
James Gallicchio (Apr 08 2022 at 11:38):
If not -- is there a way to mark a recursor noncomputable?
inductive Uninit (α)
| uninit : Uninit α
| init : (α) → Uninit α
attribute [noncomputable] Uninit.rec
This doesn't work because noncomputable is not an attribute. But it would definitely suffice my use-case, since it is not possible to distinguish the two constructors in any produced program.
James Gallicchio (Apr 08 2022 at 11:40):
I could also go the extern route, but if I conjecture an Uninit type by constant
then I can't conjecture values of that type since it can't prove the type is inhabited...
James Gallicchio (Apr 08 2022 at 11:47):
Having an Uninit type would be useful for other FFIs as well; it's nice to say at the type-level that some value is junk, instead of having to ensure everything is known everywhere
James Gallicchio (Apr 09 2022 at 13:09):
Also, should allocations for external objects go through lean_alloc
or directly call malloc
? And if I do use lean_alloc
I assume the finalize function should just calllean_dealloc
?
James Gallicchio (Apr 09 2022 at 13:25):
After poking around the codebase, I think my real question is why the small object allocator exists
Sebastian Ullrich (Apr 09 2022 at 13:27):
To make Lean go fast
James Gallicchio (Apr 09 2022 at 13:29):
I kinda assumed most malloc implementations already had paging for small allocations but that must be a wrong assumption
Sebastian Ullrich (Apr 09 2022 at 13:33):
mimalloc would not have been so successful then. The Lean allocator is very similar to it.
Sebastian Ullrich (Apr 09 2022 at 13:33):
So yes, Lean objects should always allocated with lean_alloc
so that Lean uses the correct deallocator on them.
James Gallicchio (Apr 09 2022 at 13:56):
Actually, I think I am not able to use the lean allocator right now because I need a realloc
. For now I'll just use malloc, might try my hand at implementing a realloc on Lean's allocator later
Sebastian Ullrich (Apr 09 2022 at 14:06):
I assume you are not talking about the lean_external_object
itself then, those should have a fixed size I believe
James Gallicchio (Apr 09 2022 at 15:09):
Oh, yeah -- I'm using lean_alloc_external
for the external object itself. But I wasn't sure if I should also use lean_alloc for the array allocation
Sebastian Ullrich (Apr 09 2022 at 16:56):
It's just a void *
to Lean, you are responsible for it
James Gallicchio (Apr 09 2022 at 18:52):
James Gallicchio said:
I need a type whose ABI representation could be any arbitrary value (not just
leanbox(0)
). Does something like()
serve that purpose? Its value per the ABI would beleanbox(0)
, but does it have to be?
RE: this, I think it just depends on whether Unit.rec actually checks its value, no? I know the compiler optimizes away matches with only one live branch, so I assume that's true here as well and Unit.rec doesn't check the value?
Last updated: Dec 20 2023 at 11:08 UTC