Zulip Chat Archive
Stream: lean4
Topic: a numerical lib for Lean 4
Arthur Paulino (Dec 08 2021 at 22:58):
Hi everyone! I'm building a lib for Lean 4 called NumLean
which allows low level matricial operations. Right now, it's doing this:
import NumLean
def main : IO Unit := do
let id ← NLMatrix.id 5
-- let ones ← NLMatrix.new 5 5 1
let t : Tensor ← Tensor.new id ↠ plusF 4 ↠ plusF 6.0
let m' : NLMatrix ← t.compute
IO.println $ ← m'.toString
-- 11.0 10.0 10.0 10.0 10.0
-- 10.0 10.0 6.0 6.0 6.0 ← these 6's are bugs
-- 10.0 10.0 11.0 10.0 10.0
-- 10.0 10.0 10.0 11.0 10.0
-- 10.0 10.0 10.0 10.0 11.0
The problem is that some memory issue is happening and I can't spot it. The commented output shows some unintended 6's. Would appreciate some help on this one. This is the repo link for those who are interested: https://github.com/arthurpaulino/NumLean
Henrik Böving (Dec 08 2021 at 23:24):
Are you aware of the other efforts around tensor and matrix operations at the moment? @Arthur Paulino
Arthur Paulino (Dec 09 2021 at 00:05):
Henrik Böving said:
Are you aware of the other efforts around tensor and matrix operations at the moment? Arthur Paulino
Not quite. I'm facing this more like a practice exercise/experiment, as I'm really stretching my brain while thinking in terms of usage design in Lean. I'm also getting to know Lean's FFI better in this process, and this knowledge can be useful to integrate more powerful stuff like CUDA someday.
I know @Tomas Skrivan has been working on something more high level, but he said that he's not worrying about performance/backend at the moment.
Do you know other projects around this subject?
Arthur Paulino (Dec 09 2021 at 00:15):
If I search on GitHub for "Lean" and "matrix" then filter results related to the Lean language, it shows two results:
https://github.com/search?l=Lean&q=matrix+lean&type=Repositories
Henrik Böving (Dec 09 2021 at 00:17):
Ah, the container thing from @Tomas Skrivan was indeed what I had in mind yes, would be very cool if the more performant API would end up supporting their abstraction though
Arthur Paulino (Dec 09 2021 at 00:20):
It can definitely be done, but for now I'm struggling with this (supposedly) simple task because I don't seem to be able to sum floats (doubles actually) to a matrix properly :smiley:
Arthur Paulino (Dec 09 2021 at 02:33):
Okay this is really strange. Probably a bug.
def main : IO Unit := do
let id ← NLMatrix.id 5
IO.println $ ← id.toString
IO.println "-------------------------------"
-- let ones ← NLMatrix.new 5 5 1
let t : Tensor ← Tensor.new id ↠ plusF 4 ↠ plusF 6.0
let m' : NLMatrix ← t.compute
IO.println $ ← m'.toString
Produces:
1.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0
-------------------------------
11.0 10.0 10.0 10.0 10.0
10.0 11.0 10.0 10.0 10.0
10.0 10.0 11.0 10.0 10.0
10.0 10.0 10.0 11.0 10.0
10.0 10.0 10.0 10.0 11.0
Which is correct. But the following code:
def main : IO Unit := do
let id ← NLMatrix.id 5
-- IO.println $ ← id.toString -- this commented line is the only difference
IO.println "-------------------------------"
-- let ones ← NLMatrix.new 5 5 1
let t : Tensor ← Tensor.new id ↠ plusF 4 ↠ plusF 6.0
let m' : NLMatrix ← t.compute
IO.println $ ← m'.toString
Produces:
-------------------------------
11.0 10.0 10.0 10.0 10.0
10.0 11.0 10.0 10.0 10.0
10.0 10.0 11.0 10.0 10.0
10.0 10.0 10.0 11.0 10.0
10.0 10.0 10.0 6.0 6.0
Which is incorrect. Any clue what might be happening here?
Mario Carneiro (Dec 09 2021 at 03:34):
Your pointer arithmetic is wrong here:
internal void set_val_(nl_matrix* m, uint32_t i, double v) {
*(m->data + i * sizeof(double)) = v; // this product may overflow
}
m->data
is already a double*
pointer, so advancing by i
already skips i
doubles. So advancing i * sizeof(double)
is double counting
Mario Carneiro (Dec 09 2021 at 03:35):
protip: bounds check your pointer accesses
Arthur Paulino (Dec 09 2021 at 03:36):
Thank you so much! I'm gonna take a look at it tomorrow :pray:
Mario Carneiro (Dec 09 2021 at 03:37):
is there a reason you aren't just writing m->data[i] = v
?
Arthur Paulino (Dec 09 2021 at 03:40):
Not at all. Just rusty pointer arithmetic skills :D
Arthur Paulino (Dec 09 2021 at 03:40):
No idea why I thought that regular indexing wouldn't work
Tomas Skrivan (Dec 09 2021 at 07:27):
Great! I would be really happy if someone else gets low level API for matrices working, so I don't have to do it :)
I have also started thinking about this a little bit. And my current plan on how to approach this:
- on CPU just use ByteArray, on GPU write a custom GPUByteArray (that is just a chunk of memory sitting in GPU memory)
- Define class
CReflected (X : Type)
andGPUReflected (X : Type)
that provides functions how to read/write X from/to ByteArray. These functions will be mainly used for consistency and not for performance. - Define
CArray (X : Type)
andGPUArray (X : Type)
just a warper around ByteArrays making sure the buffer size is multiple ofsizeof X
- Implement something like
CArray.mapIdx (kernel : CExpr (UInt64 -> X -> Y)) (arr : CArray X) : CArray Y
. WhereCExpr T
is a subtype ofLean.Expr
that is compilable to the specific target, CPU or GPU. Still not sure how to do this, maybe you have to define a newmapIdx
for everykernel
, write a custom elaborator that takes a syntax, gets expression, checks if it can be compiled to the specific target, generate code, compile and the define appropriate version of mapIdx with specific 'extern' attribute.
Tomas Skrivan (Dec 09 2021 at 13:14):
For reference, in this thread I asked if the extern code can depend on function arguments. In this case the code would depend on kernel
. It can't and the conclusion is to write custom elaborator that would generate a new mapIdx
for every kernel
.
Tomas Skrivan (Dec 09 2021 at 13:25):
Also I have some half baked code that defines strucuture Kernel
specifying basic elements for a target you want to compile down to, C, cuda, OpenCL ... These core elements of Kernel
are:
- ByteArray type
- internal types like
float
double
int
... and pointers to const datafloat const*
- internal n-ary functions like
float fadd(float a, float b)
, or 0-ary function likenulltpr
Tomas Skrivan (Dec 09 2021 at 13:25):
The definition of Kernel
:
structure Kernel where
KByteArray : Type
byteArraySize : KByteArray → Nat
byteArrayRead : (arr : KByteArray) → (i : Fin (byteArraySize arr)) → UInt8
malloc : (size : Nat) → KByteArray
malloc_size : ∀ n, byteArraySize (malloc n) = n
KType : Type
void : KType
ptr : KType → KType -- internal way to talk about buffers
typeDec : DecidableEq KType
typeName : KType → String
typeSize : KType → Nat -- size in bytes
KFun : Type
funDec : DecidableEq KFun
funName : KFun → String
funArgTypes : KFun → Array KType
funOutType : KFun → KType
null : KFun
null_is_ptr : funOutType null = ptr void
null_is_const : (funArgTypes null).size = 0
-- Execute a function
-- Stack all inputs in the order into a byte array and produce output
-- This form is mainly used to prove stuff and provide basic runtime
-- However it is not designed for speed!
execute : KFun → KByteArray → KByteArray
-- Just to make sure the output has the right size
execute_otput : ∀ f input, byteArraySize (execute f input) = typeSize (funOutType f)
Tomas Skrivan (Dec 09 2021 at 13:29):
Then expression for a specific kernel KExpr
is just a subset of Lean.Expr
. It is assume that the target is not functional programming language so there is no lambda abstraction. However each expression can have bound variables such that as a whole it can be treated as a function.
inductive KExpr (k : Kernel) where
| bvar : (i : Nat) → KExpr k
| const : (f : KFun k) → KExpr k
| app : (f : KExpr k) → (arg : KExpr k) → KExpr k
Tomas Skrivan (Dec 09 2021 at 13:32):
To communicate between Lean types and KType
there is a class ReflectedType
class ReflectedType (k : Kernel) (α : Type) extends Inhabited α where
t : KType k
readBytes : (KBuffer k t.sizeof) → α
writeBytes : (KBuffer k t.sizeof) → α → KByteArray k
--- and some compatibility statements between read and write
that allows you to read and write Lean types from/to KByteArray
and (KBuffer k t.sizeof)
is just a KByteArray
that has at least t.sizeof
bytes.
Tomas Skrivan (Dec 09 2021 at 13:36):
Now I would be interested how to write a custom elaborator that for example transforms fun x : Float => 3*x + sin(x)
to the appropriate kernel : KExpr
and generates code for CArray.map kernel
Arthur Paulino (Dec 09 2021 at 14:27):
@Mario Carneiro You really spotted the root cause of the issues. It works nicely now! :tada:
Tomas Skrivan (Dec 09 2021 at 14:32):
Also certigrad did some bindings to Eigen. It is in Lean 3, but maybe some inspiration can be taken from there. I have never studied the source code though.
Arthur Paulino (Dec 09 2021 at 14:44):
Hm, let me explain what I'm trying to do and then you can tell me the flaws and holes in my approach. I think this will help me better understand what you're aiming for.
I've created a package that can perform matrix operations in C from Lean 4. Matrix instantiation, transposing, addition/multiplication with scalars and other matrices. In this package, I also coded a Tensor
, which is an abstraction for concatenating matrix operations. A tensor also has a matrix in the head and a list of operations. Then when you call Tensor.compute
it checks dimensions consistency across operations and if everything is fine then it performs the operations, returning a NLMatrix
Arthur Paulino (Dec 09 2021 at 14:46):
It happens here
Tomas Skrivan (Dec 09 2021 at 14:55):
I think you approach is fine, it is just involves too much custom C code for my liking. I want to as little work in C as possible. Also, I want to take the "expression template" approach to get as much speed as possible. This requires compiling a new C function for each expression, which is infeasible to do by hand and some automation is required.
Arthur Paulino (Dec 09 2021 at 14:57):
Can you show me some exemplary use cases for the API you want to provide?
Tomas Skrivan (Dec 09 2021 at 15:08):
Maybe something like this:
def saxpy (s : Float) (x y : Vector) := Vector.map2Idx (compile (fun xi i y' => s*xi + y'[i])) x y
where Vector
is just a wrapper around ByteArray
on a specific target(e.g. CPU or GPU). The (compile expr)
has to be a macro/custom elaborator that compiles and expression to some kernel function that holds pointer to linked shared library. The function Vector.map2Idx
has signature Vector.map2Idx (k : CompiledKernel) (vectorToMutate : Vector) (vectorToRefer : Vector) : Vector
Then I would also write a custom macro optimize
that would turn optimize (s*x+y)
to Vector.map2Idx (compile (fun xi i y' => s*xi + y'[i])) x y
Tomas Skrivan (Dec 09 2021 at 15:15):
My high level tensor interface aready turns saxpy s x y
to intro (λ i => s*x[i] + y[i])
(code). It is half way there, I just do not know how to decide whether x
or y
should be mutated in place and turn intro (λ i => s*x[i] + y[i])
to x.mapIdx (λ xi i => s*xi + y[i])
or y.mapIdx (λ yi i => s*x[i] + yi)
. You don't really have access to reference counts of x
or y
to make this decision automatically, so I'm not sure what to do.
Tomas Skrivan (Dec 09 2021 at 15:17):
@Arthur Paulino Why are your operations on matrices inside of IO
monad?
Arthur Paulino (Dec 09 2021 at 15:21):
I've just updated the code, let's use this version for reference
Tomas Skrivan (Dec 09 2021 at 15:23):
Ohh I think your Tensor
is something what I call "expression template". It is just some representation of operations on matrices. I find the name Tensor
super confusing.
Arthur Paulino (Dec 09 2021 at 15:25):
I put everything inside IO
monad because code could break in C in previous versions. But now that I'm checking for consistency before running the computations I guess I don't need to do that anymore
Arthur Paulino (Dec 09 2021 at 15:25):
Because now I can assume that, for instance, matrix A and B are of same dimensions when I sum them in C
Tomas Skrivan (Dec 09 2021 at 15:27):
The whole beauty of Lean and dependent types is exactly that you can have dimensions of matrices in types, so you can't accidentally add or multiply matrices with incorrect dimensions.
Tomas Skrivan (Dec 09 2021 at 15:28):
Also the type NLMatrix
just cant form a vector space. You need dimensions in the type. And if NLMatrix
can't be a vector space then I don't see the point in using Lean.
Tomas Skrivan (Dec 09 2021 at 15:29):
But of course, you can just create a subtype of NLMatrix
that has a proof about dimensions :)
Arthur Paulino (Dec 09 2021 at 15:32):
Right, my code is currently limited to R²
. First, I wanted to make sure I would be able to make the data pipeline work back and forth (Lean -> C -> Lean)
Arthur Paulino (Dec 09 2021 at 15:33):
And maaan, did I learn in the process :smiley:
Tomas Skrivan (Dec 09 2021 at 15:34):
And huge thanks for figuring it out! I was scared of that and didn't attempt doing so yet :grinning_face_with_smiling_eyes:
Tomas Skrivan (Dec 09 2021 at 15:34):
I will be stealing some parts of that setup for sure :)
Arthur Paulino (Dec 09 2021 at 15:37):
The way that I setup my design is such that Lean has no idea about the dimensions of matrices and always has to ask C for such (although always in R²
)
Arthur Paulino (Dec 09 2021 at 15:37):
So the low level implementation is the real owner of the data
Arthur Paulino (Dec 09 2021 at 15:39):
Your approach is better and more mature, in which Lean dictates the rules and then the low level implementation has to accommodate for what's being requested by the user
Arthur Paulino (Dec 09 2021 at 15:48):
BTW your setup is pushing ~
files to GitHub like this and this, just in case you haven't noticed
Tomas Skrivan (Dec 09 2021 at 15:49):
I feel like taking your approach can get really ugly really fast. Do you pick row major or column major matrices? How are you going to deal with symmetric, anti-symmetric, diagonal, block matrices? Do you want to write custom C code for every type? And custom code for every pair-wise interaction?
Tomas Skrivan (Dec 09 2021 at 15:50):
BTW your setup is pushing ~ files to GitHub like this and this, just in case you haven't noticed
Yeah sometime I accidentally do that :( I would add *~
to gitignore
Tomas Skrivan (Dec 09 2021 at 15:51):
fixed :wink:
Tomas Skrivan (Dec 09 2021 at 15:53):
Anyway I do not want to completely detract you from your approach, I would be actually quite interested how far you can get. What will work well and what won't.
Arthur Paulino (Dec 09 2021 at 15:58):
The raw data is just an array of doubles (in contrast to arrays of arrays of doubles). I started this simple because I thought it would be easier to plug in CUDA or some other lib
Tomas Skrivan (Dec 09 2021 at 16:00):
But this (m->data[j + i * m->n_cols]
) decides it is a row major matrix.
Arthur Paulino (Dec 09 2021 at 16:03):
Does it though? It's just a computation to get the real index. It doesn't impose serious restrictions like when you have to do indexing of indexing
Tomas Skrivan (Dec 09 2021 at 16:05):
Well you are using it in matrix multiplication and you will be using matrix multiplication in other functions and that way you will more or less commit to row major matrices.
Arthur Paulino (Dec 09 2021 at 16:12):
Ah, I see where the row major commitment is now. It's in the fact that the data is disposed in the array as the sequence of rows unfold
Tomas Skrivan (Dec 09 2021 at 16:13):
Yup :)
Tomas Skrivan (Dec 09 2021 at 16:15):
That is why I want to externally implement only ByteArrays, in C/C++/cuda/opencl, and to all of the other logic on Lean level.
Arthur Paulino (Dec 09 2021 at 16:29):
I'm gonna read your full code (now that I know where it is :D) and try to understand your reasoning
Arthur Paulino (Dec 09 2021 at 18:03):
@Tomas Skrivan do you have this line of thought organized/documented somewhere?
Tomas Skrivan (Dec 09 2021 at 18:11):
Let me clean it up and I will set up a repo.
Arthur Paulino (Dec 09 2021 at 18:13):
I am very interested in it and I want to understand more details about it. For instance, what is the role of ByteArrays in your approach? Wouldn't it cause translation overhead slowdown (as opposed to communicating with C directly via instances of Array Float
)?
Arthur Paulino (Dec 09 2021 at 18:14):
However, I still haven't figured out how to translate an array of double
in C to an instance of Array Float
in lean. But I am quite sure it can be done
Mario Carneiro (Dec 09 2021 at 18:17):
What you have done is basically reimplement ByteArray, or rather DoubleArray
Mario Carneiro (Dec 09 2021 at 18:17):
Lean has a few unboxed array types that look like this
Mario Carneiro (Dec 09 2021 at 18:18):
Array Float
is a boxed array type, you won't get C-like performance with this data structure
Arthur Paulino (Dec 09 2021 at 18:20):
Doesn't this function return the pointer to the double array's head?
static inline double * lean_float_array_cptr(b_lean_obj_arg a)
Maybe it's why I am facing these issues?
Arthur Paulino (Dec 09 2021 at 18:35):
EEh, nevermind. I just tried to access data from a Array Float
in C and it didn't work the way I expected, so ByteArrays do seem to be the way to go
Mac (Dec 09 2021 at 19:02):
@Arthur Paulino there is FloatArray
though for unboxed C arrays of double
.
Arthur Paulino (Dec 09 2021 at 19:07):
Ah!! Thanks! I got confused and thought I'd have to use Float Array
in Lean, but it's FloatArray
!
Tomas Skrivan (Dec 09 2021 at 19:44):
Ok have put my sketch of a code into a repo, https://github.com/lecopivo/lean4-karray
The idea is to provide generic array of unboxed values of type α
called KArray k α
, where k : Kernel
specifies if it is cpu/gpu or whatever memory.
Tomas Skrivan (Dec 09 2021 at 19:49):
I guess start by reading KArray.lean and look up definitions in Kernel.lean as they come up.
Arthur Paulino (Dec 09 2021 at 19:51):
Do you prefer PRs from a fork or from non-protected branches (if you invite me)?
Tomas Skrivan (Dec 09 2021 at 19:52):
I don't know :D I just invite you and give you full access.
Arthur Paulino (Dec 09 2021 at 19:53):
Sure, just add a license first and I will be more comfortable
Tomas Skrivan (Dec 09 2021 at 19:54):
License suggestion? MIT?
Arthur Paulino (Dec 09 2021 at 19:54):
MIT or Apache 2.0 should be fine
Tomas Skrivan (Dec 09 2021 at 19:59):
I'm fan of beerware :D ok jokes aside MIT it is then.
Kevin Buzzard (Dec 09 2021 at 23:19):
I think that in general in the community we've licensed projects using the same license as lean itself (ie Apache) but I don't know the legal reasons for this
Anne Baanen (Dec 10 2021 at 09:52):
Using the same license as Lean itself makes it easy to move code from/to core Lean. Big corporations with lots of lawyers (read: Microsoft) tend to prefer Apache over MIT since it's more precise, and addresses other forms of intellectual property including patents. So I would advocate Apache for your project if it's not too late :)
Tomas Skrivan (Dec 10 2021 at 10:06):
I'm ok switching it to Apache @Arthur Paulino ok with you too?
Arthur Paulino (Dec 10 2021 at 11:52):
Sure
Anders Christiansen Sørby (Dec 10 2021 at 12:24):
Cool project. I added a nix flake build https://github.com/arthurpaulino/NumLean/pull/1
Anders Christiansen Sørby (Dec 10 2021 at 12:26):
I've wanted to do something like this too, but Lean is quite fast itself so much of the code can be written in Lean too.
Anne Baanen (Dec 10 2021 at 12:36):
Arthur Paulino said:
Sure
Thank you! :heart:
Anders Christiansen Sørby (Dec 10 2021 at 12:56):
@Tomas Skrivan I added one for KArray too if you want it :smile: https://github.com/lecopivo/lean4-karray/pull/2
Anders Christiansen Sørby (Dec 10 2021 at 12:58):
Having a GPU library for Lean would be gold :heart_of_gold:
Chris Lovett (Jan 07 2022 at 03:31):
Here's another simple Matrix C++ implementation - also has vector and tensor and supports different layouts. It would would be interesting to compare. This implementation can also be optimized by an OpenBlas back end and can be code generated to optimized bitcode using LLVM. One could imagine a Lean implementation of such things that is compiled by a LLVM-based Lean compiler to achieve the same thing - but with all the proofs you want to ensure the code is correct.
Joe Hendrix (Sep 08 2022 at 17:15):
I was curious about the status of this (and any other linear algebra projects) in Lean. Are any numerical libraries still being worked on?
I have some primitives (BitVecs, ByteVec, polymorphic vectors and polymorphic matrices) in my crypto side project.
Mathlib seems to define matrices as functions. I think this is great for proving, but for computation we may want a variety of largely isomorphic representations of transforms.
I think it'd be great to support a variety of different approaches with different tradeoffs (in trust, performance, hardware requirements), but allow theorems to be soundly shared.
Thoughts on whether convergence to a single representation is feasible or practices to support reuse across implementations?
Mario Carneiro (Sep 08 2022 at 17:46):
My general sense on this topic is that it's fairly difficult to construct a computational type without a specific application in mind. The mathlib definition should be viewed as the noncomputable "ideal" definition, while specific applications will have some other bespoke type with the desired properties, and then you can prove isomorphism to the ideal definition and transfer all the theorems across that isomorphism
Tomas Skrivan (Sep 08 2022 at 17:50):
(deleted)
Mario Carneiro (Sep 08 2022 at 17:51):
We could have a few specific classes of computational matrices (e.g. a 2D Array A
, a ByteArray
encoding a bitvector, a FloatArray
but this one isn't a ring) in a general library but it's easy to miss the mark if you don't know the characteristics of the application
Kevin Buzzard (Sep 08 2022 at 17:51):
For matrices doesn't one typically have several implementations for computing, depending for example on whether you know your matrix is sparse or not? I'm a bit unclear about how one is supposed to handle this kind of thing in Lean 4.
I'm also particularly interested in this kind of question at the minute, because in a few days I'll be (virtually) attending a conference at AIM on "Computational mathematics in computer assisted proofs" and someone pointed out to me that "Lean 4" might be a potential answer to a lot of questions that people have. Here's the participant list
http://admin.aimath.org/resources/compproofsv/participantlist/
and as you can see from the statements of interest, several people attending would probably want to hear what's going on in the Lean 4 community in this regard.
Tomas Skrivan (Sep 08 2022 at 18:09):
Yes I'm working on this on and off. I went through multiple iterations and currently the main interface is done through the relatively new GetElem
typeclass.
Interface
For example, to get n × m
matrix I would provide instance:
instance {n m} : GetElem {a : FloatArray // n * m = a.size} (Fin n × Fin m) Float (λ _ _ => True)
:= ⟨λ a ij _ => a.1[(Enumtype.toFin ij).1]!⟩
where Enumtype
is a typeclass that has isomorphism between a finite type and Fin _
. In this case, Fin n × Fin m ≈ Fin (n*m)
. The type Fin n × Fin m
has lexicographical ordering, so you naturally get row-major matrix.
I have couple of other typeclasses for setting, modifying elements and introducing a new container
class SetElem (Cont : Type u) (Idx : Type v) (Elem : outParam (Type w)) where
setElem : (x : Cont) → (i : Idx) → (xi : Elem) → Cont
class ModifyElem (Cont : Type u) (Idx : Type v) (Elem : outParam (Type w)) where
modifyElem : (x : Cont) → (i : Idx) → (Elem → Elem) → Cont
class HasIntro {X Y} (T : Type) [FunType T X Y] where
intro : (X → Y) → T
toFun_intro : ∀ f x, (intro f)[x] = f x
Maybe the SetElem
should also have the (dom : outParam (cont → idx → Prop))
parameter. For example, you can modify only diagonal elements for diagonal matrix. I have to think about it a bit more as it should also support sparse matrices.
You can see the code here.
To get notation like x[i] += ...
in do blocks. You can do
syntax atomic(Term.ident) noWs "[" term "]" " += " term : doElem
macro_rules
| `(doElem| $x:ident[ $i:term ] += $xi) => `(doElem| $x:ident := modifyElem ($x:ident) $i (λ val => val + $xi))
Carrier
What should the Cont
type be? I mostly use FloatArray
. I also have wrapper for C++ Eigen matrices, but it is just a proof of concept and I didn't touch it in a long time as I'm mostly focusing on symbolic computations.
Tomas Skrivan (Sep 08 2022 at 18:11):
My main worry with this approach is that the index type for matrices Fin n × Fin m
is using polymorphic product which boxes its elements, so the final code can be slow if the boxing is not eliminated.
Mario Carneiro (Sep 08 2022 at 18:17):
Tomas Skrivan said:
My main worry with this approach is that the index type for matrices
Fin n × Fin m
is using polymorphic product which boxes its elements, so the final code can be slow if the boxing is not eliminated.
That shouldn't be a problem as long as everything in sight is marked @[inline]
; lean can remove stuff like projections on a pair so as long as the user code passes arr[(i, j)]
I would expect it to be untupled in the generated code
Tomas Skrivan (Sep 08 2022 at 18:21):
Yes that is my hope that it can be eliminated with sufficient in lining. Also, there should be some kind of compile time simplifier that might help with this too.
Tomas Skrivan (Sep 08 2022 at 19:39):
Joe Hendrix said:
Thoughts on whether convergence to a single representation is feasible or practices to support reuse across implementations?
We definitely need multiple representations. Just for matrices we need row/column-major, triangular, block, diagonal matrices, also is for CPU, GPU or distributed across machines?
James Gallicchio (Sep 08 2022 at 21:08):
Are there any well-established libraries for matrix operations that we can shamelessly grift an API from?
James Gallicchio (Sep 08 2022 at 21:10):
I don't know that we need to care about the efficiency of stuff like GetElem
esque operations, since performance-sensitive matrix code should be exclusively using large-step operations like matrix mult/add/sub, factorization, etc...
Joe Hendrix (Sep 08 2022 at 21:22):
@Tomas Skrivan I like the typeclasses. It seems like the ModifyElem/SetElem type classes would be worth moving into Lean or MathLib. For my current purposes, it'd also be great to have comprehensions (including parallel comprehensions) generalized to support other types such as vectors/matrices.
I think the approach I'll take is to show my implementation types and definitions are isomorphic to the Mathlib definitions.
Tomas Skrivan (Sep 08 2022 at 21:37):
James Gallicchio said:
Are there any well-established libraries for matrix operations that we can shamelessly grift an API from?
That is exactly what I want to do with wrapping C++ Eigen library. It is a fairly popular linear algebra library in C++ with a nice collection of dense and sparse solvers. Someone just needs to sit down and write down all the wrapper functions. So far I wrote only a single wrapper for LDLT to demonstrate how to do it.
Tomas Skrivan (Sep 08 2022 at 21:40):
James Gallicchio said:
I don't know that we need to care about the efficiency of stuff like
GetElem
esque operations, since performance-sensitive matrix code should be exclusively using large-step operations like matrix mult/add/sub, factorization, etc...
My hope is that if you effectively write C code with Lean then equivalent C code will be generated. Maybe with some special annotations to the compiler or something. But definitely not something I'm too worried about in the near future.
Last updated: Dec 20 2023 at 11:08 UTC