Zulip Chat Archive
Stream: general
Topic: SciLean: scientific computing with Lean
Tomas Skrivan (Jan 15 2022 at 17:10):
For quite some time I have been working on a library that would allow writing scientific computing code in Lean. As Lean can formalize mathematics, I think it is an ideal programming language for math heavy code such as physics simulation or machine learning. Either as a way more powerful glue code, like Python, or leverage its meta programming capabilities, like Julia, or why not to turn Lean into computer algebra system, like Mathematica, if it already understands lots of mathematics.
I have finally arrived to a point where it makes sense to share my code.
Currently, I have only one working example worth mentioning and it is a simulation of wave equation, full code:
Wave equation can be defined through its Hamiltonian
def H (m k : ℝ) (x p : ℝ^n) :=
let Δx := (1 : ℝ)/(n : ℝ)
(Δx/(2*m)) * ∥p∥² + (Δx * k/2) * (∑ i, ∥x[i] - x[i-1]∥²)
(it is already discretized in space)
The simulation of such Hamiltonian system with Runge-Kutta 4 can be implemented with the following code:
def solver (m k : ℝ) (steps : Nat) : Impl (ode_solve (HamiltonianSystem (H n m k))) :=
by
-- Unfold Hamiltonian definition and compute gradients
simp[HamiltonianSystem, H]
autograd
autograd
-- Apply RK4 method
rw [ode_solve_fixed_dt runge_kutta4_step]
lift_limit steps "Number of ODE solver steps."; admit; simp
finish_impl
Roughly speaking:
simp[HamiltonianSystem, H]; autograd; autograd
expands definitions and preform symbolic differentiationrw [ode_solve_fixed_dt runge_kutta4_step]
replacesnoncomputable
functionode_solve
with Runge-Kutta 4 integration schemelift_limit steps "Number of ODE solver steps."; admit; simp
says we are computing the specification only in an approximate sense and we choosesteps
integration steps
This then produces this animation
wave.gif
See the readme file for more detailed explanation. (It assumes you do not know Lean, so something might sound silly)
The current state of the library is that I have a solid base which is capable automatically prove that function is smooth, linear or has adjoint and it can also automatically compute differentials, adjoints and gradients. To a very limited degree it can also automatically invert functions, but that is currently only used in computing adjoints of expressions involving sums, as you sometimes need to reindex sums.
There is tons of directions I would like to explore:
-
Variational calculus
The above example shows wave equation that has already been discretized in space. I want to start with the continuous version of Hamiltonian and choose the space discretization in similar fashion as I picked the time discretization with RK4.
I have some rudimentary results with functional derivative. For this purpose I have smooth/linear lambda abstraction, i.e. you can writeλ (x : X) (r : ℝ) ⊸ r*x
the⊸
indicate it is a linear function in every argument. Writingλ (r : ℝ) ⊸ r*r
produces error as the function is not linear. example code -
Numerical Algebra
I have already shared some of the results here, it mainly provides high level interface for array/tensor like objects.
Together with @Arthur Paulino we are working on a library that would allow somewhat automatically generating C code for unboxed arrays like FloatArray. Hopefully this will also extends to GPUs.
Wrapping existing linear solvers is also highly desirable. -
Symbolic manipulation with polynomials
I want to do some computations with polynomials, differential forms and tensor products.
One big result I want is computational isomorphism between polynomials ofn+1
variables and polynomials in one variable but values in polynomials withn
variables i.e.𝓟[ℝⁿ⁺¹, ℝ] ≅ 𝓟[ℝ, 𝓟[ℝⁿ, ℝ]]
. This way, I can easily get Horner form of a polynomial(crucial for fast evaluation) in several variables just from Horner form of polynomial in one variable.
Some incoherent experimentation in this direction is here -
(semi)Prismatic sets
In physics simulation, meshes are everywhere. Most commonly triangular or simplicial meshes. I want to have unified treatment of these and want to include meshes with quads, hexes, prisms and pyramids. I call theseprisms
and they can be generated by apoint
and two operations:cone
andprod
. I have inductive definition here, for example line segment iscone point
, square is product of two segmentsprod (cone points) (cone point)
. The inclusion map of one Prism to another is represented by the inductive typePrism.Face
and composition isFace.ofFace
. This forms direct analog of simplex category(containing only inclusions). So I would like to define (semi)prismatic sets as presheaf over Prism category. This should be the core tool for dealing with meshes on top of which I can start building finite element/difference/volume code. -
Machine learning
I have defined the code for forward and reverse mode differential operators and their core simplification identities, forward and reverse. These identities are expressing nothing more then functionality of (co)tangent map between manifolds. The task is to generalize the identities I have for normal differential to these two differential operators. Then we can hopefully unleash it on my mockup of VGG and hopefully get back-propagation.
I'm happy to hear any comments, suggestions. Or if anyone is interested in any of these topics I'm happy for any helping hand :)
However, the current thing I should focus on is probably optimizing order/priority of all simp lemmas and type class instances. As timeouts are sometimes really bad and I'm constantly running into infinite loops.
Yaël Dillies (Jan 15 2022 at 17:15):
Wow, that's great! This is part of my Cambridge computing project, so I'll refrain to look at the code :smile:
Eric Wieser (Jan 15 2022 at 17:30):
Is this the first "real" project depending on Mathlib4?
Arthur Paulino (Jan 15 2022 at 17:46):
That's some decent amount of work :open_mouth:
The usability of a machine learning lib is highly dependent on the existence of model deployment tools or at least a nice interactive REPL environment, so I think it would be a lower priority target at the moment
Tomas Skrivan (Jan 15 2022 at 20:34):
Another big direction I want to take is to use diffeological spaces to work with manifolds. Currently, I'm working only with convenient vector spaces and smooth maps between them. Therefore, I can't for example say what is a smooth map between spaces of all invertible functions. At some point I want to formulate fluid simulation as a geodesic path on the space of all volume preserving maps.
(I'm not actually fully formalizing all the math, not even providing full definitions, but I'm just peeking in to books and making sure it can be fully formalized at some point. Right now, I'm focusing on the computational aspect of it.)
I have very limited understanding of diffeology, so having someone who understands it would be nice. My current idea is to maybe to work only with diffeological spaces that are quotients of vector spaces and smooth map would be defined as a quotient of a smooth map. Not sure sure if this is a completely valid but should preserve reasonably well the computational aspect. It also naturally reflects how we usually work computationally with manifolds. For example, a point on a sphere is modeled as (x,y,z) with a condition |(x,y,z)| = 1. From time to time you have to project the point back on the sphere. This projection exactly defines the relation with which you take the quotient. Computationally, this is way way better then covering sphere with some maps and working with those.
Also I know there are some problems with tangent spaces of a general diffeological spaces that I do not fully understand. This probably gives some condition on the relation with which I would do the quotient. Maybe defining the relation based on a function satisfying implicit function theorem?
Patrick Massot (Jan 15 2022 at 21:50):
I think it's really great that someone tries to do numerical analysis in Lean! I had no idea that diffeological spaces could be useful in numerical maths. I only very vaguely know them as an esoteric corner of differential geometry. I wasn't aware of any link with anything else than diffeological spaces applied to diffeological spaces.
Tomas Skrivan (Jan 15 2022 at 22:06):
For example this paper on differentiable programming: 𝜆ₛ: computable semantics for differentiable programming with higher-order functions and datatypes uses diffeological spaces as a tool for soundness of their method.
I'm mostly following the first chapter of The Convenient Setting of Global Analysis. I especially like the part of the introduction:
An eminent mathematician once said that for infinite dimensional calculus each
serious application needs its own foundation. By a serious application one obviously
means some application of a hard inverse function theorem. These theorems can
be proved, if by assuming enough a priori estimates one creates enough Banach
space situation for some modified iteration procedure to converge. Many authors
try to build their platonic idea of an a priori estimate into their differential calculus.
We think that this makes the calculus inapplicable and hides the origin of the a
priori estimates. We believe that the calculus itself should be as easy to use as
possible, and that all further assumptions (which most often come from ellipticity
of some nonlinear partial differential equation of geometric origin) should be treated
separately, in a setting depending on the specific problem
This totally aligns with my experience from my studies, mostly analysis and PDEs, and the total disconnect on how mathematicians and physicist treat similar problems.
All transformations in SciLean can be(hopefully will be one day) formally argued by using convenient calculus (or diffeological spaces). To actually prove any kind of convergence of the generated algorithms you then have to go and manually provide some apriory estimates, choose correct Banach/Sobolev space etc. However, this is not a task I'm particularly interested in and it is super hard(yes looking at you Navier-Stokes).
Tomas Skrivan (Jan 15 2022 at 22:10):
My hope is to automatically prove consistency i.e. if the algorithm converges then it actually solves the specified problem.
However, the overruling principle is to minimize the human time it takes to go from formal specification of the problem to the actually executable code.
Patrick Massot (Jan 16 2022 at 11:03):
This is really surprising to me. It would be really super nice if this were actually useful. But I fear you'll have a very hard time finding experts here. Maybe you should email the authors of this paper and try to get them interested?
Tomas Skrivan (Jan 16 2022 at 11:47):
I will try that, but maybe I should read that paper properly first :)
Joe Hendrix (Jan 16 2022 at 16:54):
@Tomas Skrivan I'll mention that Ben Sherman (the first author of the paper) knows Lean -- he was an intern at Galois when we were working on formalizing a distributed protocol.
Tomas Skrivan (Jan 16 2022 at 16:55):
Cool! Thanks a lot.
Tomas Skrivan (Jan 19 2022 at 20:49):
ASCII art is cool but fully raytraced animations are cooler. I wrote a plugin for Houdini, software for visual effects similar to Blender, and now I can use Lean a scripting language in it. Visualization of the wave simulation: wave.mp4
Tomas Skrivan (May 08 2023 at 22:30):
I have cool new example to show.
The task it to shoot a target with a cannon i.e. find initial velocity for ballistic motion such that it hits specified target.
I wrote a widget that aims at your mouse :)
Peek-2023-05-08-17-59.gif
Here a description how is it done:
Ballistic motion is a solution to these differential equations:
where is some function describing air drag. In Lean this system is defined with the function ballisticMotion
(with
def balisticMotion (x v : ℝ×ℝ) := (v, g - (5 + ‖v‖) • v)
The functionaimToTarget
finds approximately the initial velocity to hit target
at time T
. The specification of this function is:
approx aimToTarget (v₀ : ℝ×ℝ) (optimizationRate : ℝ) :=
λ (T : ℝ) (target : ℝ×ℝ) =>
let shoot := λ (t : ℝ) (v : ℝ×ℝ) =>
odeSolve (t₀ := 0) (x₀ := ((0:ℝ×ℝ),v)) (t := t)
(f := λ (t : ℝ) (x,v) => balisticMotion x v)
(λ v => (shoot T v).1)⁻¹ target
Where we define a local function shoot
that returns a position an velocity of projectile at time t
shot with initial velocity v
. So if we want to hit the target
at time T
we are interested in inverting the function shoot
, i.e. (λ v => (shoot T v).1)⁻¹ target
.
The arguments v₀
and optimizationRate
are later used as initial parameters for gradient descent.
This specification is then turned into executable code using tactics. The steps are roughly follows
- Reformulate inverse as minimization and apply gradient descent
- Run automatic differentiation - this will formulate the adjoint problem because we are differentiating through solution of ODEs
- choose ODEs discretization for the forward and backward pass of the adjoint formulation
Unfortunately, my automatic differentiation is still not working completely smoothly, so there is no neat tactic called autodiff
but bunch of calls to fun_trans
which is a generic tactic doing function transformations.
If you want to see more details, here is the code.
Jeremy Avigad (May 09 2023 at 13:10):
Does this mean that we can now apply for DARPA grants for widgets?
Tomas Skrivan (Mar 01 2024 at 14:40):
I have a cool new picture to show :grinning_face_with_smiling_eyes: I can solve Laplace equation using Lean. Here is the solution of Laplace equation in a disk with boundary condition .
harmonic.png
You might be wondering why it is so bumpy, the thing is that I'm using a probabilistic algorithm. Let me explain how it works.
The starting point is the mean value property of harmonic functions(functions that satisfy Laplace equation). It is saying that the value at the point is the average of all the values on the sphere
Inspired by this property I define this recursive function
noncomputable
def harmonicRec (n : ℕ) (φ : Vec3 → Float) (g : Vec3 → Y) (x : Vec3) : Y :=
match n with
| 0 => g x
| m+1 => (4*π)⁻¹ • ∫ (x' : sphere 0 1), harmonicRec m φ g (x + φ x • x'.1)
Which recursively computes averages over the sphere at the point x
with the radius φ x
. Where φ
is the distance function to the boundary and g
defines the boundary conditions.
The hope is that harminicRec n φ g
converges(as n → ∞
) to the solution u
of Laplace equation on {x : φ | 0 < φ x}
and u x = g x
for φ x = 0
.
As you have noticed harmonicRec
is a noncomputable function so let's apply Monte Carlo method to evaluate all those integrals. The resulting algorithm is called walk on spheres. Here is the code turning harminicRec
into a probabilistic program:
def walkOnSpheres (φ : Vec3 → Float) (g : Vec3 → Y) (n : ℕ) (x : Vec3) : Rand Y := do
let f : Rand (Vec3 → Y) :=
derive_random_approx
(fun x => harmonicRec n φ g x)
by
induction n n' prev h
. simp[harmonicRec]
. simp[harmonicRec,h]
simp only [smul_push,cintegral.arg_f.push_lambda]
rw[integral_as_uniform_E Float]
rw[pull_E_nat_recOn (x₀:=_) (r:=_) (hf:=by fun_prop)]
simp (config:={zeta:=false})
let f' ← f
pure (f' x)
The idea is that the rw[integral_as_uniform_E Float]
rewrites those integrals into expectation over a uniform random variable on a sphere. Then rw[pull_E_nat_recOn ...]
pulls all those expectations out of the recursor and you end up with an expectation of some recursively defined random variable.
The resulting probabilistic program is:
Nat.rec
(pure fun (x : Vec3) => g x)
(fun (n : ℕ) (u : Rand (Vec3 → Y)) => do
let u' ← u
let y' ← uniform ↑(sphere 0 1.0)
pure (fun (x : Vec3) => u' (x + φ x • ↑y')))
n)
which recursively defines a random function u : Rand (Vec3 → Y)
. At each step it samples this function u'
and samples a random point on a sphere y'
then returns fun x => u' (x + φ x • y')
.
Tomas Skrivan (Mar 01 2024 at 14:50):
Or here is a piggy with very cold nose but very hot ears. We can check the heat distribution in his head :)
harmonicPig.png
Tomas Skrivan (Mar 04 2025 at 14:13):
After a long time and lots of mostly invisible work(making sure things scale) I have a new exciting results. Sam Estep at CMU is working on tool, Gradbench, to compare different automatic differentiation tools so I have preliminary results for SciLean.
The benchmark I implemented is Gaussian Mixture Model(i.e. fitting data with a collection of Gaussians).
Evaluation times for primal value i.e. evaluation of the loss function:
Not very encouraging, SciLean is dead last here.
(The plot is a bit crowded, you can go here and click on gmm
on top left to see an interactive version of the plot)
Evaluation times of the derivative:
Encouraging, not last anymore.
However, what measures the quality of automatic differentiation is the overhead introduced by computing the derivative. If we look at the ratio derivative/primal then SciLean is fairly well. For example beating Enzyme here.
One of the next step I should probably focus on is to learn how to use profiler and investigate why the evaluation of the primal value is so slow. I'm suspecting that adding few @[inline] and fixing places where I make accidental copies might take me quite far.
Tomas Skrivan (Mar 17 2025 at 14:15):
I have done some profiling and implemented a new benchmark.
The benchmark is k-mean clustering, which groups set of points into k groups such that this function is minimized:
def objective {n k d : ℕ} (points : Float^[d]^[n]) (centroids : Float^[d]^[k]) :=
∑ᴵ i, minᴵ j, ∑ᴵ l, (points[i,l] - centroids[j,l])^2
After some optimizations, this function compiles down to equivalent program in C and is as fast as such C program! Which is really nice assurance that it is in fact possible to set up all the abstraction such that it gets compiled away.
The computation of derivative is not as fast as direct implementation in C but it is getting closer. Here is comparison to other tools:
Alex Meiburg (Mar 17 2025 at 14:30):
I would be very curious to know what kind of optimizations those are! :) Maybe even just dropping that in a blogpost or something, I'm sure it would be a useful guide for others too... as someone who has very little sense about when Lean is slow or fast, it would be super nice to learn about a real "in-practice" example like this!
Tomas Skrivan (Mar 17 2025 at 15:47):
Most of the magic is hidden in ∑ᴵ
vs mathlib's ∑
. To get to C level I had to do the following things:
- custom Fintype: Mathlib's
Fintype
orFinEnum
are really bad performance wise. SciLean hasIndexType
which is similar toFinEnum
but with careful implementation. Operations with superscriptᴵ
like∑ᴵ
are overIndexType
rather thanFintype
. - Nat vs USize: When writing raw for loops, like suming all numbers of an array, there is a noticeable performance hit for using
Nat
compared fixed size numbers likeUSize
. Thus I haveIdx n
instead ofFin n
which is based onUSize
andIndexType I n
provides equivalenceI ≃ Idx n
. I'm likely inconsistent as I probably somewhere assume thatIdx m × Idx n ≃ Idx (m*n)
. This can be solved by makingIdx n
opaque and stating that its semantics is ofFin n
. Lean is doing something similar already, try runningArray.mkArray USize.size 0
:) - external vs internal iterators:
IndexType
used to have internal iterators i.e. extendedStream
for writing loops over finite types. This is slow. Now I have external iterators i.e. classIndexType.Fold I
providingfold/forIn
function overI
. - inline "everything": I had to inline lots of functions. An example, for
x : Float^[m,n]
the expressionx[i,j]
which is syntax sugar forgetElem x (i,j) (...)
and this should translate tox.toByteArray.ugetFloat (i*n+j) (...)
. Also lots of the high order function, likesum : (I → X) → X
should be marked with@[specialize]
. I still to not know when exactly should I use@[inline]
or@[specialize]
so I end up adding both.
I still need to do more profiling and optimization on the derivative computation, my best Lean implementation is still 75% slower than the identical C implementation. There will be some overhead due to the fact that Lean has to do reference counter check before each mutation, but I'm not getting any substantial speed up even if I used unsafe mutation function that does not do that check. There is something I still do not understand.
Tomas Skrivan (Mar 17 2025 at 16:07):
I think I have also encountered a performance bug. I have these structures
structure DataArray (X : Type*) [pd : PlainDataType X] where
data : ByteArray
h_size : data.size % pd.bytes = 0
structure DataArrayN (X I : Type*) {n} [PlainDataType X] [IndexType I n] where
data : DataArray X
h_size : data.size = n
structure Float64Array where
data : ByteArray
h_size : data.size % 8 = 0
and functions to convert from DataArrayN
to Float64Array
def toDataArrayN (x : Float64Array) : DataArrayN Float (Idx.size x) := ⟨⟨x.1, ...⟩, ...⟩
def toFloat64Array (x : DataArrayN Float I) : Float64Array := ⟨x.1.1, ...⟩
these should compile to identity but don't seem to. I didn't check the IR but if I implement them with cast sorry x
I get faster code.
Henrik Böving (Mar 17 2025 at 17:17):
(deleted)
Robert Joseph (Mar 18 2025 at 05:59):
This is so cool @Tomas Skrivan ! Will you also be releasing with scilean like some of these profiling benchmark tools or is it just using gradbench?
Tomas Skrivan (Mar 18 2025 at 08:28):
You can find implementation of the GradBench benchmarks in Test/GradBench directory. Those tests are only to ensure that I still compute the same derivative. You can't easily really run those but you can have a look at the generated code for the derivative.
In this directory I have tests comparing different implementations of parts of those benchmarks. They produce output like this
k-means profile test
k := 1000, n := 10000, d := 16
reference c impl time := 40.716340ms loss := 7202.578688
best lean impl time := 44.375879ms loss := 7202.578688
scilean no BLAS time := 79.685435ms loss := 7202.578688
target scilean impl time := 1280.020544ms loss := 7202.578688
Tomas Skrivan (Mar 18 2025 at 08:30):
(deleted)
Arthur Paulino (Mar 18 2025 at 11:06):
Is there a chance it's behind because of garbage collection?
Tomas Skrivan (Mar 18 2025 at 12:13):
I don't think so as I would see something like lean_free_object
with perf
.
Last updated: May 02 2025 at 03:31 UTC