Zulip Chat Archive

Stream: Program verification

Topic: Algorithms in Lean

Ashvni Narayanan (Sep 16 2024 at 23:44):

This question is from somebody who is looking to work on the formalization of distributed algorithms:

What algorithms have been formalized (and proved to be accurate) in Lean? If only a handful, why is that?

I suppose this is the right stream for this question, pls let me know if not.
Any examples of algorithms that have been formalized would be appreciated, thank you!

Shreyas Srinivas (Sep 17 2024 at 00:16):

(deleted)

Shreyas Srinivas (Sep 17 2024 at 00:17):

What kind of algorithms are we talking about here?

Shreyas Srinivas (Sep 17 2024 at 00:17):

Can you give some examples?

Shreyas Srinivas (Sep 17 2024 at 00:19):

because depending on the answer, there are several different siloed fields with different notions of algorithms and models of distributed computing.

Shreyas Srinivas (Sep 17 2024 at 00:32):

Further there are different notions of formalising distributed algorithms/protocols. From raw formalisms to systematic verification + synthesis approaches using separation logics.

Shreyas Srinivas (Sep 17 2024 at 01:15):

If you mean synchronous distributed graph algorithms, the reason is basically that until very recently, almost no one from the community actively worked in this area (unless you count me). Another issue is the literature in this area plays fast and loose with definitions, so what looks like one model is actually several related models.

Shreyas Srinivas (Sep 17 2024 at 01:16):

If you are referring to asynchronous systems with algorithms like raft and paxos, the community that does verify them is active in other theorem provers.

Shreyas Srinivas (Sep 17 2024 at 01:17):

There are also shared memory models, and I believe there are some people from the PL community using Coq, and process calculi people use model checker like tools.

Shreyas Srinivas (Sep 17 2024 at 01:19):

some example works from the PL side : https://ilyasergey.net/assets/pdf/papers/dpc-jfp.pdf

Shreyas Srinivas (Sep 17 2024 at 01:20):

https://ilyasergey.net/assets/pdf/papers/scilla-oopsla19.pdf
https://www.cs.purdue.edu/homes/roopsha/papers/QuickSilver.OOPSLA21.pdf

Ashvni Narayanan (Sep 17 2024 at 02:47):

Shreyas Srinivas said:

What kind of algorithms are we talking about here?

Thank you for your response!

A few examples include Asynchronous Byzantine agreement protocols and DAG-based consensus protocols (and the theory leading up to them).

Ashvni Narayanan (Sep 17 2024 at 03:32):

In particular, interest is in verification of liveness and probabilistic guarantees of randomised asynchronous distributed algorithms.

Shreyas Srinivas (Sep 17 2024 at 08:21):

Then there are formalisations in Coq like the following: https://arxiv.org/abs/1907.05523

Karl Palmskog (Sep 17 2024 at 08:24):

people encode distributed systems in ITPs in three primary ways:

global transition system (this was done in https://arxiv.org/abs/1907.05523 and many others like Verdi Raft)
Lamport's happens-before relation (done for example in https://github.com/vrahli/Velisarios)
distributed separation logic (for example https://github.com/logsem/aneris)

Shreyas Srinivas (Sep 17 2024 at 08:45):

This still leaves room for a lot of other sorts of distributed systems which haven't been formalised or verified yet in ITPs

Shreyas Srinivas (Sep 17 2024 at 08:51):

specifically systems which aren't asynchronous

Sasha Rubin (Sep 17 2024 at 10:17):

Is there an advantage of formalising distributed algorithms in Lean instead of Coq?

Karl Palmskog (Sep 17 2024 at 10:29):

I don't check all papers in the field, but I didn't see any high-profile Lean-based paper on distributed systems verification. The main advantage is usually proposed to be access to Mathlib (but MathComp for Coq gives mostly similar math)

Karl Palmskog (Sep 17 2024 at 11:12):

Shreyas Srinivas said:

This still leaves room for a lot of other sorts of distributed systems which haven't been formalised or verified yet in ITPs

You can in principle do any specific distributed algorithm using any of the three approaches (happens-before, etc.), but the encoding specifics and proof reuse will be very different. And availability of maintained libraries varies, of course.

Shreyas Srinivas (Sep 17 2024 at 12:14):

Sasha Rubin said:

Is there an advantage of formalising distributed algorithms in Lean instead of Coq?

I am formalising distributed graph algorithms with some colleagues, and I see a few advantages in Lean over Coq:

I can teach lean math to mathy people in this area who will run in fear if the word "constructive" is used. Even though mathcomp is classical, Coq has an overall reputation of letting you pick your axioms, which means people need to know some foundations.

Shreyas Srinivas (Sep 17 2024 at 12:14):

UX UX and UX. Unparalleled editor support, easy use of gitpod etc. This ranks above any technical considerations, because one can learn to work around technical limitations for most popular ITPs. Good UX is a huge time and effort saver

Shreyas Srinivas (Sep 17 2024 at 12:14):

Mathlib. It is growing rather fast, so I am not worried about not having math that I might need.

Shreyas Srinivas (Sep 17 2024 at 12:16):

Lean's metaprogramming is a bit more unified. There is one book that I can run a seminar on, and bring everyone upto speed.

Shreyas Srinivas (Sep 17 2024 at 12:34):

Karl Palmskog said:

I don't check all papers in the field, but I didn't see any high-profile Lean-based paper on distributed systems verification. The main advantage is usually proposed to be access to Mathlib (but MathComp for Coq gives mostly similar math)

I have it on good authority that this situation might change in the next couple of years. Looking forward to it. It will be a nice comparison.

Karl Palmskog (Sep 17 2024 at 12:41):

constructivism is a non-issue to most Coq users, but the Coq community has traditionally argued for careful "axiom management", since certain combinations of axioms can lead to trouble. When we did distributed system implementation verification in HOL4+CakeML, the main difference from Coq was not having to juggle decidable equality explicitly (but instead it shows up during the CakeML synthesis phase)

Karl Palmskog (Sep 17 2024 at 12:49):

core MathComp packages are axiom-free, while certain non-core packages like MathComp Analysis use a variety of axioms related to classical logic

Shreyas Srinivas (Sep 17 2024 at 12:51):

Karl Palmskog said:

the Coq community has traditionally argued for careful "axiom management", since certain combinations of axioms can lead to trouble.

In mathy communities like the algorithms community, most people probably took that one logic course in undergrad. We are not experts in proving soundness and completeness theorems for various axiom choices. So this is risky business for us. Lean (as in the whole tactic infrastructure of lean + mathlib) picks them for us and tactics use them freely.

Shreyas Srinivas (Sep 17 2024 at 12:56):

Another thing is, a lot of the algorithms theory community's work is producing combinatorial lemmas and procedures. The computational bit is not as important for us as it is for systems folks.

Karl Palmskog (Sep 17 2024 at 13:21):

you can encode "axioms" in many different ways, e.g., as your assumptions about how the network behaves. In computer science oriented reviews (think CPP conference), these assumptions could well be a focus of PC discussions. I think formal (program) verification is quite different from math/algorithms in this way, e.g., the reason for adding some assumption or axiom or framework is more likely to be scrutinized

Shreyas Srinivas (Sep 17 2024 at 13:42):

Karl Palmskog said:

I think formal (program) verification is quite different from math/algorithms in this way, e.g., the reason for adding some assumption or axiom or framework is more likely to be scrutinized

This makes sense. In math, one can construct reasonable mathematical models and work on them. These models don't necessarily have to conform to a ground truth in the same way that assumptions about the behaviour of programs have to (the ground truth being that they are run or simulated on a computer).

Shreyas Srinivas (Sep 17 2024 at 13:43):

So in a formalisation project of math or algorithms, the only relevant question is whether the model corresponds in some reasonable way to what is already on paper.

Bas Spitters (Oct 14 2024 at 08:27):

There are quite a few formalizations of consensus in Coq already. E.g.
https://eprint.iacr.org/2020/917
https://verse-lab.github.io/papers/bythos-ccs24.pdf

Aside, the concise SSR style is available in Lean too, one "just" needs to rebuild the math-comp libraries...
https://arxiv.org/pdf/2403.12733

Shreyas Srinivas (Oct 14 2024 at 09:22):

I think we can mix and match SSR and normal lean tactics

Shreyas Srinivas (Oct 14 2024 at 09:25):

So mathlib + leanSSR should work as is

Ember Arlynx (Mar 03 2025 at 01:50):

@Ashvni Narayanan I am presently working on a formalization of the Morpheus consensus algorithm, but it's deterministic and semisynchronous, not asynchronous and probabilistic. but it would be interesting to model the system degradation under liveness violations or a message adversary.

Ember Arlynx (Mar 03 2025 at 01:56):

It's going OK. Next week I am working towards a refinement proof for the fastpath optimizations.

Last updated: May 02 2025 at 03:31 UTC