Zulip Chat Archive

Stream: lean4

Topic: verified compiler


Arthur Paulino (Feb 15 2022 at 20:32):

What would it mean to have a verified Lean 4 compiler? What is there to be proved in such context?

Eric Taucher (Feb 15 2022 at 20:39):

CompCert

Eric Taucher (Feb 15 2022 at 20:41):

You might also like the Rainbow Series I have not herd of them being referenced much any more but the basic ideas still hold true.

Arthur Paulino (Feb 15 2022 at 20:51):

Semantic preservation ‍ The formal verification of CompCert consists in proving the following theorem, which we take as the high-level specification of a correct compiler:

Semantic preservation theorem:
For all source programs ‍S and compiler-generated code ‍C,
if the compiler, applied to the source S, produces the code C,
without reporting a compile-time error,
then the observable behavior of C improves on one of the allowed observable behaviors of ‍S.

Then it says:

Second, the compiler is allowed to select one of the possible behaviors of the source program. The ‍C language has some nondeterminism in expression evaluation order; different orders can result in several different observable behaviors. By choosing an evaluation order of its liking, the compiler implements one of these valid observable behaviors.

So there must be a formal description of "observable behavior". What would be a practical example for a Lean 4 compiler?

Marc Huisinga (Feb 15 2022 at 20:56):

For Lean 4's IR, the appendix of Counting Immutable Beans may be of interest

Marc Huisinga (Feb 15 2022 at 20:57):

But note that this is really just one step in the compilation pipeline

Henrik Böving (Feb 15 2022 at 20:57):

I personally believe it would be quite hard to verify the entire lean compiler, you can certainly prove stuff about parts of it like e.g. the linked counting beans proof but due do its self extending nature anyone can basically add more to the lean semantics as they wish. SO even if we proved a certain compiler correct it would not be that useful in reality I think

Henrik Böving (Feb 15 2022 at 20:58):

But I'm certainly not a compiler verification person so maybe I'm totally wrong on this,

Marc Huisinga (Feb 15 2022 at 21:02):

... FWIW I spent more or less my entire BSc thesis formalizing only theorem 9 in that appendix, and I didn't quite finish the proof entirely :-)
Lean 3 has progressed quite a bit since then, though.

Henrik Böving (Feb 15 2022 at 21:02):

Huh I did conside donig that as an exercise but seems like it's much harder than expected, thanks for warning me :p

Marc Huisinga (Feb 15 2022 at 21:04):

I learned Lean with that project, so it's likely that you'll do much better than me.

Arthur Paulino (Feb 15 2022 at 21:08):

I see, thanks for the input everyone :pray:

Leonardo de Moura (Feb 15 2022 at 21:12):

@Henrik Böving Yes, it would be hard to verify the entire Lean compiler. BTW, for me "compiler" is referring to the Lean code generator that converts kernel terms/declarations into executable code.

its self extending nature anyone can basically add more to the lean semantics as they wish.

The kernel terms/declarations are fixed, we have no plans to extend them, and users cannot extend the kernel.
All user notations and extensions can be viewed as syntax sugar for the actual terms/declarations that are accepted by the kernel.

Chris B (Feb 15 2022 at 22:10):

The book "Program Logics for Certified Compilers" might also be of interest. There's a chapter called "how to specify a compiler", and a section on the operational semantics and memory model of compcert.

Chris B (Feb 15 2022 at 22:12):

I think it's kind of expensive, but if you have access to a university library system it's available online through the Cambridge University Press.

Andrés Goens (Feb 16 2022 at 08:14):

it would be pretty neat though, if Lean would be verified in itself it would be taking "bootstrapping" for compilers to a new level :wink:

Mario Carneiro (Feb 16 2022 at 08:19):

Maybe not lean itself, but perhaps a simpler language, designed for the purpose... :upside_down:

Eric Taucher (Feb 16 2022 at 09:38):

"Program Logics for Certified Compilers" (pdf) is free from Andrew Appel's Princeton U. web page.

Tim Carstens (Feb 16 2022 at 10:11):

On the topic of formally verified compilers for dependently typed languages, you might want to take a look at certicoq.org. That project seeks to provide verified extraction from Coq to C. It uses the compcert C semantics, which means you get an "end to end" correctness property when using CertiCoq to generate the C code and compcert to compile it.

Tim Carstens (Feb 16 2022 at 10:13):

With regards to bootstrapping, the current leader is probably CakeML, which provides a robust foundation for the Isabelle ecosystem

Tim Carstens (Feb 16 2022 at 10:27):

In terms of porting Appel's framework to Lean, this is something Patrick Stach and I were heavily investing in circa Lean 2. We were targeting C++, not C, but we did manage to get some nontrivial stuff done (full port of indirection theory, full representation of C++ AST, tooling for getting the AST out of clang and into Lean, etc)

One amusing anecdote: at the time, the checker for inductive type definitions did not scale extremely well. This made it hard for us to prototype our AST representation for C++. To make life easier, Stach wrote some tooling to generate the inductive type definitions based on a less verbose syntax. It made things go smoother so he called it "vaso-lean" Thankfully these perf issues are long since a thing of the past


Last updated: Dec 20 2023 at 11:08 UTC