Zulip Chat Archive
Stream: general
Topic: lean5 - rewrite everything on rust instead of c++?
Serhii Khoma (srghma) (Jan 25 2025 at 07:18):
- User Experience: will make language core more typesafe?
- Beneficiaries: core on rust will be easier to understand + generated code will be rust (type safe output) + linear types in lean language?
or maybe it is possible to write lean on lean without any c++?
just interesting what people think
Johan Commelin (Jan 25 2025 at 07:37):
Johan Commelin (Jan 25 2025 at 07:38):
As you can see, there isn't that much C++ in the repo. But there is a lot of Lean in the repo. Gradually the C++ is being rewritten in Lean.
Chris Wong (Jan 25 2025 at 09:23):
Usually programming language runtimes are small and have non-standard memory management, which makes them less viable to write in Rust. That's why Mozilla rewrote the style and rendering engines, not the scripting one.
Eric Wieser (Jan 25 2025 at 12:05):
There is of course the possibility of changing the bootstrapping process to transpile Lean code into Rust instead of C++, but it's not clear to me what benefit that would bring (and I think there's ongoing work to target LLVM IR instead)
Shreyas Srinivas (Jan 25 2025 at 12:09):
I think if the goal is RIR as a hobby, then writing an external checker is a nice place to begin. However I do think if understandability is the goal, then rewriting in lean is a better option.
Henrik Böving (Jan 25 2025 at 12:20):
Eric Wieser said:
There is of course the possibility of changing the bootstrapping process to transpile Lean code into Rust instead of C++, but it's not clear to me what benefit that would bring (and I think there's ongoing work to target LLVM IR instead)
LLVM already works, its just a few percentages slower than C for unknown reasons
Mauricio Collares (Jan 25 2025 at 14:12):
Rust is objectively a good pick when speed is the most important feature of a product that handles untrusted inputs. It's no surprise it was created by the great folks at Mozilla for a browser, as people judge browsers primarily by security, speed and memory footprint (assuming equal standards support). The advantages in other contexts are pretty much subjective. To state the obvious, the design space for progamming languages is huge and there are many trade-offs. Claims of the form "Rust is easier to understand than X" usually boil down to "I'm more familiar with Rust than language X" (especially when language X has orders of magnitude more users than Rust), and I think they're so prevalent because the "Rust evangelism strike force"/"rewrite it in Rust" memes just got out of hand.
As for the last point, using Rust wouldn't automatically endow Lean with linear types, and if Lean's type theory was augmented to be more like QTT (or your favorite dependent type theory with linearity), it's unlikely that it would match Rust's types so precisely as to be a comfortable target for lowering.
Jason Rute (Jan 25 2025 at 17:46):
As far as making the language core more typesafe, I assume you mean the C++ kernel (although maybe you mean the compiler/interpreter which others have addressed above). For the logical kernel, in my opinion Lean4Lean is the best option there. IIRC not only does it function as a pure Lean replacement for the C++ kernel (although a bit slower), it has the huge advantage that it can be proven correct in multiple ways:
- One can formalize Lean's type theory LeanTT
- One can formally prove the kernel faithfully implements LeanTT (i.e. there are no implementation bugs)
- One can formally prove LeanTT is consistent as well as sound with respect to the usual set-theoretic model of Lean (i.e. there are no logical bugs).
Even if it never becomes the official kernel, Lean4Lean already acts as an external type checker, and point (2) can be used to prototype optimizations to the kernel, proving that these optimizations are correct. Then the C++ kernel can also implement those optimizations and use the Lean4Lean implementation as a reference for testing.
Jason Rute (Jan 25 2025 at 18:03):
Also, see the discussion in #general > Lean4Lean: Lean 4 kernel in Lean.
Michael Rothgang (Jan 25 2025 at 18:55):
Mauricio Collares said:
Rust is objectively a good pick when speed is the most important feature of a product that handles untrusted inputs. It's no surprise it was created by the great folks at Mozilla for a browser, as people judge browsers primarily by security, speed and memory footprint (assuming equal standards support). The advantages in other contexts are pretty much subjective. To state the obvious, the design space for progamming languages is huge and there are many trade-offs. Claims of the form "Rust is easier to understand than X" usually boil down to "I'm more familiar with Rust than language X" (especially when language X has orders of magnitude more users than Rust), and I think they're so prevalent because the "Rust evangelism strike force"/"rewrite it in Rust" memes just got out of hand.
As for the last point, using Rust wouldn't automatically endow Lean with linear types, and if Lean's type theory was augmented to be more like QTT (or your favorite dependent type theory with linearity), it's unlikely that it would match Rust's types so precisely as to be a comfortable target for lowering.
For completeness, I'd like to say that "rewrite it in Rust" has actual technical merits. (Memory safety or thread safety in C++ are, I understand, extremely hard to impossible in practice. Rust makes this much more achievable. You can in principle write correct code in any language, but C++ makes some aspects much harder.)
A rewrite can introduce bugs itself (and takes lots of effort) --- but I really wonder: if Rust had been around 10 years earlier, would be Lean kernel have been written in Rust instead?
Nick Ward (Jan 25 2025 at 19:31):
Michael Rothgang said:
You can in principle write correct code in any language, but C++ makes some aspects much harder.
There's also an argument that the subset of bugs (safe) rust eliminates are otherwise some of the most difficult to reason about. For example, proving an optimization correct in lean before porting it to C++ cannot account for the possibility of memory safety bugs in the C++ kernel.
Jason Rute (Jan 25 2025 at 20:02):
Just to be clear there is an external checker for Lean 4 written in Rust (https://github.com/ammkrn/nanoda_lib). Any important Lean project should probably also be checked with one (or all) of these external checkers (for example during CI).
Nick Ward (Jan 25 2025 at 20:11):
@Jason Rute it seems to me that lean has notably more (maintained) external checkers than any other proof assistant. Assuming this is true, do you know why?
Nick Ward (Jan 25 2025 at 20:14):
It's also worth saying that ensuring correctness of any non-trivial piece of software ultimately comes down to defense in depth (of which external checkers are a great example).
Jason Rute (Jan 25 2025 at 20:16):
I’m not the one to ask. (I assume it is because Lean has a good proof exporter and that external checking is the best way to guarantee there is nothing funny going on. I think there are ways in Lean to hack the environment and bypass the kernel. External checking makes sure that doesn’t happen.)
Jason Rute (Jan 25 2025 at 20:18):
Also Metamath probably has more external checkers. :)
Nick Ward (Jan 25 2025 at 20:23):
Jason Rute said:
Also Metamath probably has more external checkers. :)
I expect this is probably true, but this could be explained by the relative simplicity of writing such a verifier, correct?
Mauricio Collares (Jan 25 2025 at 20:59):
Michael Rothgang said:
For completeness, I'd like to say that "rewrite it in Rust" has actual technical merits. (Memory safety or thread safety in C++ are, I understand, extremely hard to impossible in practice. Rust makes this much more achievable. You can in principle write correct code in any language, but C++ makes some aspects much harder.)
A rewrite can introduce bugs itself (and takes lots of effort) --- but I really wonder: if Rust had been around 10 years earlier, would be Lean kernel have been written in Rust instead?
The best bang for the buck is to write new code in Rust, if that's what you want. Battle-tested code is less likely to contain issues, and "old" code almost always contains hard-earned lessons. Interop between C++ and Rust makes it possible to implement new parts in Rust while interfacing with old parts in other languages (see the link). I was mentioning Rust's Mozilla origins before, and if there's one lesson Netscape/Mozilla learned is that companies should never rewrite large-scale programs unless they were meant to be an experiment in the first place (Things You Should Never Do, Part I is one of my favorite software essays). Software gets redesigned often enough, you can just do it in Rust the next time a component gets rearchitected to satisfy new requirements. On the other hand, rewriting particularly tricky libraries such as image/video decoders in Rust would be great, even if it still involved some amount of assembly or unsafe code.
I know it's funny to say "rewriting stuff is bad" in the Lean Zulip of all places, but the fact that Leo knew everything about Lean 0.1/2/3 when writing Lean 4 made sure all of the hard-earned lessons were preserved in the rewrites. Not every rewrite endeavor has this luxury. And, about your last thought, my completely unsubstantiated speculation is that if Rust was already stable in 2013 (Rust 0.7 was released around the time of the first Lean commit, but I agree that pre-1.0 doesn't count), Leo would still write the kernel in whatever low-level language he was most comfortable with in this alternate universe. Speaking of which, it's much easier to trust C and C++ codebases wrt memory management when all the code has been written by just a single experienced programmer; lots of trouble in bigger projects come from mismatched memory ownership expectations/conventions. Of course, this is not to say compiler help isn't welcome (I'm in the Lean Zulip after all!), but I just wanted to highlight another luxury of the Lean kernel when compared to most other C++ projects.
I swear this is my last message on this topic, since it's pretty much off-topic for the Lean Zulip.
Chris Bailey (Jan 25 2025 at 21:01):
Nick Ward said:
Jason Rute said:
Also Metamath probably has more external checkers. :)
I expect this is probably true, but this could be explained by the relative simplicity of writing such a verifier, correct?
This is almost certainly the case. Shameless plug for those up to the challenge: https://ammkrn.github.io/type_checking_in_lean4/
A simpler lean checker is possible by adding one more layer and having the "real" trusted checker only review forward reasoning chains, but that's not how the current ones work.
Jason Rute (Jan 25 2025 at 21:11):
@Chris Bailey How long do you think it would take someone to write a Lean type checker from scratch using your guide, that would be capable of checking all of (the export of) mathlib? A week? A month?
Nick Ward (Jan 25 2025 at 21:12):
@Jason Rute I like where this is going..
Chris Bailey (Jan 25 2025 at 23:30):
That's hard to say. Someone acquainted with capture avoiding substitution, debruijn indices, working with open terms, etc would be able to do it in much less time than someone who has to learn as they go. Someone who can work on it all day and keep that continuity of thought is in the same situation relative to someone who gets an hour a day to hack on it.
The fork of the exporter I used outputs the recursors and computation rules as well as the quotient terms. This was done for performance reasons in my case, but it does allow someone to implement just the core checker without having to immediately do the hardest/most time consuming parts at the expense of some assurance.
For someone on the happy path (familiar with lambda calculus internals, acquainted with lean, can work on it consistently, not reimplementing inductives themselves) using an agreeable language, not shy about asking for help, I think its like a two week thing?
Serhii Khoma (srghma) (Jan 26 2025 at 12:21):
lean on lean then (without any c++ at all, without transpilation, maybe only have .cbor files like purescript does for caching of already proved theorms or etc ) would be awesome
Eric Wieser (Jan 26 2025 at 13:20):
In what sense is lean not practically already what you describe? Transpilation to C vs compilation to LLVM IR seems kind of irrelevant beyond unnecessary C source parsing, and direct compilation to x86 assembly would be a massive and mostly-pointless project.
Eric Wieser (Jan 26 2025 at 13:21):
At some level you need some "unsafe" C code, because you have to call into the OS APIs with unsafe C semantics, and that is what a fair chunk of Lean's (non-transpiled) C++ code is doing.
Mario Carneiro (Jan 30 2025 at 14:51):
@Chris Bailey Could you elaborate on how you think the kernel could be improved beyond essentially the architecture of lean4lean (/ the official kernel)? I'm not sure I understand how you could shortcut some or all of that even if we use the export format instead of the oleans - you still have to infer all the intermediate types and do whnf in the kernel, unless you also log defeq chains
Chris Bailey (Jan 30 2025 at 17:39):
Mario Carneiro said:
Chris Bailey Could you elaborate on how you think the kernel could be improved beyond essentially the architecture of lean4lean (/ the official kernel)? I'm not sure I understand how you could shortcut some or all of that even if we use the export format instead of the oleans - you still have to infer all the intermediate types and do whnf in the kernel, unless you also log defeq chains
No shortcuts. Maybe it would have been more clear to draw a distinction between a verifier (needs to be spoon fed all of the intermediate steps) and a kernel (capable of checking the proof by itself). In that context I would rephrase it as "a simpler verifier can be implemented" which in a literal sense adds more software, but is effectively a reduction in size/complexity of the stuff in ring 0 since you would no longer have to trust the component that actually produces the reasoning chains. I'm thinking of something like lean.mm1.
Last updated: May 02 2025 at 03:31 UTC