Zulip Chat Archive

Stream: general

Topic: Trusted codebase size

Martin Dvořák (Feb 26 2025 at 13:08):

For years, I have been saying that Lean has a small core and the soundness of proofs in Lean depends on the core only (assuming hardware is flawless, and so on). Lately, however, people have been asking me for a concrete number, which I don't know.

How many lines of code is it that we really have to trust? Consider native_decide to be outside of the trusted codebase but the code that ensures Lean.ofReduceBool appears in #print axioms to be part of it. Consider the VS Code extension (and its backend support) to be outside of the trusted codebase, even though, had a dangerous bug been introduced into the VS Code extension, it might potentially be used to deceive people. Do we get to a reasonable number?

As far as I know, the size of the core increased from Lean 3 to Lean 4, and the trust in it has decreased accordingly. I understand the role of external checkers; you can perhaps also inform me what their code size is.

Floris van Doorn (Feb 26 2025 at 13:32):

I am not a great authority here, but this is what I've gathered. Some things might be wrong here.
I believe the kernel in the Lean codebase has not been intentionally designed to be as small as possible, and the boundaries between what is the kernel and what isn't is not an entirely hard line. I believe in the Lean code base it's about 20k lines of code(?)
There are external typecheckers (lean4lean, nanoda_lib) that work well in practice (i.e. they have the required performance optimizations to be able to check Mathlib), and they're about 6k-8k lines of code. This also includes things like a pretty printer, so a minimal checker can be smaller.

Last updated: May 02 2025 at 03:31 UTC