Zulip Chat Archive

Stream: general

Topic: PSA about trusting Lean proofs


Jannis Limperg (Sep 23 2023 at 11:30):

I showed the following example to @Floris van Doorn yesterday and he was surprised, so maybe this is not widely known: you can very easily trick Lean into accepting a proof of False.

import Lean

open Lean

elab "add_false" : command => do
  modifyEnv λ env =>
    let constants := env.constants.insert `falso $ ConstantInfo.thmInfo
      { name := `falso
        levelParams := []
        type := .const ``False []
        value := .const ``False []
      }
    { env with constants }

add_false

example : False :=
   falso

What happens here?

  • Using the metaprogramming APIs, we add a "proof" of False to the environment. The proof is bogus, of course, but we never send it to the kernel for checking, so nobody complains.
  • Lean assumes that things in the environment have been checked by the kernel and doesn't re-check them when we use the bogus proof.

In practice, this means that if someone you don't trust hands you a proof and Lean says the proof is okay, this assurance doesn't mean much. Even --trust=0 doesn't help. You can, however, use an external checker, which will detect nonsense like this. (I'm not sure whether we currently have an external checker than can check mathlib.)

Of course, we don't usually use Lean proofs in adversarial situations, so this is probably not a big issue. The most realistic exploit I could come up with is academic fraud: a scientist with more ambition than scruples could hide this trick in a superficially convincing 'automate all the things' tactic and use it to publish very impressive proofs. However, if anyone else tries to use the tactic, they may find that it proves false statements.

Kevin Buzzard (Sep 23 2023 at 11:41):

theorem foo : False :=
   falso

#print axioms foo -- 'foo' does not depend on any axioms

Yeah that is not ideal as far as I'm concerned. What if I finish my proof of Fermat's Last Theorem and someone accuses me of cheating like that?

Oliver Nash (Sep 23 2023 at 12:07):

I guess without some silliness like cryptographic signing there has to be trust somewhere (in this case that the actors mutating the environment are honest).

Removing comment that was too vague

Eric Wieser (Sep 23 2023 at 12:21):

Was this possible in Lean 3 too?

Eric Wieser (Sep 23 2023 at 12:22):

This seem pretty troublesome even for benevolent actors; things like @[mk_iff] or derive handlers if written incorrectly could be adding things to the environment without sending them to the kernel

Floris van Doorn (Sep 23 2023 at 12:23):

I hope there is a way to detect that the environment was changed to a new environment where not all declarations were sent to the kernel, without affecting performance too much. Hopefully we can get the kernel to check the added declaration, either the moment you change the environment, or maybe when you use it / try to write a olean file.

Adam Topaz (Sep 23 2023 at 12:38):

In the meantime, can we add a command (e.g. to mathlib) that verifies that all declarations in the current env have been checked by the kernel? At least that way Kevin can run the im_not_evil command after his proof of FLT to be more convincing?

Anatole Dedecker (Sep 23 2023 at 12:41):

Isn't there a way to run the kernel on the generated olean file, which would hopefully not work? I agree that's not ideal, but at least that's a check.

Arthur Adjedj (Sep 23 2023 at 12:47):

This bypasses kernel-checking on the falsodeclaration entirely, and directly adds the declaration to the environment without verifying it's well-formed. The fact that such low-level environment manipulation is exposed to the user is worrying IMO.

Arthur Adjedj (Sep 23 2023 at 12:57):

Ideally, the only "exposed" interface to generate/modify Environments should be Lean.Environment.addDecl, I don't believe any user should need/be allowed to remove/add declarations on the fly with no checks whatsoever. In a similar manner to the `falso trick, someone could easily remove some core constants from the environment, making most other declarations already present in the environment not well-formed anymore.

Adam Topaz (Sep 23 2023 at 13:00):

I don't agree with this. One thing that's so nice about lean4 is that you as a user have control over everything.

Adam Topaz (Sep 23 2023 at 13:01):

The one thing a user shouldn't have control over is adding a proof of false to the environment.

Arthur Adjedj (Sep 23 2023 at 13:01):

The environment is critical to soundness, and theoretically meaningful. For the same reason that users aren't allowed to make ill-typed terms, users shouldn't be allowed to make ill-typed contexts for other terms to be type-checked in.

Arthur Adjedj (Sep 23 2023 at 13:03):

I wonder what @Mario Carneiro 's opinion is on this.

Arthur Adjedj (Sep 23 2023 at 13:04):

Adam Topaz said:

The one thing a user shouldn't have control over is adding a proof of false to the environment.

The solution to that is exactly what I'm proposing: You should only be allowed to add declarations using Lean.Environment.addDecl, which makes the kernel type-check the declaration before adding it.

Sebastian Ullrich (Sep 23 2023 at 13:06):

Having metaprograms run in the same process as the system is inherently problematic for absolute soundness. As Jannis already mentioned, external checkers are the best defense in depth against such concerns. We are certainly interested in cultivating the same variety of checkers that we had in Lean 3. In fact, with the API Lean 4 provides, it should be quite easy to write a small Lean program that loads the .olean closure of a module and (re)sends all declarations through the kernel without ever running any metaprograms. Using the file format of https://github.com/Kha/lean4export would be the next step after that.

Eric Wieser (Sep 23 2023 at 13:25):

Adam Topaz said:

The one thing a user shouldn't have control over is adding a proof of false to the environment.

I think the argument for allowing this is that it permits the user to serialize and reload an environment without having to re-check all the declarations that were serialized

Geoffrey Irving (Sep 23 2023 at 13:38):

How long would you expect a minimal independent Rust checker to be? Is there one of those for Lean 3 to compare against?

David Renshaw (Sep 23 2023 at 13:45):

https://github.com/ammkrn/nanoda_lib

David Renshaw (Sep 23 2023 at 13:46):

https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Export.20format/near/351096525

Shreyas Srinivas (Sep 23 2023 at 13:50):

Kevin Buzzard said:

theorem foo : False :=
   falso

#print axioms foo -- 'foo' does not depend on any axioms

Yeah that is not ideal as far as I'm concerned. What if I finish my proof of Fermat's Last Theorem and someone accuses me of cheating like that?

Could a #print axioms like command simply print all the terms of type False in the environment? Could it then filter down to those terms used in a proof?

Kevin Buzzard (Sep 23 2023 at 13:53):

No because I can just prove false statements instead, rather than False itself.

Eric Wieser (Sep 23 2023 at 14:08):

They don't even need to be false, just results that you can't be bothered to prove

Eric Wieser (Sep 23 2023 at 14:08):

Sounds like we're going to have h : FLT in Lean much sooner than expected!

Jason Rute (Sep 23 2023 at 14:21):

What are the barriers to converting nanoda or the other external checkers to Lean 4. The logic should be almost the same, right? Is it just no export format for Lean 4?

Kevin Buzzard (Sep 23 2023 at 14:23):

My understanding is that it will not be much more difficult to write a Lean 4 typechecker than a Lean 3 one, and several Lean 3 ones got written. I think it's just on nobody's priority list right now?

Anatole Dedecker (Sep 23 2023 at 14:32):

I think the only real new requirement is the existence of a trusted bignum library in the language you want to implement your typechecker in.

Leonardo de Moura (Sep 23 2023 at 14:35):

Hi everyone,

I'd like to echo Sebastian's message and provide a clearer outline of our plans regarding external checkers at the Lean FRO (http://lean-fro.org)

  • Short Term: We plan to introduce a new executable to serve as an .olean checker. This will be a compact Lean program designed to import .olean files and carry out type checks on its declarations. Given its simplicity, this checker doesn't have to be anchored in the core. Moreover, Lean users are free to craft their own versions as per their requirements. Note this kind of checker is sufficient for the issue raised in this thread.

  • Mid Term: We're working on developing a reader tailored for the Lean 4 export format in Lean described in Sebastian's message. The plan is to extend the checker mentioned in the previous step to support this format.

  • Long Term: We're committed to ensuring that Chris Bailey's Rust checker is seamlessly ported to Lean 4. Conversations with him are already underway to ascertain that this transition materializes. We hope many other checkers will be available in the future.

Thank you for your understanding and support as we work through these stages.

Patrick Massot (Sep 23 2023 at 14:36):

Anatole, I don't think this is requirement if you are ready to see your external checker to be much slower than the Lean kernel.

Anatole Dedecker (Sep 23 2023 at 14:44):

Well if external type checkers go back to unary representation of natural numbers, wouldn't it mean that we can't really use the "better nat reduction" feature of the mainstream kernel?

Martin Dvořák (Sep 23 2023 at 15:03):

Thank you @Jannis Limperg very much for announcing this issue publicly!
I can't stress enough how important it is for us all to be informed about Lean's soundness issues.

Patrick Massot (Sep 23 2023 at 15:07):

I wouldn't call that a Lean soundness issue.

Sky Wilshaw (Sep 23 2023 at 16:23):

Is there any way to tell (via something like #print axioms) that this "exploit" hasn't been secretly performed?

Sky Wilshaw (Sep 23 2023 at 16:24):

For example, could we force Lean to recompile all of mathlib to check that all definitions are correct (i.e. not trusting oleans)?

Sky Wilshaw (Sep 23 2023 at 16:25):

(not sure if this is really the correct phrasing of what I want to ask but hopefully it makes enough sense)

Patrick Massot (Sep 23 2023 at 16:26):

You need to read this thread more carefully, all these questions have been answered already.

Sky Wilshaw (Sep 23 2023 at 16:28):

I'm not sure I understand - we don't yet have an external typechecker for Lean 4, right?

Sky Wilshaw (Sep 23 2023 at 16:31):

and currently, people can add untrusted definitions to the Lean environment.

Patrick Massot (Sep 23 2023 at 16:38):

Read in particular https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/PSA.20about.20trusting.20Lean.20proofs/near/392705540

Leonardo de Moura (Sep 23 2023 at 16:39):

@Sky Wilshaw In the message above (https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/PSA.20about.20trusting.20Lean.20proofs/near/392705540), we outline our plans for external checkers at the Lean FRO. The first checker is straightforward and can also be implemented by users. It simply uses our public APIs to load .olean files and then employs the APIs for type-checking definitions. This program, written in Lean, is self-contained. While any user can create it, we'll be adding it to our release suite for added convenience.

Sky Wilshaw (Sep 23 2023 at 16:39):

Thanks for the clarification, this makes sense now.

Martin Dvořák (Sep 23 2023 at 16:54):

Sky Wilshaw said:

Is there any way to tell (via something like #print axioms) that this "exploit" hasn't been secretly performed?

The implied short answer is NOT TODAY.
I'll keep it here as TLDR.

Eric Wieser (Sep 23 2023 at 18:21):

This is almost certainly not robust, but here's proof that can you detect this today despite Martin's claim:

elab "check_decl" n:ident : command => do
  let e  getConstInfo (Elab.Tactic.getNameOfIdent' n)
  let .some v := e.value? | return
  Elab.Command.liftTermElabM do
    let m  Meta.mkFreshExprMVar e.type
    let .true  Meta.isDefEq m v | throwError "incorrect type"

check_decl falso  -- fails

Adam Topaz (Sep 23 2023 at 18:35):

Would this check whether the decl uses false?

Eric Wieser (Sep 23 2023 at 18:36):

No, but it would be straightforward to adapt it to just check every decl in the environment or be recursive

Eric Wieser (Sep 23 2023 at 18:36):

But I suspect my usage of isDefeq is incorrect in some way anyway, so it's probably better to wait for an officially sanctioned checker!

Adam Topaz (Sep 23 2023 at 18:37):

You can do other tricks like delete some structure from the env and introduce a bad one in place with the same name, in which case the expr ends up being the same

Joachim Breitner (Sep 23 2023 at 18:49):

Within a system where arbitrary code can do arbitrary IO (including, in theory, swapping out the proof kernel with a tampered one), protection against malicious proofs seems simply impossible.

Joachim Breitner (Sep 23 2023 at 18:55):

Trying to build a theorem prover that really prevents cheating may lead you to something like HOL Zero (I'm not 100% sure even that attempts to protect against that, the focus seems to be shenenigans with pretty-printing.)

Kyle Miller (Sep 23 2023 at 19:32):

@Eric Wieser I believe you need to use docs#Lean.Meta.check on the type and value first, since isDefEq assumes the terms are type correct. You can also avoid metavariables by doing isDefEq between the type and inferType of the value directly.

Leonardo de Moura (Sep 23 2023 at 19:45):

Joachim Breitner said:

Within a system where arbitrary code can do arbitrary IO (including, in theory, swapping out the proof kernel with a tampered one), protection against malicious proofs seems simply impossible.

This is why we promote external checkers in Lean. See our plan at https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/PSA.20about.20trusting.20Lean.20proofs/near/392705540

Leonardo de Moura (Sep 23 2023 at 19:46):

Joachim Breitner said:

Trying to build a theorem prover that really prevents cheating may lead you to something like HOL Zero (I'm not 100% sure even that attempts to protect against that, the focus seems to be shenenigans with pretty-printing.)

We can achieve the same level of trustworthiness with external checkers, if not more. Especially since users have the option to implement their own checkers.

Joachim Breitner (Sep 23 2023 at 20:03):

Absolutely, external checkers are the way to go here!

(Although one still has theaw issue with notation abuse. I assume that an external checkers sees only fully elaborated terms, is that right? So there is still a little gap between what the user believes the lean surface syntax type means, and what is checked by the external checker.)

Leonardo de Moura (Sep 23 2023 at 20:15):

Joachim Breitner said:

Absolutely, external checkers are the way to go here!

(Although one still has theaw issue with notation abuse. I assume that an external checkers sees only fully elaborated terms, is that right? So there is still a little gap between what the user believes the lean surface syntax type means, and what is checked by the external checker.)

@Gabriel Ebner's external checker for Lean 3 had its own pretty printer https://github.com/gebner/trepplein/blob/master/src/main/scala/trepplein/pretty.scala

Leonardo de Moura (Sep 23 2023 at 20:18):

I've emphasized this point about external checkers and pretty printers in several talks I've given about Lean, including my most recent ones at the Topos Institute and on the Type Theory Forall podcast.

Kevin Buzzard (Sep 23 2023 at 20:26):

I've never been too worried about notation abuse, this is much easier to check for; imagine a million lines of code claiming to prove FLT; the statement is short and it's very easy to check there is no notation nonsense in the statement -- and you don't need to check the proof. But checking for the issue which is the topic of this thread involves looking through a million lines of code (which of course a checker can do)

Eric Wieser (Sep 23 2023 at 20:41):

Kevin, unfortunately the problem is not just notation but instances; I can prove 2 + 2 = 5 if I add instance : Add Nat where add _ _ := 5 somewhere hidden within the other million lines

Eric Wieser (Sep 23 2023 at 20:43):

And while you can easily inspect the instances for your statement, when the instance trace is thousands of lines (as it frequently is!) it's rather harder

Geoffrey Irving (Sep 23 2023 at 21:13):

FLT is a particular easy case: you can make one file that builds up Nat, add, pow from first principles, using only the kernel, state FLT, and have the checker prove that the long version produces the same type.

Eric Wieser (Sep 23 2023 at 21:46):

If you build up Nat from first principles then the type check would fail, as types are nominal not structural and thus your.Nat = Nat is undecidable (and at any rate, certainly not definitionally true, which would be needed for it to type check)

Kyle Miller (Sep 23 2023 at 22:03):

@Eric Wieser I wrote the version of your code that does Lean.Meta.check, but it's still not robust -- you can circumvent the check still if the environment's constants don't form a DAG. Maybe this is useful as a test case for an external checker.

Kyle Miller (Sep 23 2023 at 22:06):

In case you want to see how to circumvent isDefEq by itself without environment loops:

Geoffrey Irving (Sep 23 2023 at 22:27):

@Eric Wieser It is not difficult to show that two versions of Nat are isomorphic. The long proof would include that isomorphism.

Martin Dvořák (Sep 23 2023 at 22:49):

Yeah. I would do something like the following reässurence:

import Mathlib.Tactic.Linarith

-- FLT imported from a suspicious source:
theorem FLT {a b c n : Nat} (ha : 0 < a) (hb : 0 < b) (hc : 0 < c) (hn : 2 < n) :
  a ^ n + b ^ n  c ^ n := sorry


-- From now only trusted code follows:

inductive MyNat
| zer : MyNat
| suc : MyNat  MyNat

def MyAdd (n : MyNat) : MyNat  MyNat
| MyNat.zer   => n
| MyNat.suc m => MyNat.suc (MyAdd n m)

def MyMul (n : MyNat) : MyNat  MyNat
| MyNat.zer   => MyNat.zer
| MyNat.suc m => MyAdd n (MyMul n m)

def MyPow (n : MyNat) : MyNat  MyNat
| MyNat.zer   => MyNat.suc MyNat.zer
| MyNat.suc m => MyMul n (MyPow n m)


def MyNat.toNat : MyNat  Nat
| MyNat.zer   => Nat.zero
| MyNat.suc n => Nat.succ n.toNat

lemma myAdd_toNat {a b c : MyNat} (hyp : MyAdd a b = c) :
  a.toNat + b.toNat = c.toNat :=
by
  sorry

lemma myMul_toNat {a b c : MyNat} (hyp : MyMul a b = c) :
  a.toNat * b.toNat = c.toNat :=
by
  sorry

lemma myPow_toNat {a b c : MyNat} (hyp : MyPow a b = c) :
  a.toNat ^ b.toNat = c.toNat :=
by
  sorry

theorem MyFLT (a b c n : MyNat) :
  MyAdd
    (MyPow a.suc n.suc.suc.suc)
    (MyPow b.suc n.suc.suc.suc)
   (MyPow c.suc n.suc.suc.suc) :=
by
  have ha : 0 < a.suc.toNat
  · simp [MyNat.toNat]
  have hb : 0 < b.suc.toNat
  · simp [MyNat.toNat]
  have hc : 0 < c.suc.toNat
  · simp [MyNat.toNat]
  have hn : 2 < n.suc.suc.suc.toNat
  · simp [MyNat.toNat]
    linarith
  by_contra hyp
  apply FLT ha hb hc hn
  convert myAdd_toNat hyp <;> apply myPow_toNat <;> rfl

Eric Wieser (Sep 23 2023 at 22:57):

Yes, I think that captures the gist; though I worry we're now further derailing this thread.

Arthur Paulino (Sep 24 2023 at 01:37):

In the same line of thought of "external checkers", it would be possible to write a Lean 4 typechecker in a language that can produce SNARKs. These can be expensive to be generated, but can be verified cheaply. Such artifacts would be virtually impossible to forge without having proper proofs for the theorems

Scott Morrison (Sep 24 2023 at 04:35):

Okay, I have a prototype verifier that "replays" all declarations into the environment.

It detects both of today's hacks:

% lake exe verify Examples.False
Kernel exception while processing:
---
add_false


---
Could not add declarations:
falso
(kernel) declaration type mismatch, 'falso' has type
  Prop
but it is expected to have type
  False

and

% lake exe verify Examples.False2
Kernel exception while processing:
---
add_super_false


---
Could not add declarations:
super_false
(kernel) unknown constant 'super_false'

Not quite ready for widespread use, but hopefully soon.

Mario Carneiro (Sep 24 2023 at 07:54):

Arthur Adjedj said:

I wonder what Mario Carneiro 's opinion is on this.

I've been aware of this exploit for some time, and kind of didn't want to make a big deal about it because it is somewhat inherent to the lean 4 architecture - there is nothing really preventing you from constructing Environment objects with whatever you want in them as they are a pure lean construction. This was not possible in lean 3, because Environment was an opaque object (defined in C++) which you could only interact with using the API. As Sebastian and Leo say, the main fix for this is to use an external checker. Lean 4 already makes it really easy to just override what def or #print axioms means, so you can't really trust anything you see in regular use in an adversarial scenario.

Mario Carneiro (Sep 24 2023 at 08:04):

I have long been of the opinion that we should view ITPs and proof assistants as "proof compilers": that is, they take as input high level code in some language (i.e. lean) and output proofs in a low level language (i.e. the export format). They are not responsible for checking the correctness of the output, although it is a quality of implementation issue that they should try to produce correct proofs. Actual proof checking is vested in a completely separate tool (a proof checker) which is small and not running in the same process as the prover. (You could maybe make an exception if the language is typesafe and has no escape hatch for UB, but it is in my mind extremely risky to put a large untrusted program and a small trusted program in the same address space. Neither C++, nor Lean, nor OCaml meet the bar I am describing here. Some flavors of ML may be able to meet this requirement, but usually not the ones used in practice.)

Mario Carneiro (Sep 24 2023 at 08:12):

And just to be clear, I do not consider this a soundness bug in lean. The logical system is not affected here, it is just sidestepping checking entirely. (BTW, --trust=0 does absolutely nothing in lean 4. In Lean 3 it did some stuff with unfolding macros but in lean 4 the trustLevel argument seems to be entirely vestigial.)

Shreyas Srinivas (Sep 24 2023 at 08:47):

Mario Carneiro said:

Actual proof checking is vested in a completely separate tool (a proof checker) which is small and not running in the same process as the prover. (You could maybe make an exception if the language is typesafe and has no escape hatch for UB, but it is in my mind extremely risky to put a large untrusted program and a small trusted program in the same address space.

I don't understand. Are you suggesting that the typechecker should run in a separate process and act like a server? What is the obstacle to make ITPs work like this?

Shreyas Srinivas (Sep 24 2023 at 08:49):

More specifically, why can't lean's kernel run in a separate process as a "verify" server and communicate with the rest of the system through some serialization of Expr.

Henrik Böving (Sep 24 2023 at 08:59):

Shreyas Srinivas said:

More specifically, why can't lean's kernel run in a separate process as a "verify" server and communicate with the rest of the system through some serialization of Expr.

I dont see how this is different from " the typechecker should run in a separate process and act like a server"

Kevin Buzzard (Sep 24 2023 at 09:05):

What this thread has taught me is that lean 4 is so flexible that the idea that "you can just trust the program" is simply out of date. This does not bother me at all. In CS people are always forced to think about malicious actors but when building mathematics like we're doing there is absolutely no motivation to cheat especially as we now know that you can be found out very easily (just not by lean itself). I think the scenario of a malicious maths actor claiming to prove RH (a) doesn't really exist and (b) can be protected against anyway, with an external checker. I do think that this is evidence that we'd better get an external checker sooner rather than later but Leo's post makes it clear that this is a priority that the FRO well aware of.

It will be interesting when the LLMs manage to discover that they can do this and start getting gold medals on IMOs because of it, but as I say it's clear that this can be mitigated against.

Kevin Buzzard (Sep 24 2023 at 09:20):

I think that the observation that opening lean code can wipe your hard drive is in some sense far more concerning but this has been known for years and again it's never been an issue for me in practice -- I mark students' work in a virtual machine but this is not difficult to set up.

Shreyas Srinivas (Sep 24 2023 at 09:27):

Henrik Böving said:

Shreyas Srinivas said:

More specifically, why can't lean's kernel run in a separate process as a "verify" server and communicate with the rest of the system through some serialization of Expr.

I dont see how this is different from " the typechecker should run in a separate process and act like a server"

I am asking why it isn't done like that currently. My best guess is latency. But, at least, it could be added as an option.

Shreyas Srinivas (Sep 24 2023 at 09:41):

@Kevin Buzzard : This could affect more than just research math. For example without extra checkers, how could an instructor be sure that their students submit correct proofs of textbook results. Especially if the instructor is not lean-savvy enough to catch it without the compiler telling them.

Scott Morrison (Sep 24 2023 at 09:55):

I think the people teaching using Lean so far are quite savvy enough. :-) And they are well aware of "hacking" possibilities like this.

Yaël Dillies (Sep 24 2023 at 09:56):

@Adam Topaz, would your autograder be tricked by Jannis' add_false? I believe not, but I haven't tried.

Geoffrey Irving (Sep 24 2023 at 10:00):

Alongside an ecosystem if checkers, it would be cool if there was

  1. A standard “lake check” command or similar that runs the check, or nice integration between lake and the checkers.
  2. An idiomatic way to have no-library Lean code to express theorems. Possibly this already exists, but for example @Martin Dvořák’s approach above is unsound as written: it starts by importing an untrusted file, and all is lost if you do this.

Kevin Buzzard (Sep 24 2023 at 10:09):

@Shreyas Srinivas I don't use any auto-grading in my course, the students write individual lean projects on topics of their choice.

Adam Topaz (Sep 24 2023 at 11:21):

Yaël Dillies said:

Adam Topaz, would your autograder be tricked by Jannis' add_false? I believe not, but I haven't tried.

I have already modified my auto-grader to do something similar to Kyle's command above, but as he mentioned there, there is still a workaround. But in any case, I can think of several ways for students to bypass this (or any other auto-grading idea I had), so I just started skimming through student code regardless.

Adam Topaz (Sep 24 2023 at 11:22):

Once @Scott Morrison releases the verify executable, I'll probably experiment with using this as part of the workflow.

Notification Bot (Sep 24 2023 at 17:35):

3 messages were moved from this topic to #general > Lean and the upcoming Cyber Resilience Act by Kyle Miller.

Scott Morrison (Sep 26 2023 at 01:52):

We have just released lean4checker, which reverifies .olean files using the Lean kernel.

(Thanks also to Mario for a significant performance improvement.)

This tool detects and rejects the "environment hacking" tricks described above in this thread.

It uses the Lean kernel, but none of the elaborator or other frontend. (Imports are just Lean.CoreM and Lean.Util.FoldConsts.)

PR #7368 proposes adding this as a regular CI step for Mathlib. It comes at a ~6 minute cost per CI run; perhaps we will switch to a cron job and/or only run lean4checker on modified files (and files downstream) if this feels too expensive. I would prefer that we merge it now as a regular job, however.

There's more coming: the FRO has plans (weeks-to-months timeline) to provide independent verifiers written in languages other than Lean, which will provide an alternative to lean4checker with no overlap with the Lean codebase. Of course if anyone would like to write such a verifier that would be very welcome (and a cool project).

Arthur Paulino (Sep 26 2023 at 03:21):

@Alexandre Rademaker :point_up:

MohanadAhmed (Sep 26 2023 at 07:06):

Scott Morrison said:

There's more coming: the FRO has plans (weeks-to-months timeline) to provide independent verifiers written in languages other than Lean, which will provide an alternative to lean4checker with no overlap with the Lean codebase. Of course if anyone would like to write such a verifier that would be very welcome (and a cool project).

  1. For people interested in external checkers, are there plans for the Olean files have an "official" specification?
  2. What background knowledge do you think is necessary to independently build such a verifier?

Henrik Böving (Sep 26 2023 at 07:29):

Just a wild idea, https://github.com/leanprover/lean4checker/blob/master/Main.lean#L86 could you inject ctor's or rec's into an Environment yourself and get proofs that way without the kernel noticing? -> is it okay to just leave these out?

Mario Carneiro (Sep 26 2023 at 07:34):

oh good point

Scott Morrison (Sep 26 2023 at 07:34):

Yes, I think there is a loophole here.

The current tool only checks each file individually. Thus if you put in an evil constructor in one file, then lean4checker will just drop it. Any subsequent results in the same file which tries to use it will fail. But when we run lean4checker on a downstream file, we are not replaying the environment from that earlier file, so the use of the evil constructor will still be allowed.

The same code as in lean4checker should (potential stack overflows notwithstanding?) be easy to adapt to rebuild the entire environment from scratch, rather than accepting the environment provided by imports.

Scott Morrison (Sep 26 2023 at 07:35):

Of course you would only run this on one file (e.g. Mathlib.lean).

Scott Morrison (Sep 26 2023 at 07:36):

I'm not sure what the longest path in the declaration dependency graph is. The current implementation may overflow, but it should be straightforward to rewrite iteratively.

Mario Carneiro (Sep 26 2023 at 07:38):

import Lean

open Lean in
elab "add_false" : command => do
  modifyEnv fun env =>
    let constants := env.constants.insert `false $ ConstantInfo.ctorInfo
      { name := `false, levelParams := [], type := .const ``False []
        induct := ``False, cidx := 0, numParams := 0, numFields := 0, isUnsafe := false }
    { env with constants }

add_false

example : False :=
   false

Mario Carneiro (Sep 26 2023 at 07:38):

passes the checker

Mario Carneiro (Sep 26 2023 at 07:39):

oh but you can't use it for real, theorem bla : False := false fails because false isn't in the environment

Mario Carneiro (Sep 26 2023 at 07:41):

I think there is a deeper issue here, which is that environments are not meant for replay

Mario Carneiro (Sep 26 2023 at 07:42):

we should really be storing the list of environment Declarations rather than reconstructing them

Mario Carneiro (Sep 26 2023 at 07:43):

there are actually loads of similar exploits involving all the fields in InductiveVal: those have a bunch of assumed invariants and you have to make sure you exactly replicate it

Mario Carneiro (Sep 26 2023 at 07:48):

In lean 3 there was this concept of "environment modifications" which compose the output format. Reconstructing this from the environment alone is a very lossy process

Mario Carneiro (Sep 26 2023 at 07:50):

QuotVal has a similar exploit, you can lie about the type of quotient decls

Scott Morrison (Sep 26 2023 at 08:22):

I've updated lean4checker so it can replay the entire environment, rather than just "the environment since the imports".

It catches the ctorInfo example above (rather: it doesn't catch the use in example, but if you use the evil constructor in a theorem in either the same file or a downstream file it catches you).

You don't get to take advantage of any parallelism however, so I'm not sure how long it will take. It is running on Mathlib now.

Scott Morrison (Sep 26 2023 at 08:25):

I guess rather than just dropping ctorInfos, a better approach would be to verify that they match identically with the ctorInfos generated when we replay the inductive.

Scott Morrison (Sep 26 2023 at 09:11):

Okay, I have updated lean4checker so that:

  • You can no longer smuggle an evil constructor or recursor into one file, and be able to use it later. It is rejected in the file you try to create it. We do this by collecting all constructors and recursors and, after all the inductives have been built, verify that the constructors and recursors were identical to the generated ones.
  • There is now a lean4checker --fresh Mathlib mode, which rebuilds the entire environment from scratch, rather than just replaying starting from the imports. Please brew coffee before attempting this!

Any reviews of the PR at https://github.com/leanprover/lean4checker/pull/2 much appreciated.

Mario Carneiro (Sep 26 2023 at 09:26):

I think it would be best to keep checking environments differentially as you were doing, but make sure that all the ConstantInfos added match what the kernel says

Scott Morrison (Sep 26 2023 at 09:27):

That's exactly what I've done, I think.

Scott Morrison (Sep 26 2023 at 09:27):

I still drop the QuotVals.

Scott Morrison (Sep 26 2023 at 09:28):

The --fresh mode is hidden behind a flag because it's not intended to be regularly used. It is really quite slow since we can't parallelise.

Mario Carneiro (Sep 26 2023 at 09:28):

we can, it's just more complicated to do so

Mario Carneiro (Sep 26 2023 at 09:28):

I mean, that's just the same kind of parallelism that lake is already doing

Mario Carneiro (Sep 26 2023 at 09:29):

I also have some partial work on a fix, I will try to merge it with leanprover/lean4checker#2

Scott Morrison (Sep 26 2023 at 09:29):

I think I might leave that to lake. :-)

Scott Morrison (Sep 26 2023 at 09:29):

Great. Please feel free to push to that branch.

Petur Vetle (Sep 27 2023 at 05:59):

I think a crucial aspect has been overlooked, but I don't have the knowledge/experience compared to the others here. So if I am missing something or if it has already been addressed, feel free to point it out.

The thing that concerns me is not an adversary, but rather that its non-trivial to use Lean to verify the correctness of a proof written in Lean. It should be as simple as #print axioms foo and compile. However, the comments in this thread and the source code of lean4checker indicate otherwise.

External checkers solve this problem but it misses the point. If one trusts Lean, then I think its reasonable that one should be able to trust a proof that Lean says is valid.

I have not used Lean's internal APIs, so I don't know how easy/difficult it is to "accidentally" add a False statement into the environment. But I do know that in complex systems (e.g. mathlib) it is easy for small and seemingly innocent changes to have unintended effects.

Petur Vetle (Sep 27 2023 at 06:57):

For me, the issue of this is better shown in a program that actually computes things. The fact that the following can compile is concerning from a program verification point of view:

import Lean

open Lean

elab "add_false" : command => do
  modifyEnv λ env =>
    let constants := env.constants.insert `falso $ ConstantInfo.thmInfo
      { name := `falso
        levelParams := []
        type := .const ``False []
        value := .const ``False []
      }
    { env with constants }

add_false

theorem can_index_anywhere (a : Array Nat) (i : Nat) : i < a.size :=
  False.elim falso

-- the user has no way to determine that this is not true without using an external checker?
#print axioms can_index_anywhere

def index_anywhere (a : Array Nat) (i : Nat) :=
    a[i]'(can_index_anywhere a i)

#eval index_anywhere #[1, 2, 3] 4

Mario Carneiro (Sep 27 2023 at 06:57):

If you are not in an adversarial mindset, none of these issues are a problem. It is basically impossible to do this kind of thing by accident: you either have to use a very low level API which is not ever used in practice, or break the type system in some way in order to cause problems.

Mario Carneiro (Sep 27 2023 at 06:58):

Note that you can already have issues with sorry or unsafe in #eval

Mario Carneiro (Sep 27 2023 at 07:00):

And you can always implement your own #eval that takes the guard rails off if you want

Mario Carneiro (Sep 27 2023 at 07:01):

add_false is basically similar: the metaprogramming API is very powerful and lets you direct lean to check or not check whatever you like

Mario Carneiro (Sep 27 2023 at 07:01):

But we are designing here from the perspective that we are trying to build a proof assistant, which assists the user in writing a proof. That means giving errors when it is useful

Mario Carneiro (Sep 27 2023 at 07:02):

A tactic that lies to the user and fails to give good error messages is not a good tactic

Scott Morrison (Sep 27 2023 at 07:07):

Perhaps it would be good to add a #replay command that can be used interactively.

Mario Carneiro (Sep 27 2023 at 07:09):

assuming it's not overridden...

Shreyas Srinivas (Sep 27 2023 at 08:12):

Mario Carneiro said:

But we are designing here from the perspective that we are trying to build a proof assistant, which assists the user in writing a proof. That means giving errors when it is useful

I don't understand this perspective. If A relies on B (some tool) to check A's software, then the proof assistant assists B. So B could end up using a tactic that (accidentally) got its environment manipulation incorrect. A should not be expected to trust B's wisdom any more than the A's own programming skills. And it should not be necessary for A or B to minutely examine the entire source they use. Petur is right. From a formal verification perspective, nobody would trust such a tool.

Chris Wong (Sep 27 2023 at 08:22):

I think what Mario is getting at is accidental vs malicious mistakes. Nobody would write add_false on purpose. And if the author is writing devious code on purpose, then you've already lost.

Kevin Buzzard (Sep 27 2023 at 08:23):

nobody would trust such a tool

I would trust that tool, because I know what I'm doing. I understand that Lean 4 is so flexible that you can now "cheat" and I understand that there is (or soon will be) a tool which detects that cheating. Thus your claim is false.

Scott Morrison (Sep 27 2023 at 08:24):

If you are concerned about these issues, you run lean4checker. If you're not, you either don't care, or are content that lean4checker (will soon) run in CI.

Chris Wong (Sep 27 2023 at 08:24):

This is similar to the unsafe guarantee in Rust. You can clobber arbitrary memory with no unsafe code by writing to /proc/self/mem. This is not considered a soundness hole.

Scott Morrison (Sep 27 2023 at 08:26):

It's pretty straightforward. Of course the Lean frontend and VSCode extension are not meant to be "trusted" in this sense. They are tools for writing proofs. The kernel checks the proofs, and the kernel has no soundness issues. If you decide to not send your proofs to the kernel, then of course it has no opportunity to check them.

Henrik Böving (Sep 27 2023 at 08:27):

the kernel has no soundness issues

That we know of :wink:

Scott Morrison (Sep 27 2023 at 08:28):

Lean is a general purpose programming language, and it is intended to be capable of doing metaprogramming, including manipulating Lean Environment objects. If you want something that is just a verifier, use a verifier.

Scott Morrison (Sep 27 2023 at 08:30):

Henrik Böving said:

That we know of :wink:

Sure, of course! Lean's track record is very solid here.

Anne Baanen (Sep 27 2023 at 08:38):

Even if you decide to trust the verifier output, beware that Lean files are totally capable of replacing lean4checker with a program that does the equivalent of echo "Looks good to me" :P

Scott Morrison (Sep 27 2023 at 08:57):

This is getting a little silly, no? A lean program is capable of that in the same sense that a C program is capable of replacing your C compiler with a nefarious one. I mean, that is a valid "adversary" to think about in some contexts, but ...

Henrik Böving (Sep 27 2023 at 09:18):

Scott Morrison said:

This is getting a little silly, no? A lean program is capable of that in the same sense that a C program is capable of replacing your C compiler with a nefarious one. I mean, that is a valid "adversary" to think about in some contexts, but ...

I mean you could set it up such that once the lean compiler starts it priv drops into a user that cannot touch the leanchecker binary and such but that seems overkill

Shreyas Srinivas (Sep 27 2023 at 09:18):

TL DR: I'd say an external checker as mentioned by Leo is the best long term solution.

Scott: It is a question of what the system guarantees are, and what assumptions are made. In principle one can also go to extremes and overwrite the lean kernel in-memory through some hardware side channel. Verification is always w.r.t a reasonable set of assumptions. The kernel has no soundness issues. But for use in formal verification, it is generally hoped that I don't have to worry about trivial side channels like skipping the checker through a bad tactic. Comparing lean to C is unfair. A C programmer knows (or at least ought to know), that they are getting very little by way of guarantees.

Shreyas Srinivas (Sep 27 2023 at 09:36):

Chris Wong said:

I think what Mario is getting at is accidental vs malicious mistakes. Nobody would write add_false on purpose. And if the author is writing devious code on purpose, then you've already lost.

It is not merely about one malicious programmer. It is also about trusting libraries. Currently we have a few centrally managed libraries so this is not an issue at all. But in the future, we need to have a reasonably easy way of checking that a library is not buggy internally (sure someone could intercept the download and add the bug, but that's beyond lean).

Mario Carneiro (Sep 27 2023 at 09:46):

Anne Baanen said:

Even if you decide to trust the verifier output, beware that Lean files are totally capable of replacing lean4checker with a program that does the equivalent of echo "Looks good to me" :P

I checked, there is no opportunity to run custom code during initialization to affect the relevant parts of the state of lean4checker. The only avenue for corruption is to cause UB and break lean that way.

Mario Carneiro (Sep 27 2023 at 09:49):

Shreyas Srinivas said:

Chris Wong said:

I think what Mario is getting at is accidental vs malicious mistakes. Nobody would write add_false on purpose. And if the author is writing devious code on purpose, then you've already lost.

It is not merely about one malicious programmer. It is also about trusting libraries. Currently we have a few centrally managed libraries so this is not an issue at all. But in the future, we need to have a reasonably easy way of checking that a library is not buggy internally (sure someone could intercept the download and add the bug, but that's beyond lean).

I don't see why this is necessary any more than the usual standards of correctness for a library. How is it that you come to trust any library at all, without reviewing carefully all the code? Lean is not a trusted component, so if there are bugs then it's not a big deal, you report/fix them and move on

Mario Carneiro (Sep 27 2023 at 09:51):

I'm curious how you expect to satisfy this criterion for, say, libraries written in Javascript, Rust, Ruby, etc.

Shreyas Srinivas (Sep 27 2023 at 09:52):

I don't expect them to satisfy this requirement. They are not proof checkers. Worst case, my application throws an exception and I have to use my debugger to isolate the cause.

Shreyas Srinivas (Sep 27 2023 at 09:53):

How do you debug a proof OTOH? It doesn't run. It doesn't get tested. It doesn't crash. And with a bug like this, it accepts incorrect theorems silently.

Mario Carneiro (Sep 27 2023 at 09:55):

Lean is not a proof checker, it is a proof assistant. There is a difference

Mario Carneiro (Sep 27 2023 at 09:56):

It checks proofs as a convenience for proof authors, so it can give nice error messages and the like

Mario Carneiro (Sep 27 2023 at 09:56):

Shreyas Srinivas said:

I don't expect them to satisfy this requirement. They are not proof checkers. Worst case, my application throws an exception and I have to use my debugger to isolate the cause.

the exact same thing applies to lean programs. Worst case, it throws an exception and you use a debugger

Eric Wieser (Sep 27 2023 at 09:56):

Mario Carneiro said:

I checked, there is no opportunity to run custom code during initialization to affect the relevant parts of the state of lean4checker. The only avenue for corruption is to cause UB and break lean that way.

I think Anne's point is that my lean code can contain the equivalent of f = open('leanchecker', 'w'); f.write("echo ok"), and so by the time you run the checker its not the binary you thought it was any more. This is obviously easy to protect against by having leanchecker be in a read-only filesystem

Shreyas Srinivas (Sep 27 2023 at 09:56):

Mario Carneiro said:

Shreyas Srinivas said:

I don't expect them to satisfy this requirement. They are not proof checkers. Worst case, my application throws an exception and I have to use my debugger to isolate the cause.

the exact same thing applies to lean programs. Worst case, it throws an exception and you use a debugger

Programs yes. Proofs?

Mario Carneiro (Sep 27 2023 at 09:58):

What about proofs? Lean is a tool to help you build proofs. If you break it by some exploit, you won't get a proof out, so you aren't really incentivized to do so

Shreyas Srinivas (Sep 27 2023 at 09:59):

Mario Carneiro said:

What about proofs? Lean is a tool to help you build proofs. If you break it by some exploit, you won't get a proof out, so you aren't really incentivized to do so

And if all I am doing is using a library which happens to contain this exploit that I can't find in an automated way?

Mario Carneiro (Sep 27 2023 at 09:59):

then the library has a bug and you should report it

Mario Carneiro (Sep 27 2023 at 09:59):

you will know because it's producing bad proofs

Anne Baanen (Sep 27 2023 at 10:00):

Eric Wieser said:

I think Anne's point is that my lean code can contain the equivalent of f = open('leanchecker', 'w'); f.write("echo ok"), and so by the time you run the checker its not the binary you thought it was any more

Or really more generally that silly attacks will always remain possible if your threat models are powerful and motivated enough.

Shreyas Srinivas (Sep 27 2023 at 10:00):

Mario: That's easy to say if one is doing simple proofs. What happens if there is a bug in some library and someone is doing a verification effort on the scale of some easycrypt project?

Mario Carneiro (Sep 27 2023 at 10:01):

bad proofs here as determined by a checker

Mario Carneiro (Sep 27 2023 at 10:01):

this is what checkers are for

Mario Carneiro (Sep 27 2023 at 10:02):

(also, I have to emphasize that this isn't just a bug, it's a deliberate exploit. There is really no reason you would run into this on accident)

Mario Carneiro (Sep 27 2023 at 10:03):

if there are exploits in your library you have bigger problems

Shreyas Srinivas (Sep 27 2023 at 10:06):

Mario Carneiro said:

bad proofs here as determined by a checker

Anne: What I mean is that it is very reasonable for a threat model for an ITP to exclude that kind of external malicious change. This exploit is happening within the system "lean" (assistant + kernel).

Mario : External checkers are needed. We agree on this. As for this being an exploit, in a large dependency chain, an exploit at one level can become a bug downstream.

Mario Carneiro (Sep 27 2023 at 10:08):

an exploit at any level is broken trust, you should not associate with that software vendor if they are doing misconduct

Mario Carneiro (Sep 27 2023 at 10:09):

it's like discovering malware in your dependencies, you can't just paper over it, get it out of the system posthaste

Shreyas Srinivas (Sep 27 2023 at 10:10):

Mario Carneiro said:

an exploit at any level is broken trust, you should not associate with that software vendor if they are doing misconduct

This is nice, but doesn't scale as dependency graphs get bigger. Do people check all their hundreds of npm dependencies and worse still the next layer of indirect dependencies.

Mario Carneiro (Sep 27 2023 at 10:11):

what I described is reactionary, it scales well enough

Mario Carneiro (Sep 27 2023 at 10:11):

catching things before they blow up is harder to scale

Mario Carneiro (Sep 27 2023 at 10:12):

but having a proof checker that can tell you if anyone in your stack is lying is a very scalable solution :)

Shreyas Srinivas (Sep 27 2023 at 10:13):

Mario Carneiro said:

it's like discovering malware in your dependencies, you can't just paper over it, get it out of the system posthaste

so suppose a third level indirect dependency is malicious and there are in total a few hundred of them. How would I go about finding the malicious dependency? Does lean have a proof debugger?

Mario Carneiro (Sep 27 2023 at 10:18):

If we're talking bad proofs, it's really easy: the bad proof will be located in some module, which is in some package

Mario Carneiro (Sep 27 2023 at 10:19):

although perhaps you want to blame the tactic that generated the proof rather than the proof itself

Mario Carneiro (Sep 27 2023 at 10:20):

This is basically a non-issue in practice, it's usually not that hard to minimize errors once they are detected

Shreyas Srinivas (Sep 27 2023 at 10:21):

How do we detect it quicky? Suppose we consider a large project with hundreds or thousands of dependencies.

Mario Carneiro (Sep 27 2023 at 10:21):

you mean like mathlib?

Shreyas Srinivas (Sep 27 2023 at 10:22):

Mario Carneiro said:

you mean like mathlib?

I am not sure mathlib is a representative example. It is maintained and built by a relatively small number of very good people.

Mario Carneiro (Sep 27 2023 at 10:23):

Evidence suggests that during minimization you can throw away huge quantities of code, I would estimate the work is logarithmic in the number and size of dependencies

Mario Carneiro (Sep 27 2023 at 10:23):

I suspect you are just going to no-true-scotsman any example I give here

Mario Carneiro (Sep 27 2023 at 10:24):

sure, minimization is some work, but it's not intractable, even on million line codebases

Shreyas Srinivas (Sep 27 2023 at 10:43):

Mario Carneiro said:

I suspect you are just going to no-true-scotsman any example I give here

Well here is one more question: How should reviewers of a paper submission review the proofs that have been submitted to them by anonymous authors, in lean, which might be large and have multiple dependencies?

Johan Commelin (Sep 27 2023 at 10:52):

Maybe they run the code in the appendix through an external checker that automatically also checks all dependencies? I don't understand the problem.

Shreyas Srinivas (Sep 27 2023 at 11:12):

Johan Commelin said:

Maybe they run the code in the appendix through an external checker that automatically also checks all dependencies? I don't understand the problem.

Agreed. If the external checker checks everything in one go, then that works very well.

What Petur initially pointed out is that this doesn't happen yet(see quote below). Further, the immediate response to Petur was that no reasonable person would do something that inserts a proof of False in the environment. So it is not a cause for worry. Unfortunately making such an assumption is a non-starter for trusting any non trivial formal verification effort for systems, because a malicious dependency at one level could trickle down into wrong proofs downstream, especially when powerful tactics are involved.

Petur Vetle said:

The thing that concerns me is not an adversary, but rather that its non-trivial to use Lean to verify the correctness of a proof written in Lean. It should be as simple as #print axioms foo and compile. However, the comments in this thread and the source code of lean4checker indicate otherwise.

Scott Morrison (Sep 27 2023 at 11:28):

Of course we would like multiple redundant versions of lean4checker, written in any other languages, and at least one is coming soon.

But to my knowledge lean4checker is perfectly capable of checking any current proof in Lean from top to bottom. And setting aside "but what if I changed the lean4checker binary" type objections, I don't know of any ways around it.

That said, it is 48 hours old, so presumably we will still think of further problems with it --- the 24 hour old version of it did have some gaps using constructors/recursors for inductives. It is only 200 lines long (+60 easy lines in another file), so adversarial reviews are both welcome and warranted! :-)

Scott Morrison (Sep 27 2023 at 11:28):

(Of course, here I mean 200 lines beyond the Lean kernel.)

Joachim Breitner (Sep 27 2023 at 11:41):

We could start a bounty program. Anyone who finds a way to trick the checker can get a PR of their choice fast-tracked :-)

Chris Wong (Sep 27 2023 at 11:55):

A Manifold market would have a similar effect :wink:

Arthur Paulino (Sep 27 2023 at 11:57):

Shreyas Srinivas said:

Mario Carneiro said:

I suspect you are just going to no-true-scotsman any example I give here

Well here is one more question: How should reviewers of a paper submission review the proofs that have been submitted to them by anonymous authors, in lean, which might be large and have multiple dependencies?

Sorry to be mentioning this so many times, but it's such a nice use case of SNARKs. People could run a SNARK verifier on their smartphones with a QR code and it would approve or reject proofs in milliseconds. This sounds like the "Lean 4 seal of approval" that I was looking for ~2 years ago when I joined this Zulip server

https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/.E2.9C.94.20Remotely.20trusted.20proofs/near/258633430

Shreyas Srinivas (Sep 27 2023 at 12:25):

@Scott Morrison : I am extremely grateful for the lean4checker. Naturally one can't expect it to be absolutely perfect in 48 hours. Of course an end to end external checker that checks everything including dependencies is going to be a good solution as Johann suggests.

My argument is more against the principle that there is no cause for worry about a malicious person inserting a proof of False in the environment, from inside the lean system, because then the entire proof is wrong anyway. At scale such malicious attempts can become almost undetectable and painful to find bugs. Not every project will be a nicely constructed and carefully maintained library like mathlib. Further as a lean expert, you and Kevin might not be worried too much about such stray attempts at cheating because you can see the mistake from a mile away. But in the long run, a project should strive to be something non-expert users who don't know metaprogramming internals can trust. Ultimately, formal verification's promise is for a user to be able to believe a system that says "trust me it is correct per your spec". I think that is also what Petur is trying to say.

Shreyas Srinivas said:

Further, the immediate response to Petur was that no reasonable person would do something that inserts a proof of False in the environment. So it is not a cause for worry.

Scott Morrison (Sep 27 2023 at 12:32):

I think we're in furious agreement here. Formal verification is meant to verify your proofs. That's what the Lean kernel does.

A programming language that allows metaprogramming and is highly capable as a proof assistant is of course much more complex than a verifier, and can do many things.

Henrik Böving (Sep 27 2023 at 12:32):

I still don't see how you think people are able to insert proofs of False into the system here. The lean checker makes sure that the environment matches the environment that the kernel would expect to build. Sure you can start attacking the lean checker and what not if you really want but that is not a Lean issue that's an IT security issue. If you are scared about this type of stuff just copy the oleans of the project into a VM and run the checker there in isolation or something like that.

So the only way remaining to inject a proof of false into the system would be via an axiom which you can also detect if you wish. In particular you could just limit the set of axioms that you allow to show up and integrate that check into the lean4checker

So what is your attack scenario here? How will a third party that you are importing code from compromise the trust you can have in the correctness of your proof?

Shreyas Srinivas (Sep 27 2023 at 12:33):

Just to clarify, I think the leanchecker tool is awesome already and a great step forward. I am not at all arguing against it.

Henrik Böving (Sep 27 2023 at 12:33):

I get that, I just dont understand what you want to have in addition

Scott Morrison (Sep 27 2023 at 12:33):

I think it's also important to realize that no piece of software in isolation should ever give a "non-expert user" good reason to trust a proof.

Scott Morrison (Sep 27 2023 at 12:34):

Even the smallest verifiers require expertise to understand and check yourself.

Scott Morrison (Sep 27 2023 at 12:37):

For a non-expert, you are choosing to trust the combined system of software and expert users. If you're at the point you can dispense with trusting the experts, you're already an expert ... :woman_shrugging:

Scott Morrison (Sep 27 2023 at 12:38):

@Henrik Böving, I mean, obviously what we want in addition is a freshly implemented verifier.

Chris Wong (Sep 27 2023 at 12:40):

To be clear. There aren't any known holes in the Lean kernel, right?

Henrik Böving (Sep 27 2023 at 12:48):

That is correct

Arthur Paulino (Sep 27 2023 at 12:53):

Can lean4checker be incorporated by Lake?

Matthew Ballard (Sep 27 2023 at 12:54):

One thing that is implicit in this thread and others (and that should not be taken for granted) is the openness of the whole community to present and honestly discuss possible “exploits”.

Scott Morrison (Sep 27 2023 at 12:55):

I think it makes sense for lean4checker to remain a separate repository.

Scott Morrison (Sep 27 2023 at 12:56):

You can run it in any Lean project with lake env path/to/lean4checker.

Scott Morrison (Sep 27 2023 at 12:57):

And if you add it as a dependency to your lakefile then you can just run lake exe lean4checker.

Scott Morrison (Sep 27 2023 at 12:57):

If there's a case for further integration I'd be interested to hear the details.

Arthur Paulino (Sep 27 2023 at 12:59):

Both options make sense, but if it becomes a straightforward Lake config (e.g. checkOleans := true), then the Lean 4 ecosystem issue would have a native solution instead of requiring users to deal with an extra moving piece.

Another advantage is dependency and toolchain version management. Does lean4checker compile on toolchain X? That's a problem that users can simply not have.

Shreyas Srinivas (Sep 27 2023 at 13:40):

Henrik Böving said:

I get that, I just dont understand what you want to have in addition

@Henrik Böving : What Scott says. Ultimately,

  1. I want to be able to use libraries of proofs without checking every single inherited dependency manually.
  2. I want to be able to submit lean proofs as artifacts to conferences which they can check and trust without too much effort.
  3. As an artifact reviewer I would like to be able to sign off on formalisation artifacts within reasonable time and confidence in its correctness (including dependencies).

Joachim Breitner (Sep 27 2023 at 16:24):

The lean4checker reads .olean files, which are directly memory-mapped, and not validated. So it might be attackable by “poisonous olean” files that contain invalid data structure or override parts of the memory space. That level of nefariousity will require going through an external file format (which is planned, of course, just stating this for the record.)

Mac Malone (Sep 27 2023 at 18:25):

Shreyas Srinivas said:

  1. I want to be able to use libraries of proofs without checking every single inherited dependency manually.

In the abstract, this is impossible. A proof is dependent on all its priors. To verify its soundness, you must necessarily verify its dependencies. This is true in general logic as well, if ZF is one day shown be inconsistent then everything borne from it becomes unsound. In practice, there can be a set of dependencies that is already considered "trusted" by the venue / community (e.g., because they have already gone through rigorous external checking) and a checker could be configured to skip checking those parts, saving time during a review.

Shreyas Srinivas (Sep 27 2023 at 18:27):

I don't want to check them _manually_

Shreyas Srinivas (Sep 27 2023 at 18:28):

I do want them checked automatically by the external checker as Johann suggested

Shreyas Srinivas (Sep 27 2023 at 18:28):

Johan Commelin said:

Maybe they run the code in the appendix through an external checker that automatically also checks all dependencies? I don't understand the problem.

Specifically what Johann said here.

Mac Malone (Sep 27 2023 at 18:29):

Shreyas Srinivas said:

I don't want to check them _manually_

I wasn't suggesting that? The (ideally, external) checker would be a program run by the reviewer on the untrusted parts of the codebase.

Shreyas Srinivas (Sep 27 2023 at 18:34):

I am referring to what I meant in point 1 in my earlier message, which you quoted. What I meant there is that I don't want to carefully read every file of every dependency to check for malicious lines. If I can run the external checker on an entire project + dependencies then I don't have to scrutinize it like that.

Jireh Loreaux (Sep 27 2023 at 18:35):

@Shreyas Srinivas let me offer a constructive suggestion: your desires and goals are laudible, and I think everyone probably agrees that having maximal trust in our system is essential. However, I think that, for now, the discussion in that direction has been mostly exhausted. What is needed now is action. There are two potential ways you could contribute here:

  1. implement your own external Lean 4 proof checker in a language of your choice. I think there is already an alternative export format that can be used (besides oleans) that Sebastian implemented. Then make it open to the community for use.
  2. give concrete examples (with code!) of exploits that demonstrate your concerns and show how the existing tools (currently only lean4checker) are insufficient for the task.

Jireh Loreaux (Sep 27 2023 at 18:35):

Note that this is how Jannis started this thread a few days ago and within about 60 hours there was already a new tool implemented to address his concern. Of course, I can't promise a 60-hour turnaround time on everything, but if you make the case with explicit examples of the problem, I'm sure the community will be open to addressing the issues.

Shreyas Srinivas (Sep 27 2023 at 18:36):

To clarify again,

  1. I am not asking for anything new. I am just arguing against a specific idea that was suggested here.
  2. I am perfectly happy with leanchecker and the proposed plan for external checkers.

Jireh Loreaux (Sep 27 2023 at 18:56):

I've taken the occasion to reread the entire thread. The issue is that you are arguing against a specific idea, but:

  1. This problem you are suggesting is, thus far, purely hypothetical and does not at this time exist in practice (e.g., there do not exists Lean projects with hundreds or thousands of dependencies, and we have no evidence of malicious actors trying to get us to accept proofs of False).
  2. You have not taken the opportunity to explicitly demonstrate the problem arising. For example, you have not said: "I have project X I am working on which contains Y and Z as dependencies. I would like to verify that my project and all its dependents are valid. Can I do this in a way that avoids manual effort?"
  3. You have not taken the opportunity to write the tool you want. If there's something that doesn't exist (a lean4checker that will check all dependencies also), and you think we should have it but no one else does, just write it yourself.

Note that (3) is exactly the story of mathlib-changelog, and afterwards the community recognized the value (it's a shame it hasn't yet been ported to Lean 4, but I'm just as guilty as anyone here since I haven't taken the steps to make it happen; for me, I didn't use it that much, but I know others did).

Shreyas Srinivas (Sep 27 2023 at 19:03):

All true. I am happy to try writing such a checker, if only as a fun exercise. But it is not fair to say "something hasn't happened in lean yet so it is pointless to argue that we should mitigate against it", especially since lean4 is fairly new. If every new language/project has to relearn each lesson from scratch, when we know these lessons from previous projects/languages, then no new problems would ever get solved. Supply chain attacks are well known and explored as a concept in other languages.

Siddhartha Gadgil (Sep 28 2023 at 06:00):

Jireh Loreaux said:

Shreyas Srinivas let me offer a constructive suggestion: your desires and goals are laudible, and I think everyone probably agrees that having maximal trust in our system is essential. However, I think that, for now, the discussion in that direction has been mostly exhausted. What is needed now is action. There are two potential ways you could contribute here:

  1. implement your own external Lean 4 proof checker in a language of your choice. I think there is already an alternative export format that can be used (besides oleans) that Sebastian implemented. Then make it open to the community for use.
  2. give concrete examples (with code!) of exploits that demonstrate your concerns and show how the existing tools (currently only lean4checker) are insufficient for the task.

Since the one language I know reasonably well besides Lean is scala, I can try to port Trepplein unless @Gabriel Ebner is already doing this.

Mario Carneiro (Sep 28 2023 at 06:02):

I think it would be good to have one written in not-lean though, I wouldn't want everyone to jump on the lean version

Mario Carneiro (Sep 28 2023 at 06:03):

so perhaps before porting it to lean you should update it to lean 4 while still in scala

Mario Carneiro (Sep 28 2023 at 06:04):

(for people suspicious of lean and looking for an external verifier, there are advantages to having a verifier written in not-lean)

Siddhartha Gadgil (Sep 28 2023 at 06:05):

Mario Carneiro said:

I think it would be good to have one written in not-lean though, I wouldn't want everyone to jump on the lean version

By "port" I meant to Lean 4 export format, still written in scala.

Peiran Wu (Sep 28 2023 at 16:42):

Shouldn't we be making more people aware of this exploit, the existence of lean4checker and plans for external checkers?

Mario Carneiro said:

Lean is not a proof checker, it is a proof assistant. There is a difference

I'm not sure many people realise this. I didn't until I read this thread.

Maybe this caveat should be added on the websites. I think Shreyas Srinivas made a good point in that people have different expectations between Lean and languages like C and Rust. If the consensus here is that some of these expectations are incorrect, then we should change those expectations.

Jireh Loreaux (Sep 28 2023 at 16:51):

To expand a bit on this, since it is clear it is a new idea to some:

Mario Carneiro said:

Lean is not a proof checker, it is a proof assistant. There is a difference. It checks proofs as a convenience for proof authors, so it can give nice error messages and the like

This is a key concept to understand, which I've had the good fortune to learn by Mario explaining it to me patiently. When you interact with Lean in VS Code (or emacs or neovim, etc.), it is acting as a proof assistant; it is providing you feedback on the current state of your proofs, and helping you fill them in using tactics. In order to be an effective proof assistant, Lean does not need to provide absolute certainty that there are no problems (e.g., you can have sorry in other places and that's still useful), it just needs to have a reasonable approximation.

After you complete the proof, you want the proof assistant to be able to generate a file in some export format (olean, or something else perhaps), which can then be verified by a proof checker. These tasks (assistant and checker) should be separated. Thus far in Lean 4, we have been using the assistant as the checker, which does a more than passable job, but it is vulnerable to malicious actors. This is not the fault of Lean, it's doing exactly what it's meant to: being a proof assistant and general purpose programming language. That's why Scott implemented lean4checker and why other external checkers are in progress currently.

Kevin Buzzard (Sep 28 2023 at 16:51):

The big difference between a proof assistant and a proof checker is that the proof assistant offers you a ton of tools which are super-useful for humans who want to make their own proofs (tactics), and also a ton of meta-tools which are super-useful for humans who want to make it easier for other humans to make their own proofs (the ability to write tactics). It seems that Lean 4 has gone so overboard with the meta-tools that you can now write meta-tools which let you manipulate things in ways which are not allowed in proofs and then skip the part where Lean actually checks that the manipulations are valid.

Kevin Buzzard (Sep 28 2023 at 16:53):

However this is just what any flexible programming language is like and this is the insight Leo had -- that Lean had to become a fully-fledged programming language.

Jireh Loreaux (Sep 28 2023 at 16:54):

Peiran, I don't really think the expectations are wrong. 99% of the time (or actually much much more), Lean-as-proof-assistant suffices for Lean-as-proof-checker. It's really only in the edge cases where there is a potential malicious actor that you need the extra check (or if you want an outside check on Lean, e.g., if you are worried about bugs in the kernel). These exploits are not the kind of thing you are going to accidentally add to your codebase.

Peiran Wu (Sep 28 2023 at 16:59):

I think once you say a proof checker is different from a proof assistant, the rest is easy to grasp. But some might not realise that they have to realise there is a difference. I understand this is an edge case. But since Lean is fully-fledged programming language now, people will be using it to do things where this edge case might be more important. And I think with the ambition of being fully-fledged comes the responsibility of better security announcements.

MohanadAhmed (Sep 28 2023 at 17:05):

Am I right in understanding that the Olean files where not designed for this purpose, i.e. of being a format for external checkers to verify proofs. The olean format is a memory "dump" of lean and will change between lean releases. ?

Peiran Wu (Sep 28 2023 at 17:07):

I think I must have heard from someone that "if it compiles, it's correct". They might have only intended to simplify things slightly. But when I start the Lean study group at my university in a few weeks, I will remember to tell everyone that there is a difference between "if your code compiles" and "if someone else's code you found on GitHub compiles".

Kevin Buzzard (Sep 28 2023 at 17:23):

You could say "if the checker says yes then it's correct" and "every PR to mathlib is checked by the checker" (which is not yet true but was true for Lean 3 and will become true in time) and "hence code that you write yourself which uses only mathlib and only does maths and not hackery and which compiles is correct" and finally "if you run arbitrary lean code which someone else has written then it can (a) delete all the files on your computer and (b) can try to trick the system; it is possible to trick the system because Lean is super-sophisticated now but it's not possible to trick the checker"

Kyle Miller (Sep 28 2023 at 17:35):

Peiran Wu said:

I will remember to tell everyone that there is a difference between "if your code compiles" and "if someone else's code you found on GitHub compiles".

I'll quote Ken Thompson from 1984 in Reflections on Trusting Trust:

The moral is obvious. You can't trust code that you did not totally create yourself. [...] No amount of source-level verification or strutiny will protect you from using untrusted code.

(He demonstrated how you can make a C compiler that injects a backdoor into programs in a way that is self-perpetuating and undetectable from the source code -- the C compiler executable itself inserts the backdoor inserter into the C compiler it compiles.)

I suppose this is a little bit orthogonal to Lean and proofs, since with Lean you might expect that the "math part" was more sandboxed from the "systems part" than it currently is -- you have access to the whole metaprogramming interface even in the middle of proofs. Maybe one day we'll find the need to have a mode that restricts a module to using only the "math part." Until then, I think I trust mathlib reviewers to reject PRs that do sketchy environment manipulations or try to access the filesystem.

Mario Carneiro (Sep 28 2023 at 17:36):

(that piece is especially relevant given that lean 4 is also a bootstrapping compiler)

Peiran Wu (Sep 28 2023 at 17:52):

Kyle Miller said:

Until then, I think I trust mathlib reviewers to reject PRs that do sketchy environment manipulations or try to access the filesystem.

I do as well. (I'm generally paranoid about code security and use Docker containers extensively, but I do trust mathlib reviewers.)

I just want to reiterate my point that I think more people should be made aware of the exploit and mitigations against it, in contexts outside of mathlib.


Last updated: Dec 20 2023 at 11:08 UTC