Zulip Chat Archive

Stream: lean4

Topic: SafeVerify-passing kernel soundness exploit

James E Hanson (Jan 04 2026 at 21:59):

import Mathlib.Data.Nat.Basic
import Mathlib.Tactic
import Mathlib.NumberTheory.FLT.Basic
import Lean

open Lean

unsafe def badNatUnsafe : ℕ := unsafeCast (-4294967296 : ℤ)

@[implemented_by badNatUnsafe] opaque badNatVal : ℕ

run_elab
  addDecl <| .defnDecl {
    name        := .str .anonymous "badNat"
    levelParams := []
    type        := .const ``Nat []
    value       := .lit <| .natVal badNatVal
    hints       := .opaque
    safety      := .safe
  }

theorem FLT : FermatLastTheorem := by
  have truly_marvelous_0 : ¬badNat ≤ 9223372036854775807 := by decide
  have truly_marvelous_1 : ¬9223372036854775807 < badNat := by decide
  simp_all only [not_le]

#print axioms FLT
-- 'FLT' depends on axioms: [propext]

Web version.

I don't know whether this counts as a 'soundness bug' since it uses environment hacking to intentionally send a malformed declaration to the kernel, but it does fool lean4checker and SafeVerify. As far as I can tell, this could have in principle been used to claim the bounties on TheoremMarketplace (so @GasStationManager and @Vadim Fomin should know about it). Because of this, I'm guessing it wasn't a known issue.

Interestingly it doesn't fool Comparator because Comparator exports the solution environment to a string first (using lean4export) and then parses that string. Since BadNatVal gets printed as 4294967296, the resulting proof no longer typechecks. (By contrast, SafeVerify uses .olean files, which are able to represent the malformed expression 'correctly'.)

Jason Rute (Jan 04 2026 at 23:06):

For safe verify, what does the challenge file look like?

Jason Rute (Jan 04 2026 at 23:08):

~~So the exploit is changing the declaration, but in a way that SafeVerify can’t detect?~~

Jason Rute (Jan 04 2026 at 23:10):

Does lean4checker work because it is a valid proof, but just of a different declaration? Or is the proof invalid?

GasStationManager (Jan 04 2026 at 23:25):

Thanks @James E Hanson for finding this! It does appear to pass SafeVerify. I would like to understand this more, to see to what extent this could be fixed in SafeVerify or elsewhere.

To me the surprising thing about the example is that decide worked. From what I understand, decide here is unfolding opaque defs that ultimately leads to an unsafe def. But the 'unsafe' is not propagated; the run_elab makes a declaration that is 'safe'...

If the run_elab is replaced with a normal def, then decide doesn't work.

Henrik Böving (Jan 04 2026 at 23:46):

What's happening is that the unsafeCast produces a Lean object at the C level that confuses the part of the kernel that is capable of handling natural numbers natively to the point where the logic implementing the Nat.lt and Nat.le operations for decide returns true in both cases. It's not really surprising that this only works with run_elab because otherwise decide will never get to see this malformed object.

This is precisely the kind of exploit that I was hoping to prevent in Comparator by being more paranoid about the input format. The key design principle of Comparator is that the olean (and even the lean) file are to be considered malicious input. Mapping malicious input into our memory and blindly dereferencing pointers without any validation mechanism can and will go wrong and that is exactly what importing an olean file does. (Though I had expected it to be some cool trick with pointers and not this easy to be honest :sweat_smile:)

James E Hanson (Jan 05 2026 at 00:23):

Jason Rute said:

For safe verify, what does the challenge file look like?

import Mathlib.Data.Nat.Basic
import Mathlib.Tactic
import Mathlib.NumberTheory.FLT.Basic
import Lean

open Lean

theorem FLT : FermatLastTheorem := sorry

James E Hanson (Jan 05 2026 at 00:30):

Jason Rute said:

Does lean4checker work because it is a valid proof, but just of a different declaration? Or is the proof invalid?

It's a valid proof in the same sense that missingno is a real pokemon.

GasStationManager (Jan 05 2026 at 00:53):

@Henrik Böving good points, and I agree that what Comparator does is safer. But at the same time this also feels like a bug. If the kernel wants to be able to handle natural numbers natively, then it is either worth making it robust or leave it to the compiler / native_decide.

James E Hanson (Jan 05 2026 at 00:55):

I'm not sure whether the devs are going to want to do something about this because it might be a performance issue to sanitize declarations that are passed to the kernel.

James E Hanson (Jan 05 2026 at 00:55):

I don't really know though, obviously.

Henrik Böving (Jan 05 2026 at 01:03):

The natural numbers are a red herring here. Exploits of the same kind are almost certainly doable by crafting olean files where expressions point to memory regions they aren't supposed to. I suspect the only reason this hasn't happened yet is that we do not have enough people with the necessary skillset and willingness in our community to figure out how to do it (look into e.g. ROP exploit chains on browsers, people go to extreme lengths of sophistication when sufficiently motivated). You cannot expect the kernel to somehow guard every pointer dereference it ever makes. Allowing an attacker to inject arbitrary things into your memory that you start playing around with is half of the effort that goes into writing arbitrary code execution exploits against "normal programs" and any verifier that imports oleans is giving this away for free by design. The only mean to defend against this type of attack consistently is to not load olean files.

GasStationManager (Jan 05 2026 at 01:27):

@Henrik Böving I tend to agree with you there. The kernel is implicitly trusting that the environment does not contain malformed objects. (But how do you define well-formed vs malformed? If the official definition is whatever remains valid after a round trip to lean4export?)

At the same time, this natural number example is perhaps making it too easy. native_decide would not have gotten confused by this, just sayin.

James E Hanson (Jan 05 2026 at 01:41):

GasStationManager said:

native_decide would not have gotten confused by this, just sayin.

import Mathlib.Data.Nat.Basic
import Mathlib.Tactic
import Mathlib.NumberTheory.FLT.Basic
import Lean

open Lean

unsafe def badNatUnsafe : ℕ := unsafeCast (-4294967296 : ℤ)

@[implemented_by badNatUnsafe] opaque badNatVal : ℕ

run_elab
  let name := .str .anonymous "badNat"
  addDecl <| .defnDecl {
    name        := name
    levelParams := []
    type        := .const ``Nat []
    value       := .lit <| .natVal badNatVal
    hints       := .opaque
    safety      := .safe
  }
  Lean.setImplementedBy name ``badNatUnsafe

theorem FLT : FermatLastTheorem := by
  have truly_marvelous_0 : badNat > 9223372036854775807 := by native_decide
  have truly_marvelous_1 : badNat < 9223372036854775808  := by native_decide
  trivial

#print axioms FLT
-- 'FLT' depends on axioms: [propext, Lean.ofReduceBool, Lean.trustCompiler]

Web version.

James E Hanson (Jan 05 2026 at 01:41):

I guess maybe this isn't the same 'this' because I'm adding in an implemented_by declaration.

James E Hanson (Jan 05 2026 at 01:43):

But really I worked out this example in part by thinking about easier examples of the same kind of thing with native_decide.

GasStationManager (Jan 05 2026 at 02:02):

@James E Hanson Very nice, I stand corrected. I guess once we see native_decide and implemented_by in the same file we should expect anything can happen; but still, this is something I haven't seen before, very interesting.

James E Hanson (Jan 05 2026 at 02:06):

Yes, it's very easy with both of them:

import Mathlib

@[implemented_by false]
def tralse : Bool := true

example : False := by
  have : tralse = true := by rfl
  have : tralse = false := by native_decide
  trivial

Web version.

GasStationManager (Jan 05 2026 at 02:18):

And at least some of the SafeVerify-based proof checking backends do ban implemented_by, though at the source level. Example

GasStationManager (Jan 05 2026 at 03:29):

I have just pushed to SafeVerify a fix that covers this case.

In more detail: it has always been designed / intended to ban the use of unsafe and partial definitions, since they are known to cause soundness issues. But previously I wasn't checking them properly; the correct way is to check before passing to Environment.replay, but I was doing it after. Since Environment.replay was in fact skipping unsafe/partial constants, they weren't in the environment after the replay. In this example, the unsafe def badNatUnsafe ends up sneaking into the proof via implemented_by. After the fix, SafeVerify is able to detect the unsafe def and throws an error.

Jason Rute (Jan 05 2026 at 03:32):

Besides this, what soundness issues do unsafe and partial cause?

Jason Rute (Jan 05 2026 at 03:33):

I ask because someone may want to use partial to write a function which is used in a custom tactic.

Aaron Liu (Jan 05 2026 at 03:41):

partial doesn't cause any soundness issues

Aaron Liu (Jan 05 2026 at 03:42):

Jason Rute said:

I ask because someone may want to use partial to write a function which is used in a custom tactic.

you can use whatever functions you want in the implementation of a tactic (even unsafe functions!), and it will not cause any problems as long as the proof terms outputted by the tactic don't contain the bad functions

Yaël Dillies (Jan 05 2026 at 08:37):

James E Hanson said:

Web version.

FYI you don't need to include this link. Hover over the top right of any code snippet to find it.

Vlad Tsyrklevich (Jan 05 2026 at 08:39):

Henrik Böving said:

I suspect the only reason this hasn't happened yet is that we do not have enough people with the necessary skillset and willingness in our community to figure out how to do it

The threat model here is not very clear to me. The AI companies care about some lean proof being gamed, but is it the case for anyone else? Why should I care?

My threat model w.r.t lean is more about the security of my own device so I am much more concerned about the fact that to really read a proof you need to run it in interactive mode where another person's code could backdoor you. The entire community is vulnerable to this, meaning that watering hole attacks against e.g. reviewers with commit access to mathlib whose code I do run means I am also vulnerable to this even if I don't run untrusted code. We could sandbox lean to disallow access to network system calls but this would probably break some functionality I haven't thought of. :shrug: (sandboxes are also not perfect, but for how small our community is I imagine this would be a sufficient step for quite a while.)

GasStationManager (Jan 05 2026 at 09:24):

Jason Rute said:

I ask because someone may want to use partial to write a function which is used in a custom tactic.

Looking back at earlier discussions, I think my issue with partial mainly stems from SafeVerify's original use case in checking executable code with correctness proofs, where partial functions are problematic at least due to their obvious potential for non-termination.

Since most subsequent use cases are for pure proofs, and partial does have legitimate use in writing tactics, I've pushed an update so that SafeVerify only disallows partial if an optional command-line flag (--disallow-partial) is set.

GasStationManager (Jan 05 2026 at 09:37):

Aaron Liu said:

you can use whatever functions you want in the implementation of a tactic (even unsafe functions!), and it will not cause any problems as long as the proof terms outputted by the tactic don't contain the bad functions

But if we allow unsafe, this allows exactly the kind of examples given by @James E Hanson, which can easily be packaged into a tactic. Then how can we tell if the resulting proof terms are not corrupted? Either (a) the kernel accepting such a proof is a bug in Lean, and needs to be fixed; or (b) this is beyond the kernel's responsibility.

Henrik Böving (Jan 05 2026 at 10:32):

But how do you define well-formed vs malformed?

I think that's quite easy: Built from the constructors of the safe Lean language. Which is essentially the invariant that recovering lean4export output through a safe Lean program ensures. After all the kernel already assumes this anyway. For example it expects that the hash code stored in an Expr node is in fact the real hash code for this node. The equality check will do judgements based on this value https://github.com/leanprover/lean4/blob/master/src/kernel/expr_eq_fn.cpp#L57. I am almost willing to bet that someone here can build a handcrafted Expr with a custom hash code (or other computed field) that does something funny.

The threat model here is not very clear to me. The AI companies care about some lean proof being gamed, but is it the case for anyone else? Why should I care?

This particular threat model is for example intended for people that run automated competitions etc. on a blockchain. More generally speaking though it's of course also a bit of a marketing thing in the end. Comparator is supposed to be the ultimate official judge :tm: that you cannot trick unless you actually find a bug in the kernel when its invariants are upheld. On the larger scale it is not feasible for us to go and correct people on the internet that do cheap tricks every time. So we built a software to do that.

My threat model w.r.t lean is more about the security of my own device so I am much more concerned about the fact that to really read a proof you need to run it in interactive mode where another person's code could backdoor you. The entire community is vulnerable to this, meaning that watering hole attacks against e.g. reviewers with commit access to mathlib whose code I do run means I am also vulnerable to this even if I don't run untrusted code. We could sandbox lean to disallow access to network system calls but this would probably break some functionality I haven't thought of. :shrug: (sandboxes are also not perfect, but for how small our community is I imagine this would be a sufficient step for quite a while.)

Very fair! Fortunately Comparator also has got you covered here as it tries to limit the read and write access to the environment you have https://github.com/leanprover/comparator/blob/master/Main.lean#L44-L46. I will agree this is very rough granular and if people are willing to figure out a better set of permissions I'd be more than happy to merge that. However this current set of permissions does already block any kind of network access. So I guess the worst thing they can do is read from somewhere in your system and then write that somewhere? I'm a bit concerned about the fact that we currently allow read and write access to /dev given the various kinds of things you can do in that file system by just writing. Might be of interest to figure out a smaller set of permissions here.

This approach could of course also be adapted to your more general Lean processes if you are sufficiently scared and I think e.g. Joachim has done that or something like it on his machine?

More generally speaking Comparator is designed in this style that you see in daemons done by the OpenBSD community where you try to separate out processes and privdrop as much as possible in order to gain some defense in depth. If you have more ideas for what we could separate or privdrop I'd be more than happy to hear that! One thing I had in mind is that I would actually like to fork the lean4replay in a separate process as well but unfortunately our way of ref counting makes this sort of thing kind of difficult. That said, e.g. the nanoda support that will eventually come in the future is going to run in a separate, even harsher privdropped process.

Henrik Böving (Jan 05 2026 at 10:44):

But if we allow unsafe, this allows exactly the kind of examples given by @James E Hanson, which can easily be packaged into a tactic. Then how can we tell if the resulting proof terms are not corrupted? Either (a) the kernel accepting such a proof is a bug in Lean, and needs to be fixed; or (b) this is beyond the kernel's responsibility.

You should view the Lean program as you use it on a daily basis as essentially a fast path for Lean that assumes its user is kind and well intended and can thus cut lots of corners to make your experience fast. Which is to say the kernel simply assumes its invariant of "everything is built from constructors" and as long as you uphold that invariant everything is nice and fast. Once you do not have that assumption anymore and you are dealing with an adversarial user we know that it has severe issues and we know these issues are systemic and likely unfixable or very hard to fix. This is precisely where checkers come into play.

Note that this is almost certainly not unique to Lean, I for example suspect that with enough hackery on the object representation of things in the Rocq kernel you will be able to pull off similar exploits. Just like our kernel, the Rocq kernel does of course assume that its objects are built using the constructors (after all its an OCaml program, why would they not be!) and it can likely be tricked into doing interesting things by fiddling with that assumption.

Vlad Tsyrklevich (Jan 05 2026 at 11:21):

Henrik Böving said:

On the larger scale it is not feasible for us to go and correct people on the internet that do cheap tricks every time.

Yeah I think this model just doesn't speak to me (or I expect the average formalizer.) Blockchain-anything isn't happening and there are plenty of people having AI delusions that will not be convinced by anything anyway.

Henrik Böving said:

So I guess the worst thing they can do is read from somewhere in your system and then write that somewhere?

I mean this can be bad (e.g. plenty of auto-scanning daemons exist like anti-virus software that are notoriously buggy that could be used to move laterally), but is generally fine. The actual worst thing is they have a kernel 0day in the limited system functionality exposed to them, but I am very happy to reduce the problem of compromising me to having to exploit real bugs. That is anyways the case with my browser and all of my other network-connected software.

Henrik Böving said:

This approach could of course also be adapted to your more general Lean processes if you are sufficiently scared and I think e.g. Joachim has done that or something like it on his machine?

I could go out of my way to do this, but without security-by-default it's still problematic because I am vulnerable to watering hole attacks on any number of other people in the ecosystem that are just doing the default (e.g. nothing.)

Henrik Böving said:

If you have more ideas for what we could separate or privdrop I'd be more than happy to hear that!

I'm not familiar with landrun but assuming it is using kernel-level system call filtering then that is the gold standard for sandboxing. Now I'd just like the equivalent functionality to be integrated into VSCode.

Henrik Böving (Jan 05 2026 at 11:44):

Blockchain-anything isn't happening

Well people have already done it with e.g. https://theorem-marketplace.com/ though with miniscule amounts of money I agree.

there are plenty of people having AI delusions that will not be convinced by anything anyway.

That is true, still when people post things like the above on twitter and claim that Lean is deeply flawed or w/e it is helpful to have a judge like this around that cannot be fooled. The marketing here is not directed towards the crazy people but to ensure to the larger audience that indeed if you need to be sure you can be sure. It should be possible to verify with the highest amount of certainty that a non trivial result in Lean is indeed correct in the theoretical definition of correct.

I'm not familiar with landrun but assuming it is using kernel-level system call filtering then that is the gold standard for sandboxing.

yes

Now I'd just like the equivalent functionality to be integrated into VSCode.

I think this is something that might end up happening in the longer term future. Doing this for a full VSCode setup in a way that works cross platform is of course quite a lot more engineering effort compared to "we want to run these specific actions on a Linux machine"

Vadim Fomin (Jan 05 2026 at 11:48):

though with miniscule amounts of money I agree.

It's actually on playmoney right now (Sepolia ether) specifically because I wasn't confident it was robust, and thought the community might want to play around with it and see how that goes without touching actual finances that might be lost to vulnerabilities. (I considered moving it to actual money If I saw it was of use to anyone, but right now I think it was more of a toy.)

Martin Dvořák (Jan 05 2026 at 12:00):

James E Hanson said:

It's a valid proof in the same sense that missingno is a real pokemon.

At least, in case of Pokémon, a team containing MissingNo will be automatically rejected from an official tournament. Similarly, it would be nice to get all proofs containing the exploit automatically rejected by a software, so that people don't have to check proofs for the exploit manually.

Sebastian Ullrich (Jan 05 2026 at 13:05):

Martin, that software already exists. It is being discussed in this very thread.

Last updated: Feb 28 2026 at 14:05 UTC