Zulip Chat Archive

Stream: general

Topic: Lazy proofs?

Thomas Murrills (Oct 30 2023 at 21:47):

I was wondering if there's any systematic way to write lazy proofs to calculate something by brute force. For example, let's say we're trying to prove that the Ramsey number $R(4,3) \leq 9$ by naive brute force. By this I mean simply looking at every dichromatic edge labeling of the complete graph on 9 vertices and finding either a red 4-clique or a blue 3-clique in each. (Pretend for now that the only way is the naive way, as I just want to use it as a toy example.)

There are over 68 billion labelings, so we don't want to have all the cliques we find present in memory as part of some term at once. We don't even want to wind up with a proof that involves a list with 68 billion small terms. So, I figure the search program ought to run in the kernel when we typecheck a small term, instead of being a program that unfolds to a sensible "constructive" term. But how do we actually write a "brute-force search" program that runs as part of the typechecking of a proof in the kernel? (Or is there another way to do it altogether?)

(Has this been discussed already? I searched Zulip but came up empty.)

Mario Carneiro (Oct 30 2023 at 21:53):

If you set things up to use kernel computation, it is possible, though very difficult, to get transient terms. I think the main challenge you will get with this method is that the caches will get filled up whether you want it or not

Mario Carneiro (Oct 30 2023 at 21:55):

Running programs in the kernel isn't that hard: just write a function which computes a Bool and assert it equals true by rfl

Mario Carneiro (Oct 30 2023 at 21:57):

(or by native_decide if you're feeling spicy. This will be much more likely to have transient terms)

Kyle Miller (Oct 30 2023 at 22:05):

@Scott Morrison and I were experimenting with taking advantage of the kernel to prove known mersenne primes up through ones found in the 1980s https://github.com/leanprover-community/mathlib4/blob/lucas_lehmer_tailrec/Archive/Examples/MersennePrimes.lean

Here's the program run by the kernel, and this bit of generated proof is what causes it to be computed.

Everything fits in memory, but it's not generating a giant proof term -- it's just reducing sMod'' to some natural number.

Thomas Murrills (Oct 30 2023 at 22:08):

Mario Carneiro said:

Running programs in the kernel isn't that hard: just write a function which computes a Bool and assert it equals true by rfl

Can you capture mathematical content with this strategy, though? I.e. how do you actually show you've e.g. exhausted all possibilities, and/or what's to prevent someone from just saying true in place of running the search program?

Kyle Miller (Oct 30 2023 at 22:12):

You have to have a theorem that this function indeed computes a mathematically meaningful value.

Kyle Miller (Oct 30 2023 at 22:12):

For the Lucas-Lehmer test, that theorem is here

Thomas Murrills (Oct 30 2023 at 22:15):

Ah, ok, neat! I get it now. So is the issue "just" (1) to ensure we have transient terms and (2) to somehow not fill up the cache?

Thomas Murrills (Oct 30 2023 at 22:24):

This is probably not something we can do yet, but just to ask...is there any possibility of using a fast extern program somehow which is verified in lean to do what we think it does? Do we have any means of using lean to verify non-lean programs, yet? (Or can you not even intervene with an extern program during typechecking in the kernel?)

Mario Carneiro (Oct 30 2023 at 22:25):

that's kind of what native_decide is for

Mario Carneiro (Oct 30 2023 at 22:26):

there are ideas for something intermediate where you use the lean compiler but turn off all the extra features that lead to a larger TCB like implemented_by, extern and opaque

Thomas Murrills (Oct 30 2023 at 22:30):

Hmmmm. The appeal for me with brute force checking is really being sure that the code is doing what it says, seeing as it's impossible to examine fully how it's working in these cases. so having the smallest TCB is pretty important

Thomas Murrills (Oct 30 2023 at 22:31):

Also with native_decide you can't create your own extern functions and then prove things about them, right? That's really what I'm wondering.

Mario Carneiro (Oct 30 2023 at 22:38):

I'm not sure what you mean, you can make any function you want, subject to the constraint that it is the lean compiler that is compiling it

Mario Carneiro (Oct 30 2023 at 22:38):

Thomas Murrills said:

Hmmmm. The appeal for me with brute force checking is really being sure that the code is doing what it says, seeing as it's impossible to examine fully how it's working in these cases. so having the smallest TCB is pretty important

Like Kyle said, if you have a Bool function for doing a brute force search, it's typically easy to prove a general theorem that says it does what it is supposed to do

Thomas Murrills (Oct 30 2023 at 22:39):

~~What about the issues with transient terms and the cache that you mentioned?~~ Oh, maybe I misread the scope of your message.

Mario Carneiro (Oct 30 2023 at 22:43):

import Mathlib.Data.Nat.Basic

def foo : Nat → Bool
  | 0 => true
  | n+1 => n * n < 100 && foo n

theorem foo_correct (n) (h : foo n = true) : ∀ i, i < n → i * i < 100 := by
  induction n with
  | zero => decide
  | succ n ih =>
    simp [foo] at h
    exact Nat.forall_lt_succ.2 ⟨ih h.2, h.1⟩

example : ∀ i, i < 10 → i * i < 100 := foo_correct _ rfl

Mario Carneiro (Oct 30 2023 at 22:44):

this is more or less how Decidable instances work

Mario Carneiro (Oct 30 2023 at 22:46):

If you use set_option trace.Meta.isDefEq true in on the example and scroll to the very last ✅ foo 10 = foo 10 =?= foo 10 = true subgoal you will see it evaluating the search one step at a time

Thomas Murrills (Oct 30 2023 at 22:47):

Oh, ok! Is that happening in the kernel or during elaboration? My expectation would be that that's an elaboration-time thing. (Is it both?)

Mario Carneiro (Oct 30 2023 at 22:47):

Both

Mario Carneiro (Oct 30 2023 at 22:47):

You can't trace the kernel evaluation (unless you stick a dbg_trace in lean4lean)

Thomas Murrills (Oct 30 2023 at 22:48):

That latter point was what I thought, yeah

Mario Carneiro (Oct 30 2023 at 22:48):

You can skip the elaborator evaluation if you write a suitable tactic

Thomas Murrills (Oct 30 2023 at 22:48):

Right, makes sense

Thomas Murrills (Oct 30 2023 at 22:49):

So I guess the issue to expect is mainly the cache issue, right? Is there any way around that (without trusting more code)?

Mario Carneiro (Oct 30 2023 at 22:49):

well, that and also the kernel WHNF is not very good

Mario Carneiro (Oct 30 2023 at 22:50):

but modulo constant factors maybe you can ignore that

Mario Carneiro (Oct 30 2023 at 22:50):

If you don't write the function in just the right way it will build up a thunk during the search instead of evaluating everything like you would expect

Mario Carneiro (Oct 30 2023 at 22:51):

I'm not even sure it is possible to force everything that needs to be forced and prevent a buildup of terms

Thomas Murrills (Oct 30 2023 at 22:54):

Mario Carneiro said:

I'm not sure what you mean, you can make any function you want, subject to the constraint that it is the lean compiler that is compiling it

In the message above this I was considering specifically the ability to not have the lean compiler compile it! (I'm exploring several possibilities at once here.) That is, you write your code in another (faster) language, and prove something about its behavior (or the behavior of its compiled version) in Lean—and then use it in Lean to do the search you need to do.

I'm not expecting this to exist yet, it would just be cool—although I imagine it'd require extensive modeling of e.g. the resulting machine code (or whatever level would be appropriate) and demand conditions on the machine you're using to run it. It could be worth it, though, if you need really fast code to do cutting-edge searches, and want to verify that it's correct. I imagine this is a thing that people might want to use Lean for—and it'd be great to be able to compose existing "fast external algorithms" in a modular fashion in lean (and compose the proofs about them) instead of writing new code and verifying it from scratch each time.

Is anyone working on e.g. verifying binaries in Lean (or anything similar)? I feel like I saw some mention of something like that on Zulip... :eyes:

Mario Carneiro (Oct 30 2023 at 23:05):

Thomas Murrills said:

In the message above this I was considering specifically the ability to not have the lean compiler compile it! (I'm exploring several possibilities at once here.) That is, you write your code in another (faster) language, and prove something about its behavior (or the behavior of its compiled version) in Lean—and then use it in Lean to do the search you need to do.

Obviously this proof is only as good as the assertion that the lean model of the function matches the external implementation, but sure you can do this. You would use native_decide and an extern function, and then have that extern function do FFI to your function defined in whatever language

Kyle Miller (Oct 30 2023 at 23:18):

Mario Carneiro said:

I'm not even sure it is possible to force everything that needs to be forced and prevent a buildup of terms

I'd found a way to at least force Nat expressions for the Lucas-Lehmer calculations:

/-- Ensure the `Nat` is reduced before evaluating `f` -/
def nat_seq (f : Nat → α) : Nat → α
  | 0 => f 0
  | n => f n

But that's just Nats, and it contributes to the stack depth limit iirc.

Thomas Murrills (Oct 30 2023 at 23:20):

Mario Carneiro said:

Thomas Murrills said:

In the message above this I was considering specifically the ability to not have the lean compiler compile it! (I'm exploring several possibilities at once here.) That is, you write your code in another (faster) language, and prove something about its behavior (or the behavior of its compiled version) in Lean—and then use it in Lean to do the search you need to do.

Obviously this proof is only as good as the assertion that the lean model of the function matches the external implementation, but sure you can do this. You would use native_decide and an extern function, and then have that extern function do FFI to your function defined in whatever language

I mean something a little deeper, I think: instead of having an external function with a lean model which we just assert is equivalent, we actually prove that the external function does what we want, in lean. This would mean being able to model the actual machine code in lean, with the idea being that we're trusting only very simple facts about how the machine code works. (This means, of course, that we'd also be using and trusting facts about what it's running on, which means a certain degree of platform-dependence, but that's inevitable.)

Lessness (Oct 30 2023 at 23:25):

Coq has Verified Software Toolchain, which does something similar.

It will take some time until Lean also makes something like this...

Mario Carneiro (Oct 30 2023 at 23:28):

I've also been working on a project to do verified programming for a while, but it will target lean via translation / code generation (it's not directly written in lean because I don't want lean in the TCB)

Last updated: May 02 2025 at 03:31 UTC