Zulip Chat Archive

Stream: general

Topic: Minimally exporting a theorem from a repository


Terence Tao (Sep 16 2025 at 19:32):

Suppose Github repository X manages to formalize a proof of Theorem T using Mathlib version V. Meanwhile, Github repository Y, using Mathlib version V', wants to use Theorem T to prove some other theorem T'. Right now, one has the option of copying the entire formalization of T and its proof from repository X to repository Y, with whatever updates are needed to make it compatible with Mathlib V' instead of Mathlib V, and then import that file as part of the proof of T'.

But a more lightweight option would be if there were an automated way to export the statement of T as an axiom in a file that uses as minimal a dependency on Mathlib as possible, so that the proofs in repository X give a formalized proof that the axiom in that exported file is a theorem. Then, one could import that file to repository Y; and if one is lucky, the minimal Mathlib dependency has not changed (up to definitional equivalence) from version V to version V', so the axiom in repository Y has exactly the same type as the one that is proved in repository X. Then, one could complete the formal proof of T' in repository Y using this axiom file as import, and then the combination of the formalizations in X and Y do give a completely Lean certified proof of T' by basically the nearly disjoint union of two repositories (glued together by the minimal Mathlib import).

[EDIT: in the case where the minimal Mathlib import differs in some places from version V to version V', one could imagine a modified import in which any type used that is defined the same in both versions is unchanged, but types that are defined differently would be introduced with a slightly different name (e.g., the imported version of a type SomeObject might be introduced as SomeObject'), and then one would have to introduce some helper lemmas creating suitable equivalences between SomeObject with SomeObject' in order to make the imported version of T match the version of T one would actually want to use.]

Is it feasible to automate this export and import process, and is the minimal core of Mathlib needed to do this stable enough that this would be a useful way for one repository to be able to formally cite another?

Ruben Van de Velde (Sep 16 2025 at 19:35):

I would personally be surprised (but maybe not negatively so) if this direction produced something fruitful

Terence Tao (Sep 16 2025 at 22:17):

I guess I am basically asking for four tools:

  1. A tool that, given a repository that proves a complicated theorem X:Prop from Mathlib (and possibly some other axioms), generate a minimal import file that defines the type X, but does not provide a term of that type from the given axioms (but this is what the repository can easily accomplish).
  2. A tool that takes such a minimal import file defining a type X:Prop as input, defined using one version of Mathlib, and produces an analogous file, defined using a different version of Mathlib, that produces a type definitionally equal to X, using naming conventions as similar as possible to the original file (so that types defined in the first version of Mathlib are replaced with the corresponding types in the second version of Mathlib, if they are definitionally equal).
  3. A tool that, when given two separate files defining two separate types, can check whether these types are definitionally equal or not (I assume this capability already exists within Lean).
  4. A tool that, when given one repository that supplies a proof of X:Prop, and another repository that assumes as an axiom something definitionally equivalent to X to prove some other result Y:Prop, can combine the two to provide a larger but self-contained proof of Y.

3 and 4 seem quite doable to me; 1 requires some effort but should be feasible; and 2 is perhaps the one that is a bit tricky unless the core definitions of Mathlib are quite stable.

Kim Morrison (Sep 16 2025 at 22:39):

I think generally here axiom is unhelpful: it doesn't really add anything.

Better is to just ask in 4. that one repository supplies x : X where X : Prop, and the other repository supplies h : X' → Y, where X and X' are definitionally equal (I think defeq is also a red herring, and we don't lose anything to just say X' := X).

Then of course it is easy to produce a new repository with a proof of Y. You just depend on both repos and write theorem h' : Y := h' x. (Lean just automatically checks the defeq between X and X' at this point.

Kim Morrison (Sep 16 2025 at 22:40):

But what I'm missing is what the spec of step 2. is? Is it not just: "bump the dependencies, fix errors, and review to check that meaning did not change?"

Terence Tao (Sep 16 2025 at 22:46):

The problem is that if the original repository proved X:Prop using an old version of Mathlib, and some key definitions changed between that old version and the current one, then after bumping the type of X might not be definitionally equal to the one that the original repository was working with, but instead some slightly different X'. So in such cases one would need to define both X and X' in the new repository and have some lemma that proves that X ↔ X'.

But one would not want to import both the old version and new version of Mathlib together to state and prove X ↔ X': one would need two versions of Nat, two versions of Real, etc., etc., and it would be a namespace mess. But presumably many of the types needed to define X (or X') would coincide: the old version of Nat should be defeq to the new version of Nat, etc.. So to a large extent one could use most of the new version of Mathlib to define the old version of X; there may only be a small number of types in which one has to explicitly re-introduce an older definition of the type (using a slightly different name) in order to define X in the new version of Mathlib. This should hopefully make the lemma X ↔ X' relatively easy to prove, one just needs to prove some equivalences for the few types which have definitionally changed from one version of Mathlib to the next.

Terence Tao (Sep 16 2025 at 22:59):

For instance, suppose one wants to cite a repository that proves a difficult result that all widgets are quasistable:

theorem TheoremA (Widget W) : IsQuasiStable W

where Widget and IsQuasiStable are defined somewhere in Mathlib. The idea is that tool 1 would provide a file that says something like

import Mathlib.Widget
import Mathlib.QuasiStable

def TheoremA (Widget W) := IsQuasiStable W

that one could hopefully just drop into the new repository as an import and prove theorems like TheoremA → TheoremB as you say. But suppose that the definition of Widget has recently changed in Mathlib to something that is not definitionally equal to the old definition. Then one has to now also import the deprecated definition of Widget and check compatibility:

import Mathlib.QuasiStable

def Widget' := ... -- this is the deprecated definition of Widget.  But `IsQuasiStable` remains unchanged

def TheoremA' (Widget' W) := IsQuasiStable W -- this was proven in the previous repository

theorem theoremA_bump : TheoremA'  TheoremA := sorry -- a proof needs to be provided here

The purpose of tool 2 is to provide such a file (and ideally, try to fill in the sorry, though perhaps that still has to be done by hand).

Lawrence Wu (llllvvuu) (Sep 16 2025 at 23:29):

The algorithm that comes to mind here is content-addressing / hash-consing. I saw something about that from zero-knowledge folks a while back, I believe it was https://github.com/argumentcomputer/yatima authored by @Arthur Paulino

For example, you would put something like

def KnownFact : Prop := 

#assert_hash_eq KnownFact 0123456789abcdef

This would protect against drift in the definition of KnownFact as desired, as long as the hashing scheme is stable.

I guess the zero knowledge idea has an even grander aim, which is to be able to do something like

def KnownFact : Prop := 

#assert_hash_eq  0123456789abcdef
#check_zk_proof 0123456789abcdef fedcba9876543210

which would also be stable

Terence Tao (Sep 17 2025 at 00:19):

I like this approach! If the hash is invariant with respect to minor updates such as renaming terms in mathlib while maintaining defeq then this could be stable enough to use results as is in many cases and only need minor bump lemmas in other cases.

Arthur Paulino (Sep 17 2025 at 00:20):

@John Burnham check this out

Terence Tao (Sep 17 2025 at 00:27):

It may be desirable to have a hash that can somehow ignore applications of propext. Eg if some object is constructed using some proof of a side condition, and a mathlib bump ends up changing the proof (but not the statement) of that side condition, it would be nice if this didnt affect the hash.

Eric Wieser (Sep 17 2025 at 00:33):

Having a useful hash that is stable across Lean + Mathlib updates sounds pretty unlikely to me; you run into immediate trouble with typeclass instances. Consider the + in the statement of FLT:

  • Does Nat.instAdd have the same hash as Nat.instSemiring.toAdd? How can we ensure this while banning a + that is secretly *?
  • If we add a new op_nsmul field to Semiring (something I've proposed), how can we preserve the hash of Nat.instSemiring when even the type has changed?

Lawrence Wu (llllvvuu) (Sep 17 2025 at 00:45):

Does Nat.instAdd have the same hash as Nat.instSemiring.toAdd?

IIUC the aspiration here is yes, provided they reduce to the same recursor, though I don’t have a strong grasp of the performance implications of the amount of unfolding, reduction, and normalization involved. One could even imagine a model where f and g are not defeq and have different hashes, but f x and g x are defeq and have the same hash.

Actually, I believe the instAddNat vs Nat.toSemiring.toAdd example does not need unfolding; indeed they already #reduce to the same expr. The example of adding a structure field is one that would require (while hashing downstream declarations) unfolding to simplify the projection and use the unchanged hash of the existing field instead of the changed hash of the overall structure. But it is worth noting that even if this isn’t done for computational reasons, it would only lead to spurious hash changes, but the system would still be sound.

Terence Tao (Sep 17 2025 at 00:45):

The hash doesn't have to stay invariant across all Lean/Mathlib updates; as long as it can be stable under minor ones (or even major updates, if they don't affect the necessary definitions needed to state a result), I think it can be a viable tool. Suppose for instance that @Kevin Buzzard's FLT project is completed and posts a hash of the FLT statement, but then stops maintaining the repository as Mathlib continues to evolve. Some time later, a new project wants to use FLT, but when computing its own hash of the statement of FLT in the latest version of Mathlib, finds that they no longer agree. But one can do a bisection search in both repositories of intermediate definitions required to state FLT (the proof is irrelevant), and find the divergence - e.g., in some change to Nat.instAdd. Then one can (either automatically or manually) introduce the deprecated version of Nat.instAdd (or whatever) into the new repository and build all the way back to a legacy version of the statement of FLT that matches the hash of Kevin's version, but which can be elaborated in the latest version of Mathlib. Then it is just a matter of deriving the new version of FLT from the legacy one, which could be a tedious task, but probably significantly less time-consuming than bumping the entire proof of FLT to the most recent version of Mathlib.

John Burnham (Sep 17 2025 at 00:46):

We are developing a new version of the Yatima project called Ix https://github.com/argumentcomputer/ix, which I think should provide some of the desired properties.

At a high level, the goal of Ix is to be able to compress the typechecking of Lean proofs into succinct cryptographic proofs. The upshot that (assuming the protocol is correct) one will be able to take a long complex proof that may take hours to check with corresponding installation of the right toolchain, downloading of the right dependencies, etc, and generate a small ~1kb proof artifact that can be verified in <100 ms. The fun meme example this enables is to embed a zk-proof of the Lean proof of FLT on a QR code that actually can fit in the margin :-)

It turns out that a very large part of the work to do this though is figuring how to create meaningful cryptographic commitments to Lean4 kernel programs. In Ix the way we do this is with a new binary serialization format for Lean.Environment that turns the graph of Lean constants into a DAG and then hashes up through the leaves. Crucially, this DAG does not depend on Lean.Name, and creates alpha-invariant hashes for constants (with symbolic and other noncomputational metadata stored and hashed separately to allow for decompilation back into regular Lean types).

This means that if you have a proof and someone refactors the dependencies in a way that only affects naming, but not the actual structure, the Ix hash of the proof will be stable.

That said, this is for kernel terms after elaboration, so in practice I suspect that tactics and other metaprogramming will cause the hashes to change on many source changes, including imports, possibly renaming, or even entropy since metaprograms can do IO. So I don't know how useful this will be for you. I do think we should be able without too much extra work to use something like Ix to generate a minimal import file for a particular constant and provide a stable hash across versions that do not change the kernel, which is equal for all definitionally equal constants

John Burnham (Sep 17 2025 at 00:56):

From a programming languages perspective, our approach is essentially an adaptation of the Unison content-addressing model to Lean (https://www.unison-lang.org/docs/the-big-idea/) and when Ix is more mature/stable should enable a lot of the same depedendency management and even distrubuted computing applications that Unison does.

I do have to caveat though that Ix is not stable right now, and while it does currently work for small programs, all of Prelude and Init, it's not very optimized and a recent Lean version causes the Ix compiler to OOM at least on my machine when you try to compile Lean itself through Ix. But we have some fun toy examples, like a mini zero-knowledge voting app: https://github.com/argumentcomputer/ix/blob/main/Apps/ZKVoting/Prover.lean, an integration with the https://www.iroh.computer/ p2p network and a very large amount of cryptographic machinery for our backend zk prover and DSL.

If anything in here sounds potentially useful to anyone though, I would absolutely love to talk about it!

Terence Tao (Sep 17 2025 at 01:12):

Once a stable hashing standard is established, one could imagine having some database of hashed theorems from various repositories bundled into Mathlib, an axiom that allows one to assume any theorem whose hash is in this database, and a tactic (analogous to native_decide) that allows one to close goals by hashing them and trying to apply the axiom. This could allow one to scale up the mathematical capability of Mathlib without exploding the size of the codebase or overwhelming the approval process for contributions.

Bolton Bailey (Sep 17 2025 at 18:59):

one could imagine having some database of hashed theorems from various repositories bundled into Mathlib, an axiom that allows one to assume any theorem whose hash is in this database, and a tactic (analogous to native_decide) that allows one to close goals by hashing them and trying to apply the axiom

Not a very important point, but this is one of those funny things where there are two ways of looking at it depending on what level of the stack the crypto is placed on. One way of implementing this is to maintain the whole database of theorems in your project and index into it whenever you want to apply one. But if the hash-addressing is tight enough, then if you just construct the conjunction of all the theorems in the database, and the hash of the resulting theorem can be considered as a Merklized Abstract Syntax Tree, and the other theorems can be proved from it directly.


Last updated: Dec 20 2025 at 21:32 UTC