Zulip Chat Archive

Stream: Equational

Topic: Checks for correctness

Jason Rute (May 06 2025 at 12:25):

I'm curious, does equational do any of the following pedantic checks (in addition to lean4checker)?

Check that all the proofs in the project avoid the sorryAx axiom (or other axioms besides the standard three)
Check that all 10,657 explicitly true implications in Lean and all 586,925 explicitly false implications in Lean are properly exported to lean4checker (so that you know lean4checker is actually checking their proof)?

I have no particular reason to think that either is an issue, but I'm trying to better understand how we can trust Lean proofs, especially automatically generated proofs, and what things lean4checker doesn't currently check.

David Renshaw (May 06 2025 at 12:36):

There may be other checks, but one that I know about is that all of the results tagged with the @[equational_result] attributed have their axioms checked via Lean.collectAxioms: https://github.com/teorth/equational_theories/blob/4fa4df863c4e79045d8f1d0234b01a5fc879f61f/equational_theories/EquationalResult.lean#L90-L95

Jason Rute (May 06 2025 at 12:42):

Yeah, that is pretty good.

Andrés Goens (May 06 2025 at 15:17):

Check that all 10,657 explicitly true implications in Lean and all 586,925 explicitly false implications in Lean are properly exported to lean4checker (so that you know lean4checker is actually checking their proof)?

Well, at this point we're not even checking explicitly that these are all the true implications, the computation of the transitive implications is currently part of the TCB (we discussed in #Equational > A final end-to-end theorem in Lean how to do that in a reasonable amount of time, but haven't done this). I'd say checking that would be a prerequisite to checking that they're also correctly exported to lean4checker (and arguably we would get that automatically if we do)

Shreyas Srinivas (May 06 2025 at 16:42):

Basically lean4checker does not accept native_decide

Shreyas Srinivas (May 06 2025 at 16:43):

In my draft I am going to make the case that trust is a funny thing. It comes in layers. I need to ask the deepseek people if I can mention their issue as well.

Shreyas Srinivas (May 06 2025 at 16:44):

Another example is the Veil paper.

Shreyas Srinivas (May 06 2025 at 16:44):

They trust SMT solvers as long as those return an answer

Shreyas Srinivas (May 06 2025 at 16:44):

It’s possible to turn on proof reconstruction and checking, but that’s the not the default

Shreyas Srinivas (May 06 2025 at 16:45):

The challenge is to make the case that despite all these issues, ITPs are a huge step up for large collaborations

Jason Rute (May 07 2025 at 02:38):

@Shreyas Srinivas Does the code use native_decide? Or are you saying if it did, then you would know because lean4checker would reject it?

Jason Rute (May 07 2025 at 02:39):

And I agree there are many layers to this.

Shreyas Srinivas (May 07 2025 at 07:58):

The latter. But now that I recall, the end to end theorem that was left unfinished used native_decide.

Last updated: Dec 20 2025 at 21:32 UTC