Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Autotranslation of Logical Foundations from Rocq to Lean

Jason Gross (Jan 16 2026 at 07:56):

Theorem is open-sourcing a (non-idiomatic) Lean translation of the 1276 theorem statements in Software Foundations Volume 1 here, along with proofs that the Lean translation is isomorphic to the original Rocq. This is the first project scaling AI software verification to handle codebases with dependencies. While more compact solutions are possible, our current approach produced ~60k LoC in Lean, proven isomorphic with ~720k LoC of Rocq.

We’re very excited by this result, and will be sharing more details soon. We hope it will be useful to students of PL, as well as AI researchers excited about software verification. Happy POPL!

Bas Spitters (Jan 16 2026 at 09:13):

This is great progress @Jason Gross !
Can you say more about the isomorphism proof? Do you have a model of Rocq in Lean and proof the isomorphism in Lean, or the other way around?

Could this also help with other porting efforts, such as iris, the computable reals or program logics for cryptography ?

Bas Spitters (Jan 16 2026 at 09:18):

Looking at this, it seems that you've made a model of Lean in Rocq. It would be interesting to know more about it.
https://theorem.dev/sfbench-pt1/?id=U_original__U2_lf_dot_U_basics__U2_lf__U_basics__U_natU_playground__nat__iso

Jason Gross (Jan 17 2026 at 05:09):

We built a module type which defines the notion of "correct LF translation" using what is roughly the relational lifting of isomorphism along type formers, and used rocq-lean-import* to extend this to a spec for a "correct Lean LF translation". Our AI system was able to both translate and prove equivalence for 97% (1237/1276) of the statements. The remaining 39 statements pivoted off 6 extra difficult dependencies, which required human intervention to solve; these statements are marked as "extreme" in the list view of our dataset explorer. When given these additional 6 solutions, we were able to 100% validate the entire benchmark.

With the (very big) caveat that the equivalence test is quite strict and thus tends to bias the LLM towards giving structurally identical translations at the cost of being less idiomatic, it should be extensible to porting other codebases. We're currently looking into translating Flocq --- what other projects do you have in mind for translation? (Iris? coqeal?)

More details on this model of Lean in Rocq that @Gaëtan Gilbert made here

Bas Spitters (Jan 18 2026 at 13:27):

There is an ongoing porting effort for iris, see #iris-lean
The porting of exact real numbers is more informal/manual, but ongoing I understand. It would be interesting to see what your tooling could do.

Last updated: Feb 28 2026 at 14:05 UTC