Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Building an Autoformalizer on Analysis

Blanca Luo (Jul 02 2025 at 18:33):

Hi! I'm interested in a short-term project that trains a model to autoformalize proofs in the field of analysis (real/complex analysis, measure theory, etc). I plan to get informal->formal paired data from Tao's ongoing Analysis I project and from mathlib, feed them to a base LLM, and then perform some fine-tuning and reinforcement learning. Since I'm still semi new to Lean and ML, I wonder if the method and scale of this project sounds valid, and if this kind of tool will be meaningful/helpful? Any thoughts and advice are welcomed!

Anthony Wang (Jul 05 2025 at 17:04):

I suspect your main difficulty would be getting enough training data to get useful results out of this. Some other people have been working on Lean autoformalizers as well, so I'd recommend checking out their papers for their methods and datasets that they used. Also, I think RL seems more useful for generating proofs rather than autoformalization so I'm curious as to how you plan to use it for your project.

Last updated: Feb 28 2026 at 14:05 UTC