Zulip Chat Archive

Stream: IMO-grand-challenge

Topic: Anyone tried Lean or Isabelle (like DTV) on AIMO?


Jason Rute (Nov 29 2024 at 16:25):

Does anyone know if any of the AIMO contestants tried using Lean or Isabelle similar to the paper Don’t Trust; Verify. That paper is similar to Draft, Sketch and Prove, but for natural language math problems where the ultimate goal is to return a number. The idea of that one first has informal reasoning via a language model, and then translates it into a formal proofs as a way to check/rank the numerical answers. In the DTV paper it improved the results on the informal benchmark MATH. I don’t know with modern models if DTV even helps anymore. (I have not seen any papers use it recently.) But if it does still help, AIMO would be an obvious application. Does anyone know of other papers or AIMO contestants using formal proof approaches like this? (And yes, using Sympy/Python is quite common, so formal approaches are still used, but what about theorem proving?)


Last updated: May 02 2025 at 03:31 UTC