Zulip Chat Archive
Stream: Machine Learning for Theorem Proving
Topic: Performance of models for formalisation
Archie Browne (Sep 02 2025 at 08:16):
Hi all, I am interested in hearing about your experiences in using various language models for formalisation (GPT-5, Opus 4.1, DeepSeek-V3, Kimina, Grok etc...). Which have you found most capable? Are they something you commonly use in your work with formalisation? In particular, if a model has shortcomings, what are they? Are there certain areas of Mathematics that are easier for specific models? Most of all it would be good to hear about their performance on research Mathematics. Any concrete examples would be fantastic. Thanks.
(deleted) (Sep 04 2025 at 03:18):
Rumor has it that GPT-5 with GitHub Copilot is very good
West (Sep 04 2025 at 03:52):
Cross posting my experience.
Last updated: Dec 20 2025 at 21:32 UTC