Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: AlphaEvolve

Ivan Eric (May 14 2025 at 17:36):

https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
New LLM based coding agent for algorithm discovery. It uses evolutionary search with automated evaluation to generate and optimize code. They claim to have improved data center scheduling, TPU design, and found improvements upon Strassens algorithm and other open math problems

Junyan Xu (May 14 2025 at 18:12):

AlphaEvolve’s procedure found an algorithm to multiply 4x4 complex-valued matrices using 48 scalar multiplications, improving upon Strassen’s 1969 algorithm that was previously known as the best in this setting. This finding demonstrates a significant advance over our previous work, AlphaTensor, which specialized in matrix multiplication algorithms, and for 4x4 matrices, only found improvements for binary arithmetic.
To investigate AlphaEvolve’s breadth, we applied the system to over 50 open problems in mathematical analysis, geometry, combinatorics and number theory. The system’s flexibility enabled us to set up most experiments in a matter of hours. In roughly 75% of cases, it rediscovered state-of-the-art solutions, to the best of our knowledge.
And in 20% of cases, AlphaEvolve improved the previously best known solutions, making progress on the corresponding open problems. For example, it advanced the kissing number problem. This geometric challenge has fascinated mathematicians for over 300 years and concerns the maximum number of non-overlapping spheres that touch a common unit sphere. AlphaEvolve discovered a configuration of 593 outer spheres and established a new lower bound in 11 dimensions [previous record: 592].

From paper:

Notably, for multiplying two 4 × 4 matrices, applying the algorithm of Strassen [92] recursively results in an algorithm with 49 multiplications, which works over any field. For the very specific case of multiplying in the field with 2 elements, Fawzi et al. [25] (AlphaTensor) found an algorithm with 47 multiplications. For 56 years, designing an algorithm with fewer than 49 multiplications over any field with characteristic 0 was an open problem. AlphaEvolve is the first method to find an algorithm to multiply two 4 × 4 complex-valued matrices using 48 multiplications.

Junyan Xu (May 14 2025 at 18:34):

Three candidate number theory problems that come to my mind which could potentially be studied with AlphaEvolve:

High-quality abc triples (there are already computer search running for these)
Mahler measure: apparently people are already using evolutionary approaches
https://web.ma.utexas.edu/users/jmc5946/john_clark_thesis.pdf
image.png
Optimization of trigonometric polynomials (I don't know if anyone keep track of records for such things)

image.png

Kevin Buzzard (May 14 2025 at 21:05):

My instinct is that high-quality ABC triples would not be a good problem for this framework because of the arithmetic nature of the subtlety. The applications are mostly about shaving off decimal places or algebra. For example would one expect this kind of tool to be good at primality generation or factoring? ABC triples is kind of like that and my instinct is that a different tool is required. I'd love to be proved wrong!

Junyan Xu (May 15 2025 at 19:24):

@Terence Tao was involved in the results involving harmonic analysis inequalities, additive combinatorics, and packing mentioned in the announcement: https://mathstodon.xyz/@tao/114508029896631083

Shreyas Srinivas (May 15 2025 at 21:32):

I am passing on this thread to some of my colleagues in the algorithms side. There are a lot of potential function based analyses problems which are considered hard. If they are accepting proposals for problems somewhere, I might be able to forward such a link or form as well to people in the algorithms community

Shreyas Srinivas (May 15 2025 at 21:44):

Nvm. I found the link

Ching-Tsun Chou (Nov 06 2025 at 07:00):

https://arxiv.org/abs/2511.02864

Jeremy Tan (Nov 06 2025 at 07:43):

This is probably DeepMind people bragging about their latest Alpha-tool, and only has minimal relationship to Lean (although the results are genuine)

Kim Morrison (Nov 06 2025 at 10:00):

AlphaEvolve is extremely awesome, but nothing to do with Lean.

Chris Henson (Nov 06 2025 at 10:06):

The paper linked above does mention Lean and even link to Zulip, so maybe at least tangentially related? Maybe a better fit for the ML channel.

Notification Bot (Nov 06 2025 at 10:16):

4 messages were moved here from #general > AlphaEvolve by Patrick Massot.

Kim Morrison (Nov 06 2025 at 10:34):

Interesting! It is indeed tangentially related.

I think the takeaway is "people are making conjectures based on AI-driven experiments/optimisations, and then successfully handing these conjectures off to neural provers writing Lean".

Eric Wieser (Nov 06 2025 at 10:59):

In particular, on page 4 of the paper it remarks that:

Even more strikingly, for the finite field Kakeya problem (cf. Problem 6.1), AlphaEvolve discovered an interesting general construction. When we fed this programmatic solution to the agent called Deep Think [148], it successfully derived a proof of its correctness and a closedform formula for its size. This proof was then fully formalized in the Lean proof assistant using another AI tool, AlphaProof [147]. This workflow, combining pattern discovery (AlphaEvolve), symbolic proof generation (Deep Think), and formal verification (AlphaProof), serves as a concrete example of how specialized AI systems can be integrated. It suggests a future potential methodology where a combination of AI tools can assist in the process of moving from an empirically observed pattern (suggested by the model) to a formally verified mathematical result, fully automated or semi-automated.

where the lean proof is here

Robin Carlier (Nov 06 2025 at 11:32):

It’s impressive how inconsistent at whitespacing this AI-generated code is.

Dmitry Rybin (Nov 06 2025 at 16:03):

It would be interesting if AlphaEvolve can discover an algorithm that is also a proof of some deep general construction (like RSK-algorithm). I am reading through this work https://arxiv.org/abs/2511.02864 by prof. Tao and DeepMind, and didn't find such examples yet

Kim Morrison (Nov 06 2025 at 23:31):

Having read the paper some more: this is wild.

Kim Morrison (Nov 06 2025 at 23:32):

I would love to know the "overhead factor", i.e. the compression factor between AlphaProof's proof and a human written (or even just human-cleaned-up) one. I haven't actually loaded up the proof, but skimming it there are many redundant looking steps.

Eric Wieser (Nov 07 2025 at 09:03):

You mean to say that induction π is not an essential step of the proof?

Eric Wieser (Nov 07 2025 at 09:05):

As I understand it, the goal here was more "the system produced a Lean proof to ensure the argument was correct" rather than "the system produced a Lean proof as an artifact to explain the proof to humans", so no explicit cleanup step was involved

Bas Spitters (Nov 07 2025 at 10:13):

There is some similarity with CryptOpt, which does give formal guarantees (in Rocq).
https://arxiv.org/abs/2211.10665

Wang Jingting (Nov 07 2025 at 13:01):

Kim Morrison said:

I would love to know the "overhead factor", i.e. the compression factor between AlphaProof's proof and a human written (or even just human-cleaned-up) one. I haven't actually loaded up the proof, but skimming it there are many redundant looking steps.

I remember seeing a model for golfing lean proofs on zulip some time earlier? Maybe that one could help us to make the code more organized and readable. (But it seems that they haven't released the model to the public yet)

Edit: the discussion thread for that model is #Machine Learning for Theorem Proving > ProofOptimizer

Junyan Xu (Nov 29 2025 at 21:49):

The new paper Finding Kissing Numbers with Game-theoretic Reinforcement Learning (submitted Nov 17) broke kissing number records in dimensions 25 to 31. It's interesting that the three recent breakthroughs improved disjoint records: AlphaEvolve (May 2025) improved 11-dimensional record from 592 to 593, Cohn and Li (Nov 2024) broke the records from 17 to 21 dimensions, and now this.

Last updated: Feb 28 2026 at 14:05 UTC