Zulip Chat Archive

Stream: Equational

Topic: An implication predictor

Terence Tao (Feb 16 2026 at 02:06):

Just for fun, I used AlphaEvolve to evolve a predictor for the ETP implication graph, which can be found at https://github.com/teorth/equational_theories/blob/main/scripts/predictor/predictor2.py . It takes the strings of two equational laws (using "*" as the magma operation) and gives a probabilistic prediction as to whether the first implies the other. It works very well on random implications, with an accuracy rate of about 98%:

Does x = y * ((z * (w * w)) * w) imply x = ((y * (z * y)) * w) * y? Predicted Probability: 0.9990, Actual: True, Log Loss: -0.0010
Does x = ((y * z) * x) * (w * u) imply x = ((y * z) * w) * (u * y)? Predicted Probability: 0.9990, Actual: True, Log Loss: -0.0010
Does x = (x * x) * x imply x * (x * y) = z * (z * y)? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x = y * ((z * (w * u)) * w) imply x * y = (x * x) * (y * y)? Predicted Probability: 0.9990, Actual: True, Log Loss: -0.0010
Does x = ((y * z) * w) * (u * y) imply x = y * ((x * x) * x)? Predicted Probability: 0.9990, Actual: True, Log Loss: -0.0010
Does x * (y * z) = w * (z * y) imply x = (y * x) * (y * (z * w))? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x = y * ((z * w) * (w * x)) imply x = (y * z) * ((w * w) * y)? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x = ((y * z) * (x * w)) * x imply x = (y * y) * (z * (z * y))? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x = y * ((y * z) * (z * z)) imply x = (y * (z * z)) * y? Predicted Probability: 0.9990, Actual: True, Log Loss: -0.0010

I evolved an initial predictor to do well at random implications (which it did mostly by testing against small magmas and trying easy substitutions, similar to how ETP first started). I then generated a set of "hard" implications and evolved the second predictor to do well at those as well, though now with some failure rates: it uses some additional magma models such as linear models, as well as smarter syntactic matching. Here is how it does on some hard implications:

Does x = y * (y * ((x * y) * x)) imply x * y = ((z * x) * w) * y? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x * (y * z) = (z * w) * w imply x * (y * x) = (x * y) * z? Predicted Probability: 0.6286, Actual: False, Log Loss: -0.9905
Does x = ((y * y) * z) * x imply x = x * (((y * x) * x) * x)? Predicted Probability: 0.8506, Actual: True, Log Loss: -0.1618
Does (x * y) * x = (x * z) * w imply x * (y * y) = x * (z * y)? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x = ((y * z) * (z * x)) * y imply x * x = (y * (y * y)) * y? Predicted Probability: 0.8520, Actual: False, Log Loss: -1.9104
Does x * x = (x * y) * z imply x * x = (y * z) * w? Predicted Probability: 0.0010, Actual: False, Log Loss: -0.0010
Does x = (y * (y * (y * z))) * x imply x * y = z * (w * (w * y))? Predicted Probability: 0.9990, Actual: True, Log Loss: -0.0010
Does x = x * (y * ((y * z) * y)) imply x = x * ((x * (x * x)) * x)? Predicted Probability: 0.6263, Actual: False, Log Loss: -0.9844

I haven't tested it on implications out of distribution, i.e., on implications involving more than four operations, and to see how it competes against traditional ATPs like Vampire, etc.. I like to think of this as a sort of "interpretable" machine learning experiment: no doubt a neural net for instance would get a much better fit to data, but the scripts here are all human-readable python code rather than a collection of neural net weights, so one still has some idea what the algorithm is actually doing.

Last updated: Feb 28 2026 at 14:05 UTC