Zulip Chat Archive

Stream: Equational

Topic: Ways to analyze a given equation

Terence Tao (Oct 15 2024 at 18:14):

One of the outcomes of this project is that we have started to assemble a nice suite of techniques to analyze any given equation, EquationX(and the laws implied by this equation), and I thought I might try to create a thread to collect them here as the project continues. Here are some to get started:

Test EquationX against linear magmas x ◇ y = ax+by in various fields, commutative rings, and/or non-commutative rings, and study the resulting constraints on a and b.
Test EquationX against translation invariant magmas x ◇ y = x + f(y-x) on an additive group G, and study the resulting functional equation on f. Non-abelian constructions are also possible.
Use SAT solvers (either fully automated, or human-guided) to generate medium-sized finite models for EquationX, which can hopefully be used to rule out many implications to other equations. One can also use brute force to exhaust all small finite models (size 4 or less). Random linear or quadratic models on cyclic groups have proven somewhat effective in practice.
Observe "invariants" of the left and right hand sides of EquationX. If they match, then any equation implied by EquationX must also have matching invariants. (One way to generate such invariants is by abstract nonsense, mapping both sides of the equation to elements of a free magma modulo some reference theory.)
Use an automated greedy algorithm (invoking SAT solvers if needed) to hopefully find a complete ruleset for greedy extension for EquationX; if they exist, this tends to settle all outstanding implications in practice.
Try to see if EquationX is confluent (do we have a concrete procedure for this?), which gives a tractable description of the free magma subject to this law, and thus allows one to resolve all implications of this law.
If confluence fails, a complete rewriting system for EquationX is still possible (how can one guess this is the case, and what should one do to find it?).
Start with some simple "base model" of EquationX, such as a translation-invariant model, and try to construct local modifications of it to disprove possible consequences of that law.
Use standard automated theorem provers to see if EquationX implies, or does not imply, another given law.
We have one isolated example of a graph-theoretic construction (based on an infinite regular tree). Possibly there are ways to generalize this type of construction more systematically to other equations? [EDIT: See this second example.]

Any corrections, additions, or clarifications would be welcome.

Terence Tao (Oct 15 2024 at 18:34):

A technical question: is it easy to feed EquationX into an ATP and have it spit out all the short laws that it can "easily" deduce from EquationX, without having to enumerate over all short laws and run the ATP separately for each potential implication?

Michael Bucko (Oct 15 2024 at 19:07):

That kind of result reminds me of ensemble approaches -- something that (these days) could be approached by an agent (computing in parallel, function calling / tools, CoT). With FFI and Lean (someone with more Lean experience - please correct me), one could integrate 3rd party tools (especially those production-ready, low-level, highly efficient, like top SAT solvers, Vampire, and so on) used in Lean (something like this below?)

structure AnalysisResult :=
(linear_magma_results : TestResult)
(translation_invariant_results : TestResult)
(finite_models : list Model)
(invariants : Invariants)
(greedy_ruleset : Ruleset)
(is_confluent : bool)
(rewriting_system : option RewritingSystem)
(local_modifications : list Modification)

let analysis_result : AnalysisResult := {
    linear_magma_results := linear_magma_results,
    translation_invariant_results := translation_invariant_results,
    finite_models := finite_models,
    invariants := invariants,
    greedy_ruleset := greedy_ruleset,
    is_confluent := is_confluent,
    rewriting_system := rewriting_system,
    local_modifications := local_modifications
}

Some other points:

I guess one could deploy Mathematica (or similar) whenever symbolic manipulation is needed,
Constructing the base model and invariant recommendations could potentially be recommended by an AI,
Perhaps greedy algorithms could be developed that could help break down problems (or turn them into some proxy problems), or approach problems using map-reduce-like paradigms? Perhaps there's a way for the algorithms to provide lazy evaluations (1) or imperfect (but not useless) insights?
Point 1 and 2 could be seen as some sort of algebraic modeling / constraint derivation, 3 and 5 as computational model generation / analysis. Invariant identification and property analysis seem like very powerful tools that could be done in a greedy way. 6 and 7 - confluence + rewriting.
The approach could be viewed as a universe (that is modeled by EquationX in specific structures) an agent "enters". That agent would then generate and analyze its own model of that structure (while exploring it; that includes invariant identification, confluence checks, rewriting systems, and identifying blocks (counterexamples)). I guess such an environment could created for projects like gymnasium.
As for automated law discovery in batch processing mode -- even if that's not possible atm, we could develop an agent that can do it. That'd involve candidate generation, batch processing, continuous learning. I believe in agents and curriculum learners (ie. those kind of lifelong learners that would generate their own knowledge bases and provide their insights via APIs).
When it comes to 10, I love the idea of generalizing those kinds of constructions. One could bootstrap the algorithm with the template, and let the agent work on the template. Certain types of trees could potentially "fit" certain types of equation characteristics.
I think compute is quite tricky when it comes to such projects. I tried to run DeepSeek Prover 1.5, but then one always needs enough space and compute. I think having access to a server with enough compute via SSH would help. One could run lots of experiments and test models.
New architectures like differential transformer (better attention while cancelling noise, https://arxiv.org/abs/2410.05258) could lead to the models getting better much faster (soon enough). It could be a good idea to build an agentic math framework that could eventually deal with other types of problems too.

Amir Livne Bar-on (Oct 15 2024 at 19:26):

Re 6, we originally took as confluent only equations without critical pairs. This had some false negatives, so we switched to a procedure that calculates necessary conditions, and had no false positives. I think it proves confluence exactly. The idea is: for every critical pair, write down the general term that encounters this critical pair, and fully simplify it in all possible orders. If we found a term with two simplified forms, the equation is not confluent. (I'm not familiar with this area, so I don't know the name for the general term; What I mean is, in the rule x(yy)->x there's a critical pair, and the most general term is found using the equation x(yy)=zz, yielding the term a(bb)(bb))

Amir Livne Bar-on (Oct 15 2024 at 19:29):

Re 7, it's not clear that a confluent system always exists, or at least it shouldn't be easy to find one in the general case. As it would give an algorithm for deciding implications.

Daniel Weber (Oct 16 2024 at 07:39):

I wonder if it might be effective to try and assume rules which aren't implied by the equation, but still don't (seem to) imply the target

Fan Zheng (Oct 16 2024 at 11:21):

@Terence Tao It would be good if a chapter in the final write-up is organized around this outline, as these ideas are quite useful even for equations not in our list.

Cody Roux (Oct 16 2024 at 16:26):

Amir Livne Bar-on said:

Re 6, we originally took as confluent only equations without critical pairs. This had some false negatives, so we switched to a procedure that calculates necessary conditions, and had no false positives. I think it proves confluence exactly. The idea is: for every critical pair, write down the general term that encounters this critical pair, and fully simplify it in all possible orders. If we found a term with two simplified forms, the equation is not confluent. (I'm not familiar with this area, so I don't know the name for the general term; What I mean is, in the rule x(yy)->x there's a critical pair, and the most general term is found using the equation x(yy)=zz, yielding the term a(bb)(bb))

What you're referring to is actually "the critical pair" (sometimes also called "the critical peak"). Though I'm a bit confused about the zs.

Edit: wait know I see what the zs are: you had to "freshen" the variables of the rule to compute the overlap, and the y was renamed to z.

Terence Tao (Oct 16 2024 at 16:27):

Recording here a variant of 5: it seems that sometimes when the standard greedy algorithm fails, a modified greedy algorithm in which one adds new elements to the table in clusters rather than individually can still work. This worked for 1692 !=> * in https://leanprover.zulipchat.com/#narrow/stream/458659-Equational/topic/Proposed.20new.20target.3A.2063.20and.201692.20.28.22Dupont.20and.20Dupond.22.29/near/477252513 , and also earlier for "Obelix" in #Equational > Oberlix: joining two approaches .

[Side note: how come Zulip sometimes can shorten stream links, and sometimes it can't?]

Cody Roux (Oct 16 2024 at 16:28):

Amir Livne Bar-on said:

Re 7, it's not clear that a confluent system always exists, or at least it shouldn't be easy to find one in the general case. As it would give an algorithm for deciding implications.

Both are false, the system doesn't always exist, and it may be undecidable to prove that a system is confluent.

However, if a system is terminating, then it is decidable! Just find all the critical pairs, and prove that they join (this can now be done in finite time). Joining critical pairs only implies ~~decidability~~ confluence for terminating systems though IIRC.

Cody Roux (Oct 17 2024 at 19:36):

Here's a proof that should be fun to formalize in Lean! (actually the statement is in several proof assistants already)

20241017_152933.jpg

Eric Taucher (Oct 17 2024 at 20:09):

Cody Roux said:

Here's a proof that should be fun to formalize in Lean! (actually the statement is in several proof assistants already)

20241017_152933.jpg

What book is that if you don't mind noting?

I checked "Term Rewriting and All That" by Franz Baader and Tobias Nipkow and it is not that. (WorldCat)

Cody Roux (Oct 17 2024 at 20:19):

It's the pithily named "Resolution of Equations in Algebraic Structures volume 2: Rewriting Techniques". Found it in a Bookstore close to Harvard.

It's just a collection of papers, this one is Dershowitz, "Completion and it's Applications".

Cody Roux (Oct 17 2024 at 20:20):

I think it's this: https://www.cs.tau.ac.il/~nachumd/papers/CompletionApplications.pdf

Jose Brox (Oct 17 2024 at 21:02):

These books (there are two volumes) are the proceedings of the Colloquium on Resolution of Equations in Algebraic Structures (Texas, 1987) and can be found online (anyone interested in getting them can write me to ambroxius at gmail).

Cody Roux (Oct 17 2024 at 21:55):

People loved rewriting in 1987! It's worth noting that the previous article in this collection ("Completion without Failure" by Bachmair, Dershowitz (again) and Plaisted) is how I think Vampire works, and why it's so good at its job in this case!

Bhavik Mehta (Oct 18 2024 at 16:08):

There's a statement of church-rosser's theorem on confluence in mathlib, and I have a PR to this project which proves newman's lemma too, I agree this one should be nice!

Cody Roux (Oct 18 2024 at 17:52):

Cool! I had a little development as well, but it was my early days with Lean 4 so some of the proofs are very ugly: https://github.com/codyroux/traat-lean

Cody Roux (Oct 18 2024 at 17:53):

generally ARSes are very fun to formalize

Cody Roux (Oct 18 2024 at 18:09):

Finally: almost all that is known in completion + confluence + termination was formalized in Isabelle, mostly by Aart and his colleagues in Austria: https://dblp.org/pid/m/AMiddeldorp.html

Cody Roux (Oct 18 2024 at 18:10):

Not FLT but still impressive

Terence Tao (Oct 23 2024 at 18:55):

Some new techniques that have emerged in recent days:

Take a simple base model of the equation (e.g., one with 2-4 elements), and try to "extend" it by considering magma models that have a surjective magma homomorphism to the base. (See this discussion.)
Derive facts about finite models (e.g., taking advantage of injectivity<->surjectivity, or quotient reductions), and add those as axioms to an ATP to help find counterexamples. Example here.

Terence Tao (Oct 28 2024 at 02:32):

Two more techniques that have emerged in recent days:

Study "relaxed" versions of magmas obeying a target law, in which the magma operation is allowed to be multivalued. This can serve as a "base" magma to construct extensions that are not relaxed, for instance by a greedy construction. This seems to be particularly useful for "central groupoid like" laws such as 1485, see this blueprint chapter.
Study the free magma relative to the target law by inspecting how its rewriting can affect a word. In equations of "absorptive" type, such as 854, one can gain some useful properties from this analysis, such as unique factorization. See this blueprint chapter.

Mario Carneiro (Oct 28 2024 at 03:58):

Is it possible to use SMT solvers to explore the known equalities and disequalities of the free magma modulo? Perhaps something interesting can be discovered by just looking at what is already known there

Mario Carneiro (Oct 28 2024 at 04:00):

in particular, this gives you a lot more equations to work with than simply looking at consequences of the law, since you can assume every known counterexample as an additional rule

Daniel Weber (Oct 28 2024 at 04:03):

I did that for Asterix and it wasn't too useful, but maybe it'll be more useful now

Terence Tao (Oct 31 2024 at 14:48):

@Jose Brox has proposed using Kisielewicz type models, in which the operation $x \diamond y$ on the natural numbers is created explicity in terms of how $x,y$ splits up into powers of small primes such as 2,3,5, and the output value consists of expressions such as $2^x 3^y 5^z$ , with branching based on various rules guided by the law in question.

Daniel Weber (Oct 31 2024 at 14:50):

These operations can also be thought of as pattern matching in trees with colored edges

Terence Tao (Oct 31 2024 at 14:54):

I think one can also view it as an operation on the free algebra (on infinitely many generators) generated by a ternary relation $R$ , which one is concretely representing as $R(x,y,z) = 2^x 3^y 5^z$ .

Michael Bucko (Oct 31 2024 at 15:14):

Daniel Weber schrieb:

These operations can also be thought of as pattern matching in trees with colored edges

Check this out: https://leanprover.zulipchat.com/#narrow/channel/458659-Equational/topic/Magma.20equations.20and.20fractal.20tree.20matching/near/479358570

Terence Tao (Nov 02 2024 at 14:40):

More approaches that have arisen in recent days:

In order to help the greedy algorithm close, add axioms that are not forced by the original law, but which you know or suspect to be true for the universal magma (e.g., "unique factorization"). This was successful for finishing off 854.
Generate a magma by positing some new axioms (such as special elements or symmetries), using ATPs to extend the multiplication table and test the viability of additional axioms, and make optimistic simplifying assumptions. This strategy was successful for finishing off 1485; it also gives an alternate proof for 1117.
See if the magma law has any symmetries with regards to replacing the operation $x \diamond y$ by a twisted operation $Tx \diamond Uy$ for some automorphisms $T,U$ of the magma that obey some algebraic relations. For most laws the semigroup of such twists is trivial, but for 1485 it is not, and it provides an alternate way to finish off that equation.

Michael Bucko (Nov 04 2024 at 20:19):

Made this for myself using O1, but perhaps others will find it interesting, too.

Methods.pdf

Terence Tao (Nov 05 2024 at 16:47):

For the version of the implication graph project restricted to finite magmas, @Vlad Tsyrklevich has managed to prove several hundred new implications just using the fact that on a finite set, injectivity implies surjectivity and vice versa (and related to this, $f \circ g = I$ is equivalent to $g \circ f = I$ for functions $f,g: X \to X$ on a finite set $X$ ). This project just got completed using a variant of this fact that for functions $f,g: X \to X$ on a finite set $X$ , $f \circ f \circ g = f$ or $g \circ f \circ f = f$ implies $f \circ g \circ f = f$ . [Update: actually we still have about 30 unresolved implications to go.]

Terence Tao (Nov 10 2024 at 00:39):

It's not clear yet to what extent the recent successful resolution of 1323 extends to a broader technique, but for this equation involving squares, it was possible (after making additional assumptions to simplify the algebra, such as assuming the magma was unital) to decouple the problem of constructing a core magma on the square elements, working out how the squares interacted with the rest of the magma, and then filling in the rest of the table (with the constraints at that point becoming so simple that the existing greedy algorithm method easily finished it off). In retrospect, as pointed out by @Bruno Le Floch , the construction is also a version of the "modified base magma" strategy, although the particular magma $(x,a) \diamond (y,b) = (x+y+ab, a+b)$ to modify was not an obvious choice initially.

Terence Tao (Nov 28 2024 at 17:03):

There is an emerging "magma extension" strategy that has resolved some finite implications, in which one starts with a base magma $M$ and studies extensions with carrier $G \times M$ for some abelian group or ring $G$ and operation $(x, a) \diamond (y,b) = (\alpha x + \beta y + f(a,b), a \diamond b)$ where $\alpha,\beta$ are coefficients of some linear magma model $x \diamond y = \alpha x + \beta y$ and $f: M \times M \to G$ is a "cocycle" obeying a certain "cocycle type equation". The space of such $f$ is a linear space and so methods from linear algebra can become fruitful here. (We may have in fact accidentally discovered a new type of cohomology - "magma cohomology".)

Kevin Buzzard (Nov 29 2024 at 06:18):

My vague memories of trying to work out what monoid cohomology should be when the computer scientists started pointing out that the descriptions of group cohomology in the literature didn't have any inverses in, was that there seemed to be several definitions one could use which gave different answers in general for monoids because the proof that they gave the same answer for groups used inverses. I had no particular applications for monoid cohomology (note that either or both the G and the M in $H^n(G,M)$ can be monoids) so I left it at that. But historically group cohomology was studied in low degrees $n=1,2$ first, where it was realised that these cohomology groups solved problems (and also encapsulated obstructions to solving problems, eg "X can be done if and only if a certain 2-cocycle is a 2-coboundary") in group theory. It would not surprise me if some of these ideas spread out to magmas. And of course if you run into a definition in group cohomology which bifurcates in this way (can be written in more than one way for magmas) then you just choose the way which is most useful for your application. I'd stick to low degree (n=1,2) for now, there are examples of theories (eg when M is nonabelian) where things get pretty hairy after that.

Cody Roux (Nov 29 2024 at 16:59):

Oof another reason to try to understand that beast

John Baez (Dec 01 2024 at 06:54):

The cohomology $H^n(G,A)$ of a group $G$ with coefficients in an abelian group $A$ on which $G$ acts generalizes using the exact same formulas in the case where $G$ is a monoid. After someone pointed me to the above conversation I checked to see if the usual definition of $H^2(G,A)$ generalizes to the case where $G$ is a monoid. You can see what happened here.

Summarizing, the upshot was this:

$H^2(G,A)$ , defined in the usual way, can be generalized to a magma $G$ if we restrict ourselves to 1-cochains that factor through the free monoid on $G$ , and $G$ acts on the coefficient group $A$ in such a way that $(g_1 g_2) a = g_1 (g_2 a)$ .

(I explain this more clearly at the link.)

Terence Tao (Dec 01 2024 at 20:01):

Thanks! It seems that our project has discovered a more general form of magma cohomology that relates to more general laws than the associative law. In the case of a law with $k$ variables, it creates a partial chain complex

$0 \to C^1(G,A) \stackrel{\partial}{\to} C^2(G,A) \stackrel{\partial}{\to} C^k(G,A),$

where $C^k(G,A)$ is the space of functions from $G$ to $A$ but then I don't know how to continue the chain complex further. In the case of the associative law $x \diamond (y \diamond z) = (x \diamond y) \diamond z$ , we have $k=3$ , and the usual group (or monoid) cohomology allows one to continue the chain.

In the case of the commutative law $x \diamond y= y \diamond x$ (so $k=2$ ), assuming that $G, A$ are abelian groups for simplicity (with the action of $G$ on $A$ being trivial, corresponding to "central extensions" in the group case), the first boundary operation $\partial: C^1(G,A) \to C^2(G,A)$ is the familiar
$\partial f(x,y) := f(x+y) - f(x) - f(y)$
and the second boundary operation $\partial: C^2(G,A) \to C^2(G,A)$ is
$\partial F(x,y) := F(x,y) - F(y,x)$
thus for instance the magma operation $(x,s) \diamond (y,t) = (x+y, s+t+F(x,y))$ on $M \times A$ is commutative iff $\partial F = 0$ , and is conjugate to the product operation $(x,s) \diamond (y,t) = (x+y, s+t)$ if $F = \partial f$ is a coboundary. That is to say, the partial chain complex for the commutative law takes the form
$0 \to C^1(G,A) \stackrel{\partial}{\to} C^2(G,A) \stackrel{\partial}{\to} C^2(G,A).$
I don't know if there is a natural continuation to the chain: one could propose a third boundary map $\partial G(x,y) = G(x,y) + G(y,x)$ to annihilate the second, but this was an ad hoc choice specific to the commutative law, and I don't have any theory to propose to do this more systematically.

Michael Bucko (Dec 01 2024 at 22:38):

(this could be an interesting ml project -- generating boundary maps in order to allow for natural continuations to those chains)

Last updated: May 02 2025 at 03:31 UTC