Zulip Chat Archive

Stream: Equational

Topic: Obelix: joining two approaches


Pace Nielsen (Oct 12 2024 at 21:52):

The Obelix equation 1491 is x=(y◇x)◇(y◇(y◇x)). The forcing-style arguments look to be quite promising in ultimately showing that Obelix does not imply Asterix. I also wanted to describe how this can be attacked by merging the "linear-invariant" approach with the "finite modification" technique.

Using the linear-invariant approach, we get a functional equation of the form f(f2(h)f(h))=f(h)f(f^2(h)-f(h))=-f(h). When trying to build ff from partial functions, this says is that once we know that b:=f(a)b:=f(a) and c:=f(b)=f2(a)c:=f(b)=f^2(a) are defined, then we must have f(cb)=bf(c-b)=-b.

What is quite interesting about that last equality is that it does not involve aa directly. In other words, if we were to (for the moment) assume that ff is surjective, then our generation rule just says that from (b,c)(b,c) we get (cb,b)(c-b,-b), which in turn yields (c,cb)(-c,-c-b), which then in turn yields (b,c)(b,c). We get a type of cycle of length 3.

Looking at that cycle, there is a clear duality with negatives. This can be "factored out" by simply working "mod 2" (i.e., in an F2\mathbb{F}_2-vector space). After doing this, we see that ff itself satisfies f3=1f^3=1. So it is a bijection, and now satisfies the corresponding Asterix functional equation.

So we can't simply assume surjectivity, which in turn complicates understanding ff. However, with just a finite bit of modification we can, and then ff is extremely simple to describe!

Let AA be a countable-dimensional F2\mathbb{F}_2 vector space. For sake of simplicity, take AA to be the set of N\mathbb{N}-indexed sequences of 0's and 1's, with finite support, with addition defined coordinate-wise. This is a countable set, so fix a well-ordering of order type ω\omega. Let eie_i be the standard basis vector with 1 in the ith coordinate, and 0's elsewhere. Start with the partial function {(0,0),(e1,e2),(e1+e2,e3),(e1+e3,e4)}\{(0,0),(e_1,e_2),(e_1+e_2,e_3),(e_1+e_3,e_4)\}. This partial function fails the Asterix functional equation. Also note that none of the first coordinates (except 0) appears as a second coordinate.

For each bA{0,e1,e1+e2,e1+e3}b\in A-\{0,e_1,e_1+e_2,e_1+e_3\} (as we go along the well-ordering) greedily adjoin the triples (b,ei),(b+ei,b),(ei,b+ei)(b,e_i),(b+e_i,b),(e_i,b+e_i), where ii is sufficiently large depending on bb. The resulting function will be what we need.

Terence Tao (Oct 12 2024 at 22:01):

That's a nice construction! You've basically set up some sort of "Hilbert's hotel" I think to greedily construct the graph of f, which should then lead to the anti-implication Oberlix !=> Asterix.

Terence Tao (Oct 12 2024 at 23:46):

I've opened equational#543 to formalize this argument.

Alex Meiburg (Oct 17 2024 at 05:22):

I started trying to code this up (sketching just chunks of the proofs in Lean) and quickly ran into an issue. I think you had a mistake in your algebra -- or maybe I'm not assuming the right form for the linear invariant? If I take x◇y = x + f(y-x) = x + f(h), then the functional equation simplifies to $f(f^2(h) - f(h)) = h - f(h)$, not just -f(h) on the right hand side as you wrote. If I instead take x◇y = y + f(y-x) = x + f(-h) then the functional equation simplifies to
$0 = f(h) + f(h+f(h)) + f(f(h+f(h)))$
which is also interesting, but, also not what was written. I tried double checking these calculations in Mathematica and got the same thing. I also tried a few other versions of ax+by+f(cx+dy) but didn't find anything.

Alex Meiburg (Oct 17 2024 at 05:33):

Going with x◇y = x + f(y-x) = x + f(h), and so the functional equation f(f^2(h) - f(h)) = h - f(h), we get that f(a) = b / f(b) = c implies that f(c-b) = a-b. ... but I don't see a good way to proceed from there? I hope I'm just missing something simple!

Alex Meiburg (Oct 17 2024 at 08:25):

Hmm yes been staring at this a while, I'm having trouble making this work, I've been looking for a fix but no luck yet

Pace Nielsen (Oct 17 2024 at 14:06):

@Alex Meiburg Yes, that was a calculational mistake. You are correct that the equation should be f(f2(h)f(h))=hf(h)f(f^2(h)-f(h))=h-f(h). I'll try to fix the argument.

Pace Nielsen (Oct 17 2024 at 14:25):

Okay, I believe that the "bifurcation tree" argument used to handle 1692 works here too. The big change is that the two children of a node (a,b)(a,b) are (in this new case) (b,c),(cb,ab)(b,c),(c-b,a-b), but essentially the same argument should apply.

Alex Meiburg (Oct 17 2024 at 15:04):

I see, that sounds reasonable. In that case I think both the proof and code for Dupont/Dupond will have a lot of overlap with
Obelix=!>Asterix.

Sadly that bifurcation tree is a lot messier (constr/ind)uction to formalize. :) If cooler!

Alex Meiburg (Oct 17 2024 at 17:24):

Alright, thinking through this a bit here out loud before I try to write it down very formally:

  • After adding one "full tree" rooted at (a,b) to the partial function, there's a few facts that are now true:
  • (1) for all x in the domain of f, we know that f(x) is in the domain as well.
  • (2) for all x in the domain of f, we know that f(f(x))-f(x) is in the domain as well.
  • (3) for all x in the domain of f, it holds that f(f^2(h) - f(h)) = h - f(h). The definedness of that expression comes from (1) and (2) above.
  • (4) the provided "a" is in the domain of f.
    The first three are invariants we want the partial function to always have. The fourth is the fact that will assure us that eventually all values are in the domain of f. So it's easy (~ a few lines) that if we can always extend a partial function with a tree like this for any "a" not in the domain, then we can build a total function that still obeys equation (3), and so get an Obelix magma.

The tree could be defined with some kind of tree-shaped data structure / induction, but it's simple enough that I think we can just directly index the nodes and give the spec. We can define it as a Nat-indexed sequence seq of pairs (x,f(x)) that we add to the partial function. It's given by the following formula:

  • seq 0 = (a, <anything linearly independent of all previous elements>)
  • seq (2n+1) = ( (seq n).snd, <anything linearly independent of all previous elements>)
  • seq (2n+2) = ( (seq (2n)).snd - (seq n).snd, (seq n).fst - (seq n).snd )
    Then the mappings n -> 2n+1 and n -> 2n+2 give the witnesses we need to prove the invariants (1) and (2) above, and (3) is verified by plugging in the appropriate values.

That leaves two things to show: (5) we need to show that the inductive definition of seq never produces a (x,y) where x was already defined earlier, and (6) the <anything linearly independent of all previous elements> needs showing that such an element exists.

(5) is done by contradiction, saying that we have two values in the domain that were equal. One case is that something in seq collided with something in the domain _before_ we started the tree, and then we can refute that by the fact it was either the fresh element "a" or something linearly independent. The other case is that is that both were from seq. In that case, one of them refers to a (seq x).snd that is greater than the other, so they're linearly independent.

I initially thought that (6) would just be a simple "oh well the partial function is always finite, and we can take any infinitely generated group" -- because that's what I was going to do before. But that doesn't work here, because after one "full tree", our partial function actually has infinite support. You have to prove that you don't exhaust the full set of linearly independent generators in the first tree (and indeed, the naive approach would do exactly that -- there would be nothing linearly independent after the first extension of the function). So you have to do a little dance where you index the generators by Nat x Nat and just take generators from one row, when building one tree.

For that reason, it seems easiest to actually take the group to be (say) functions (Nat x Nat) -> G, where G could be Fin 2 or Rat or Int or whatever, instead of just Nat -> G, so that the indexing is easier.

Alex Meiburg (Oct 19 2024 at 04:39):

The Lean formalization is making some progress but there seem to be further glitches in the proof. I have to think about how to best fix this. So, the group element "0" has to occur in our bifurcation tree at some point, and I realized that pretty quickly there's an issue where the tree tries to assign two different values to the same input.

Here's an abbreviated version of the tree:

Root: (0, f(0)=a)
|--Child 1: (a, f(a)=b)
     |--Child 3: (b, f(b)=c)
     |--Child 4: ...
|--Child 2: (b-a, f(b-a)=0-a=-a)
     |--Child 5: (-a, f(-a)=d)
     |--Child 6: (d-(-a)=d+a, f(d+a)=(b-a)-(-a)=b)
          |--Child 13: (b, f(b)=e)
          |--Child 14: ...

It seems like a problem unique to 0. If we look at the same shape of tree rooted at a different value x, then the same corresponding children have b and b-x, so they're distinct.

But it still gets to the problem, that we don't have a _proof_ of the fact that we never accidentally end up with the same value being forced to a different value twice. It felt "obvious" since the arguments (the left half of each tuple) was being made from the 'fresh' right values from immediately preceding pairs. But, obviously, that doesn't always hold true.

I'm trying to think about a way to repair this.

Alex Meiburg (Oct 19 2024 at 04:56):

The construction might work (although not sure how to prove it) if we set f(0)=0 and then start the first tree at 1.

Pace Nielsen (Oct 19 2024 at 16:20):

@Alex Meiburg Yes, start with the pair (0,0) to avoid having repetitions in all the (other) bifurcation trees. (Sorry I didn't mention/catch that side case earlier.) This won't cause problems with contradicting Asterix because it's functional equation doesn't involve f(0).

I did double-check that the bifurcation tree with a\neq 1 at the start does avoid any repetition.

Pace Nielsen (Oct 19 2024 at 16:22):

I may be able to make this argument not need the infinite bifurcation tree. I'll give it some thought right now and get back to you soon.

Pace Nielsen (Oct 19 2024 at 17:02):

@Alex Meiburg Finding that repetition in the binary table starting at 0 actually helped explain why all the other tables don't have repetitions when generated freely. Thank you for working that out! See the attached file, which now doesn't need the infinite binary table.

Obelix.pdf
Obelix.tex

Terence Tao (Oct 19 2024 at 18:36):

This is beginning to resemble the automated approach we can adopt for the non-translation-invariant greedy setting in which we can invoke SAT solvers to close the greedy construction. For the translation-invariant setting I guess we need an SMT solver or something? I don't know if this is as easily automatable. On the other hand, there are only a dozen equations left, of which maybe half are amenable to this method, so it may be easier to just keep repeating this sort of analysis by hand.

Alex Meiburg (Oct 19 2024 at 19:09):

I think doing this by hand will be easier than automating. A tougher question is whether there should be any "common theory" here in Lean, where we try to factor out very similar code and definitions (even if we're otherwise forgoing automation); or if we should just copy/paste and modify as appropriate. ~10 copies of specializations to different magmas feels lame but might be the least effort.

Alex Meiburg (Oct 19 2024 at 19:10):

I'll take a swing at fixing this up later today. :smile:

Pace Nielsen (Oct 19 2024 at 20:24):

@Alex Meiburg There was a minor error in the case analysis, which has now been fixed. See the updated files below.

Obelix.pdf
Obelix.tex

Alex Meiburg (Oct 20 2024 at 03:33):

This is much easier to implement, since E being finite means everything is computable and generally easier to just blindly reduce down. :) And there's only "one" construction, instead of "two" (adding each needed element in the outer loop, and building the tree in the inner loop). Should be quick now, hopefully

Daniel Weber (Oct 20 2024 at 03:42):

Alex Meiburg said:

I think doing this by hand will be easier than automating. A tougher question is whether there should be any "common theory" here in Lean, where we try to factor out very similar code and definitions (even if we're otherwise forgoing automation); or if we should just copy/paste and modify as appropriate. ~10 copies of specializations to different magmas feels lame but might be the least effort.

Since the automation for the non-translation-invariant greedy setting is almost done, it might not be hard to adapt it for this (although I don't truly know how different this is)

Terence Tao (Oct 20 2024 at 03:51):

The main thing is that it isn't pure SAT any more; there is also an ambient group involved, so it is more like SMT than SAT. There's now a binary relation EE on a group GG instead of a ternary relation RR, and the axioms look typically like this:

E(a,b)E(b,c)    E(cb1,ab1) E(a,b) \wedge E(b,c) \implies E(cb^{-1}, ab^{-1})

E(a,b)E(c,d)ab1=cd1    a=c E(a,b) \wedge E(c,d) \wedge ab^{-1} = cd^{-1} \implies a = c

(taken from the above Obelix example). Not sure if the ATP side of your automation can handle these sorts of things.

Daniel Weber (Oct 20 2024 at 03:55):

You can give it the group axioms and it should be able to handle it, although I don't know if that works well (and I guess the meaning of "a fresh element" also has to be different)

Terence Tao (Oct 20 2024 at 04:00):

Yeah, fresh element is a bit trickier. One can assume that the new element c is not any finite word of previous symbols (in practice I don't think we've seen words of length more than 4 appear). Most of the constructions take place in an abelian group so the number of words that show up is not too massive.

Terence Tao (Oct 20 2024 at 04:02):

In any event, the human implementation of the translation-invariant greedy algorithm is going well. In the last 24 hours, of the 11 remaining two-variable equations which could in principle have been attacked by this method, one has been solved, one has been partially solved, and three other equations (and the remaining part of the partially solved equation) are now known to be immune to this method. So only six equations remain to be tested against this method.

Alex Meiburg (Oct 21 2024 at 04:30):

The Lean for 1491 =!=> 65 is mostly there in equational#624. All the defs are there (so the sorry's are just for proofs, not the actual construction) and the 'main' facts are proven. Now it's just the fiddly bits of case work about verifying injectivity in the right places etc. etc. :)

Michael Bucko (Oct 24 2024 at 17:53):

I'm reading about Asterix & Obelix. Sharing this -- perhaps someone else likes to learn this way.

equation65-proof.mermaid

Bildschirmfoto 2024-10-24 um 19.43.51.png

The sketch given by O1 is:

**Proof Sketch:**

1. **Instantiation:** Uses N as the type G and constructs the Magma using the `closure` of the `initial` PartialSolution.
2. **Simplification:** Applies simplification rules to break down the goal using `Equation65` and `closure_prop`.
3. **Splitting Goals:** Uses `split_ands` to handle multiple conjuncts separately.
4. **Iterative Use:** Repeatedly uses `closure_eq_of_mem_e1` to establish the required equalities for each fact, solving them using the `decide` tactic.

Last updated: May 02 2025 at 03:31 UTC