Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Linear logic and game semantics


Deming Xu (Aug 03 2024 at 13:56):

I want to say that I found this thing called linear logic and game semantics almost a year ago. It seems to be able to rigorously interpret logic as some games like Go or Tic-Tac-Toe, which should be what reinforcement learning is good at. It sees logic as a debate (or dialogue) between proponent/opponent (prover/falsifier, programmer/user, function callee/caller).

(Specifically, this refers to a two-player zero-sum game. That is, there can't be three or more players, and when one wins, the other loses. I also think the players play in turn in the theory. I'm not sure.)

Here are some relevant links I found:
https://plato.stanford.edu/entries/logic-games/
https://en.wikipedia.org/wiki/Linear_logic
https://ncatlab.org/nlab/show/linear+logic
https://www.youtube.com/watch?v=t-5SzAAh6oY&pp=ygUMbGluZWFyIGxvZ2lj

Someone on Zulip mentioned game semantics 5 years ago...
https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/A.20newcomer's.20opinion/near/164799600

I spent some time trying to understand this, and I think I understand the equivalence, but I still lack some prerequisite knowledge...

Game semantics gives you two similar but different concepts.

  1. "You have a winning strategy for a certain game rule" is equivalent to being able to prove a certain theorem.
  2. "You happen to defeat an opponent once under a certain rule" is equivalent to proving a special case of a certain theorem, or successfully avoiding doubts about your theorem.

For example, ∀ n : ℕ, ∃ m : ℕ, m > n (for any natural number there is a larger natural number), the rule for this proposition is as follows: your opponent first points out an n that it thinks makes ∃ m : ℕ, m > n false (trying to give a counterexample), then you are responsible for finding m, and then you are responsible for proving that for the n and m chosen by the two of you, m > n.

Below are two examples of a game process of ∀ n : ℕ, ∃ m : ℕ, m > n:
First:
Opponent: I think this is not true for n = 5
Me: I found m = 6
Me: 6 > 5 is obvious (maybe you can write a Lean norm_num to expand the definition of natural numbers to prove this...)
(Now the opponent can't refute it. I won the game.)

Second:
Opponent: I think this is not true for n = 5
Me: I found m = 4
(Now I can't prove 4 > 5, I failed.)

In fact, ∀ n : ℕ, ∃ m : ℕ, m > n is a true proposition. But you can see that even if the game corresponds to a true proposition, the prover (me in the example) will still lose the game if he is stupid and does not follow the winning strategy. The above two game processes do not correspond to "proof", but only to exploring special cases.

The following is the correct proof of ∀ n : ℕ, ∃ m : ℕ, m > n, or the winning strategy:
Opponent: I think this is not right for n
Me: ​​I found m = n+1
Me: m = n + 1 > n is obvious
(Now the opponent can't refute it, I won the game.)

I won the debate without knowing the specific n chosen by the opponent and without using any information about it. Now this game process can be fixed into a strategy to win any new ∀ n : ℕ, ∃ m : ℕ, m > n game. A neural network should be rewarded more for finding a complete proof than just an example.

Also, you can swap the ∀ n, ∃ m in the original proposition ∀ n : ℕ, ∃ m : ℕ, m > n to get ∃ m : ℕ, ∀ n : ℕ, m > n (There is an integer bigger than any other integer). This game is now harder to win for me. Originally, when I chose m, the opponent has already chosen n, so I know what n was, and choosing m = n + 1 (defining m with the variable n) is valid. For the newer theorem, when I am choosing m, this is before the opponent choose n, so I don't know what n is. My chosen m has to work for all possible n my opponent can possibly choose. In fact, this new proposition becomes false and my opponent has a winning strategy.

Game semantics seems to be particularly helpful to correctly implement trial and error in theorem provers, just like AlphaGo tries various possible game states of Go, without having to prove that its method is absolutely correct and can beat any opponent (in fact, it cannot prove it).

Neural networks should first explore and understand curcial special cases of a theorem, and learn the complete proof after that. The training goal of these neural network provers should not be to convince Lean directly, but to convince the same neural network falsifier first. Maybe this will give us a more diversed loss punishment value (determined by whether you have a complete proof, or you have proved a specific example of the theorem, or you have been falsified by a counterexample. Even things like you found a specific example of a counterexample of a specific example of the theorem. You can nest these).

If you simply train a neural network to try a tree search, and use whether the proof succeeds as your loss, when the model is not really smart yet, the only feedback you get might be that the proof failed most of the time, so the model may not learn much because the loss value is almost always the same, maybe -1, and a constant loss punishment does not tell the model how to improve. However, if you make a model debate with itself about a theorem, you can check whether it wins the debate and use that as your loss value. Now the game is somewhat more symmetric because the opponent is also the model itself, not the lean checker, just like the self-play in AlphaGo, and winning the opponent model should be much easier than completing a successful tree search. Now the loss will not be a constant of -1, but a more diversed value. Even if the model is not smart enough to prove the theorem, it can still learn how to improve a specific debate process, and gradually obtain the complete winning strategy as a real proof. Given a proposition, you should also be able to count the number of winning / losing debates, and take the ratio, to "guess" whether a proof is probably true or not, and examine how well the model is learning.

Translating And Or True False into a game is similar to ∀ and ∃.
A and B is equivalent to this game: your opponent first chooses the one he thinks is incorrect between A and B (questioning you), and then you are responsible for proving the goal chosen by the opponent. A or B is equivalent to this game: you have the right to (and must) choose one between A and B to prove. False corresponds to a game you always lose. True corresponds to a game you always win.

The continuity of the function defined by epsilon-delta can also be understood rigorously:

import Mathlib.Data.Real.Basic
import Mathlib.Tactic

def MyContinuous (f :   ) :=
   x₀,  ε > 0,  δ > 0,  x, |x - x₀| < δ  |f x - f x₀| < ε

For any epsilon (maximum error of y) required by your opponent, you can require delta (maximum error of x) so that any x within the error range can make f(x) within the error.
For another example, the composition of continuous functions is continuous, which is equivalent to others questioning you whether g ∘ f is continuous, while you can question hf and hg and ask them whether f and g are continuous.

theorem comp_continuous {f g :   } (hf : MyContinuous f) (hg : MyContinuous g) : MyContinuous (fun x => g (f x)) := by
  unfold MyContinuous at *
  intro x₀ -- we are given x₀ anywhere on the real line
  intro dz hdz -- we are given dz > 0, the wanted error of z, ie g(f(x))
  replace dy, hdy, hg⟩⟩ := hg (f x₀) dz hdz -- With dz, `MyContinuous g` will tell us dy > 0, the wanted error of y, ie g(x)
  replace dx, hdx, hf⟩⟩ := hf x₀ dy hdy -- With dy, `MyContinuous x` will tell us dx > 0, the wanted error of x
  use dx, hdx -- we answer dx > 0 to be the wanted error of x
  intro x hx -- given x where |x - x₀| < dx
  -- we are given hx ==> x has the error that hf wants ==> f(x) has the error that hg wants ==> g(f(x)) has the error that we want
  dsimp
  exact (hg (f x) (hf x hx)) -- hover on parantheses to see the statements

Notification Bot (Aug 05 2024 at 00:35):

A message was moved here from #Machine Learning for Theorem Proving > Reinforcement learning for solving a single theorem by Kim Morrison.

Jason Gross (Aug 06 2024 at 07:56):

There's an even neater connection here: if you train neutral nets to play the game semantics of a particular statement, the way to turn a successful neural net into a proof is to do (mechanistic) interpretation of the learned algorithm to prove that it is a winning strategy.

Qian Hong (Sep 03 2024 at 07:33):

This is probably very off-topic, but I'd like to share something interesting: There's an online platform called Kialo.com that aims to provide a tool for rigorous debate. It organizes different sides of an argument recursively as a tree, allowing any sub-argument to be further expanded with additional support or counterarguments. This structure is somewhat similar to game semantics, where two players engage in a back-and-forth, attempting to identify flaws in each other's reasoning.

Random example from Kialo: https://www.kialo.com/the-institution-of-american-football-is-socially-harmful-10143.5?path=10143.0~10143.1_10143.5


Last updated: May 02 2025 at 03:31 UTC