Zulip Chat Archive

Stream: maths

Topic: Turing machines & gigantic numbers

DayDun (Nov 16 2021 at 00:53):

Hello :wave:, I'm working on some personal proofs for busy beavers. I'm fairly new to lean so I'm asking for a bit of advice on how to best represent this in lean.

As a simplified rundown, the busy beaver game asks, given all halting turing machines with precisely n states, what is the largest number of ones we can print on the tape, or alternatively the maximum number of steps possibly taken before halting.

There doesn't appear to exist a classic turing machine in mathlib, only a slightly more general one under computability.turing_machine.turing.TM0 which is a bit inconvenient to use for the proofs I'm constructing. Should I just implement my own type from scratch instead?

And since I'm going to be dealing with discrete instances of turing machines most of the time, should I stick to always representing symbols, states and possible turing machines as finsets? Using something like finset.max' on the set of possible turing machines to prove which machines maximizes certain properties.

Also these numbers grow incredibly large, reaching > 10^36534, even > 10^10^10^10^18705353 and beyond fairly quickly. So I suspect the default nat Peano numbers certainly won't suffice. What's the best way to deal with these kinds of numbers? Should I use num or some custom more compact representation of the numbers?

Reid Barton (Nov 16 2021 at 01:14):

nat goes well past 10^10^10^10^18705353, so you should be fine. Just don't try to compute anything!

Mario Carneiro (Nov 16 2021 at 01:15):

I would suggest you stick with using TM0, and specialize the types to meet your requirements. For the busy beaver problem, I think the usual idea is to take the alphabet to be bool and set of states to be fin k, and then the result of evaluation can be straightforwardly interpreted; the resulting constraint on k yields the busy beaver number.
Regarding the numbers growing quickly, this is not really a problem because nat is fundamentally an analysis tool. It doesn't really matter that the numbers get big unless you want to compute a particular busy beaver number, which is almost certainly impractical for n > 3.

Mario Carneiro (Nov 16 2021 at 01:16):

actually even n = 3 is a stretch, that's a lot of TM proofs. n = 2 can probably be done

Mario Carneiro (Nov 16 2021 at 01:19):

Could you be more specific about "which is a bit inconvenient to use for the proofs I'm constructing"? It is designed to be used, after all, so I'm interested in what you need from it / what you would change

DayDun (Nov 16 2021 at 01:34):

I of course don't plan on actually running a turing machine for 10^10^10^10^18705353 steps, but that doesn't mean we can't prove the value of these numbers. Clearly other people have been able to find machines that output numbers in these ranges, and prove the value or at least range of the number, I want to do the same, just in lean. Proving the value of n = 5, and maybe even 6, while it might require a significant amount of effort, should not be that unreasonable.

The reason TM0 seems a bit inconvenient is that it is quite a bit more general than I need. I would have to add a bunch of constraints such that it only identically simulates classical turing machines, so it feels like I might as well spend that effort towards implementing my own simpler representation.

Mario Carneiro (Nov 16 2021 at 01:38):

You can write the number 10^10^10^10^18705353 in lean no problem. Especially if you are using some technique to not actually simulate the TM that long it's quite reasonable to work with it

Mario Carneiro (Nov 16 2021 at 01:38):

What constraints are you thinking of?

DayDun (Nov 16 2021 at 01:44):

Well TM0 splits each state in two, allowing two different kinds of steps, one move and one write. I'd have to double the number of states, make sure the step function correctly alternates between moves and writes, and make sure when counting the number of steps taken to like divide by two or something.

Mario Carneiro (Nov 16 2021 at 01:46):

The model being used is pretty standard, the Post-Turing machine, where at each state you either do a move or a write (Turing's original formulation did both in one step, so this is a more restrictive model)

Mario Carneiro (Nov 16 2021 at 01:48):

Here's a simple definition of the busy beaver number for TM0:

import computability.turing_machine tactic.derive_fintype

open_locale classical
namespace turing

def eval_with_steps {σ} (f : σ → option σ) (s : σ) : part (ℕ × σ) :=
eval (λ a, (f a.2).map (λ b, (a.1 + 1, b))) (0, s)

namespace TM0
variables {Γ Λ : Type*} [inhabited Γ] [inhabited Λ]

def halting_time (M : machine Γ Λ) (l : list Γ) : part ℕ :=
(eval_with_steps (step M) (init l)).map (λ n, n.1)

attribute [derive fintype] turing.dir stmt
noncomputable instance machine.fintype [fintype Γ] [fintype Λ] :
  fintype (machine Γ Λ) := by unfold machine; apply_instance

noncomputable def busy_beaver
  (Γ : Type*) [inhabited Γ] [fintype Γ]
  (Λ : Type*) [inhabited Λ] [fintype Λ] : ℕ :=
finset.univ.sup $ λ M : machine Γ Λ, (halting_time M []).to_option.get_or_else 0

end TM0

-- vvv  for once it actually says noncomputable for a good reason
noncomputable def busy_beaver (n : ℕ) : ℕ :=
if n = 0 then 0 else TM0.busy_beaver bool (fin (n-1+1))

end turing

DayDun (Nov 16 2021 at 01:49):

has there been work on this already in lean?

Mario Carneiro (Nov 16 2021 at 01:50):

no, I just wrote this

Mario Carneiro (Nov 16 2021 at 01:51):

there isn't too much around TMs that has been PR'd to mathlib beyond what you can see in the turing_machine.lean file and a few other files in the directory

Mario Carneiro (Nov 16 2021 at 01:57):

And if you want a model that does both kinds of steps at once, it's a different computational model but not a particularly hard one to write:

namespace turing
namespace TM0'

section
parameters (Γ : Type*) [inhabited Γ] -- type of tape symbols
parameters (Λ : Type*) [inhabited Λ] -- type of "labels" or TM states

def stmt := Γ × dir

instance stmt.inhabited : inhabited stmt := ⟨(default _, default _)⟩

def machine := Λ → Γ → Γ × dir × option Λ

instance machine.inhabited : inhabited machine := by unfold machine; apply_instance

structure cfg :=
(q : option Λ)
(tape : tape Γ)

instance cfg.inhabited : inhabited cfg := ⟨⟨default _, default _⟩⟩

parameters {Γ Λ}
/-- Execution semantics of the Turing machine. -/
def step (M : machine) : cfg → option cfg
| ⟨none, T⟩ := none
| ⟨some q, T⟩ := let (a, d, q') := M q T.1 in some ⟨q', (T.write a).move d⟩

def init (l : list Γ) : cfg :=
⟨some (default Λ), tape.mk₁ l⟩

def eval (M : machine) (l : list Γ) : part (list_blank Γ) :=
(eval (step M) (init l)).map (λ c, c.tape.right₀)

end

end TM0'
end turing

Mario Carneiro (Nov 16 2021 at 01:59):

I would suggest something like this if you are hoping to exactly replicate some busy beaver number from the textbook (because the exact numerical value of BB(n) depends on the precise model of computation, even though its broad behavior like growing faster than any computable function does not)

DayDun (Nov 16 2021 at 02:05):

Alright, that looks like a pretty decent starting point. Thank you very much

Mario Carneiro (Nov 16 2021 at 02:18):

[Updated the post above to match the computational model defined at https://en.wikipedia.org/wiki/Busy_beaver ]

DayDun (Nov 16 2021 at 02:23):

Mario Carneiro said:

Here's a simple definition of the busy beaver number for TM0:

import computability.turing_machine tactic.derive_fintype

open_locale classical
namespace turing

def eval_with_steps {σ} (f : σ → option σ) (s : σ) : part (ℕ × σ) :=
eval (λ a, (f a.2).map (λ b, (a.1 + 1, b))) (0, s)

namespace TM0
variables {Γ Λ : Type*} [inhabited Γ] [inhabited Λ]

def halting_time (M : machine Γ Λ) (l : list Γ) : part ℕ :=
(eval_with_steps (step M) (init l)).map (λ n, n.1)

attribute [derive fintype] turing.dir stmt
noncomputable instance machine.fintype [fintype Γ] [fintype Λ] :
  fintype (machine Γ Λ) := by unfold machine; apply_instance

noncomputable def busy_beaver
  (Γ : Type*) [inhabited Γ] [fintype Γ]
  (Λ : Type*) [inhabited Λ] [fintype Λ] : ℕ :=
finset.univ.sup $ λ M : machine Γ Λ, (halting_time M []).to_option.get_or_else 0

end TM0

-- vvv  for once it actually says noncomputable for a good reason
noncomputable def busy_beaver (n : ℕ) : ℕ :=
if n = 0 then 0 else TM0.busy_beaver bool (fin (n-1+1))

end turing

is the symbols and states mixed up here? machine type is Λ → Γ → option (Λ × stmt)

Mario Carneiro (Nov 16 2021 at 02:25):

busy_beaver takes Γ first (the alphabet aka number of symbols on the tape) and then Λ (the label space aka number of states)

DayDun (Nov 16 2021 at 02:27):

yes but isn't the arguments in the wrong order when passed to machine? It should take the label space first should it not?

Mario Carneiro (Nov 16 2021 at 02:27):

all the functions take arguments in that order, including machine Γ Λ

Mario Carneiro (Nov 16 2021 at 02:28):

what you are pointing out is that machine Γ Λ is defined to be a function which takes lambda first, but we aren't applying that function here

Mario Carneiro (Nov 16 2021 at 02:29):

This function is applied inside TM0.step

DayDun (Nov 16 2021 at 02:29):

oh right, implicit arguments. Didn't think of that

Mario Carneiro (Nov 16 2021 at 02:30):

the whole section uses parameter to hide the type arguments inside the section

DayDun (Nov 16 2021 at 02:31):

yea

Yaël Dillies (Nov 16 2021 at 08:57):

For reference, we already used much bigger numbers in a proof. Szemerédi's regularity lemma involves a bound of the form 4^4^...^4.

Daniel Briggs (Dec 19 2021 at 15:26):

Hi! I'm interested in using Lean or Coq to help finish the Busy Beaver of 5. I'm using the quintuple formulation, the original formulation of Turing machines, used in Radó's original paper about the Busy Beaver and in Lin and Radó. I was coding up formal proofs in Java until they got too unwieldy at https://github.com/danbriggs/Turing and I wrote up the current state of the problem at https://github.com/danbriggs/Turing/blob/master/paper/HNRs.pdf

I just finished the Lean tutorial 00 through 09. Please message me or message here if you're interested in working on it!

Junyan Xu (Jul 02 2024 at 17:58):

BB(5) = 47,176,870 has been verified in Coq!
https://www.quantamagazine.org/amateur-mathematicians-find-fifth-busy-beaver-turing-machine-20240702/

Yaël Dillies (Jul 02 2024 at 18:24):

Congratulations, @Maja Kądziołka!

Floris van Doorn (Jul 02 2024 at 19:00):

Oh, that is really cool!

Geoffrey Irving (Jul 02 2024 at 19:33):

Presumably the last one!

Ruben Van de Velde (Jul 02 2024 at 20:02):

Next one should be in lean, clearly :light_bulb:

Kim Morrison (Jul 02 2024 at 23:47):

Maybe "BB(745) is independent of ZFC" could be in Lean!

Geoffrey Irving (Jul 06 2024 at 12:00):

Is this the biggest mathematical result yet where the proof has been announced together with the formalization?

Kevin Buzzard (Jul 06 2024 at 12:13):

Is "this" Busy Beaver 5? Definitely not! Much of the recent work of Marijn Heule has been on formal verification of incomprehensibly huge proofs, see e.g. https://www.nature.com/articles/nature.2016.19990 or https://www.quantamagazine.org/computer-search-settles-90-year-old-math-problem-20200819/

Mario Carneiro (Jul 06 2024 at 19:15):

those weren't "formalizations" though in the ITP sense

Mario Carneiro (Jul 06 2024 at 19:17):

Some of them have been subsequently formalized, like our paper on the Empty Hexagon Number to appear at ITP this year, but the formal proof of the whole result came after the SAT proof of the "hard part". This is the first time I have seen the proof itself being conducted in real time with Coq forming the collaboration focal point

Mario Carneiro (Jul 06 2024 at 19:19):

that is, it wasn't just "the proof was announced together with the formalization", the proof was actually done formally first

Last updated: May 02 2025 at 03:31 UTC