Zulip Chat Archive

Stream: CSLib

Topic: Formalization of 1D Cellular Automata Theory

Henning Dieterichs (Feb 22 2026 at 22:55):

I'm very happy to hear about the new project CSLib!

I just finished the formalization and verification of various results regarding 1D cellular automata as formal language recognizers. The project is on GitHub (Lean 4 + Mathlib4, ~9,300 lines of definitions and proofs, 12 main theorems, all sorry-free and axiom-verified).

Some results from the literature that I formalized and verified:

One-way ↔ regular 1D Cellular Automata (CA) equivalence (modulo speedup factor of 2)
Quiescent and dead border constructions
Constant additive speedup and linear speedup by compression
Real-time (RT) recognition of exponential word lengths via signal bouncing

On top, I explored what power CAs gain with access to an advice (a length-preserving annotation of the input). I developed a theory of "two-stage advice" (CA transducer + right-to-left FST) and proved that RT transducers and two-stage advices are closed under composition, that two-stage advices don't extend RT-CA power, and that causal RT-closed advice is exactly CA-RT-transducer advice (more details here).

These advices simplify constructions as they allow for decomposition and could significantly simplify some proofs from the literature (especially the landmark result that L(RT-CA) = L(LT-CA) <=> L(RT-CA) = L(RT-CA-Rev), which I did not formalize).

One challenge was finding the right definitions: I iterated many times on the formalization of cellular automata until arriving at a definition that is more flexible than the standard one in the literature (e.g., no a priori constraints on the border state), but provably equivalent - the formalization includes constructions showing that quiescent or dead borders can always be imposed without changing the recognized language.

This falls under "Programming Languages, Models of Computation and Interaction → Automata" in the CSLib contributing guide. I'd be interested in discussing whether (parts of) this could fit into CSLib - particularly the base definitions and classical constructions. Happy to adapt style/structure to CSLib conventions if there is interest.

Note: while the constructions and proof strategies are my own, the Lean proof engineering was heavily AI-assisted.

Shreyas Srinivas (Feb 22 2026 at 23:22):

I think integrating this will take some time. Firstly the definitions might have to be generalized to use cslib's definition of LTSes. Secondly it is 9300 lines. Thirdly, there is a lot of golfing and simplication that's bound to be necessary with any AI assistted development

Henning Dieterichs (Feb 22 2026 at 23:36):

Thirdly, there is a lot of golfing and simplication that's bound to be necessary with any AI assistted development

That is very true, and I already did some code golfing (partially AI assisted). There are some proofs though which are just grind (that Lean's grind cannot handle), where the literature just leaves the details to the reader (for example in theorem 3.2 here - which is not part of this project). I believe that without AI assistance it might be very difficult to formalize such proofs.

Shreyas Srinivas (Feb 23 2026 at 00:06):

Depends on the time frame. Given that that an AI can generate (or assist with generating) these proofs in 9000 lines, I would put this in the realm of 2 months with minimal to no AI use.

Shreyas Srinivas (Feb 23 2026 at 00:07):

The key thing is finding good definitions that integrate well with CSLib. that will take more time.

Henning Dieterichs (Feb 23 2026 at 00:09):

I would put this in the realm of 2 months with minimal to no AI use.

do you mean full time? :)

Henning Dieterichs (Feb 23 2026 at 00:24):

The key thing is finding good definitions that integrate well with CSLib. that will take more time.

I'm wondering, isn't that also a compromise on how these definitions are used in the literature? For cellular automata there is research for all kinds of variations (different neighborhoods, 1D vs 2D, different acceptance times and positions). It is very difficult and maybe not even practical to have an abstraction that can deal with all these variations and sometimes it might be simpler to have copies and some bridges to transfer results.

Shreyas Srinivas (Feb 23 2026 at 01:13):

The literature can afford to play fast and loose with results. We need to build a library. That means building something whose parts fit together

Shreyas Srinivas (Feb 24 2026 at 13:31):

Btw, I heard about your thesis talk. All the best

Last updated: Feb 28 2026 at 14:05 UTC