Zulip Chat Archive
Stream: CSLib
Topic: Question Lean structure and extends
Ching-Tsun Chou (Oct 29 2025 at 22:29):
In the following code, is it possible to define DFA.prod as an extension of DA.prod but still make it to have the type DFA (State1 × State2) Symbol?
structure DA (State : Type _) (Symbol : Type _) where
/-- The initial state of the automaton. -/
start : State
/-- The transition function of the automaton. -/
tr : State → Symbol → State
def DA.prod (da1 : DA State1 Symbol) (da2 : DA State2 Symbol) : DA (State1 × State2) Symbol where
start := (da1.start, da2.start)
tr := fun (s1, s2) x ↦ (da1.tr s1 x, da2.tr s2 x)
structure DFA (State : Type _) (Symbol : Type _) extends DA State Symbol where
/-- The set of accepting states of the automaton. -/
accept : Set State
/-- The automaton is finite-state. -/
finite_state : Finite State
/-- The type of symbols (also called 'alphabet') is finite. -/
finite_symbol : Finite Symbol
def DFA.prod (dfa1 : DFA State1 Symbol) (dfa2 : DFA State2 Symbol) (acc : Set (State1 × State2))
: DFA (State1 × State2) Symbol where
start := (DA.prod dfa1.toDA dfa2.toDA).start
tr := (DA.prod dfa1.toDA dfa2.toDA).tr
accept := acc
finite_state := by
have := dfa1.finite_state
have := dfa2.finite_state
infer_instance
finite_symbol := dfa1.finite_symbol
Chris Henson (Oct 29 2025 at 23:41):
I don't think I understand what you mean by "define DFA.prod as an extension of DA.prod". You could write:
def DFA.prod (dfa1 : DFA State1 Symbol) (dfa2 : DFA State2 Symbol) (acc : Set (State1 × State2))
: DFA (State1 × State2) Symbol :=
DFA.mk (DA.prod dfa1.toDA dfa2.toDA) acc (@Finite.instProd _ _ dfa1.finite_state dfa2.finite_state) (dfa1.finite_symbol)
As a sidenote, those finite instances not being inferred without have still annoys me, does anyone know if there is a way to add an instance for DFA.finite_state and DFA.finite_symbol?
Chris Henson (Oct 30 2025 at 01:27):
Can you remind me why we bundle the instances? I know a lot of thought went into the definitions, I just can't recall.
Ching-Tsun Chou (Oct 30 2025 at 02:28):
Because the "F" in DFA means "finite". Actually I'm going to advocate getting rid of finite_symbol, because it is not used in any proof and causes cslib's notion of regular languages differ from mathlib's. In the following theorem:
theorem regular_iff_cslib_dfa [Finite Symbol] {l : Language Symbol} :
l.IsRegular ↔ ∃ State : Type, ∃ dfa : Cslib.DFA State Symbol, dfa.language = l := by ...
the [Finite Symbol] assumption is there only because mathlib's Language.IsRegular does not assume that the alphabet is finite.
Chris Henson (Oct 30 2025 at 02:31):
I'm not sure I made myself clear. I meant why is Finite a field of the structure versus being an argument to it?
Ching-Tsun Chou (Oct 30 2025 at 02:38):
If the finite-ness is an argument, then the above theorem would look something like:
l.IsRegular ↔ ∃ State : Type, ∃ h : Finite State, ∃ dfa : Cslib.DFA State Symbol, dfa.language = l
Right? That's actually how mathlib's DFA works:
def Language.IsRegular {T : Type u} (L : Language T) : Prop :=
∃ σ : Type, ∃ _ : Fintype σ, ∃ M : DFA T σ, M.accepts = L
I'm not sure that is better. Personally, now that I know the "type universe hell" I encountered in my earlier automata theory project was not caused by the state type being bundled, I would prefer bundle the state type into DA and DFA, so that we can prove something like:
l.IsRegular ↔ ∃ dfa : Cslib.DFA State Symbol, dfa.language = l
which seems more elegant.
Chris Henson (Oct 30 2025 at 02:50):
It is generally recommended in Mathlib (since the move to Lean 4 at least) to unbundle. It would simplify the statement of definitions like DFA.prod and there are performance implications of bundling. See section 5.1 of this paper for some discussion.
Ching-Tsun Chou (Oct 30 2025 at 02:56):
I don't have a strong view about this. I can go either way. The unbundled design does have the advantage of making clear which finiteness assumption is used when. (In particular, we don't need to drag finite_symbol around for no real purpose.). On the other hand, the list of assumptions or quantified variables of some theorems will become longer. @Fabrizio Montesi , what do you think?
Chris Henson (Oct 30 2025 at 03:05):
I strongly advocate for unbundling. As someone who has done some category theory (one of the situations where bundling is required) I can attest to the pain it causes. The performance and potential awkwardness of definitions is not worth it IMO.
Ching-Tsun Chou (Oct 30 2025 at 04:03):
BTW, if we go for unbundling, we don't need DFA and NFA anymore. We will use DA and NA instead and the notion of the language of an automaton will be applied to DA and NA and take the accepting condition as an argument. That's actually the design I adopted in my earlier automata theory project. Similarly, instead of EpsilonNFA, we will have EpsilonNA.
Fabrizio Montesi (Oct 30 2025 at 09:31):
I'm open to discussing alternative designs for sure. My original comment was that it is very weird to call something without a finiteness assumption DFA/NFA/etc. instead of the more general DA/NA/etc.
From all the discussions we've had and my own knowledge of the literature, what is much more important is categorising the acceptance conditions. These cannot really live hardcoded in DA/NA/etc.
Chris Henson (Oct 30 2025 at 09:37):
If as the message previous suggests unbundling would preclude having seperate defs for DFA/NFA at all, I think the odd naming would be avoided?
Fabrizio Montesi (Oct 30 2025 at 09:40):
Getting rid of DFA/NFA/etc. is easy, yes. The problem we had is that then we gotta figure out where to put and what to call the places defining their acceptance conditions. That's what I need a proposal for.
Fabrizio Montesi (Oct 30 2025 at 09:41):
Take, for example, NFA and EpsilonNFA. They're essentially the same, but for the acceptance condition (which is the one given in the literature).
Chris Henson (Oct 30 2025 at 09:45):
Is this inherently linked to the other typeclass fields being (un)bundled?
Fabrizio Montesi (Oct 30 2025 at 09:50):
You mean the structures?
Fabrizio Montesi (Oct 30 2025 at 09:50):
I don't think so
Fabrizio Montesi (Oct 30 2025 at 09:51):
It's just that we need to create new terminology and abandon some well-known one from the literature, so we gotta be careful.
I'll exemplify:
Fabrizio Montesi (Oct 30 2025 at 09:52):
Currently, 'NFA' is 'NA with (1) finiteness assumptions and (2) the expected acceptance condition for NFA'.
Dropping (1) is no problem, but then it doesn't make sense to call that thing NFA (the 'F' is not there). Agreed? What should we call it then?
Fabrizio Montesi (Oct 30 2025 at 09:52):
What we care about is putting the (2) somewhere, not the (1). The latter is very easy to unbundle and put it wherever as theorem assumptions.
Chris Henson (Oct 30 2025 at 10:06):
Bear with me a bit as I am not so familiar with this area. I understand there are more conditions on accept between each structure, but to be concrete compare the following:
import Mathlib
namespace Cslib
structure DA (State Symbol: Type*) where
start : State
tr : State → Symbol → State
-- what we currently have
structure DFA {State Symbol} extends DA State Symbol where
accept : Set State
finite_state : Finite State
finite_symbol : Finite Symbol
-- an unbundled version
structure DFA' {State Symbol} [Finite State] [Finite Symbol] extends DA State Symbol where
accept : Set State
end Cslib
I am asking if the last definition is reasonable as a starting point.
Fabrizio Montesi (Oct 30 2025 at 10:08):
Yes, it'd be an improvement, imo.
Fabrizio Montesi (Oct 30 2025 at 10:17):
Re (2), continuing the example:
Currently, this sits under NFA:
/-- An NFA accepts a string if there is a multi-step accepting derivative with that trace from
the start state. -/
@[scoped grind]
def Accepts (nfa : NFA State Symbol) (xs : List Symbol) :=
∃ s ∈ nfa.start, ∃ s' ∈ nfa.accept, nfa.MTr s xs s'
but it doesn't need any of the finiteness assumptions. However, I can't put it under NA either, because EpsilonNFA is an NA as well, and it has a different definition of Accepts.
A direct approach could be to move NFA.Accepts under NA and call it the 'canonical'/'simple' acceptance condition for NAs. Then rename EpsilonNFA to EpsilonNA, get rid of εNFA entirely and just rename εNFA.Accepts into NA.εAccepts.
Fabrizio Montesi (Oct 30 2025 at 10:17):
This would cause a bit of proliferation though, because then we get εLanguage in addition to language, etc.
Fabrizio Montesi (Oct 30 2025 at 10:18):
and then more (buchiAccepts, buchiLanguage, ...)
Fabrizio Montesi (Oct 30 2025 at 10:20):
I think it'd be much better to:
- get rid of all the finiteness assumptions in
DFA,NFAandεNFA, etc. - rename them to something that is not DA/NA.
Fabrizio Montesi (Oct 30 2025 at 10:20):
εNFA could just be εNA, that's easy.
DFA could become ... ?
NFA could become ... ?
Fabrizio Montesi (Oct 30 2025 at 10:22):
Another alternative, which I still need to figure out (any ideas? :-)), would be to make manifest the concept of 'acceptance condition' and define all of them in separate files. Then, ideally, once given a notion of acceptance condition, its associate notion of 'language' is derived automatically.
Fabrizio Montesi (Oct 30 2025 at 10:22):
End of stream of consciousness about the design of automata theory... ;-)
Chris Henson (Oct 30 2025 at 10:30):
Is making Accept a typeclass helpful at all in this? Browsing the current code, it seems we have some repeated definitions like .language and .accepts_mem_language that would be expressible in terms of this, but I'm not familiar enough to know if this is always possible.
Fabrizio Montesi (Oct 30 2025 at 11:03):
We'd have multiple instantiations of that typeclass for NA, wouldn't that be a problem?
Chris Henson (Oct 30 2025 at 11:06):
Would you need more than one instance at a time? They could be scoped if that makes sense.
Fabrizio Montesi (Oct 30 2025 at 11:10):
Probably? Like for the encoding from εNFA to NFA.
Fabrizio Montesi (Oct 30 2025 at 11:20):
How would that look like?
Chris Henson (Oct 30 2025 at 12:09):
This is rough and I left out some LTS bits, but this is the shape I had in mind:
import Mathlib
namespace Cslib
structure LTS (State Label : Type*) where
Tr : State → Label → State → Prop
inductive LTS.MTr (lts : LTS State Label) : State → List Label → State → Prop where
| refl {s : State} : lts.MTr s [] s
| stepL {s1 : State} {μ : Label} {s2 : State} {μs : List Label} {s3 : State} :
lts.Tr s1 μ s2 → lts.MTr s2 μs s3 →
lts.MTr s1 (μ :: μs) s3
structure NA (State Symbol) extends LTS State Symbol where
start : Set State
-- this namepsacing is bad, but you get the idea...
class A (α Symbol: Type*) where
Accepts : α → List Symbol → Prop
open A
def A.language {α Symbol} [A α Symbol] (a : α) : Set (List Symbol) := Accepts a
def A.accepts_mem_language {α Symbol} [A α Symbol] (a : α) (xs : List Symbol) :
Accepts a xs ↔ xs ∈ language a := by rfl
structure NFA (State Symbol) [Finite State] [Finite Symbol] extends NA State Symbol where
accept : Set State
instance {State Symbol} [Finite State] [Finite Symbol] : A (NFA State Symbol) Symbol where
Accepts nfa xs := ∃ s ∈ nfa.start, ∃ s' ∈ nfa.accept, nfa.MTr s xs s'
structure εNFA (State Symbol) [Finite State] [Finite Symbol] extends NA State (Option Symbol) where
accept : Set State
-- not writing this one out because it relies on more of LTS...
instance {State Symbol} [Finite State] [Finite Symbol] : A (εNFA State Symbol) Symbol where
Accepts enfa xs := sorry
def εNFA.toNFA {State Symbol} [Finite State] [Finite Symbol] (enfa : εNFA State Symbol) : NFA State Symbol := sorry
-- just stating and not proving:
theorem toNFA_language_eq {State Symbol} [Finite State] [Finite Symbol] {enfa : εNFA State Symbol} :
language (Symbol := Symbol) enfa.toNFA = language enfa := sorry
Fabrizio Montesi (Oct 30 2025 at 12:12):
i like it, it's very similar to what I was once thinking of defining as the notion of acceptor. I'd actually call A Acceptor.
Fabrizio Montesi (Oct 30 2025 at 12:13):
We'd have an equivalent typeclass for Omega-stuff of course. (OmegaAcceptor, I guess.)
Chris Henson (Oct 30 2025 at 12:15):
For Epsilon Symbol varies to Option Symbol. Are there variants where State similarly changes?
Fabrizio Montesi (Oct 30 2025 at 12:16):
Don't think so, outside of the definitions for the subset construction, e.g.,
def NFA.toDFA (nfa : NFA State Symbol) : DFA (Set State) Symbol
Chris Henson (Oct 30 2025 at 12:19):
Hmm. I was trying to determine if that first State type being shared meant we could share some infrastructure that is independent of the Symbol.
Ching-Tsun Chou (Oct 30 2025 at 22:09):
In an unbundled design, we would have the following signatures, where S and A are types of states and alphabets respectively:
DA.language (da : DA S A) (acc : Set S) : Language A
NA.language (na : NA S A) (acc : Set S) : Language A
εNA.language (na : εNA S A) (acc : Set S) : Language A
NA.buchiLanguage (na : NA S A) (acc : Set S) : ωLanguage A
DA.mullerLanguage (da : DA S A) (accSet : Set (Set S)) : ωLanguage A
and definitions and theorems (which will be nameless for now) like:
theorem (na : NA S A) (acc : Set S) : na.toDA.language {ss | (ss ∩ acc).Nonempty} = NA.language acc
theorem {l : Language A) : l.IsRegular ↔ ∃ S : Type, ∃ da : DA S A, ∃ acc, Finite S ∧ da.language acc = l
def ωLanguage.IsRegular (p : ωLanguage) := ∃ S : Type, ∃ na : NA S A, ∃ acc, Finite S ∧ na.buchiLanguage acc = p
theorem {p : ωLanguage A) (h :p.IsRegular) : (pᶜ).IsRegular -- this is the main result of Büchi's paper
theorem {p : ωLanguage A) : p.IsRegular ↔ ∃ S : Type, ∃ da : DA S A, ∃ accSet, Finite S ∧ da.mullerLanguage accSet = p -- this is McNaughton's theorem
Note that:
- The definitions of various languages will not involve any finite-ness assumption.
- The subset construction embodied by
NA.toDAalso does not depend on any finite-ness assumption. - Finite-state assumptions come in with the notions of regular and ω-regular languages.
- All the closure results about regular languages actually do not need the finite-state assumption. Many , though not all. of the closure results about ω-regular languages do not need the finite-state assumption, either.
- Nowhere above is the finite-alphabet assumption needed. I think we need it only when proving the equivalence between regular languages and regular expressions, because you need a finite alphabet to make regular expressions finite.
Ching-Tsun Chou (Oct 30 2025 at 22:14):
I haven't thought about εNA carefully, but I believe that the εNA.toNA construction also does not depend on any finite-state assumption. BTW, it would be nice to have a NA.toEpsilonNA construction as well. I played with it a little bit and found that it is not trivial.
Ching-Tsun Chou (Oct 30 2025 at 22:17):
It is actually possible to prove all the closure properties of regular and ω-regular languages without using EpsilonNA. I did that in my earlier automata theory project.
Fabrizio Montesi (Oct 31 2025 at 11:25):
What about this?
- We get rid of
NFAandDFA, nobody wants them for now anyway (me included, they were there only because we couldn't figure out good names for the acceptors yet). - We create
NA.ExactAcceptor(instantiatesAcceptor, theAwritten by @Chris Henson),DA.ExactAcceptor, andNA.εAcceptor, which do what the currentNFA,DFA, andεNAdo but without including the finiteness assumptions. Here I had to be creative: 'exact' comes from the fact that a string is accepted if it matches exactly the sequence of encountered transition symbols. - We put all of this stuff under the
Cslib.Automatanamespace, including theAcceptortypeclass. (As @Ching-Tsun Chou suggested once.)
This design gets rid of the F in DFA and NFA, which is wrong to use since we don't need the finiteness assumptions.
For omega stuff we'll do the same: ωAcceptor, instantiated by NA.BuchiAcceptor, etc.
This scheme clarifies that the underlying structures are all the same, it's only the acceptance strategy that differs.
I'm happy to make a PR for this restructuring, but we should discuss it here first.
Chris Henson (Oct 31 2025 at 11:28):
These various Exact* are typeclass instances of Acceptor?
Fabrizio Montesi (Oct 31 2025 at 11:31):
Yes, well, structures accompanied by an instance, I guess. Like this:
structure NA.Acceptor (State : Type _) (Symbol : Type _) extends NA State Symbol where
accept : Set State
instance : Acceptor (NA.Acceptor State Symbol) Symbol where
Accepts na xs := ...
Chris Henson (Oct 31 2025 at 12:03):
This is a relevant aside that I feel we need to discuss regarding convention. You have expressed discomfort about structures named for finite automatons that do not enforce a finiteness condition. My view is that this a Lean formalization convention. Typically when defining a def or strcuture, we do not place typeclass or other Prop requirements on it, postponing these until the theorems where they are used. So yes, there is a mismatch of sorts in the name, but the theorems where this data are used all enforce these conditions. In this sense, I do not find the definitions/names of Mathlib's docs#NFA and docs#DFA so unusual.
And in fact, modulo some LTS involvement, the NA.Acceptor you have defined above seems to converge on this, but with a different name, yes?
Fabrizio Montesi (Oct 31 2025 at 12:04):
Can you point me to an example that is not NFA or DFA?
Fabrizio Montesi (Oct 31 2025 at 12:06):
(Aside: I wouldn't necessarily have that Acceptor in the same file as NA, it could be a separate file.)
Fabrizio Montesi (Oct 31 2025 at 12:12):
Also, what you wrote is exactly how we've been doing things. Perhaps I'm missing something?
What I'm contesting here is that if a structure implicitly declares that it'll have a property in its name, it should always have it, not just in some cases.
Chris Henson (Oct 31 2025 at 12:13):
I'd have to think for a sec to find a good Mathlib example, but consider the [HasFresh Var] [DecidableEq Var] condition we have in all the lambda calculus files. Terms definitely don't make sense without them, but none of the inductive types we define include these assumptions. You can construct such a term without an error, but find out when you try to use any of the theorems that there is a problem. We just don't normally think about it because it's not in the name.
Fabrizio Montesi (Oct 31 2025 at 12:27):
Term or Var (etc.) say nothing about HasFresh or DecidableEq in their names. So what you're describing is what I wanna do with NA/DA/etc.
The example you gotta give me to support your argument is one where somebody wrote something called like VarWithFresh, it contains nothing about freshness whatsoever, and then there's a bunch of theorems that have [HasFresh Var] as parameter. What I'm arguing for is that such a thing should just be called Var.
Chris Henson (Oct 31 2025 at 12:33):
I see if I can find another example, but I can understand your argument regardless.
Chris Henson (Oct 31 2025 at 12:33):
What is the advantage of putting accept : Set State in a child structure verus including it in DA and NA? Then this really is the same definition as Mathlib, but without the undesirable name. (And as was suggested here at least for DA it could just be an abbrev)
Fabrizio Montesi (Oct 31 2025 at 12:36):
For some omega-stuff and things that are not acceptors, there is no such field IIRC, but other things instead.
Chris Henson (Oct 31 2025 at 12:41):
I see, this is reasonable then. Another consideration. Instead of having a child structure like NA.Acceptor that carries the accept : Set State, this could be in the Acceptor class. I see having one fewer structure definition as an advantage, and it groups together the data that Acceptor uses. What do you think?
Fabrizio Montesi (Oct 31 2025 at 12:49):
We thought about this once and my recollection is that:
This might be reasonable, but I'm not 100% sure because I'm not sure all acceptors have exactly that set of states. Isn't it also a bit weird to 'leak' internal details of how the acceptor works?
Fabrizio Montesi (Oct 31 2025 at 12:50):
At least in omega-things, this is gonna be hard to replicate because of the different ways acceptance is modelled.
Chris Henson (Oct 31 2025 at 12:57):
That's all I could think of, your proposal sounds reasonable to me.
Ching-Tsun Chou (Oct 31 2025 at 20:59):
Let me produce a PR of an unbundled design on top of cslib#141.
Eric Wieser (Oct 31 2025 at 21:59):
Fabrizio Montesi said:
Can you point me to an example that is not NFA or DFA?
Some mathlib examples are:
- docs#MonoidHom: not actually between monoids
- docs#Module: actually a semimodule without other assumptions
- docs#RingEquiv: not actually between rings
- docs#RingCon: not actually about a ring
- docs#InnerProductSpace: actually a pre-inner product space
Eric Wieser (Oct 31 2025 at 22:02):
Fabrizio Montesi said:
What I'm contesting here is that if a structure implicitly declares that it'll have a property in its name, it should always have it, not just in some cases.
Either you over promise in the name and deliver something more general than expected, or you use a less discoverable and more verbose name and leave the user wondering why they can't find the less general thing that is wanted 90% of the time
Eric Wieser (Oct 31 2025 at 22:03):
Of course if the name implies a property that the caller must assume separately to actually get, the docstring ought to make this clear.
Ching-Tsun Chou (Oct 31 2025 at 22:20):
Well, one way to interpret the "F" in "NFA" and "DFA" is to say that it refers to the finiteness of the words being accepted, not necessarily to the finiteness of the state space. :wink:
Fabrizio Montesi (Nov 01 2025 at 08:10):
Eric Wieser said:
Fabrizio Montesi said:
What I'm contesting here is that if a structure implicitly declares that it'll have a property in its name, it should always have it, not just in some cases.
Either you over promise in the name and deliver something more general than expected, or you use a less discoverable and more verbose name and leave the user wondering why they can't find the less general thing that is wanted 90% of the time
Can't one always have also the less general thing as a bundle (like NFA), with a docstring that tells people the story? The less general thing could be useful for people who want to use the theory for applications.
Fabrizio Montesi (Nov 01 2025 at 08:11):
Ching-Tsun Chou said:
Well, one way to interpret the "F" in "NFA" and "DFA" is to say that it refers to the finiteness of the words being accepted, not necessarily to the finiteness of the state space. :wink:
Hehe, that's actually pretty good. But it also tells more about the acceptor/acceptance condition than the machine itself.. :-)
Ching-Tsun Chou (Nov 01 2025 at 08:13):
You can find in the following commit what I have in mind for an unbundled design of automata theory:
https://github.com/ctchou/cslib/commit/b9fd281d0148d2979bff97cb2a249534319a7308
The key idea is that the notion of an accepting run of an automaton (which I guess is what Chris and Fabrizio referred to as an "acceptor" above) is abstracted out and put into the file Accept.lean. Three different acceptance conditions are formalized: that for finite words, the Buchi and Muller acceptance conditions for infinite words. Those acceptance conditions can be combined with DA and NA in any way you like. Some example combinations and illustrative theorem statements are given. Finite-state assumptions will be made only when necessary. For example, they are made in the definition/theorem about regular and ω-regular languages. But they will not be made in (for example) the proof of the equivalence of NA and DA in terms of the (finite-word) languages they can accept.
I'll port the rest of cslib#141 to this design, but that will take me several days.
Fabrizio Montesi (Nov 01 2025 at 08:25):
Yes, you're basically defining acceptors. But I'd like to try out the typeclass approach (because, for example, it'll give us the resulting languages for free).
I'd like to merge the pending PRs first though, so not to impede progress (this is 'just' a restructuring). (Currently I just need a couple of merge conflicts to be fixed in cslib#141 and we're good to go).
Fabrizio Montesi (Nov 01 2025 at 13:47):
@Ching-Tsun Chou: I've merged cslib#135, but I see cslib#141 would require quite a few changes to be adapted to the new design. To minimise work, I'm opening a branch automata so that we can discuss and agree on the code before we commit to the design and do the work.
Fabrizio Montesi (Nov 01 2025 at 13:49):
I've ended up simplifying and refining things a bit, you can see my code here: https://github.com/leanprover/cslib/tree/automata/Cslib/Computability/Automata
See in particular Acceptor and the Deterministic and Nondeterministic directories. I guess this is also closer to what you had in mind, @Ching-Tsun Chou.
I still have to deal with the transformations between automata.
An example of how a bundled version and its docstring will look like is given in DFA (I still have to update the others).
Fabrizio Montesi (Nov 01 2025 at 13:58):
In OOP I'd just define an Acceptor interface and implement it from DFA. But there's no way to do this cleanly in Lean AFAIK.
Ching-Tsun Chou (Nov 01 2025 at 16:11):
After cherry-picking on top of the current main, cslib#141 should be fixed.
Chris Henson (Nov 01 2025 at 22:18):
@Fabrizio Montesi I think Acceptor looks good at your link above. Regarding it being a typeclass or structure, I am not sure which is best. There are places in Mathlib where both are defined and proven to be related, for instance the structure docs#CategoryTheory.Iso and Prop typeclass docs#CategoryTheory.IsIso. If there are different use cases here that make having both worthwhile, I think that's fine. (I didn't quite understand your OOP analogy though)
However, I continue my objection to bundling typeclasses. Eric has shown examples of the Mathlib convention, and for functions like DFA.prod (what @Ching-Tsun Chou was writing at the very beginning of this thread) it means that you have to manually work with instances in defining functions rather than them being inferred, and as cited above, we have seen already from the Mathlib port the performance and organizational benefits of unbundling. If a different name is desired that is fine, but I strongly believe that this bundling will become technical debt.
Eric Wieser (Nov 01 2025 at 22:34):
The inferring issue goes away if you declare the field with [theField : FieldType] and attribute [instance] TheType.theField (which if you insist on bundling you should definitely do), but I agree that bundling typeclasses about the parameters inside structures is a bad idea
Ching-Tsun Chou (Nov 01 2025 at 22:37):
@Chris Henson Could you elaborate more on what you mean by "bundling typeclasses"? I don't understand it yet. Thanks!
Ching-Tsun Chou (Nov 01 2025 at 22:41):
As to @Fabrizio Montesi's design, I think it is basically the same as mine, except that I have a single "acceptor" that is applicable to both NA and DA. (The other two acceptors in my Accept.lean are for the Buchi and Muller acceptance conditions of automata on infinite words.).
Chris Henson (Nov 01 2025 at 22:43):
DFA for instance is "bundled" because it has typeclass instances finite_state and finite_symbol that appear as fields.
Chris Henson (Nov 01 2025 at 22:44):
Eric Wieser said:
The inferring issue goes away if you declare the field with
[theField : FieldType]andattribute [instance] TheType.theField(which if you insist on bundling you should definitely do), but I agree that bundling typeclasses about the parameters inside structures is a bad idea
I tried this at some point, but was having some problems with typeclass inference that I didn't have time to debug.
Ching-Tsun Chou (Nov 01 2025 at 22:45):
@Chris Henson I see. Could you take a look at my design? It doesn't have any finite-ness assumption bundled.
Ching-Tsun Chou (Nov 01 2025 at 22:46):
Indeed, in my design, we won't have DFA or NFA at all.
Chris Henson (Nov 01 2025 at 22:55):
There is a "This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository." warning so I can't view the full diff. Is it in a branch somewhere?
Chris Henson (Nov 01 2025 at 23:03):
I'm not sure about the difference between yours and Fabrizio's design of Acceptor, I'll leave that for you since you both understand what we'll want for later theory better. But it seems your bundling is in line with what I was thinking. You unbundled the finiteness assumptions, then rather have a structure that just provides accept : Set State , you move this to the Acceptor structure. This seems fine to me.
Ching-Tsun Chou (Nov 01 2025 at 23:06):
As I said above, Fabrizio's "acceptor" is basically the same as mine, except that he has one for DA and another one for NA while I have a single one for both. I think we cam state and prove all the results without introducing DFA or NFA.
Ching-Tsun Chou (Nov 02 2025 at 01:53):
Here's the branch I'm working on:
https://github.com/ctchou/cslib/tree/unbundle-automata-work
Take a look at DAtoNA.lean and NAtoDA.lean, which are the unbundled counterparts of DFAToNFA.lean and NFAToDFA.lean, respectively. As you can see, nothing in the proof depends on any finite-ness assumption. Now there should be enough results to prove the closure results in RegularLanguage.lean, which I will do next. Then I'll port EpsilonNA related files and get rid of DFA, NFA, and EpsilonNFA altogether.
Chris Henson (Nov 02 2025 at 02:10):
Yes, this is the direction I would go.
One small style point about using Type*. The point is that it assigns fresh universes for each variable at once, so {State : Type*} {Symbol : Type*} is redundant and should be {State Symbol : Type*} instead. (And it is not required that you use Type*, Fabrizio for instance has expressed a preference for writing Type _ and I don't feel we need to enforce one way or the other)
Fabrizio Montesi (Nov 02 2025 at 16:52):
Ah, of course, I shouldn't have bundled the finiteness assumptions, I didn't consider they were typeclasses while making the refactoring. Does this look ok? Basically DFA bundles only the accept state set (but constructing it requires having the finiteness assumptions).
https://github.com/leanprover/cslib/blob/automata/Cslib/Computability/Automata/DFA.lean
Fabrizio Montesi (Nov 02 2025 at 16:57):
At this point, I agree that DFA might just be useless... mmmh.
Chris Henson (Nov 02 2025 at 16:58):
Yeah, this is fine, but you could go one step further and move the accept state set into the Acceptor structure and eliminate DFA.
Chris Henson (Nov 02 2025 at 16:58):
(which is what @Ching-Tsun Chou describes above)
Fabrizio Montesi (Nov 02 2025 at 16:59):
Why would acceptor contain the accept state set?
Chris Henson (Nov 02 2025 at 17:00):
Where would you place it if getting rid of DFA?
Fabrizio Montesi (Nov 02 2025 at 17:02):
Nowhere? It's just a parameter for the constructor, e.g., https://github.com/leanprover/cslib/blob/automata/Cslib/Computability/Automata/Deterministic/ExactAcceptor.lean
Fabrizio Montesi (Nov 02 2025 at 17:02):
(DA.exactAcceptor)
Chris Henson (Nov 02 2025 at 17:04):
Ah I see, yes that's fine.
Fabrizio Montesi (Nov 02 2025 at 17:34):
A hypothetical discussion:
What concerns me is that, following the same line of thought, shouldn't we unbundle start : State as well from DA and NA, because they're needed only to produce acceptors?
That is, this
def exactAcceptor (da : DA State Symbol) (acc : Set State) : Acceptor Symbol where
Accepts (xs : List Symbol) := da.mtr da.start xs ∈ acc
could become
def exactAcceptor (da : DA State Symbol) (start : State) (acc : Set State) : Acceptor Symbol where
Accepts (xs : List Symbol) := da.mtr start xs ∈ acc
If we do this, hypothetically, NA would just literally be an LTS, and all results on NA would become 'unbundled statements' on LTS. Where does this end? :o)
Fabrizio Montesi (Nov 02 2025 at 17:35):
I'm concerned because composition operations on acceptors need to know about these elements (start and accept), so statements might get pretty long.. is that a price we wanna pay?
Fabrizio Montesi (Nov 02 2025 at 17:37):
even correctness theorems get to be like this:
theorem toNA_exactAcceptor_language_eq {da : DA State Symbol} {start : State} {acc : Set State} :
(da.exactAcceptor start acc).language = (da.toNA.exactAcceptor start acc).language := by [...]
Chris Henson (Nov 02 2025 at 17:45):
Tl;dr no, don't unbundle everything
The line that is usually drawn is that we bundle some data into a definition when we want to build API around some concept. Mathlib doesn't have DA, and draws this line as docs#DFA bundling the step (our LTS.Tr), start and accept. We certainly can choose to have the corresponding structure by extending DA with just accept if we prefer to give a name to it.
Fabrizio Montesi (Nov 02 2025 at 17:51):
Let me try again with both start and accept states bundled in DA and NA.
Ching-Tsun Chou (Nov 02 2025 at 19:02):
@Fabrizio Montesi You may want to take a look at what I have done:
https://github.com/ctchou/cslib/tree/unbundle-automata-work
I think you are going down the same path as me, just with slightly different name and namespace hierarchies.
Ching-Tsun Chou (Nov 02 2025 at 19:20):
Your ExactAcceptor.Accepts is a combination of my Accept.Run and Accept.language. I store the set of accepting states as a field xxAccept.acc* in the xxxAccept structure, where the xxx corresponds to the different types of acceptance conditions . My xxxAccept.Run captures the notion of a "run" of an automaton, which is:
(1) a finite word over symbols)plus a final state in the case of finite runs, or
(2) an infinite word over symbols plus an infinite sequence of states in the case of infinite runs.
My xxxAccept.language defines what it means for an automaton to accept a run under the xxx acceptance condition. The above is separate from and orthogonal to the notions of DA and NA and the various constructions on DA and NA (such as product, subset construction, and the yet-to-be-defined sum, concatenation, and looping constructions).
Ching-Tsun Chou (Nov 03 2025 at 03:53):
I have ported all of automata theory up to cslib#141 to the new unbundled design, which can now be reviewed as cslib#142.
Ching-Tsun Chou (Nov 03 2025 at 03:57):
It shows that all finite-ness assumptions can be safely dropped, except those embedded in the definition of regular languages. In particular, Fabrizio's proofs in DFAToNFA.lean, NFAToDFA.lean, and EpsilonNFAToNFA.lean can all be carried over to DAtoNA.lean, NAtoDA.lean, and EpsilonNAtoNA.lean with very few changes.
Fabrizio Montesi (Nov 03 2025 at 07:52):
I did look at it; I think we're all perfectly in line regarding dropping the assumptions (and we have been for a long time), I just wanted to be really sure about the architecture because there are interesting lessons here.
Chris Henson (Nov 03 2025 at 09:11):
I left just a few style comments, nothing about the design.
Fabrizio Montesi (Nov 03 2025 at 09:21):
Please wait a bit more before working too much on PRs, I'm still working on the updated proposal for restructuring. ()
Fabrizio Montesi (Nov 03 2025 at 13:36):
I'm trying an idea to solve the 'missing' level of abstraction between deterministic transition systems and automata (which is present instead for NA, because it extends LTS), and see where it goes.
In the automata branch I've introduced a 'Functional LTS' structure,
/--
A Functional Labelled Transition System (`FLTS`) for a type of states (`State`) and a type of
transition labels (`Label`) consists of a labelled transition function (`tr`).
-/
structure FLTS (State : Type _) (Label : Type _) where
/-- The transition function. -/
tr : State → Label → State
plus the expected translations back and forth between LTS and FLTS and the accompanying theorems that relate transitions in LTS to those in FLTS and vice versa.
DA now extends FLTS, and adds the start and accept states. NA now does the same, but extending LTS.
Translating between them now invokes the translations between LTS and FLTS and then simply adds the translations for the start and accept state (sets).
Working like this things are pretty modular and acceptors can just be returned with no parameters, e.g.,
def DA.acceptor (da : DA State Symbol) : Acceptor Symbol where
Accepts (xs : List Symbol) := da.mtr da.start xs ∈ da.accept
and then theorems are started like this:
theorem toDA_language_eq {na : NA State Symbol} :
na.toDA.acceptor.language = na.acceptor.language := by ...
This means that composition of automata should be easy as well, because they bundle the important bits (instead of taking them as parameters to acceptor).
Acceptor is still:
structure Acceptor (Symbol : Type _) where
Accepts (xs : List Symbol) : Prop
I've pushed to automata but it's still a work in progress, I still have to remove some old stuff and gotta try epsilon automata (but they should work).
Thoughts welcome. I tried hard to find a name in the literature for FLTS (which I'm pretty familiar with), but failed: it appears one has to be created.
Chris Henson (Nov 03 2025 at 14:28):
Just to be clear, the problem this solves is that it makes NA and DA both extend analogous structures? When finished, would this remove DFA and NFA in the same way as the above PR?
If this is a common enough concept in the literature to deserve its own definition, I don't have any objection. I see from the code that this is considered a special case of LTS, so we don't need to replicate the more complex machinery like the notation attribute I am guessing?
Fabrizio Montesi (Nov 03 2025 at 14:59):
Yes. More specifically, it solves the problem that we didn't have a good place for the concept of 'transition system given by a function', which led to asymmetric hierarchies and some confusion (theorems were organised differently across DA vs NA+LTS). This allows for proving theorems where more appropriate (plenty can be stated across FLTS and LTS, without talking about any start or accept states).
This in turn leaves us free to use DA and NA for, respectively, FLTS and LTS equipped with start and accept states, and then prove there the theorems that require talking about such states.
Re the concept: there's no real established name for this concept that I could find, as I wrote, so it's a design choice. Calling 'transition system given by a function' LTS-something instead of Automaton-something seems more consistent to me and seems to give us a better hierarchy.
Ching-Tsun Chou (Nov 03 2025 at 18:10):
@Fabrizio Montesi Attaching the set of accepting states to DA or NA is a bad idea, because automata on infinite words have several types of acceptance conditions not all of which can be specified by a set of states:
- The Muller acceptance condition is a set of sets of states.
- The Rabin and Streett acceptance conditions are a set of pairs of sets of states.
-
The Buchi acceptance condition is a set of states.
Here's a good reference: https://www.imsc.res.in/~madhavan/papers/pdf/tcs-96-2.pdf
This is why I keep insisting that DA and NA should NOT carry accepting states and moved the accepting states (or sets of sets of states, etc) to Accept.lean, which is now completely orthogonal to DA/NA and can be combined with them in any manner you like. Currently cslib#142 already illustrates several combinations: -
Finite acceptance condition with DA, NA, and EpsilonNA.
- Buchi acceptance condition with NA, which is used to define the notion of ω-regular languages.
- Muller acceptance condition with DA, which is used to state McNaughton's theorem.
- More combinations will be added. For example, a relatively simple result is that DA with Buchi acceptance condition is strictly weaker than Buchi with NA and hence does not capture ω-regular languages. (This is a major difference between regular and ω-regular languages. The former is insensitive to the DA vs NA distinction, while the latter requires one to go from the simple Buchi condition to the more complex Muller or Rabin or Streett condition when moving from NA to DA.)
Ching-Tsun Chou (Nov 03 2025 at 18:13):
As to having a notion of FLTS, I think it is pretty orthogonal to the design of automata theory. We can easily switch to it just like we can easily switch to a different LTS implementation as long as the same API is kept.
Fabrizio Montesi (Nov 04 2025 at 16:06):
More progress and a reply to your last comment @Ching-Tsun Chou .
The idea is that now that we have FLTS, it takes very little effort to define different automata and we can have a bit of a flat hierarchy. Each automaton would define both its elements on top of the transition function/relation (which is now contained in FLTS/LTS) and the acceptance condition. Then we can state things very simply as in here by using Acceptor as a typeclass: https://github.com/leanprover/cslib/blob/automata/Cslib/Computability/Automata/NAToDA.lean
So now we could very easily define MullerAutomaton, BuchiAutomaton, etc. They wouldn't be extensions of NA or DA, just of FLTS or LTS.
But if you think there's value in sharing the start field in the type hierarchy, we can of course discuss that. I'm basically trying to hit the sweet spot for the hierarchy here.
What do you think?
Fabrizio Montesi (Nov 04 2025 at 16:06):
Note that the proofs NAToDA and DAToNA now look really simple, because most details could be found by grind from the theory about FLTS/LTS. (I still have to do the others.)
Ching-Tsun Chou (Nov 04 2025 at 17:57):
@Fabrizio Montesi In my previous automata theory project, I found it very natural to package the initial state/states into DA/NA in almost all the automata constructions. There were only a couple of occasions where I wanted to change the initial state/states of a DA/NA. So I think DA/NA should contain the initial state/states, so that we don't have to mention them explicitly all the time in definitions and theorems. I also think that the code you showed above demonstrates that moving more work into FLTS/LTS is really an issue orthogonal to the design of DA/NA. For example, I don't really care about where the heavy-lifting in the argument for the subset construction happens. Whether it happens in LTS or in NA is a matter of proof organization, not in the design of DA and NA.
But I do want to insist that DA and NA should NOT contain the acceptance condition, for the simple reason that there are many different types of acceptance conditions in the automata theory on infinite words. So acceptance conditions should be made orthogonal to DA/NA.
Fabrizio Montesi (Nov 04 2025 at 19:44):
So you're saying that instead of my current
structure DA (State : Type _) (Symbol : Type _) extends FLTS State Symbol where
/-- The initial state of the automaton. -/
start : State
/-- The accept states of the automaton. -/
accept : Set State
instance : Acceptor (DA State Symbol) Symbol ...
we should have
structure DA (State : Type _) (Symbol : Type _) extends FLTS State Symbol where
/-- The initial state of the automaton. -/
start : State
and then further extend this (in this example to have something like DFA; I'm making a name up here, DAExact is just for the sake of developing this argument)
structure DAExact (State : Type _) (Symbol : Type _) extends DA State Symbol where
/-- The accept states of the automaton. -/
accept : Set State
instance : Acceptor (DAExact State Symbol) Symbol ...
Right? What are you trying to accomplish by sharing the definition of 'start' across different automata definitions through this extra layer? Having a common definition of run or something similar?
Ching-Tsun Chou (Nov 04 2025 at 19:58):
@Fabrizio Montesi Please take a look at what I did in cslib#142, in particular DA.lean and NA.lean.
Ching-Tsun Chou (Nov 04 2025 at 20:02):
I just add more attributes to DA and NA like DA.accept, DA.language, DA.mullerAccept, DA.mullerLanguage, NA.buchiAccept, NA.buchiLanguage, etc.
Ching-Tsun Chou (Nov 04 2025 at 20:05):
Note that any combinations of {DA, NA} and {language, buchiLanguage, mullerLanguage} are allowed and can be instantiated when needed. (For example, I'm going to add DA.buchiLanguage and show that it is strictly less expressive than NA.buchiLanguage.)
Ching-Tsun Chou (Nov 04 2025 at 20:07):
Also take a look at Prod.lean, where I define the product of two DA's, and how this product construction is used in RegularLanguage.lean, where the closure of regular languages under union and intersection are proved using the same product construction but two different accepting conditions.
Ching-Tsun Chou (Nov 04 2025 at 20:12):
Also take a look at OmegaRegularLanguage.lean, where I define ω-regular languages using NA.buchiLanguage and state McNaughton's theorem using DA.mullerLanguage. (To be sure, it will take 1000s lines of code to prove that theorem.)
Ching-Tsun Chou (Nov 04 2025 at 20:17):
Now, if I don't have the initial state/states in DA/NA (or whatever we choose to call them), then they will have to be supplied as explicit parameters in all the above definitions and theorems like how the set of accepting state (acc) and the set of set of accepting states (accSet) are explicitly specified. From my experience, I think that is one unbundling step too far. In my view, bundling the initial state/states but not the acceptance condition strikes the right balance.
Fabrizio Montesi (Nov 05 2025 at 10:46):
In my proposal (which is to have FLTS/LTS with only the transition function/relation + automata structures that extend those with bundled start and accept states), you'd get DA, NA, MullerAutomaton, BuchiAutomaton, etc., each instantiating either Acceptor or ωAcceptor (now added to the branch; I don't wanna add the rest for now because I'd like to import your commits there and merge somehow, to keep the right history). You have the same proliferation with the *Accept structures.
Some statements would become more ergonomic, because you wouldn't have to bring around the acceptance states as parameter.
For example IsRegular and McNaughton's would become:
def IsRegular (p : ωLanguage Symbol) :=
∃ State : Type, ∃ a : BuchiAutomaton State Symbol, language a = p
proof_wanted IsRegular.iff_muller_lang {p : ωLanguage Symbol} :
p.IsRegular ↔ ∃ State : Type, ∃ a : MullerAutomaton State Symbol, language a = p
I see what you're doing with prod in the proofs for RegularLanguages (clever!). But these probably merit to be actual operations defined for DAs: intersection and union. Their definitions could reuse a shared base by defining FLTS.prod, where the heavy lifting would be done (the product of the transition functions).
So all in all, the complexity is very similar but we get rid of having the accept parameter all over the place in places where it might be weird, like those above or
theorem toDA_language_eq {na : NA State Symbol} {acc : Set State} :
na.toDA.language { S | ∃ s ∈ S, s ∈ acc } = na.language acc
which in my branch is just
theorem toDA_language_eq {na : NA State Symbol} :
Acceptor.language na.toDA = Acceptor.language na
The only thing gained from an extra layer of indirection (as presented in my previous message) that I can see is having a modular notion of 'run' independent from accept states/conditions. But I don't see you using it without these conditions anywhere, so we could just use Accepts or MTr/mtr there.
Hence my suggestion of my current design in the automata branch, with the opening of extending it further with another layer of abstraction in the future if we see a good opportunity.
(Btw, I'm happy to try and merge your results and development in this branch before I propose this for main, so that it all goes in at the same time. I just wanna agree here before we do too much work on porting to this or that other design...)
Ching-Tsun Chou (Nov 05 2025 at 19:42):
@Fabrizio Montesi The product construction on DA is not the only occasion when you don't want the automaton to carry the acceptance condition. There are quite a few other occasions:
The same automata construction on NA can be used to show that given regular languages L and L' and ω-regular language P, L ∗ L' is regular and L ∗ P is ω-regular.:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Concat.lean
If NA has to carry the acceptance condition, I need to define two different automata concatenation operations, one concatenating two NFAs and a second one concatenating an NFA and a BuchiNA.
The same automata construction on NA can be used to show that given a regular language L, L∗ is regular and L^ω is ω-regular:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Loop.lean
If NA has to carry the acceptance condition, I need to define two different automata looping operations, one producing an NFA and a second one producing a BuchiNA.
To show that the intersection of two ω-languages is still ω-regular, I first construct the product of the two BuchiNA and then add "history variables" to the product:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Prod.lean
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Hist.lean
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/OI2.lean
If I have to work on BuchiNA directly, the two steps need to be collapsed together because the intermediate product automaton result does not have meaningful liveness behavior to speak of (in other words, it is naturally modeled as a NA without an acceptance condition).
The act of adding history variables mentioned above is an operation that has meaningful safety properties, but no meaningful liveness properties:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Hist.lean
If I'm forced to work on BuchiNA or MullerDA, I'll have to define automata constructions where the acceptance conditions are carried around but not really used. Note that adding history variables is not used in just the above result, it is also used in proving other results:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Pair.lean
So, by separating out the acceptance condition from the automaton proper, we get more modular and reusable proofs and reduce the number of distinct automata constructions we need to deal with. If the appearance of acc and accSet really bothers you, you can always define (say) BuchiNA which has the accepting states bundled in and state the final theorems in terms of the bundled entities. But I still want to do my proofs in terms of the unbundled NA and DA, because I've learned from experience that doing otherwise doesn't make any sense.
Ching-Tsun Chou (Nov 05 2025 at 20:26):
BTW, the fact that union and intersection can be implemented using the product construction on DA is not just a happy coincidence. It reflects a fundamental truth about deterministic machines. Imagine that we form the product of any number of DAs. Then any boolean operation on the accepted languages of those DAs can be implemented by the corresponding operation on the accepting sets. This even works for ω-automata with basically the same proof, though you'll need to use the Muller acceptance condition.
Fabrizio Montesi (Nov 06 2025 at 13:42):
Ching-Tsun Chou said:
BTW, the fact that union and intersection can be implemented using the product construction on DA is not just a happy coincidence. It reflects a fundamental truth about deterministic machines. Imagine that we form the product of any number of
DAs. Then any boolean operation on the accepted languages of thoseDAs can be implemented by the corresponding operation on the accepting sets. This even works for ω-automata with basically the same proof, though you'll need to use the Muller acceptance condition.
Of course, I agree. My suggestion was to generalise this to (F)LTSs.
Fabrizio Montesi (Nov 06 2025 at 14:01):
Ching-Tsun Chou said:
Fabrizio Montesi The product construction on DA is not the only occasion when you don't want the automaton to carry the acceptance condition. There are quite a few other occasions:
The same automata construction on NA can be used to show that given regular languages L and L' and ω-regular language P, L ∗ L' is regular and L ∗ P is ω-regular.:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Concat.lean
If NA has to carry the acceptance condition, I need to define two different automata concatenation operations, one concatenating two NFAs and a second one concatenating an NFA and a BuchiNA.The same automata construction on NA can be used to show that given a regular language L, L∗ is regular and L^ω is ω-regular:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Loop.lean
If NA has to carry the acceptance condition, I need to define two different automata looping operations, one producing an NFA and a second one producing a BuchiNA.To show that the intersection of two ω-languages is still ω-regular, I first construct the product of the two BuchiNA and then add "history variables" to the product:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Prod.lean
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Hist.lean
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/OI2.lean
If I have to work on BuchiNA directly, the two steps need to be collapsed together because the intermediate product automaton result does not have meaningful liveness behavior to speak of (in other words, it is naturally modeled as a NA without an acceptance condition).The act of adding history variables mentioned above is an operation that has meaningful safety properties, but no meaningful liveness properties:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Hist.lean
If I'm forced to work on BuchiNA or MullerDA, I'll have to define automata constructions where the acceptance conditions are carried around but not really used. Note that adding history variables is not used in just the above result, it is also used in proving other results:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Pair.leanSo, by separating out the acceptance condition from the automaton proper, we get more modular and reusable proofs and reduce the number of distinct automata constructions we need to deal with. If the appearance of
accandaccSetreally bothers you, you can always define (say) BuchiNA which has the accepting states bundled in and state the final theorems in terms of the bundled entities. But I still want to do my proofs in terms of the unbundled NA and DA, because I've learned from experience that doing otherwise doesn't make any sense.
I'm a bit confused because you keep referring to not bundling the acceptance condition ('when you don't want the automaton to carry the acceptance condition', etc.), so let me try to clarify: I'm doing that too, I'm just doing it in another place to make things homogeneous (FLTS/LTS).
That said, your links and arguments are still very useful, because the operators are important.
Nevertheless, the prod and concat operators would be defined for (F)LTS and then automata would reuse it (they'd even have the same signature as far as I can see at a glance). Similarly for the other operations. One of my aims is exactly to give you (even better) modularity, and obtain a symmetric type hierarchy for deterministic and nondeterministic automata.
The key point is whether it's useful enough to have an 'intermediate' structure that extends FLTS with just a start state, and another that extends LTS with a set of start states. So, for example, FLTS (transition function) -> PreDA (start state) -> DA (accept states) vs FLTS (transition function -> DA (start + accept states).
Bundled designs could then extend those intermediate structures, instead of FLTS/LTS directly. I'd like to understand why these intermediate structures are so useful, compared to just formulating operators on FLTS/LTS and extend those directly.
(Sidenote: the names of the structures can be changed, now or later, I'm more interested in the abstract hierarchy here.)
Ching-Tsun Chou (Nov 06 2025 at 17:14):
@Fabrizio Montesi I think I understand what you want to do. How about I elaborate cslib#142 with what I think you want and then you can review it?
Fabrizio Montesi (Nov 06 2025 at 17:24):
The structure I propose (without the intermediate structure) is already in the automata branch. It needs some fixes to EpsilonNA and to integrate your two PRs. We should:
- Decide if we want the intermediate layer or not (I'd first try without, and then if we see a problem with some part becoming inelegant we add it).
- Be sure we agree on the way I'm using typeclasses (Acceptor and OmegaAcceptor).
Then we could switch to working on the same codebase so that we start building a linear history and ensure the APIs are aligned: merge the code from your PRs into the automata branch (accepting that stuff will be broken) and deal together with all the breakages (I could deal with porting some of your defs to FLTS/LTS and you with the omega-related stuff).
When we're satisfied, we just get a simple review on the proofs and merge the automata branch into main.
Fabrizio Montesi (Nov 06 2025 at 17:31):
Or you can just shift your PRs to the automata branch, I review it there and then we merge, I guess? Basically I'm trying to figure out what's the quickest path to get the structure down, before we do too much work adjusting code (would like to avoid doing it too many times..!).
Ching-Tsun Chou (Nov 06 2025 at 17:47):
cslib#142 has more stuff to exercise the hierarchy on.
Ching-Tsun Chou (Nov 06 2025 at 17:51):
It has some actual proofs about regular languages and some definitions and theorem statements about omega automata. We need to exercise the hierarchy especially on the latter to see that it actually works.
Ching-Tsun Chou (Nov 07 2025 at 02:13):
@Fabrizio Montesi I just pushed a new commit to cslib#142. Please take a look.
There are now 6 structures extending DA and NA, {DA,NA}.{FinAcc,Buchi,Muller}, for the finite, Buchi, and Muller acceptance conditions under DA and NA. Each structure bundles in the set of accepting states (for FinAcc and Buchi) or the set of set of accepting states (for Muller). Each of the 6 combinations also defines a namespace under which are the Accept and language definitions and the mem_language theorem. The file Accept.lean is no longer needed and hence deleted. See also RegularLanguage.lean and OmegaRegularLanguage.lean for how the above can be used. Note that the automata constructions and their associated proofs are basically unchanged. I also added an FLTS structure (which DA extends) that should be moved to its own file (which can be done with this or a different PR).
Ching-Tsun Chou (Nov 07 2025 at 02:18):
There is also a εNA.FinAcc. Buchi and Muller acceptance conditions are not defined for εNA because εNA is useless in the automata theory on infinite words.
Fabrizio Montesi (Nov 07 2025 at 10:16):
Ching-Tsun Chou said:
Fabrizio Montesi I just pushed a new commit to cslib#142. Please take a look.
There are now 6 structures extending DA and NA, {DA,NA}.{FinAcc,Buchi,Muller}, for the finite, Buchi, and Muller acceptance conditions under DA and NA. Each structure bundles in the set of accepting states (for FinAcc and Buchi) or the set of set of accepting states (for Muller). Each of the 6 combinations also defines a namespace under which are the
Acceptandlanguagedefinitions and themem_languagetheorem. The file Accept.lean is no longer needed and hence deleted. See also RegularLanguage.lean and OmegaRegularLanguage.lean for how the above can be used. Note that the automata constructions and their associated proofs are basically unchanged. I also added an FLTS structure (which DA extends) that should be moved to its own file (which can be done with this or a different PR).
Thank you, great effort!
Looking at your code, I think we're in line on the hierarchy now. I also like that you've namespaced the different automata under DA.
There's quite a bit of work I'd done in the automata branch that I don't wanna redo/review from scratch or risk forgetting -- FLTS and its theory, the Acceptor typeclass (which is different from Accept), plenty of documentation, and more -- so I'm gonna try to merge the two developments into a single branch that we can review together. I think now we're near enough that it's worth the effort -- it's likely gonna be immediately right or almost there.
Chris Henson (Nov 07 2025 at 10:28):
Just curious, have either of you run into more grind issues like cslib#88 during this work? I feel it has something to do with how grind handles extended structures, but I wasn't able to figure exactly what the issue was.
Fabrizio Montesi (Nov 07 2025 at 10:28):
A bit randomly, I still have to understand these issues better.. in my last design I didn't have any such issues. I hope this still holds when I'm finished merging @Ching-Tsun Chou's work (WIP).
Fabrizio Montesi (Nov 07 2025 at 10:29):
(If it all works out as I think, I might have at least something interesting to say about it though. :-) )
Fabrizio Montesi (Nov 07 2025 at 12:02):
It did all work out, and more: it's a pretty pleasant development.
See cslib#144, which merges the works that @Ching-Tsun Chou and I have been doing in the automata branch and his PRs. It should all be in there.
Fabrizio Montesi (Nov 07 2025 at 12:03):
@Chris Henson I managed to get far with grind by defining transformations between automata based on transformations defined for the underlying structures (e.g., FLTS to LTS and vice versa), and using record update notation.
Ching-Tsun Chou (Nov 07 2025 at 17:51):
I created a PR cslib#145 to collect my modifications to cslib#144.
Last updated: Dec 20 2025 at 21:32 UTC