Zulip Chat Archive
Stream: mathlib4
Topic: Regular languages: the review queue
Maja Kądziołka (Apr 27 2025 at 18:08):
As detailed in #24205, there's currently quite a mess of many PRs in the regular languages area of Computability, with some of the PRs overlapping each other. As @Yury G. Kudryashov suggests, it would be good to discuss how to resolve this.
The main goal these all center around is the equivalence between regular expressions and finite automata, as well as the related area of constructing automata corresponding to various operators on languages.
Here's a list of PRs that somehow overlap one another:
- a series of PRs by @Tom Kranz , that proves equivalence between REs and NFAs, avoiding the epsilon-NFA step:
- a PR by @Anthony DeRossi which does RE -> epsilon-NFA conversion #20648
- a PR by @Rudy Peterson which implements NFA union, intersection and reversal #22361, which overlaps:
One key issue is the question: do we want to have e.g. intersection on both DFAs and NFAs? They show the same underlying facts about the class of regular languages, but they could have utility should one desire to put bounds on the number of states of an automaton...
I'd be interested to hear your ideas on how to resolve this going forward!
Yury G. Kudryashov (Apr 27 2025 at 18:49):
I think that it makes sense to have operations on different types of automata, even if they have equal power in terms of languages. It would be nice if the authors of the competing PRs
- review each other's PRs;
- resolve differences in design choices, if any; feel free to tag me here with questions;
- reach the state where PRs are compatible with each other;
- tag me here with the list of relevant PRs in the order they should be reviewed. I'll try to find time to review them quickly.
Maja Kądziołka (Apr 28 2025 at 21:27):
Okay. Then another issue is whether we want both the direct RE -> NFA and the RE -> eps-NFA constructions.
Shreyas Srinivas (Apr 28 2025 at 21:59):
I think there is a reasonable conceptual distinction between the different constructions. Technically what one is proving is “given any RE, applying <insert method name> transformation, one gets an equivalent automaton”
Shreyas Srinivas (Apr 28 2025 at 22:03):
The equivalence claim can be separately expressed as “there exists a transformation such that for any RE …”
Yury G. Kudryashov (Apr 28 2025 at 22:34):
Having different constructions that lead to the same statement about languages is fine with me, as long as the constructions reference each other in their docs and explain the differences there.
Yury G. Kudryashov (Apr 28 2025 at 22:35):
E.g., one can use them to transform specific regular expressions to automata.
Chris Wong (Apr 30 2025 at 15:25):
Regarding NFA reversal.
- I'm unfortunately really busy lately so don't have time to have a close look – it would be great if someone else could help with the merge conflict.
- From a cursory look at Rudy's approach, I prefer my one. The key differences are (a) the individual parts of the reversed NFA are defined separately, and (b) using Exists + Mem instead of Disjoint, and I don't consider those changes to be an improvement.
Yury G. Kudryashov (May 16 2025 at 16:02):
Are there any updates here? E.g., can you agree on some part that is ready for review and chery-pick it to a PR (if it's not a whole existing PR)?
Chris Wong (May 19 2025 at 02:14):
I've resolved the conflicts on #21573 – given the lack of response from the others, how about we get this PR landed first?
Yury G. Kudryashov (May 19 2025 at 02:19):
I left 2 minor comments. Please address them, then we can merge it.
Chris Wong (May 19 2025 at 14:42):
Thanks Yury. I've responded to the comments :smile:
Rudy Peterson (Oct 10 2025 at 00:12):
My apologies for missing the discussion, I will remove the NFA reversal from my PR for sure.
I can also remove the NFA union from my PR, and if we still would like to have NFA intersection I can leave that in.
If this sounds good I can migrate my PR to my fork and make these changes.
Thanks! (and sorry again about the late reply)
Rudy Peterson (Oct 25 2025 at 00:30):
I've opened a new PR for NFA concatenation: https://github.com/leanprover-community/mathlib4/pull/30872
Rudy Peterson (Oct 25 2025 at 00:32):
My apologies if I missed a proof or effort somewhere else for closure of regular languages under concatenation, but if not this PR includes this as well.
Rudy Peterson (Oct 25 2025 at 00:33):
My main goal with #30872 is to construct NFA concatenation without epsilon transitions.
Rudy Peterson (Oct 29 2025 at 00:36):
So I have split up my PR #30872 about NFA concatenation into one just about NFA.acceptsFrom and helper lemmas in #31038. I've found it generally useful to have a NFA.acceptsFrom analogous to DFA.acceptsFrom.
Ching-Tsun Chou (Nov 02 2025 at 21:10):
Just wonder whether the closure of regular languages under concatenation and Kleene star will get into mathlib soon. I'm working on automata theory on infinite words in cslib and really would like to have those results without having to prove them myself.
Rudy Peterson (Nov 02 2025 at 21:19):
I'm happy to start working on kleene star closure for regular languages via epsilon NFAs
Ching-Tsun Chou (Nov 02 2025 at 21:31):
You actually don't need EpsilonNFA to prove the closure under Kleene star. You can do it directly on NFA:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Loop.lean
In fact, you can also prove the closure under concatenation on NFA directly as well:
https://github.com/ctchou/AutomataTheory/blob/main/AutomataTheory/Automata/Concat.lean
though proof is perhaps not as pretty as the one using EpsilonNFA. Unfortunately, the automata theory I showed above is not formulated in the same way as the one in mathlib, so you can't reuse the code directly.
Rudy Peterson (Nov 02 2025 at 21:41):
Yeah #30872 does concatenation closure with non-epsilon NFAs
Ching-Tsun Chou (Nov 02 2025 at 21:43):
Any idea about the time line of these results getting into mathlib?
Kevin Buzzard (Nov 02 2025 at 21:45):
The bottleneck for PRs in this area is typically finding reviewers with expertise both in Lean and in the area.
Ching-Tsun Chou (Nov 02 2025 at 21:47):
OK. Perhaps I should review them.
Chris Henson (Nov 02 2025 at 22:36):
@Ching-Tsun Chou Please feel free to use issue cslib#132 to more broadly track these Mathlib PR dependencies for automata in CSlib. As we did in that case, we have some flexibility to temporarily include code that is currently under review in Mathlib.
Chris Wong (Nov 03 2025 at 01:43):
Rudy Peterson said:
Yeah #30872 does concatenation closure with non-epsilon NFAs
Nice!
I'm away from my laptop, so can't do a full review. But I would look into using the suffices tactic. It seems like we're using simp only calls to avoid non-terminal simps, which is ok but quite verbose here.
Rudy Peterson (Nov 03 2025 at 22:50):
Chris Wong said:
Rudy Peterson said:
Yeah #30872 does concatenation closure with non-epsilon NFAs
Nice!
I'm away from my laptop, so can't do a full review. But I would look into using the
sufficestactic. It seems like we're usingsimp onlycalls to avoid non-terminal simps, which is ok but quite verbose here.
Interesting, thanks for the tip! I tried suffices in one place, and it definitely helped. I could use it more in #30872 but it seems that would involve a lot more manual effort to type out the intended subgoal than it's worth potentially...?
Ching-Tsun Chou (Nov 03 2025 at 23:51):
I just provided some feedbacks on mathlib#30872. The PR looks fine to me. If there is an approve button I can push, I would push it.
Aaron Liu (Nov 04 2025 at 00:26):
You can leave an approving review
Chris Wong (Nov 04 2025 at 00:46):
@Ching-Tsun Chou I don't see any comments from you there. Are they still pending?
Chris Wong (Nov 04 2025 at 00:52):
Rudy Peterson said:
Chris Wong said:
Rudy Peterson said:
Yeah #30872 does concatenation closure with non-epsilon NFAs
Nice!
I'm away from my laptop, so can't do a full review. But I would look into using the
sufficestactic. It seems like we're usingsimp onlycalls to avoid non-terminal simps, which is ok but quite verbose here.Interesting, thanks for the tip! I tried
sufficesin one place, and it definitely helped. I could use it more in #30872 but it seems that would involve a lot more manual effort to type out the intended subgoal than it's worth potentially...?
I think my main concern is that the proofs seem a lot more complicated than I thought they'd be. Especially given that most of it is straightforward logical manipulations. I wonder if there's a way to simplify it, whether through pulling out intermediate lemmas, tweaking definitions, or something else.
Ching-Tsun Chou (Nov 04 2025 at 02:27):
Chris Wong said:
Ching-Tsun Chou I don't see any comments from you there. Are they still pending?
Sorry. I forgot to submit my review. I just submitted it.
Ching-Tsun Chou (Nov 04 2025 at 02:53):
Barring a radically new idea, I doubt there would be a much simpler proof than the one in mathlib#30872. I used the same automata construction to prove closure under concatenation in my own automata theory project and my proof is about the same length. More importantly, I think it is way past due that we have the basic closure results for regular languages in mathlib, so that they can be used to prove other things. If a shorter proof is found later, a new PR can always be created to replace the long proof with the shorter one.
Ching-Tsun Chou (Nov 04 2025 at 03:21):
I also provided some feedbacks on mathlib#22361. There I do think the proofs of the closure of regular languages under union and intersection can be greatly simplified. The trick is to work with DFA rather than NFA. A simple product construction on DFA can be used to prove both.
Ching-Tsun Chou (Nov 04 2025 at 03:27):
Of course, there could utility in providing union and intersection constructions for NFA. But it seems to me that a higher priority should be given to proving the closure properties of regular languages by whatever means.
Rudy Peterson (Nov 04 2025 at 03:33):
Ching-Tsun Chou said:
I also provided some feedbacks on mathlib#22361. There I do think the proofs of the closure of regular languages under union and intersection can be greatly simplified. The trick is to work with DFA rather than NFA. A simple product construction on DFA can be used to prove both.
so there's also #20238, which does these for DFA
Rudy Peterson (Nov 04 2025 at 03:34):
admittedly, for #22361, i'm not sure it will be merged (at least in this form), since several other PRs already do these constructions
Rudy Peterson (Nov 04 2025 at 03:35):
My #30872 sadly also conflicts with #15651, and #15651 is more complete (and with #15649 and #15654 also does the full RE <-> NFA proofs, so my #30872 may not be needed
Rudy Peterson (Nov 04 2025 at 03:37):
if #15651 is merged then we get a lot of the closure properties for regular languages
Rudy Peterson (Nov 04 2025 at 03:38):
Ching-Tsun Chou said:
Of course, there could utility in providing union and intersection constructions for NFA. But it seems to me that a higher priority should be given to proving the closure properties of regular languages by whatever means.
yeah, my main goal was definitely to directly do these for NFA, and I agree that likely the overal regular langauges closure results are more important
Ching-Tsun Chou (Nov 04 2025 at 03:39):
The owner of mathlib#15641 etc seems to have been inactive since Feb 11. Can he be relied on to shepherd his PRs to completion?
Ching-Tsun Chou (Nov 04 2025 at 03:49):
mathlib#20238 seems to have been similarly dormant for a long time.
Rudy Peterson (Nov 04 2025 at 08:46):
In the meantime, I have created a new PR for DFA closure properties used to prove closure properties for regular languages in #31247
Rudy Peterson (Nov 04 2025 at 08:47):
I don't mean to overwhelm the maintainers, but I am happy to help get #31247 merged if the authors for the other PRs are unavilable
Rudy Peterson (Nov 04 2025 at 08:49):
@Ching-Tsun Chou what do you think of #31247?
Ching-Tsun Chou (Nov 04 2025 at 18:35):
@Rudy Peterson I provided some feedbacks on mathlib#31247.
Rudy Peterson (Nov 04 2025 at 19:30):
Ching-Tsun Chou said:
Rudy Peterson I provided some feedbacks on mathlib#31247.
Thanks!
Ching-Tsun Chou (Nov 05 2025 at 03:49):
I have approved both mathlib#30872 and mathlib#31247.
Rudy Peterson (Nov 14 2025 at 00:27):
@Ching-Tsun Chou sorry for the delay, I have a PR for kleene star closure via an NFA construction at mathlib#31610.
Admittedly, my mathlib#31610 is heavily inspired by @Tom Kranz's mathlib#15651, so I have credited this work in my comments (and if there's more I should do I am happy to oblige).
Ching-Tsun Chou (Nov 14 2025 at 02:04):
@Rudy Peterson I left some comments in mathlib#31610. I wonder if a slightly different automata construction may simplify the proof.
Last updated: Dec 20 2025 at 21:32 UTC