Zulip Chat Archive
Stream: new members
Topic: Adding DFA regular expression equivalence to mathlib
Russell Emerine (May 01 2022 at 22:15):
Hello y'all, I'm Russell, an undergrad at CMU. I have a working proof of the equivalence of regular expressions to DFAs. I'd like to add my proof to the lean 3 mathlib. How should I start a PR? I'd also like some opinions on how to formulate the key theorems and organize parts of the proofs, more details if anyone asks. Thanks to everyone here for running such an awesome project!
Yaël Dillies (May 01 2022 at 22:33):
Hey! @Fox Thomson and I had work towards it, so please show us so we can compare.
Yaël Dillies (May 01 2022 at 22:33):
Once the @maintainers have given you write access, you will be able to open a PR from the mathlib repo (not a fork).
Eric Wieser (May 01 2022 at 22:36):
(we can't give write access without a github username!)
Bryan Gin-ge Chen (May 01 2022 at 22:52):
Welcome! In case you haven't already seen it, we've got a page with some tips for new contributors: https://leanprover-community.github.io/contribute/index.html
Russell Emerine (May 01 2022 at 23:09):
Sorry, my github is just RussellEmerine
Bryan Gin-ge Chen (May 01 2022 at 23:50):
Russell Emerine said:
Sorry, my github is just RussellEmerine
Invite sent! https://github.com/leanprover-community/mathlib/invitations
Russell Emerine (May 04 2022 at 21:30):
i promise i'm working on this lol, there'll be new stuff in a new branch soon
Russell Emerine (May 05 2022 at 02:18):
Ok, now what I had already is on the branch RussellEmerine/DFA_equivalent_regular_expression
. I'll be making some of the more obvious changes soon, please let me know if there's anything in particular that needs to be fixed or reorganized.
Tom Kranz (Apr 24 2024 at 16:31):
Hi there, after seeing the TODO in the Regular Expression documentation, I've been playing around with this recently, as well – in Lean4, though. Is this still being worked on? There doesn't seem to be a mathlib4 branch for it and the mathlib branch has been dormant for about two years now. In my own implementation, I only got as far as proving ripping to preserve the language and I started out with an unnecessarily complicated language definition, so I'd be starting mostly from scratch anyway. I'd be happy to work on whatever implementation's already there or make my own if the porting effort would not be worth it.
Tom Kranz (Apr 24 2024 at 16:46):
There's also this (which solves the mystery of what were supposed to be the “Regex-like operations”!), but mathlib4 doesn't seem to have received the branches mentioned.
https://leanprover.zulipchat.com/#narrow/stream/217875-Is-there-code-for-X.3F/topic/.E2.9C.94.20Regular.20expression.20reversal/near/283293183
Yaël Dillies (Apr 24 2024 at 18:11):
No, nobody is working on it. Go wild :smile:
Paige Thomas (Apr 25 2024 at 04:26):
I did a little bit of work outside of mathlib on automatons and regular expressions here, but got stuck on trying to define the language associated with closure (the sorry on line 204). I took the definitions I found in mathlib as inspiration, but I didn't make any plans to push it to mathlib. I recently started going through a textbook on LR parsing, and am playing a bit with formalizing that as I go, which is what is at the bottom of that file.
Martin Dvořák (Apr 25 2024 at 08:31):
Are you sure you don't want to build on top of Mathlib definitions?
Keegan Perry (Apr 26 2024 at 10:23):
Hello, I've also been working a bit on this recently. I've mainly been working on the translation from a regex to an NFA by building on and translating what was done on branch#regex_nfa. So, except for the concatenation and Kleene star operators (I'm halfway done with the proof for concatenation), I've added the proofs for the correctness of the other "Regex-like operations" (that is, regex operations for NFAs) . I'm wondering if it might be worth to add the definitions for these operations into Mathlib so long?
Lukas Gerlach (Apr 26 2024 at 11:02):
Related to this, we are formalizing (part of) an undergraduate lecture at TU Dresden that covers grammars and automata. This is work in progress and mostly advances through student projects. So far, we decided to redefine DFAs (not using the mathlib definition). We established a correspondence between regular grammars and automata.
https://github.com/knowsys/Formale-Systeme-in-LEAN/tree/main/FormalSystems/Chomsky/Regular
We did not formalize regular expressions though.
Tom Kranz (Apr 26 2024 at 15:56):
@Lukas Gerlach Cool! Do you use the project in your teaching?
Lukas Gerlach (Apr 26 2024 at 16:08):
We have student projects about the formalization but we do not use this in the lecture so far.
We plan on doing so once a more substantial amount of content from the lecture has been formalized.
Also, we dream of maybe having a game similar to the ones here but this is still a very long way :D
Martin Dvořák (Apr 29 2024 at 08:19):
Lukas Gerlach said:
Related to this, we are formalizing (part of) an undergraduate lecture at TU Dresden that covers grammars and automata. This is work in progress and mostly advances through student projects. So far, we decided to redefine DFAs (not using the mathlib definition). We established a correspondence between regular grammars and automata.
https://github.com/knowsys/Formale-Systeme-in-LEAN/tree/main/FormalSystems/Chomsky/Regular
We did not formalize regular expressions though.
There seems to be an overlap with my project:
https://github.com/madvorak/chomsky
We should perhaps collaborate!
Paige Thomas (Apr 30 2024 at 03:26):
Martin Dvořák said:
Are you sure you don't want to build on top of Mathlib definitions?
I think the issue that I ran into was that the use of the instances of Add
, Mul
, One
, Zero
and Pow
for regular expressions became too abstract for me to follow.
Tom Kranz (Apr 30 2024 at 14:30):
Yaël Dillies schrieb:
No, nobody is working on it. Go wild :smile:
Since I found @Russell Emerine's implementation quite elegant, I ported it to Lean4 and now have it at a state where it typechecks and has computational content where possible. Would anybody like to take a look and give me some pointers for moving it towards inclusion in the mathlib? I've deliberately left most of the structure intact (from mathport's output) because I don't want to introduce my own biases before getting some feedback… except for simplifying inductions over the reverse of lists by using the appropriate induction principle.
https://github.com/TpmKranz/mathlib4/tree/rescueRegularExpressionNFAEquivalence
Yaël Dillies (Apr 30 2024 at 16:08):
The proofs are longish, but nothing that can't be fixed by a PR review!
Tom Kranz (Apr 30 2024 at 16:13):
Definitely; I intend to overhaul them as soon as I get some pointers for what the result should look like. So I should just make a branch on the official repo and offer the current state as a PR to open it to wider feedback?
Lukas Gerlach (May 01 2024 at 08:45):
Martin Dvořák said:
There seems to be an overlap with my project:
https://github.com/madvorak/chomskyWe should perhaps collaborate!
Indeed, we came across your project recently and we would be happy to collaborate :)
I'll DM you!
Paige Thomas (May 04 2024 at 05:51):
What is the algorithm being used for the translation of regular expressions to a DFA/NFA? I started trying to implement Thompson's algorithm, (work here), but I don't know if I am doing it correctly and what it is going to be like trying to prove that it is correct.
Tom Kranz (May 04 2024 at 09:07):
I've set up the PR now; will I have to request a review or will the appropriate people just start on it?
Yaël Dillies (May 04 2024 at 09:10):
Hopefully the latter, but feel free to request reviews if nothing happens within three days
Tom Kranz (May 04 2024 at 09:10):
@Keegan Perry the work I chose to inherit already implemented and proved correct the regular operations on (non-ε) NFA's – just in the context of Thompson's construction – but it should probably be extracted to avoid code duplication.
Paige Thomas (May 06 2024 at 01:50):
(deleted)
Tom Kranz (May 14 2024 at 18:47):
I have extracted the regular operations into NFAs now. Where would I ask for a review – in the mathlib4 stream?
Yaël Dillies (May 14 2024 at 18:47):
Ask in #PR reviews or through the github interface
Tom Kranz (May 14 2024 at 18:50):
Thanks! I swear I did not see that stream when I was looking for it. :face_with_peeking_eye:
Last updated: May 02 2025 at 03:31 UTC