Zulip Chat Archive

Stream: general

Topic: Regular languages in Lean

Noam Atar (Mar 05 2021 at 21:35):

Hey! I formulated regular languages and proved the Myhill-Nerode theorem in Lean. The code is right here:
https://github.com/atarnoam/lean-automata

If you want to read it and have suggestions, I'd love to hear them!
I'd also like to upload this to mathlib. Can I have permissions to do pull requests to mathlib? (My github username is atarnoam)

Bryan Gin-ge Chen (Mar 05 2021 at 21:50):

@Noam Atar Invite sent! https://github.com/leanprover-community/mathlib/invitations

I haven't looked at your code yet, but there's a PR in progress on regular languages by @Fox Thomson at #6225.

Noam Atar (Mar 05 2021 at 22:03):

Oh, interesting! I see he decided to wait for the definition of regularity. I'll wait too then

Bryan Gin-ge Chen (Mar 05 2021 at 22:10):

I don't know of anyone other than Fox that is working on regular automata and languages for mathlib right now (please correct me if I'm wrong!), so it would probably make sense for you two to coordinate your efforts.

Fox Thomson (Mar 06 2021 at 11:29):

Yes sorry I forgot to close that PR, I think it makes more sence to state regular languages in terms of regular expressions, once we have this then we can state things like M-N and the pumping lemma with it but for now I'd say to get it into mathlib without the is_regular

Alexandre Rademaker (Mar 06 2021 at 15:07):

Hi @Noam Atar, nice work, do you have any documentation or paper explaining details? I would be very interested in the learnings during the experiment. For instance, when I see https://github.com/atarnoam/lean-automata/blob/main/src/regular_languages.lean#L124-L140 I am puzzled if that was the only way to make it. Yes, I know https://en.wikipedia.org/wiki/Curry–Howard_correspondence, but yet, some decisions probably have reasons behind them.

Noam Atar (Mar 06 2021 at 15:17):

@Fox Thomson What part do you mean?

Noam Atar (Mar 06 2021 at 15:22):

@Alexandre Rademaker Hey, I just wrote it for fun and to learn lean. What you asked about is the standard way to build a DFA out of a regular language, as in the proof of M.N. You can read about it here: https://en.wikipedia.org/wiki/Myhill%E2%80%93Nerode_theorem

Fox Thomson (Mar 06 2021 at 16:50):

@Noam Atar Stating things like instance fintype_language_space_of_regular (l : language α) (M : DFA α σ) [fintype σ] (h : M.accepts = l) : fintype (space l) which should be easily refactorable when we have a flexible is_regular

Alexandre Rademaker (Mar 10 2021 at 13:09):

Hi @Noam Atar, what is unusual for me is the definition with a tactic mode proof.

Noam Atar (Mar 12 2021 at 08:09):

@Alexandre Rademaker tactic mode is just a nice way to work to get a term of some type. It doesn't care if that term is a definition or a proof!

Kevin Buzzard (Mar 12 2021 at 09:07):

It matters if you make a definition using a complicated tactic and it turns out to be a completely unwieldy term that you can't reason about easily

Noam Atar (Mar 13 2021 at 15:39):

Of course

Bjørn Kjos-Hanssen (Jul 08 2022 at 21:09):

It's a bit awkward how lists are (by their inductive definition) extended on the left and consequently DFAs work that way too. So a word like lean is processed by the DFA in the order n,a,e,l. Any easy way around that?

Mario Carneiro (Jul 10 2022 at 06:40):

@Bjørn Kjos-Hanssen you could redefine what "process" means to process the characters from the front and work your way down the list; this makes more sense to me anyway

Mario Carneiro (Jul 10 2022 at 06:40):

i.e. a DFA processes a :: l if s steps via a to s' and s' processes l

Bjørn Kjos-Hanssen (Jul 10 2022 at 18:51):

@Mario Carneiro well, it's just the opposite convention from most treatments of automata. But yeah, in a way it doesn't matter.

Mario Carneiro (Jul 11 2022 at 02:14):

@Bjørn Kjos-Hanssen I think you misunderstood me. I mean that "lean" should be written as [l, e, a, n] and it should be processed in the order l,e,a,n

Mario Carneiro (Jul 11 2022 at 02:15):

and this should be ensured by writing the "DFA processes l" definition so that elements are pulled from the front of the list instead of the back

Matt Diamond (Jul 11 2022 at 02:45):

since DFA.eval_from uses foldl, isn't it already pulling elements from the "front"?

Matt Diamond (Jul 11 2022 at 02:46):

oh I guess you mean that the input list has to be constructed via prepends... but the folding itself works left-to-right

Bjørn Kjos-Hanssen (Jul 13 2022 at 17:12):

@Matt Diamond Anyway there is eval_from_append_singleton which lets you work in the expected direction

Chris Wong (Feb 24 2024 at 10:21):

I wonder, would it be better to define IsRegular in terms of Nerode congruences instead? Since they capture the states of a DFA, without specifying an explicit type for the states. In particular, a regular expression doesn't easily provide a complement operation, whereas a DFA / Nerode congruence does.

Chris Wong (Feb 24 2024 at 14:05):

I may have spoken too early. Proving the complement is doable, but union/intersection escapes me...
https://gist.github.com/lambda-fairy/bd5f7206a8a41e77b4be6fee66c94cd2

Martin Dvořák (Feb 26 2024 at 09:08):

Chris Wong said:

I wonder, would it be better to define IsRegular in terms of Nerode congruences instead?

I don't think it is correct to say "instead". We don't have a predicate IsRegular saying "this Language is regular". Note that docs#IsRegular points elsewhere. I would definitely like to see such a predicate added to the Computability directory. See my suggestion* here:
https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/Rename.20DFA.3F/near/417447876
Cf. IsContextFree in my PR:
https://github.com/leanprover-community/mathlib4/blob/51e0dbce7d041f4b12e20f94ea2713ddb55fbd66/Mathlib/Computability/ContextFreeGrammar.lean#L195

(*) I am not arguing about which definition of regular languages should be used (whether it should be by a finite docs#DFA or by docs#RegularExpression or by Nerode congruences). I am just saying I would like to see it as a predicate on languages.

Chris Wong (Feb 27 2024 at 05:49):

Agreed, I'm just responding to the discussion on #6225, in particular the last comment on whether we should use IsRegular = ∃ RegularExpression. Apologies for the lack of context (pun intended :wink: )

Last updated: May 02 2025 at 03:31 UTC