Zulip Chat Archive

Stream: mathlib4

Topic: Naming clarification


Richard Osborn (Nov 27 2022 at 16:38):

I have been trying to familiarize myself with the naming conventions for mathlib4, and I have been having some difficulty wrapping my head around it. I have noticed a number of discrepancies in how things are named vs how the wiki prescribes which makes me believe others are also confused and/or disagree with how the naming scheme is currently defined. I am hoping that my questions can clear things up for myself and others.
First of all, it's my understanding that theorems are all snake_case and are the only examples of the snake_case naming convention. With that in mind, I am still confused by the following.

UpperCamelCase vs lowerCamelCase
Currently, the wiki states that inductive datatypes (of which structure and class are special cases) should be UpperCamelCase and everything else be lowerCamelCase. There are a significant number of defs which are UpperCamelCase in mathlib4, and even ENNReal is defined using a def. It seems as though the inductive datatype vs def split isn't in line with current expectations. From what I can tell, anything in mathlib4 that returns a Sort is capitalized as UpperCamelCase. Is this what people expect? Should there be a Prop vs Type split?
Appealing to Core and Std doesn't seem to help much as their naming convention seems inconsistent. List.isPrefix,List.Disjoint, and outParam seem arbitrarily capitalized.

Prefixes
The wiki also establishes that classes should have their Is and Has prefixes removed. In certain cases, this has led to name clashes. See IsEmpty class vs Empty datatype. Is this a case where the 'is' prefix is warranted? Should either of them be renamed? Are there other similar situations that should be considered? See also IsDomain, IsLeftCancelMul, IsSymmOp, etc...
For datatypes that are not classes, is there a convention for when someone should include a verb in a name or is it best to leave it up to author's discretion? See IsUnit, IsTop, IsPartialInv.
Finally, should all the classes with the Has prefix be renamed? We currently define HasCompl, HasSup, HasInf, HasUncurry, HasInvolutiveNeg, HasDistribNeg, HasQuotient, and a few others.

Sorry for the long post, if this needs to get split up into multiple discussions, I'd be happy to oblige.

Arien Malec (Nov 27 2022 at 17:44):

Prop-typed defs are UpperCamelCase

Arien Malec (Nov 27 2022 at 17:45):

On typeclasses, I think there’s a tension between following Lean 4 conventions & reducing renaming

Arien Malec (Nov 27 2022 at 17:45):

That section may require revisions

Arien Malec (Nov 27 2022 at 17:45):

I based it on Lean 4 conventions.

Andrew Yang (Nov 27 2022 at 18:11):

The Prop-Type split means that def bar_baz : foo for foo : Prop, and def barBaz : foo for foo : Type _. In the special case of foo = Prop, since Prop : Type 1, it is the latter.
Edit: but Sort* are exeptions and def BarBaz : Sort*.

Arien Malec (Nov 27 2022 at 18:14):

That’s worth clarifying in the wiki.

Richard Osborn (Nov 27 2022 at 18:17):

Ok, so when defining terms, it is split between snake_case and lowerCamelCase.

Richard Osborn (Nov 27 2022 at 18:22):

and def barBaz : Foo for Foo : Type would be the correct capitalization?

Andrew Yang (Nov 27 2022 at 18:27):

Yes. String.isPrefix should be String.IsPrefix. List.Disjoint is correct as is. outParam should be OutParam but it is to be in core so yeah.

Andrew Yang (Nov 27 2022 at 18:27):

Or maybe outParam is morally a term of the type Type -> Type so it is not UpperCamelCase.

Richard Osborn (Nov 27 2022 at 18:44):

Would this mean with_top should be translated as withTop?

Andrew Yang (Nov 27 2022 at 18:56):

Is it so? Then docs4#Not should probably be named not since it is morally Prop -> Prop.

Jireh Loreaux (Nov 27 2022 at 19:04):

Actually I take it back, I'm not sure in this case.

Richard Osborn (Nov 27 2022 at 19:04):

Perhaps what is confusing me is that withTop in mathlib4 would be defined as

def withTop (α : Type) := Option α

but it could just have easily been defined as

inductive WithTop (α : Type) where
  | top   : WithTop α
  | some  : α  WithTop α

Should these have different capitalizations?

Andrew Yang (Nov 27 2022 at 19:11):

Either case the result type signature is Type and it is WithTop.

Jireh Loreaux (Nov 27 2022 at 19:11):

This should still be WithTop because Option : Type u → Type u and hence Option α : Type u

Thomas Murrills (Nov 27 2022 at 19:11):

I think we still have Option α : Type, so even in the def, it should still be WithTop since the def entails WithTop α : Type, and I think type-valued terms are UpperCamelCase.

I know And : Prop → Prop → Prop is given as an example of the use of UpperCamelCase for Prop-valued terms. Perhaps the wiki should be updated to explicitly include “Prop- and Type-valued terms” if not already updated?

Thomas Murrills (Nov 27 2022 at 19:12):

Oops, I guess we were all writing the same message at the same time :upside_down:

Jireh Loreaux (Nov 27 2022 at 19:12):

Or just Sort-valued terms

Andrew Yang (Nov 27 2022 at 19:13):

The wiki already mentions this. But I think the wiki mentioning outParam as an example would cause confusion though.

Richard Osborn (Nov 27 2022 at 19:13):

Jireh Loreaux said:

Or just Sort-valued terms

This agrees with what I've seen people doing so far when porting over files.

Thomas Murrills (Nov 27 2022 at 19:15):

The wiki doesn’t actually mention this—it just says “Props and Types”, which doesn’t cover e.g. things like WithTop of type Type → Type and apparently has caused some confusion.

Andrew Yang (Nov 27 2022 at 19:19):

It is def WithTop (α) : Type := sorry, so it stricly falls in the "Props and Types" category. But maybe wiki can somehow make it clearer.

Richard Osborn (Nov 27 2022 at 19:20):

So if I am understanding everything correctly, terms of Prop are snake_case, terms of Type are lowerCamelCase, any Props or Types should be UpperCamelCase. Functions should be capitalized based on their return type.

Richard Osborn (Nov 27 2022 at 19:24):

Should I split off my questions about naming prefixes into a new discussion to avoid cluttering this one?

Thomas Murrills (Nov 27 2022 at 19:27):

Hmm…strictly speaking, it’s only WithTop : Type → Type that gets named with UpperCamelCase, not WithTop α, and so I don’t think the wiki covers it when interpreted precisely…i think it might be clearest to explicitly include type and prop families in the wiki, to head off precise/literal interpretations of the naming conventions

Andrew Yang (Nov 27 2022 at 19:29):

The sentence "Functions should be capitalized based on their return type" should probably be added to the wiki.

Andrew Yang (Nov 27 2022 at 19:31):

Along with the removal of outParam in the example (or explicitly point out that this is an exception).

Thomas Murrills (Dec 13 2022 at 21:40):

I'm finally editing the wiki to reflect this! :)

Thomas Murrills (Dec 13 2022 at 21:44):

One thing I'd like to add: something that reflects the fact that words which are abbreviated to strings greater than one character (e.g. Mul or Pow) are treated the same as unabbreviated words. This is true, right?

Kevin Buzzard (Dec 13 2022 at 21:46):

you mean "is the fact that Add is a word whereas Mul is short for Multiplication irrelevant to the naming convention"? If so then yes.

Thomas Murrills (Dec 13 2022 at 21:47):

I'm porting order.circular (just to try porting something), and it provides an interesting case. Lean 3 uses btw : α → α → α → Prop and sbtw : α → α → α → Prop for "between" and "strictly between". My feeling is that these should be named Btw (Prop-valued so UpperCamelCase, then in parallel to Mul) and SBtw ("strictly" abbreviated to one character, then Btw—same reasoning).

Yaël Dillies (Dec 13 2022 at 21:49):

I would go for Sbtw, with the idea that we shouldn't put two caps one after the other.

Thomas Murrills (Dec 13 2022 at 21:50):

But Btw stands in for a single word, right? So shouldn't it be capitalized as a word within an UpperCamelCase name? (And don't we put two capitals after each other in acronyms, e.g. LT)? Or should this be an exception?

Yaël Dillies (Dec 13 2022 at 21:54):

I think the acronym convention is unfollowable and we already have a zoo of them: Eq, Ne, LE, LT.

Winston Yin (Dec 13 2022 at 21:54):

Given that Lean core uses HPow, HAppend, HAnd, where H is short for the modifier heterogeneous, I think Btw and SBtw is closer to this convention

Thomas Murrills (Dec 13 2022 at 21:56):

Well, Eq is an abbreviation, not an acronym! So I feel like it should be capitalized like Mul. (And Ne is one of the few exceptions to the naming conventions just to parallel Eq, as I understand it? But it could be considered as an abbreviation of "nonequal", and then it wouldn't be an exception...except for the fact that no rules explicitly mention abbreviations yet)

Yaël Dillies (Dec 13 2022 at 21:56):

Ne is an acronym :upside_down:

Thomas Murrills (Dec 13 2022 at 21:57):

Yes, if it's taken to stand for "not equal"! But if it's taken to stand for "nonequal", then it's an abbreviation... :)

Winston Yin (Dec 13 2022 at 21:58):

There was some discussion some days ago about how to name ennreal, and I believe it came down to ENNReal to be clear that E,N,N are all modifiers to Real

Thomas Murrills (Dec 13 2022 at 21:58):

What does ENNReal stand for, again?

Winston Yin (Dec 13 2022 at 21:58):

Extended, non-negative real

Yaël Dillies (Dec 13 2022 at 21:59):

I see no point having it be ENNReal over Ennreal or EnnReal. We're just making it harder for ourselves.

Thomas Murrills (Dec 13 2022 at 21:59):

Ok! ENNReal fits nicely into the acronym convention, and suggests SBtw...

Winston Yin (Dec 13 2022 at 22:02):

Because EReal, NNReal, ENNReal all are modified Real. It makes the capitalisation consistent so there are no surprises, and it makes it easier to change Real -> NNReal, say, when we refactor some theorems

Thomas Murrills (Dec 13 2022 at 22:03):

I see no point having it be ENNReal over Ennreal or EnnReal. We're just making it harder for ourselves.

I do think ENNReal is more communicative...it tells me ENN is an acronym and not an abbreviation for a word that starts with Enn. Likewise SBtw tells me it's S Btw, not an abbreviation for some word starting with S and involving the letters b, t, w...

Winston Yin (Dec 13 2022 at 22:05):

Now, an argument could be made that if people start saying "ereal" as though it were its own word, similar to "poset", then the capitalisation can change to reflect that. But none of the things in this thread are that established

Thomas Murrills (Dec 13 2022 at 22:05):

(Also ENNReal is totally consistent with UpperCamelCase naming conventions, except each "word" is abbreviated to one character long...I think that's why it "feels right" to me, at least)

Thomas Murrills (Dec 13 2022 at 22:06):

Yeah, it definitely makes sense to account for any lexicalization that's happened irl when deciding what counts as a "word"! :)

Winston Yin (Dec 13 2022 at 22:06):

Don't take this example too seriously, but we went from E-Mail to Email once the word became established...

Winston Yin (Dec 13 2022 at 22:11):

Earlier convo for reference

Yaël Dillies (Dec 13 2022 at 22:11):

Rust has a convention that makes acronyms be considered as a single word for capitalisation. So it would be EnnReal or Ennreal.

Winston Yin (Dec 13 2022 at 22:13):

I guess we'll have to name it EnnReal or Ennreal when we formalise them in Rust :big_smile:

Thomas Murrills (Dec 13 2022 at 22:15):

Hmm, it does seem that Lean's conventions differs in that respect (except for Ne, and as far as I can tell, only Ne?)...but tbh I think Lean's version makes more sense, at least by my personal preferences :)

Winston Yin (Dec 13 2022 at 22:15):

I got so used to smul in mathlib3 that I just say IPA/smʌl/ nowadays, but (alas?) the convention has been set to call it SMul in mathlib4

Thomas Murrills (Dec 13 2022 at 22:16):

Hmm, considering all that...I think for now I'm going to go with SBtw given that almost everything follows the naming convention (looking at you, Ne and _Params...), and e.g. HAdd, SMul shows that it's followed even in cases like this.

Yaël Dillies (Dec 13 2022 at 22:17):

Winston Yin said:

I got so used to smul in mathlib3 that I just say IPA/smʌl/ nowadays, but (alas?) the convention has been set to call it SMul in mathlib4

We're a Ctrl+F away from fixing it :wink:

Thomas Murrills (Dec 13 2022 at 22:18):

Also, my apologies Yaël—I see that I'm making this choice in a port of a file you authored! 😅

Jireh Loreaux (Dec 13 2022 at 23:01):

@Winston Yin , you're not aligning declarations correctly (in your head). has_smul went to SMul, but smul (or rather, has_smul.smul) went to SMul.smul.

Winston Yin (Dec 13 2022 at 23:03):

So this is now messing with some other convention in Lean core: HPow.hPow but SMul.smul?

Winston Yin (Dec 13 2022 at 23:04):

(Don't hate me) but the latter should probably be SMul.sMul for consistency

Jireh Loreaux (Dec 13 2022 at 23:06):

Yes, I know, but we discussed it (somewhere probably a week ago, I don't recall where), and it was decided we will keep smul instead of sMul.

Winston Yin (Dec 13 2022 at 23:08):

It's here

Thomas Murrills (Dec 13 2022 at 23:13):

Will this be an exception just for smul given how common it is? Should the general rule for everything else be to write analogous things like hMul?

Mario Carneiro (Dec 13 2022 at 23:14):

note that core lean does not completely adhere to the naming conventions. There is a PR open to fix it, but this generated an understandable amount of internal discussion and it needs to be split up into less-contentious and more-contentious parts and then possibly make changes to the style guide. Hopefully I can get Leo in the mathlib4 porting meeting and we can all hash this out

Mario Carneiro (Dec 13 2022 at 23:15):

The discussion re: hMul was actually to rename it to HMul.mul which sidesteps most of the problems

Mario Carneiro (Dec 13 2022 at 23:16):

smul is not an exception, the rule is that clusters of capitals at the start of a type name are lowercased as a group

Thomas Murrills (Dec 13 2022 at 23:19):

Isn't the rule actually that acronyms are lowercased as a group? E.g. ennReal instead of ennreal (if it had to be lowerCamelCased)

Thomas Murrills (Dec 13 2022 at 23:27):

At least, that's what the wiki says as written right now. (And it kind of makes sense to me: only the first "thing" in lowerCamelCase should be lowercased, not the second, and the acronym is the first thing (e.g. Real is the second). It's just that acronyms upper/lowercase as a group, in xkcd/XKCD style.) If that's not actually the rule, though, then the wiki should be updated

Mario Carneiro (Dec 13 2022 at 23:47):

no, the initial letter of the non-acronym part is also lowercased

Alex Keizer (Dec 14 2022 at 14:47):

What do we think about MvFunctor/MVFunctor for "multivariate functor"? I'm inclined towards the first, since Mv is an abbreviation, not an acronym, but it also feels that a naming convention where we have to think "Is this leading part an abbreviation or an acronym" is maybe not ideal.

Mario Carneiro (Dec 14 2022 at 14:51):

I could go either way on that one. I think we can still leave some stuff up to the author's discretion

Mario Carneiro (Dec 14 2022 at 14:52):

I think it's closer to an acronym than an abbreviation though, especially if you spell it multi-variate functor

Alex Keizer (Dec 14 2022 at 14:56):

Yeah, I guess it's the same discussion as NE/Ne. It could be either an abbreviation or an acronym, depending on how you spell it. All the more reason we maybe shouldn't distinguish between them? Going with MV would lead to MVQPF for "multivariate quotient of polynomial functor", which feels less readable than MvQPF to me, though.

Alex Keizer (Dec 14 2022 at 14:57):

Yaël Dillies said:

Rust has a convention that makes acronyms be considered as a single word for capitalisation. So it would be EnnReal or Ennreal.

Essentially, I would be with Yael on this one, in favor of adopting Rust's convention on this

Jireh Loreaux (Dec 14 2022 at 15:11):

ENNReal has much more semantic content. I remember being totally confused when reading mathlib for the first time and seeing ennreal.

Also, I could be wrong, but I don't think such a change is really on the table for discussion. I don't mean you can't voice your opinion, only that it may be futile.

Michael Stoll (Dec 14 2022 at 15:23):

My futile opinion is that ENnReal is better :smile:; I think of "nonnegative" as one entity, not two...

Thomas Murrills (Dec 14 2022 at 18:27):

If we're expressing futile preferences, I agree with Jireh! this is also why I personally like the rule that gets us ennReal for the lowerCamelCased version instead of ennreal—otherwise who knows when the first thing ends and the next one begins? (I mean, you don't know if enn is an abbreviation or an acronym, but it helps distinguish things at least a bit.)

Thomas Murrills (Dec 14 2022 at 21:09):

Ok, interesting new naming case: def cIcc (a b : α) : Set α. Stands for "circular interval, closed-closed". (There's an analogous cIoo for "open-open".)

Current naming convention suggests it should be cicc instead based on decapitalizing all initial clusters of capitals. cIcc emphasizes the interval nicely but wrongly suggests Icc is an abbreviation for something. ciCC, for example, separates the "units of thought" nicely (decapitalized first acronym, capitalized second acronym). Thoughts?

Yaël Dillies (Dec 14 2022 at 21:11):

This will have to match docs#set.Icc at any rate.

Yaël Dillies (Dec 14 2022 at 21:11):

Your interpretation is slightly off as well. cIcc is more directly "circular Icc".

Thomas Murrills (Dec 14 2022 at 21:12):

ohh, I see!

Thomas Murrills (Dec 14 2022 at 21:15):

So I guess this is part of a more general decision for naming intervals of (return) type Set : α.

In that case choices include Icc, icc, ICC (← probably the most wrong?) and iCC.

Thomas Murrills (Dec 14 2022 at 21:16):

I want to note that intervals can be infinite (i) and I'm not sure we'd want e.g. iio.

Reid Barton (Dec 14 2022 at 21:18):

I would be inclined towards leaving these as-is based on the fact that the lean 3 names were already unusual for being capitalized

Thomas Murrills (Dec 14 2022 at 21:20):

It does feel like it warrants an exception one way or another (so that we can use Icc or, I guess, iCC if we really want), since it's more of a schema for notating intervals than an acronym per se.

Jireh Loreaux (Dec 16 2022 at 20:12):

How do we naming things involving Nat.cast? just cast?

Floris van Doorn (Dec 16 2022 at 22:58):

I've been using natCast if the Lean 3 name was nat_cast and keep it as cast otherwise.

Anatole Dedecker (Dec 19 2022 at 14:52):

What is the capitalization policy for the Co prefix? E.g Frame/Coframe or Frame/CoFrame?

Sky Wilshaw (Dec 19 2022 at 16:06):

Jireh Loreaux said:

ENNReal has much more semantic content. I remember being totally confused when reading mathlib for the first time and seeing ennreal.

In my opinion this is a really important point - are the naming conventions designed to make experienced lean users more comfortable, or to more easily onboard new ones? I prefer ENNReal (and SMul, HEq) because it would have made the learning experience more comfortable. By the way, the Rust convention would give EnnReal not Ennreal (e.g. GzDecoder from a gzip library). I'm not a particular fan of Rust's convention because it still leaves ambiguity as to whether Enn (for example) is itself an abbreviation of a longer word.

Jireh Loreaux (Dec 19 2022 at 16:08):

My expectation would be Coframe because it's one word not two.


Last updated: Dec 20 2023 at 11:08 UTC