Zulip Chat Archive
Stream: mathlib4
Topic: Naming clarification
Richard Osborn (Nov 27 2022 at 16:38):
I have been trying to familiarize myself with the naming conventions for mathlib4, and I have been having some difficulty wrapping my head around it. I have noticed a number of discrepancies in how things are named vs how the wiki prescribes which makes me believe others are also confused and/or disagree with how the naming scheme is currently defined. I am hoping that my questions can clear things up for myself and others.
First of all, it's my understanding that theorems are all snake_case
and are the only examples of the snake_case
naming convention. With that in mind, I am still confused by the following.
UpperCamelCase vs lowerCamelCase
Currently, the wiki states that inductive datatypes (of which structure and class are special cases) should be UpperCamelCase
and everything else be lowerCamelCase
. There are a significant number of def
s which are UpperCamelCase
in mathlib4, and even ENNReal
is defined using a def
. It seems as though the inductive datatype vs def split isn't in line with current expectations. From what I can tell, anything in mathlib4 that returns a Sort
is capitalized as UpperCamelCase
. Is this what people expect? Should there be a Prop
vs Type
split?
Appealing to Core
and Std
doesn't seem to help much as their naming convention seems inconsistent. List.isPrefix
,List.Disjoint
, and outParam
seem arbitrarily capitalized.
Prefixes
The wiki also establishes that classes should have their Is
and Has
prefixes removed. In certain cases, this has led to name clashes. See IsEmpty
class vs Empty
datatype. Is this a case where the 'is' prefix is warranted? Should either of them be renamed? Are there other similar situations that should be considered? See also IsDomain
, IsLeftCancelMul
, IsSymmOp
, etc...
For datatypes that are not classes, is there a convention for when someone should include a verb in a name or is it best to leave it up to author's discretion? See IsUnit
, IsTop
, IsPartialInv
.
Finally, should all the classes with the Has
prefix be renamed? We currently define HasCompl
, HasSup
, HasInf
, HasUncurry
, HasInvolutiveNeg
, HasDistribNeg
, HasQuotient
, and a few others.
Sorry for the long post, if this needs to get split up into multiple discussions, I'd be happy to oblige.
Arien Malec (Nov 27 2022 at 17:44):
Prop-typed defs are UpperCamelCase
Arien Malec (Nov 27 2022 at 17:45):
On typeclasses, I think there’s a tension between following Lean 4 conventions & reducing renaming
Arien Malec (Nov 27 2022 at 17:45):
That section may require revisions
Arien Malec (Nov 27 2022 at 17:45):
I based it on Lean 4 conventions.
Andrew Yang (Nov 27 2022 at 18:11):
The Prop
-Type
split means that def bar_baz : foo
for foo : Prop
, and def barBaz : foo
for foo : Type _
. In the special case of foo = Prop
, since Prop : Type 1
, it is the latter.
Edit: but Sort*
are exeptions and def BarBaz : Sort*
.
Arien Malec (Nov 27 2022 at 18:14):
That’s worth clarifying in the wiki.
Richard Osborn (Nov 27 2022 at 18:17):
Ok, so when defining terms, it is split between snake_case
and lowerCamelCase
.
Richard Osborn (Nov 27 2022 at 18:22):
and def barBaz : Foo
for Foo : Type
would be the correct capitalization?
Andrew Yang (Nov 27 2022 at 18:27):
Yes. String.isPrefix
should be String.IsPrefix
. List.Disjoint
is correct as is. outParam
should be OutParam
but it is to be in core so yeah.
Andrew Yang (Nov 27 2022 at 18:27):
Or maybe outParam
is morally a term of the type Type -> Type
so it is not UpperCamelCase
.
Richard Osborn (Nov 27 2022 at 18:44):
Would this mean with_top
should be translated as withTop
?
Andrew Yang (Nov 27 2022 at 18:56):
Is it so? Then docs4#Not should probably be named not
since it is morally Prop -> Prop
.
Jireh Loreaux (Nov 27 2022 at 19:04):
Actually I take it back, I'm not sure in this case.
Richard Osborn (Nov 27 2022 at 19:04):
Perhaps what is confusing me is that withTop
in mathlib4 would be defined as
def withTop (α : Type) := Option α
but it could just have easily been defined as
inductive WithTop (α : Type) where
| top : WithTop α
| some : α → WithTop α
Should these have different capitalizations?
Andrew Yang (Nov 27 2022 at 19:11):
Either case the result type signature is Type
and it is WithTop
.
Jireh Loreaux (Nov 27 2022 at 19:11):
This should still be WithTop
because Option : Type u → Type u
and hence Option α : Type u
Thomas Murrills (Nov 27 2022 at 19:11):
I think we still have Option α : Type
, so even in the def
, it should still be WithTop
since the def
entails WithTop α : Type
, and I think type-valued terms are UpperCamelCase.
I know And : Prop → Prop → Prop
is given as an example of the use of UpperCamelCase for Prop
-valued terms. Perhaps the wiki should be updated to explicitly include “Prop
- and Type
-valued terms” if not already updated?
Thomas Murrills (Nov 27 2022 at 19:12):
Oops, I guess we were all writing the same message at the same time :upside_down:
Jireh Loreaux (Nov 27 2022 at 19:12):
Or just Sort
-valued terms
Andrew Yang (Nov 27 2022 at 19:13):
The wiki already mentions this. But I think the wiki mentioning outParam
as an example would cause confusion though.
Richard Osborn (Nov 27 2022 at 19:13):
Jireh Loreaux said:
Or just
Sort
-valued terms
This agrees with what I've seen people doing so far when porting over files.
Thomas Murrills (Nov 27 2022 at 19:15):
The wiki doesn’t actually mention this—it just says “Prop
s and Type
s”, which doesn’t cover e.g. things like WithTop
of type Type → Type
and apparently has caused some confusion.
Andrew Yang (Nov 27 2022 at 19:19):
It is def WithTop (α) : Type := sorry
, so it stricly falls in the "Prop
s and Type
s" category. But maybe wiki can somehow make it clearer.
Richard Osborn (Nov 27 2022 at 19:20):
So if I am understanding everything correctly, terms of Prop
are snake_case
, terms of Type
are lowerCamelCase
, any Prop
s or Type
s should be UpperCamelCase
. Functions should be capitalized based on their return type.
Richard Osborn (Nov 27 2022 at 19:24):
Should I split off my questions about naming prefixes into a new discussion to avoid cluttering this one?
Thomas Murrills (Nov 27 2022 at 19:27):
Hmm…strictly speaking, it’s only WithTop : Type → Type
that gets named with UpperCamelCase, not WithTop α
, and so I don’t think the wiki covers it when interpreted precisely…i think it might be clearest to explicitly include type and prop families in the wiki, to head off precise/literal interpretations of the naming conventions
Andrew Yang (Nov 27 2022 at 19:29):
The sentence "Functions should be capitalized based on their return type" should probably be added to the wiki.
Andrew Yang (Nov 27 2022 at 19:31):
Along with the removal of outParam
in the example (or explicitly point out that this is an exception).
Thomas Murrills (Dec 13 2022 at 21:40):
I'm finally editing the wiki to reflect this! :)
Thomas Murrills (Dec 13 2022 at 21:44):
One thing I'd like to add: something that reflects the fact that words which are abbreviated to strings greater than one character (e.g. Mul
or Pow
) are treated the same as unabbreviated words. This is true, right?
Kevin Buzzard (Dec 13 2022 at 21:46):
you mean "is the fact that Add
is a word whereas Mul
is short for Multiplication
irrelevant to the naming convention"? If so then yes.
Thomas Murrills (Dec 13 2022 at 21:47):
I'm porting order.circular
(just to try porting something), and it provides an interesting case. Lean 3 uses btw : α → α → α → Prop
and sbtw : α → α → α → Prop
for "between" and "strictly between". My feeling is that these should be named Btw
(Prop
-valued so UpperCamelCase
, then in parallel to Mul
) and SBtw
("strictly" abbreviated to one character, then Btw
—same reasoning).
Yaël Dillies (Dec 13 2022 at 21:49):
I would go for Sbtw
, with the idea that we shouldn't put two caps one after the other.
Thomas Murrills (Dec 13 2022 at 21:50):
But Btw
stands in for a single word, right? So shouldn't it be capitalized as a word within an UpperCamelCase
name? (And don't we put two capitals after each other in acronyms, e.g. LT
)? Or should this be an exception?
Yaël Dillies (Dec 13 2022 at 21:54):
I think the acronym convention is unfollowable and we already have a zoo of them: Eq
, Ne
, LE
, LT
.
Winston Yin (Dec 13 2022 at 21:54):
Given that Lean core uses HPow
, HAppend
, HAnd
, where H
is short for the modifier heterogeneous
, I think Btw
and SBtw
is closer to this convention
Thomas Murrills (Dec 13 2022 at 21:56):
Well, Eq
is an abbreviation, not an acronym! So I feel like it should be capitalized like Mul
. (And Ne
is one of the few exceptions to the naming conventions just to parallel Eq
, as I understand it? But it could be considered as an abbreviation of "nonequal", and then it wouldn't be an exception...except for the fact that no rules explicitly mention abbreviations yet)
Yaël Dillies (Dec 13 2022 at 21:56):
Ne
is an acronym :upside_down:
Thomas Murrills (Dec 13 2022 at 21:57):
Yes, if it's taken to stand for "not equal"! But if it's taken to stand for "nonequal", then it's an abbreviation... :)
Winston Yin (Dec 13 2022 at 21:58):
There was some discussion some days ago about how to name ennreal
, and I believe it came down to ENNReal
to be clear that E,N,N
are all modifiers to Real
Thomas Murrills (Dec 13 2022 at 21:58):
What does ENNReal
stand for, again?
Winston Yin (Dec 13 2022 at 21:58):
Extended, non-negative real
Yaël Dillies (Dec 13 2022 at 21:59):
I see no point having it be ENNReal
over Ennreal
or EnnReal
. We're just making it harder for ourselves.
Thomas Murrills (Dec 13 2022 at 21:59):
Ok! ENNReal
fits nicely into the acronym convention, and suggests SBtw
...
Winston Yin (Dec 13 2022 at 22:02):
Because EReal, NNReal, ENNReal
all are modified Real
. It makes the capitalisation consistent so there are no surprises, and it makes it easier to change Real -> NNReal
, say, when we refactor some theorems
Thomas Murrills (Dec 13 2022 at 22:03):
I see no point having it be ENNReal over Ennreal or EnnReal. We're just making it harder for ourselves.
I do think ENNReal
is more communicative...it tells me ENN
is an acronym and not an abbreviation for a word that starts with Enn
. Likewise SBtw
tells me it's S
Btw
, not an abbreviation for some word starting with S
and involving the letters b
, t
, w
...
Winston Yin (Dec 13 2022 at 22:05):
Now, an argument could be made that if people start saying "ereal" as though it were its own word, similar to "poset", then the capitalisation can change to reflect that. But none of the things in this thread are that established
Thomas Murrills (Dec 13 2022 at 22:05):
(Also ENNReal
is totally consistent with UpperCamelCase
naming conventions, except each "word" is abbreviated to one character long...I think that's why it "feels right" to me, at least)
Thomas Murrills (Dec 13 2022 at 22:06):
Yeah, it definitely makes sense to account for any lexicalization that's happened irl when deciding what counts as a "word"! :)
Winston Yin (Dec 13 2022 at 22:06):
Don't take this example too seriously, but we went from E-Mail
to Email
once the word became established...
Winston Yin (Dec 13 2022 at 22:11):
Yaël Dillies (Dec 13 2022 at 22:11):
Rust has a convention that makes acronyms be considered as a single word for capitalisation. So it would be EnnReal
or Ennreal
.
Winston Yin (Dec 13 2022 at 22:13):
I guess we'll have to name it EnnReal
or Ennreal
when we formalise them in Rust :big_smile:
Thomas Murrills (Dec 13 2022 at 22:15):
Hmm, it does seem that Lean's conventions differs in that respect (except for Ne
, and as far as I can tell, only Ne
?)...but tbh I think Lean's version makes more sense, at least by my personal preferences :)
Winston Yin (Dec 13 2022 at 22:15):
I got so used to smul
in mathlib3 that I just say IPA/smʌl/
nowadays, but (alas?) the convention has been set to call it SMul
in mathlib4
Thomas Murrills (Dec 13 2022 at 22:16):
Hmm, considering all that...I think for now I'm going to go with SBtw
given that almost everything follows the naming convention (looking at you, Ne
and _Param
s...), and e.g. HAdd
, SMul
shows that it's followed even in cases like this.
Yaël Dillies (Dec 13 2022 at 22:17):
Winston Yin said:
I got so used to
smul
in mathlib3 that I just sayIPA/smʌl/
nowadays, but (alas?) the convention has been set to call itSMul
in mathlib4
We're a Ctrl+F away from fixing it :wink:
Thomas Murrills (Dec 13 2022 at 22:18):
Also, my apologies Yaël—I see that I'm making this choice in a port of a file you authored! 😅
Jireh Loreaux (Dec 13 2022 at 23:01):
@Winston Yin , you're not aligning declarations correctly (in your head). has_smul
went to SMul
, but smul
(or rather, has_smul.smul
) went to SMul.smul
.
Winston Yin (Dec 13 2022 at 23:03):
So this is now messing with some other convention in Lean core: HPow.hPow
but SMul.smul
?
Winston Yin (Dec 13 2022 at 23:04):
(Don't hate me) but the latter should probably be SMul.sMul
for consistency
Jireh Loreaux (Dec 13 2022 at 23:06):
Yes, I know, but we discussed it (somewhere probably a week ago, I don't recall where), and it was decided we will keep smul
instead of sMul
.
Winston Yin (Dec 13 2022 at 23:08):
It's here
Thomas Murrills (Dec 13 2022 at 23:13):
Will this be an exception just for smul
given how common it is? Should the general rule for everything else be to write analogous things like hMul
?
Mario Carneiro (Dec 13 2022 at 23:14):
note that core lean does not completely adhere to the naming conventions. There is a PR open to fix it, but this generated an understandable amount of internal discussion and it needs to be split up into less-contentious and more-contentious parts and then possibly make changes to the style guide. Hopefully I can get Leo in the mathlib4 porting meeting and we can all hash this out
Mario Carneiro (Dec 13 2022 at 23:15):
The discussion re: hMul
was actually to rename it to HMul.mul
which sidesteps most of the problems
Mario Carneiro (Dec 13 2022 at 23:16):
smul
is not an exception, the rule is that clusters of capitals at the start of a type name are lowercased as a group
Thomas Murrills (Dec 13 2022 at 23:19):
Isn't the rule actually that acronyms are lowercased as a group? E.g. ennReal
instead of ennreal
(if it had to be lowerCamelCased)
Thomas Murrills (Dec 13 2022 at 23:27):
At least, that's what the wiki says as written right now. (And it kind of makes sense to me: only the first "thing" in lowerCamelCase should be lowercased, not the second, and the acronym is the first thing (e.g. Real
is the second). It's just that acronyms upper/lowercase as a group, in xkcd/XKCD style.) If that's not actually the rule, though, then the wiki should be updated
Mario Carneiro (Dec 13 2022 at 23:47):
no, the initial letter of the non-acronym part is also lowercased
Alex Keizer (Dec 14 2022 at 14:47):
What do we think about MvFunctor
/MVFunctor
for "multivariate functor"? I'm inclined towards the first, since Mv
is an abbreviation, not an acronym, but it also feels that a naming convention where we have to think "Is this leading part an abbreviation or an acronym" is maybe not ideal.
Mario Carneiro (Dec 14 2022 at 14:51):
I could go either way on that one. I think we can still leave some stuff up to the author's discretion
Mario Carneiro (Dec 14 2022 at 14:52):
I think it's closer to an acronym than an abbreviation though, especially if you spell it multi-variate functor
Alex Keizer (Dec 14 2022 at 14:56):
Yeah, I guess it's the same discussion as NE
/Ne
. It could be either an abbreviation or an acronym, depending on how you spell it. All the more reason we maybe shouldn't distinguish between them? Going with MV
would lead to MVQPF
for "multivariate quotient of polynomial functor", which feels less readable than MvQPF
to me, though.
Alex Keizer (Dec 14 2022 at 14:57):
Yaël Dillies said:
Rust has a convention that makes acronyms be considered as a single word for capitalisation. So it would be
EnnReal
orEnnreal
.
Essentially, I would be with Yael on this one, in favor of adopting Rust's convention on this
Jireh Loreaux (Dec 14 2022 at 15:11):
ENNReal
has much more semantic content. I remember being totally confused when reading mathlib
for the first time and seeing ennreal
.
Also, I could be wrong, but I don't think such a change is really on the table for discussion. I don't mean you can't voice your opinion, only that it may be futile.
Michael Stoll (Dec 14 2022 at 15:23):
My futile opinion is that ENnReal
is better :smile:; I think of "nonnegative" as one entity, not two...
Thomas Murrills (Dec 14 2022 at 18:27):
If we're expressing futile preferences, I agree with Jireh! this is also why I personally like the rule that gets us ennReal
for the lowerCamelCased version instead of ennreal
—otherwise who knows when the first thing ends and the next one begins? (I mean, you don't know if enn
is an abbreviation or an acronym, but it helps distinguish things at least a bit.)
Thomas Murrills (Dec 14 2022 at 21:09):
Ok, interesting new naming case: def cIcc (a b : α) : Set α
. Stands for "circular interval, closed-closed". (There's an analogous cIoo
for "open-open".)
Current naming convention suggests it should be cicc
instead based on decapitalizing all initial clusters of capitals. cIcc
emphasizes the interval nicely but wrongly suggests Icc
is an abbreviation for something. ciCC
, for example, separates the "units of thought" nicely (decapitalized first acronym, capitalized second acronym). Thoughts?
Yaël Dillies (Dec 14 2022 at 21:11):
This will have to match docs#set.Icc at any rate.
Yaël Dillies (Dec 14 2022 at 21:11):
Your interpretation is slightly off as well. cIcc
is more directly "circular Icc
".
Thomas Murrills (Dec 14 2022 at 21:12):
ohh, I see!
Thomas Murrills (Dec 14 2022 at 21:15):
So I guess this is part of a more general decision for naming intervals of (return) type Set : α
.
In that case choices include Icc
, icc
, ICC
(← probably the most wrong?) and iCC
.
Thomas Murrills (Dec 14 2022 at 21:16):
I want to note that intervals can be infinite (i
) and I'm not sure we'd want e.g. iio
.
Reid Barton (Dec 14 2022 at 21:18):
I would be inclined towards leaving these as-is based on the fact that the lean 3 names were already unusual for being capitalized
Thomas Murrills (Dec 14 2022 at 21:20):
It does feel like it warrants an exception one way or another (so that we can use Icc
or, I guess, iCC
if we really want), since it's more of a schema for notating intervals than an acronym per se.
Jireh Loreaux (Dec 16 2022 at 20:12):
How do we naming things involving Nat.cast
? just cast
?
Floris van Doorn (Dec 16 2022 at 22:58):
I've been using natCast
if the Lean 3 name was nat_cast
and keep it as cast
otherwise.
Anatole Dedecker (Dec 19 2022 at 14:52):
What is the capitalization policy for the Co
prefix? E.g Frame/Coframe
or Frame/CoFrame
?
Sky Wilshaw (Dec 19 2022 at 16:06):
Jireh Loreaux said:
ENNReal
has much more semantic content. I remember being totally confused when readingmathlib
for the first time and seeingennreal
.
In my opinion this is a really important point - are the naming conventions designed to make experienced lean users more comfortable, or to more easily onboard new ones? I prefer ENNReal
(and SMul
, HEq
) because it would have made the learning experience more comfortable. By the way, the Rust convention would give EnnReal
not Ennreal
(e.g. GzDecoder
from a gzip library). I'm not a particular fan of Rust's convention because it still leaves ambiguity as to whether Enn
(for example) is itself an abbreviation of a longer word.
Jireh Loreaux (Dec 19 2022 at 16:08):
My expectation would be Coframe because it's one word not two.
Last updated: Dec 20 2023 at 11:08 UTC