Zulip Chat Archive

Stream: mathlib4

Topic: naming conventions: natCast vs nat_cast


Thomas Murrills (Mar 17 2023 at 10:26):

Ooo, there are a lot of nat_casts which should be natCasts spread around mathlib4. :eyes: (A fair number of int_casts, and some rat_casts.)

Eric Wieser (Mar 17 2023 at 10:30):

In lemma names shouldn't they be nat_cast?

Eric Wieser (Mar 17 2023 at 10:31):

This is confusing because we have both natCast and Nat.cast; but the latter is the canonical spelling

Thomas Murrills (Mar 17 2023 at 10:33):

Hmm, I guess the question is: how does . factor into a lemma name? Should the two things either side be viewed as part of the same whole being referenced, and therefore the whole thing is a single unit in lowerCamelCase? Or should they be viewed as separate identifiers?

Eric Wieser (Mar 17 2023 at 10:41):

I think this belongs in the naming thread now

Notification Bot (Mar 17 2023 at 10:43):

5 messages were moved here from #mathlib4 > Complex.Basic by Thomas Murrills.

Eric Wieser (Mar 17 2023 at 10:44):

As a data point, docs4#map_multiset_sum

Thomas Murrills (Mar 17 2023 at 10:48):

Hmm, good data point. If we're considering making a conscious choice, though, we might ask: would what that lemma name referred to be epsilon clearer if it were map_multisetSum?

Eric Wieser (Mar 17 2023 at 10:50):

I think it's confusing for a lowercase name (sum) to become uppercase when included in a lemma name when usually the rule is the reverse

Eric Wieser (Mar 17 2023 at 10:50):

But I agree it's not clear a conscious choice was made

Thomas Murrills (Mar 17 2023 at 10:55):

True, it might be less clear than otherwise!

Pros to lowerCamelCasing so far: keeps it as a distinctive unit, _-separation is a bit more ambiguous in terms of what each "chunk" of the name is.

Cons: a lowercase name is now uppercase, so you now might not even recognize the chunk in the first place! And there's still ambiguity about where the . should be.

Thomas Murrills (Mar 17 2023 at 10:57):

Maybe I'm just suggesting this because I'm insanely sleep deprived right now, but:

it would be unambiguous to use ' as an intraname translation for . in lemmas. E.g. map_multiset'sum.

Thomas Murrills (Mar 17 2023 at 10:59):

' is generally only used at the end of names, and dots are generally only in the middle, meaning that iff ' was part of the original thing the lemma was about, it would be at an underscore boundary and thus distinguishable from a ' which used to be a ..

Floris van Doorn (Mar 17 2023 at 11:10):

Eric Wieser said:

As a data point, docs4#map_multiset_sum

Personally, I like the name Multiset.map_sum better, especially since there is an explicit Multiset argument.

Floris van Doorn (Mar 17 2023 at 11:12):

But if we stick we the current name, I probably prefer map_multisetSum, since Multiset.sum is a single operation.

Eric Wieser (Mar 17 2023 at 11:52):

Floris van Doorn said:

Personally, I like the name Multiset.map_sum better, especially since there is an explicit Multiset argument.

I don't think Multiset.map_sum is great here because it sounds like map refers to Multiset.map, and all the other map_* lemmas are in either the root or morphism namespaces

Eric Wieser (Mar 17 2023 at 11:55):

Thomas Murrills said:

Maybe I'm just suggesting this because I'm insanely sleep deprived right now, but:

it would be unambiguous to use ' as an intraname translation for . in lemmas. E.g. map_multiset'sum.

If we want to go down that route, map_«Multiset.sum» seems less bad...

Eric Wieser (Mar 17 2023 at 11:56):

Oh, that's not legal any more in Lean4; map_«Multiset.sum» is now parsed as map_ «Multiset.sum» instead of «map_Multiset.sum»

Eric Wieser (Mar 17 2023 at 12:24):

This is really a mess; we have 336 matches for _(nat|int|rat)Cast and 404 matches for _(nat|int|rat)_cast (on some random branch I'm on)

Eric Wieser (Mar 17 2023 at 12:24):

(multisetSum has 0 matches to multiset_sums 161)

Eric Wieser (Mar 17 2023 at 12:25):

theorem.*_(nat|int|rat)_cast has 85 matches, theorem.*_(nat|int|rat)Cast has 91

Jireh Loreaux (Mar 17 2023 at 13:04):

We actually did talk about this at some point and I think the _ separator was determined. I believe the conversation was even about Nat.cast specifically.

Eric Wieser (Mar 17 2023 at 13:08):

There's a single message here:

Floris van Doorn said:

I've been using natCast if the Lean 3 name was nat_cast and keep it as cast otherwise.

Jireh Loreaux (Mar 17 2023 at 13:08):

I just found that too. I guess I fabricated that memory?

Jireh Loreaux (Mar 17 2023 at 13:09):

Sorry

Eric Wieser (Mar 17 2023 at 13:10):

A similar data point is the lemmas about docs4#Int.ceil etc, such as docs#Nat.cast_ceil_eq_int_ceil

Floris van Doorn (Mar 17 2023 at 13:20):

These data points are mostly an indication of what mathport is automatically generating as lemma names (i.e. what it was in Lean 3), not necessarily what we want.

Eric Wieser (Mar 17 2023 at 13:51):

Sure; though having an example of lemmas that would be renamed helps us decide which convention to choose

Thomas Murrills (Mar 18 2023 at 22:20):

Just as one final avenue: are there any other potential solutions of the form "translate the dot explicitly"? This would eliminate all ambiguity and make lemmas easy to find and recognize as the one you're looking for (once you know about it), but might be more verbose/less pretty. So it might come down to what our naming convention prioritizes.

' is the only one I could think of which is 1) easy to type/ASCII 2) brief 3) admissible in an identifier, but it does look weird as of now. frobenius_nat'cast (n : ℕ). map_multiset'sum. hmm.

Another option in this vein is translating the dot to a word, like _foo_, but I'm not sure what word we'd use.

__ is technically a possibility, but makes the two parts of the name look less associated with each other, so imo that's not great.

Any other "unambiguous translation" possibilities anyone can think of (or other ideas/comments) before we make a poll?

Notification Bot (Apr 02 2023 at 09:26):

29 messages were moved here from #mathlib4 > naming conventions by Eric Wieser.

Eric Wieser (Apr 02 2023 at 09:28):

/poll What should lemma about Foo.bar (Nat.cast x) be called?

  • Foo.bar_nat_cast
  • Foo.bar_natCast
  • fooBar_natCast
  • foo_bar_nat_cast

Eric Wieser (Apr 02 2023 at 09:29):

/poll What should lemma about Nat.cast (Bar.baz x) be called?

  • Nat.cast_bar_baz
  • Nat.cast_barBaz
  • natCast_barBaz
  • nat_cast_bar_baz

Eric Wieser (Apr 02 2023 at 09:30):

(if adding a new option, please add it to both polls in a way that is consistent, even if you don't vote for both)

Eric Wieser (Apr 02 2023 at 09:31):

To some extent the question boils down to whether .s in namespaced names collapse into _s or single tokens

Eric Wieser (Apr 02 2023 at 09:32):

Declaring natCast/intCast/ratCast as exceptions to the rules is probably also not unreasonable

Reid Barton (Apr 02 2023 at 11:44):

What are the Lean 3 names of these hypothetical lemmas?

Eric Wieser (Apr 02 2023 at 12:34):

Usually foo.bar_nat_cast and nat.cast_bar_baz

Eric Wieser (Apr 03 2023 at 23:45):

:ping_pong:, to get some more opinions on the poll. I forgot to bring this up in the porting meeting.

Eric Wieser (Sep 14 2023 at 12:38):

:ping_pong:, since as far as I can tell this isn't covered yet by #naming


Last updated: Dec 20 2023 at 11:08 UTC