Zulip Chat Archive

Stream: general

Topic: Hamming naming


Wrenna Robson (Jul 08 2022 at 11:32):

Thanks to the help of @Eric Wieser, #14739, which adds the Hamming distance and Hamming norm (in a computable form) as well as a type synonym in order to equip a pi type with it, is nearly ready.

There are two naming questions that I'd like feedback on.

Firstly, currently the namespace is hamm and the distance/norm functions are called hamm_dist and hamm_norm. The question is: should this be hamming, hamming_dist and hamming_norm instead?

Secondly, the current location of hamming.lean is a new directory, coding_theory, so that this PR goes to coding_theory/hamming. Is this acceptable? If not, where should it go? The context is that I am working on other aspects of coding theory which, I hope, will eventually find their place here.

Finally, in the event that we do have a new directory or subdirectory, should it be called coding_theory, or another name?

Eric Wieser (Jul 08 2022 at 11:33):

An alternative to coding_theory would be information_theory since that's how wikipedia introduces the hamming distance:

In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.
[...]
A major application is in coding theory, more specifically to block codes, in which the equal-length strings are vectors over a finite field.

Yaël Dillies (Jul 08 2022 at 11:34):

Or just information because the _theory is not very... informative.

Wrenna Robson (Jul 08 2022 at 11:34):

bits are cheap, names are good

Eric Wieser (Jul 08 2022 at 11:34):

Yaël Dillies said:

Or just information because the _theory is not very... informative.

Sure, let's remove the theory suffix from https://en.wikipedia.org/wiki/Information_theory first though ;)

Wrenna Robson (Jul 08 2022 at 11:35):

But I think information_theory is a good shout. Conceptually I feel like this is a neighbour of the contents of the computability folder: it might be useful to have some of these "discrete mathematics/informatics/computer science-adjacent" things in the same place.

Wrenna Robson (Jul 08 2022 at 11:35):

informatics is probably the most general term

Eric Rodriguez (Jul 08 2022 at 11:39):

Wrenna Robson said:

bits are cheap, names are good

until your PR title is 80% filename

Eric Rodriguez (Jul 08 2022 at 11:40):

I have typed ring_theory/polynomial/cyclotomic/eval too many times

Wrenna Robson (Jul 08 2022 at 11:40):

certainly i don't want to work on representation_theory any time soon but I'm not sure theory is the problem there

Wrenna Robson (Jul 08 2022 at 11:42):

I think informatics is probably a better shorter name than information for information_theory

Wrenna Robson (Jul 08 2022 at 11:43):

But coding_theory is a sub-theory of information_theory so this may not solve the problem. Though the Hamming distance should be in information_theory (or whatever) as it makes sense outside the coding_theory context.

Kevin Buzzard (Jul 08 2022 at 11:45):

Who cares about PR titles? This issue has come up before when I talk to people who are not directly involved with the project. Yes of course we want add and not addition in theorem names. But beyond some point we're going to end up writing our own "code" which is not at all ideal; the Coq naming convention was super-compact and it makes it super-hard for me to read. As Wrenna points out, bytes are cheap. We also tend to love golfing proofs because it's fun; it might also be a step in the wrong direction.

Wrenna Robson (Jul 08 2022 at 11:46):

Ideally we'd have a subject matter expert weigh in on this as I would imagine there will be more and more "computer science-adjacent" stuff in mathlib over time, things that are still proper for mathlib (i.e. mathematics) but "about" computing.

Wrenna Robson (Jul 08 2022 at 11:46):

https://en.wikipedia.org/wiki/Theoretical_computer_science - something like the list of subjects here.

Wrenna Robson (Jul 08 2022 at 11:47):

Nobody on this good and green earth would sanely do computation in Lean 3, but that won't be true of Lean 4 - I dream of a world where all this stuff is nicely enmeshed and architected (and we are all watched over by machines of loving grace, yada yada).

But right now I'm trying to solve a concrete problem of a handful of names.

Eric Wieser (Jul 08 2022 at 11:52):

/poll What should the folder name be

Eric Wieser (Jul 08 2022 at 11:54):

/poll What should the definitions be called

  • hamm_dist, hamm_norm, hamm, to_hamm, of_hamm
  • hamming_dist, hamming_norm, hamming, to_hamming, of_hamming

Wrenna Robson (Jul 08 2022 at 11:59):

(As noted I think we may want a coding_theory subfolder at some point but as ever let's wait until n > 1)

Wrenna Robson (Jul 08 2022 at 12:09):

A consensus emerges! What do we consider quorate?

Eric Wieser (Jul 08 2022 at 12:10):

I recon give it 24 hours to account for timezones; feel free to update the PR based on your predicted result earlier than that

Eric Wieser (Jul 08 2022 at 12:11):

Renaming later is cheap, but we may as well wait for people who have strong opinions but are asleep

Wrenna Robson (Jul 08 2022 at 12:23):

Done :)

Wrenna Robson (Jul 09 2022 at 09:12):

I think we can call this, @Eric Wieser. I already did the rename.


Last updated: Dec 20 2023 at 11:08 UTC