Zulip Chat Archive
Stream: general
Topic: Assorted renames in Probability/Distributions/
Josha Dekker (Jan 11 2024 at 10:35):
Recently (https://leanprover.zulipchat.com/#narrow/stream/217875-Is-there-code-for-X.3F/topic/Geometric.20distribution/near/411902825) it came up that some of the naming with acronyms such as pmf
, pdf
, cgf
, cdf
and friends inProbability/Distributions/
may not be fully consistent with the https://github.com/leanprover-community/mathlib4/wiki/Porting-wiki#naming-convention (rule 6). I have one PR pending that goes in this folder, then I'd like to do some renaming in the folder to make sure that everything is up to convention. I'm opening up this thread to make it easier to have a discussion on the names, before making the changes and opening a PR.
Josha Dekker (Jan 11 2024 at 10:37):
Acronyms should be lower/upper-case together, so, using dist
as a placeholder name for a distribution, we should e.g. replace distPdfReal
and distPdf
by distPDFReal
and distPDF
(see e.g. Probability/Distributions/Gaussian.lean
). Is this right?
Josha Dekker (Jan 11 2024 at 10:37):
/poll For this, we should use
distPDFReal
distPdfReal
distPDF_Real
other
Josha Dekker (Jan 11 2024 at 10:39):
Of course, a similar thing should be done for pmf
, cgf
and cdf
!
Josha Dekker (Jan 11 2024 at 10:39):
Also, it is a convention in (some areas of) probability theory to write pdf
and CDF
rather than pdf
and cdf
, but I think that the LEAN way of doing things would be to keep it all lowercase, right?
Josha Dekker (Jan 11 2024 at 10:42):
It may be the case that changing these names break dependencies outside of Matlib, but the number of changes is large (most lemma's/theorems in these files contain one of these names), so I'm not sure whether we should make aliases for everything or just accept that things may break (but are easily fixed) in projects that depend on Matlib?
Josha Dekker (Jan 11 2024 at 10:43):
/poll Use aliases?
Yes
No
Josha Dekker (Jan 11 2024 at 10:44):
Also, once I start on this crusade, I may also change these same naming issues in some other places within Probability/
, if that is okay? (Only if we agree that the names should be changed in the first place, of course!)
Yaël Dillies (Jan 11 2024 at 10:44):
Josha Dekker said:
Also, it is a convention in (some areas of) probability theory to write
CDF
rather thancdf
, but I think that the LEAN way of doing things would be to keep it all lowercase, right?
Can you show examples?
Josha Dekker (Jan 11 2024 at 10:45):
Yaël Dillies said:
Josha Dekker said:
Also, it is a convention in (some areas of) probability theory to write
CDF
rather thancdf
, but I think that the LEAN way of doing things would be to keep it all lowercase, right?Can you show examples?
In probability theory or in lean?
Yaël Dillies (Jan 11 2024 at 10:47):
In mathlib
David Loeffler (Jan 11 2024 at 10:48):
I'd be interested to see examples in the informal literature as well – I am surprised by the claim that it is is standard to use different case conventions for these two related concepts.
Josha Dekker (Jan 11 2024 at 10:50):
Yaël Dillies said:
In mathlib
My claim was that some areas of the informal literature write pdf
vs CDF
, but I'm not advocating for that!
In Mathlib, Probability/cdf
uses all lowercase cdf
, which is fine I think!
Yaël Dillies (Jan 11 2024 at 10:52):
I see that cdf
is camel-cased as Cdf
. It should be camel-cased as CDF
. Apart from that, the naming convention seems respected.
Josha Dekker (Jan 11 2024 at 10:54):
David Loeffler said:
I'd be interested to see examples in the informal literature as well – I am surprised by the claim that it is is standard to use different case conventions for these two related concepts.
So it is typical to write for a CDF, and for a PDF; I have seen authors write pdf
rather than PDF
to stress that. I suspect this to be mainly the case in some undergrad books/readers that I've used in the past, but I'm not in my office so can't check that now unfortunately. But for mathlib this shouldn't matter, as acronyms are all lower/all uppercase depending on their usage anyway! Also, I think that the preferable thing to do is PDF and CDF anyway, so I probably shouldn't have added it to the thread!
Josha Dekker (Jan 11 2024 at 10:55):
Yaël Dillies said:
I see that
cdf
is camel-cased asCdf
. It should be camel-cased asCDF
. Apart from that, the naming convention seems respected.
Yes, it is fine in Mathlib, I just thought I'd briefly mention that I'd seen such naming conventions in the past, but I should probably have realised that that was beside the point of this thread.
Josha Dekker (Jan 11 2024 at 11:04):
David Loeffler said:
I'd be interested to see examples in the informal literature as well – I am surprised by the claim that it is is standard to use different case conventions for these two related concepts.
Screenshot-2024-01-11-at-12.02.07.png
Screenshot-2024-01-11-at-12.01.55.png
These are taken from a reader which is based on Bain & Engelhardt, Introduction to Probability and Mathematical Statistics.
Josha Dekker (Jan 12 2024 at 08:42):
Okay I just started working on a PR with all the name-changes. In Probability/Kernel/CondCDF
, we have a lot of Cdf
terms that should be CDF
based on our earlier considerations, shall I change them as well?
Josha Dekker (Jan 12 2024 at 08:52):
Similar for: Probability/...
Kernel/Distintegration.lean
Cdf.lean
: we define a defcdf
, should this beCDF
? Also, the occurrences ofcondCdf
should becondCDF
whenProbability/Kernel/CondCDF
gets its renames.
Josha Dekker (Jan 12 2024 at 08:59):
This is now #9681. First commit only changes the files in Probability/Distributions
, subsequent commits also change the things in my previous two messages, with the exception that it doesn't rename cdf
to CDF
yet (as I'm not 100% sure about that one)
Josha Dekker (Jan 12 2024 at 09:04):
We now have a definition cdf
and a definition PMF
(in Probability/Cdf.lean
and Probability/ProbabilityMassFunction/Basic.lean
respectively), but it feels like their capitalisation should be treated equally?
Last updated: May 02 2025 at 03:31 UTC