Zulip Chat Archive

Stream: general

Topic: Assorted renames in Probability/Distributions/


Josha Dekker (Jan 11 2024 at 10:35):

Recently (https://leanprover.zulipchat.com/#narrow/stream/217875-Is-there-code-for-X.3F/topic/Geometric.20distribution/near/411902825) it came up that some of the naming with acronyms such as pmf, pdf, cgf, cdf and friends inProbability/Distributions/may not be fully consistent with the https://github.com/leanprover-community/mathlib4/wiki/Porting-wiki#naming-convention (rule 6). I have one PR pending that goes in this folder, then I'd like to do some renaming in the folder to make sure that everything is up to convention. I'm opening up this thread to make it easier to have a discussion on the names, before making the changes and opening a PR.

Josha Dekker (Jan 11 2024 at 10:37):

Acronyms should be lower/upper-case together, so, using dist as a placeholder name for a distribution, we should e.g. replace distPdfReal and distPdf by distPDFReal and distPDF (see e.g. Probability/Distributions/Gaussian.lean). Is this right?

Josha Dekker (Jan 11 2024 at 10:37):

/poll For this, we should use
distPDFReal
distPdfReal
distPDF_Real
other

Josha Dekker (Jan 11 2024 at 10:39):

Of course, a similar thing should be done for pmf, cgf and cdf!

Josha Dekker (Jan 11 2024 at 10:39):

Also, it is a convention in (some areas of) probability theory to write pdf and CDF rather than pdf and cdf, but I think that the LEAN way of doing things would be to keep it all lowercase, right?

Josha Dekker (Jan 11 2024 at 10:42):

It may be the case that changing these names break dependencies outside of Matlib, but the number of changes is large (most lemma's/theorems in these files contain one of these names), so I'm not sure whether we should make aliases for everything or just accept that things may break (but are easily fixed) in projects that depend on Matlib?

Josha Dekker (Jan 11 2024 at 10:43):

/poll Use aliases?
Yes
No

Josha Dekker (Jan 11 2024 at 10:44):

Also, once I start on this crusade, I may also change these same naming issues in some other places within Probability/, if that is okay? (Only if we agree that the names should be changed in the first place, of course!)

Yaël Dillies (Jan 11 2024 at 10:44):

Josha Dekker said:

Also, it is a convention in (some areas of) probability theory to write pdf and CDF rather than pdf and cdf, but I think that the LEAN way of doing things would be to keep it all lowercase, right?

Can you show examples?

Josha Dekker (Jan 11 2024 at 10:45):

Yaël Dillies said:

Josha Dekker said:

Also, it is a convention in (some areas of) probability theory to write pdf and CDF rather than pdf and cdf, but I think that the LEAN way of doing things would be to keep it all lowercase, right?

Can you show examples?

In probability theory or in lean?

Yaël Dillies (Jan 11 2024 at 10:47):

In mathlib

David Loeffler (Jan 11 2024 at 10:48):

I'd be interested to see examples in the informal literature as well – I am surprised by the claim that it is is standard to use different case conventions for these two related concepts.

Josha Dekker (Jan 11 2024 at 10:50):

Yaël Dillies said:

In mathlib

My claim was that some areas of the informal literature write pdf vs CDF, but I'm not advocating for that!
In Mathlib, Probability/cdf uses all lowercase cdf, which is fine I think!

Yaël Dillies (Jan 11 2024 at 10:52):

I see that cdf is camel-cased as Cdf. It should be camel-cased as CDF. Apart from that, the naming convention seems respected.

Josha Dekker (Jan 11 2024 at 10:54):

David Loeffler said:

I'd be interested to see examples in the informal literature as well – I am surprised by the claim that it is is standard to use different case conventions for these two related concepts.

So it is typical to write F(x)F(x) for a CDF, and f(x)f(x) for a PDF; I have seen authors write pdf rather than PDF to stress that. I suspect this to be mainly the case in some undergrad books/readers that I've used in the past, but I'm not in my office so can't check that now unfortunately. But for mathlib this shouldn't matter, as acronyms are all lower/all uppercase depending on their usage anyway! Also, I think that the preferable thing to do is PDF and CDF anyway, so I probably shouldn't have added it to the thread!

Josha Dekker (Jan 11 2024 at 10:55):

Yaël Dillies said:

I see that cdf is camel-cased as Cdf. It should be camel-cased as CDF. Apart from that, the naming convention seems respected.

Yes, it is fine in Mathlib, I just thought I'd briefly mention that I'd seen such naming conventions in the past, but I should probably have realised that that was beside the point of this thread.

Josha Dekker (Jan 11 2024 at 11:04):

David Loeffler said:

I'd be interested to see examples in the informal literature as well – I am surprised by the claim that it is is standard to use different case conventions for these two related concepts.

Screenshot-2024-01-11-at-12.02.07.png
Screenshot-2024-01-11-at-12.01.55.png
These are taken from a reader which is based on Bain & Engelhardt, Introduction to Probability and Mathematical Statistics.

Josha Dekker (Jan 12 2024 at 08:42):

Okay I just started working on a PR with all the name-changes. In Probability/Kernel/CondCDF, we have a lot of Cdf terms that should be CDF based on our earlier considerations, shall I change them as well?

Josha Dekker (Jan 12 2024 at 08:52):

Similar for: Probability/...

  • Kernel/Distintegration.lean
  • Cdf.lean: we define a def cdf, should this be CDF? Also, the occurrences of condCdf should be condCDF when Probability/Kernel/CondCDF gets its renames.

Josha Dekker (Jan 12 2024 at 08:59):

This is now #9681. First commit only changes the files in Probability/Distributions, subsequent commits also change the things in my previous two messages, with the exception that it doesn't rename cdf to CDF yet (as I'm not 100% sure about that one)

Josha Dekker (Jan 12 2024 at 09:04):

We now have a definition cdf and a definition PMF (in Probability/Cdf.lean and Probability/ProbabilityMassFunction/Basic.lean respectively), but it feels like their capitalisation should be treated equally?


Last updated: May 02 2025 at 03:31 UTC