Zulip Chat Archive

Stream: mathlib4

Topic: naming convention linter

Jireh Loreaux (Dec 18 2025 at 21:38):

For those of you (maybe @Damiano Testa, @Thomas Murrills or @Michael Rothgang ?) who have experience writing syntax linters, a trivial one to write would be a linter to help us adhere to the naming conventions. Of course, we could have something sophisticated, but the most basic version would just do this:

Is this an inductive, structure or class (yes, I realize the latter two are just special cases of the former, so potentially you only need to deal with inductive, I don't know)? Does it start with an upper case letter and contain no _? If not, warn the user:

The naming conventions dictate that types (i.e., terms of Sort _, Type _ or Prop) are named using UpperCamelCase, so the name should start with an upper case letter and contain no underscores.
Is this a def? If so, check whether it is a type or not. If it is, the proceed as in (1). If not, check whether the name contains any underscores and starts with a lowercase letter. If it doesn't, warn the user:

The naming conventions dictate that definitions of things that are not themselves types are named using lowerCamelCase, so the name should start with a lower case letter and contain no underscores.

Bryan Gin-ge Chen (Dec 18 2025 at 21:48):

(For something sophisticated, I'm still hoping someone will pick up the work started here: #lean4 > automatic spelling generation & comparison @ 💬)

Jireh Loreaux (Dec 18 2025 at 22:03):

The biggest problem with that I think is that we still haven't even decided (at least, not in a way that is reflected in the naming conventions, but I think more than that) upon the naming convention regarding dot notation, scoped infix notation, etc. I received literally no feedback (in the form of :thumbs_up: or :thumbs_down:) on the proposal I had in this regard several weeks ago. There is some concern that my proposal would cause a great many deprecations within CategoryTheory, so perhaps it's not the right thing, but no one else has proposed a concrete alternative (to my knowledge)

Bryan Gin-ge Chen (Dec 18 2025 at 22:23):

I haven't looked too closely at the code there, but certainly an important step before turning that into a linter will be building some basic tools in Lean to compare types to names. I hope we'd then be able to easily answer questions like "with this proposed rule, how many violations are there in core / batteries / mathlib?"

Michael Rothgang (Dec 18 2025 at 22:29):

Even the basic linter suggested above sounds useful and not too difficult to write. I may give this a try, but no guarantees when I'll find the time. If I haven't made a comment to that effect, feel free to go ahead yourself!

Jireh Loreaux (Dec 18 2025 at 22:38):

Michael Rothgang said:

Even the basic linter suggested above sounds useful and not too difficult to write.

Hence the reason I suggested it. :slight_smile:

Violeta Hernández (Dec 18 2025 at 22:52):

Oh, I wasn't aware we were supposed to give feedback. Just gave your proposal a thumbs up.

Thomas Murrills (Dec 18 2025 at 22:54):

I've been thinking lately about what it would look like to have a "declaration manipulation" infrastructure, because parts of this on the implementation side are currently a bit inconvenient (or subtly tricky) for only mundane reasons (and would be trivial with the right infrastructure). It would be nice to have a "declaration linter" setup which would subsume linters like unusedDecidableInType, this linter, and other linters, and provide them with the information necessary to act on all user-written decls in performant ways.

(That last bit about performance is part of the motivation here, too; robustly detecting and associating new declarations with elaboration info can be a bit expensive.)

I understand that this proposal is intended specifically to be doable immediately without deeper considerations like this, but doing it immediately (in an inevitably somewhat bespoke, nonreusable manner) is another small bit of technical debt. Maybe we want to take on that technical debt if the value is great enough! :) But either way, I think that for long-term maintainability (especially since this will not be the only linter of this sort) it would be great to head in the infrastructure direction, even if we start off minimally. :)

Jireh Loreaux (Dec 18 2025 at 23:02):

Thomas Murrills said:

I understand that this proposal is intended specifically to be doable immediately without deeper considerations like this, but doing it immediately (in an inevitably somewhat bespoke, nonreusable manner) is another small bit of technical debt.

Technical debt? I'm not convinced. Not a good use of time if something is in the pipeline? Maybe.

It's not technical debt because nothing depends on it. You just rip it out when you have something better with which to replace it or when it doesn't perform the way you want.

Indeed, the point of my suggestion was that I thought it could probably be implemented in an afternoon by someone with the requisite knowledge, and we could reap the benefits immediately at low cost (i.e., no CI performance hits, etc.). This occurred to me because I was reviewing a PR in which I had to specify this to the author, whereas this tool could have done it for me! :smiley:

Thomas Murrills (Dec 18 2025 at 23:09):

Jireh Loreaux said:

It's not technical debt because nothing depends on it. You just rip it out when you have something better with which to replace it or when it doesn't perform the way you want.

I don't think this is an exhaustive definition of technical debt. Giving ourselves another bespoke metaprogram to maintain is part of the form that this particular form of technical debt takes. :) Also, that "just" in "You just rip it out" is load-bearing--refactoring something in terms of new infrastructure when it becomes available can indeed sometimes be trivial, but can often (ime, especially in metaprogramming) be plagued by incompatible design decisions making it nontrivial to sort out what parts remain and what stay. So this form of technical debt is the thing that is paid in the future refactoring effort.

Jireh Loreaux said:

we could reap the benefits immediately at low [immediate] cost (i.e., no CI performance hits, etc.)

The net value here might be a reason to take on that debt! :)

Thomas Murrills (Dec 18 2025 at 23:10):

(I am experimenting right now to see if the costs are not too bad in this case. :grinning_face_with_smiling_eyes:)

Jireh Loreaux (Dec 19 2025 at 00:15):

Just to advocate my point (because I do note that you're already attempting this):

Maintenance burden on a small metaprogram is generally minimal.
Rip it out means you don't refactor it or try to reuse the ideas, because something more sophisticated would do it completely differently. In this way, you pay a one-time development cost, but don't pay more in the future (modulo the already-stated minor maintenance cost). This is why I claim it is not technical debt.

If I'm wrong about how small the metaprogram is, then there's a different story here.

Thomas Murrills (Dec 19 2025 at 01:28):

Yes, but it does quickly add up. :) This isn't an argument for somehow pretending the cost is bigger than it is or anything; it's just an argument for being aware of the global situation! That is, it is indeed fine to add a little maintenance cost globally, as long as you don't add a ton of little maintenance costs locally and therefore a big one globally. So that's what's informing my general inclination here, but it's certainly not a black-and-white rule of any sort.
Okay, but how do you make sure that that's what actually winds up happening? :) One way is simply to plan, and then document the plan; that would probably be enough. A failure mode if we didn't do so could look like: we add a small metaprogram, with vague undocumented intent to rip it out later; we don't get around to the more general thing, and now with the base version in place, it's only a small cost to extend it further, so we decide there's no harm in that; perhaps this repeats; now it is not so clear that the handwaved infrastructure is more sophisticated once it starts to be developed, and perhaps the scope of the new thing has crept, so it does things slightly differently than the new infrastructure would. So, now we have to assess that switching to the new infrastructure preserves behavior, or make decisions about the behavior changing... Again, this is just a worst-case scenario to ward off, not something we should imagine always happening! But my thinking is we may as well try to avoid it if it's reasonable to do so--i.e. as long as doing so doesn't come with its own untenable costs (such as indefinite delay). :)

Thomas Murrills (Dec 19 2025 at 01:33):

And re: how small the metaprogram actually is: dealing with the many variations of declaration syntax can be pretty thorny, and in general, associating elaboration info with the declaration syntax and actual resulting decls can be tricky (and sometimes cursed). But I'm wondering if the limited scope here means we can get away with pretty much just handling the syntax, which might be briefly gnarly, but after that, possibly not too bad...

Kim Morrison (Dec 19 2025 at 07:59):

I am pretty scared about the creeping maintenance cost of linters. Despite generally being a "perfect is the enemy of good" sort of person, I want to back up Thomas' concerns here. The meta code for linters brings a real cost.

Damiano Testa (Dec 19 2025 at 15:57):

By the way, Mathlib already contains over 400 definitions that contain an underscore in their name, regardless of whether they are correctly capitalised for being or not a type.

import Mathlib

open Lean Elab Command

run_meta
  let env ← getEnv
  let csts := env.constants.map₁
  let mut nms := #[]
  for (nm, cinfo) in csts do
    -- skip non-definitions
    unless cinfo.isDefinition do continue
    ---- skip internal details
    if nm.isInternalDetail then continue
    -- skip instances
    if ← Meta.isInstance nm then continue
    -- skip parsers and delaborators
    if let some val := cinfo.value? then
      let comps := val.getAppFnArgs.1.components
      if comps.contains `ParserDescr || comps.contains `Delaborator then continue
    if nm.components.getLast!.toString.contains '_' then
      -- skip declarations that are not in mathlib
      let mod ← findModuleOf? nm
      let some `Mathlib := mod.map (·.getRoot) | continue
      nms := nms.binInsert (·.toString < ·.toString) nm
  logInfo <| m!"\n".joinSep <|
    m!"{nms.size} definitions contain an underscore in their last segment" :: "" :: (nms.map (m!"{.ofConstName ·}")).toList

Bhavik Mehta (Dec 19 2025 at 18:03):

Jireh Loreaux said:

but no one else has proposed a concrete alternative (to my knowledge)

The alternative I proposed in the other thread, and a couple of others have done before me (for example here), was for the naming convention to match what the pretty printer does. I won't repeat all the arguments in detail in favour of it here, but it avoids many of the issues with your proposal, eg fst becoming prefix, CategoryTheory being entirely deprecated, numerous exceptions being introduced to the convention, and ContinuousLinearMap.comp lemmas no longer matching their LinearMap counterparts.

Michael Rothgang (Dec 19 2025 at 19:19):

#33103 renames five misnamed definitions

Michael Rothgang (Dec 19 2025 at 19:19):

And https://github.com/leanprover-community/mathlib4/pull/33103/commits/b03e1094c7e3e3111c585a56ff4fba5f4ea40d99 exposes an interesting bug in alias' interaction with noncomputable: I get an error

failed to compile definition, compiler IR check failed at `fintype_of_finiteDimensional`. Error: depends on declaration 'IsLocalFrameOn.fintypeOfFiniteDimensional', which has no executable code; consider marking definition as 'noncomputable'

Michael Rothgang (Dec 19 2025 at 19:21):

Marking the definiting preceding it with noncomputable explicitly solves it (which is otherwise not necessary, as we're in a noncomputable section).

Michael Rothgang (Dec 19 2025 at 19:25):

The list generated by the above contains 34 definitions Foo.Simps.symm_apply, which are tagged with the "custom simps projection" library note. I consider these false positives.

Jireh Loreaux (Dec 19 2025 at 19:30):

@Bhavik Mehta, sorry, I meant a more official proposal, officially branded as such. I should have been more clear. I would like to see it as a more complete proposal. For example, how does it deal with scoped notations? What happens if we change the way something pretty prints (e.g., with pp_nodot or whatever it is now)?

Damiano Testa (Dec 19 2025 at 19:52):

Michael Rothgang said:

Marking the definiting preceding it with noncomputable explicitly solves it (which is otherwise not necessary, as we're in a noncomputable section).

This shows that there is an issue...

import Mathlib

-- adds to `noncomputableExt`, among others
whatsnew in
noncomputable def X' := (· / · : ℝ → ℝ → ℝ)

noncomputable section

-- only adds to the `declRangeExt`
whatsnew in
def X := (· / · : ℝ → ℝ → ℝ)

Michael Rothgang (Jan 26 2026 at 13:35):

#34440 adds a very basic check: it complains about declarations (in the source) which have an uppercase name component. Surely, some of these are false positives, but a number of them looks quite suspicious. For instance:

Help with fixing them is very welcome!

Michael Rothgang (Jan 26 2026 at 13:36):

This CI run will have more results.

Michael Rothgang (Jan 26 2026 at 13:38):

@Jeremy Tan @Harald Husum If you're looking for a task to help at, taking a declaration from this list (or several in the same file) and proposing a rename could be helpful.

Violeta Hernández (Jan 26 2026 at 14:19):

#34452 fixes the ordinal one

Floris van Doorn (Jan 26 2026 at 14:34):

I don't think any of these are false positives.

Violeta Hernández (Jan 26 2026 at 14:38):

And #34454 fixes the Mahler measure one

Michael Rothgang (Jan 26 2026 at 14:47):

The CI log has some warnings which arguably are linter false positives. And a lot of true positives.

Michael Rothgang (Jan 27 2026 at 12:41):

#34483 fixes three names in Moments/SubGaussian.lean

Michael Rothgang (Jan 28 2026 at 14:43):

#34440 should now be ready for review: I converted it to an environment linter and added all current violations as exceptions. There are about 660 of them; most appear to be true positives. (If you spot a false pattern that should be allowed as an exception, let me know; this is easy to add.)

Michael Rothgang (Jan 28 2026 at 14:50):

#34519 is a new linter for definitions whose names contain underscores: at the moment, there are 1000 exceptions (added to nolints.json). PRs fixing violations are very welcome; hints for avoiding some more false positives also.

Vlad Tsyrklevich (Jan 28 2026 at 15:50):

(deleted) I see this is already discussed in the PR and the nolints.json does not yet reflect the new rules.

Vlad Tsyrklevich (Jan 28 2026 at 15:52):

(deleted)

Vlad Tsyrklevich (Jan 28 2026 at 15:54):

(deleted)

Michael Rothgang (Jan 28 2026 at 17:15):

There are new 550 exceptions for the linter in #34519 - much better!

Michael Rothgang (Feb 09 2026 at 21:36):

#35055 fixes another misnamed declaration

David Loeffler (Feb 14 2026 at 10:59):

It would make me rather sad if the Gamma function, or the congruence subgroups ( $\Gamma_0(N)$ etc), were to be lowercased. We should not prioritize "consistency with rules mathlib has made for itself" over "consistency with widespread standard conventions in real-world mathematics".

Ruben Van de Velde (Feb 14 2026 at 11:02):

But we're not using $\Gamma$ in either case

Jon Eugster (Feb 14 2026 at 11:24):

I'm sure the naming convention can always be extended with exceptions which can be reflected by any linter, if there is the desire for one.

I think it's reasonable that docs#Real.Gamma falls into the same exception as e.g. Icc :+1: (or would at least require a larger Zulip discussion if there would be any intend to lowercase it)

Michael Rothgang (Feb 14 2026 at 12:32):

I'm fine with making an explicit exception, if there is consensus!

Last updated: Feb 28 2026 at 14:05 UTC