Zulip Chat Archive

Stream: general

Topic: Minimal cardinality of generating set


Andrew Yang (Oct 16 2022 at 10:41):

What would be a good name for the minimal cardinality of a generating set of a submodule? I couldn't find one in the literature except "minimal number of elements generating M". Currently, I am using submodule.min_generator_card, but it is quite lengthy. I wonder if something more radical like submodule.size would be better.

Johan Commelin (Oct 16 2022 at 10:43):

grank? = "generalized rank"

Johan Commelin (Oct 16 2022 at 10:44):

What do you need it for, btw?

Andrew Yang (Oct 16 2022 at 11:02):

It would be useful when stating theorems like Krull's height theorem (the statement would be p ∈ I.minimal_primes → p.height ≤ I.size) and also in the proofs where we would need to perform induction on it while generalizing I.

Andrew Yang (Oct 16 2022 at 11:05):

grank sounds good! I would need a nat-valued version and a enat valued version though. Should it be fin_grank/ grank or grank / egrank?

Kevin Buzzard (Oct 16 2022 at 11:10):

grank is one of those things which is great because it's short but will be hard for those not in the know to understand. I guess we have to ask how much we want to cater for these people. Bytes are cheap, I think we are too quick to forget this here. But all the more descriptive choices seem much longer :-( min_gen_size? :-(

Johan Commelin (Oct 16 2022 at 11:39):

If we go with g_rank and eg_rank, then everyone will understand it is somehow rank-like.

Andrew Yang (Oct 16 2022 at 13:01):

I agree that bytes are cheap, but when they appear multiple times in a statement, it makes it clunky and hard to read.
For example for the presentation RmRnMR^m \to R^n \to M for the finite-presented module MM, the first map would have a type (fin (module.finite.rep M).ker.min_generator_card → R) →ₗ[R] (fin (module.min_generator_card M) → R)
If there are two modules and we need to reason about the homomorphism between the presentations, things would deteriorate quite quickly.

Kyle Miller (Oct 16 2022 at 13:20):

My understanding of this answer is that "rank" of a module corresponds to the cardinality of a maximum independent set (for a module over a domain).

How about gen_card? Taking the cardinality of a minimal generating set is implicit, since you can always add more generators.

Johan Commelin (Oct 16 2022 at 13:24):

I agree that minimal is implicit. But gen_ is ambiguous. The first thing that came to my mind was "generalized [foo]".

Kyle Miller (Oct 16 2022 at 13:31):

I wonder if lin_card ("linearized cardinality") would make sense, thinking of modules as being linearized sets. That's a bit out there though.

Jireh Loreaux (Oct 16 2022 at 14:52):

You could just go with the longer name and introduce local notation for the file where you use it.

Johan Commelin (Oct 16 2022 at 18:53):

But that doesn't help with keeping lemma names short

Johan Commelin (Oct 16 2022 at 18:53):

As Andrew points out, the length of the definition influences the length of lemma names.

Kevin Buzzard (Oct 16 2022 at 19:53):

But do we really care whether lemma names are five characters shorter? Or would we rather they were readable?

Yaël Dillies (Oct 16 2022 at 19:54):

In my experience, longer names tend to be less readable, actually. We humans are pretty good at filling holes mentally.

Johan Commelin (Oct 16 2022 at 19:58):

I think it's a balancing act. Too short is obviously not good either. But too long means that you need a lot more line breaks.

Yaël Dillies (Oct 16 2022 at 20:00):

It also becomes harder to look for lemmas in the docs, because you need to type much more before you filter out the unrelated declarations.

Yaël Dillies (Oct 16 2022 at 20:02):

This in fact goes both ways. Long names poison all searches, because they contain many substrings. Maybe the search ranking could be tweaked to improve that? for example by weighing matches depending on how many letters match in the n-th part of the name (namespaces + fully unqualified name).

Yaël Dillies (Oct 16 2022 at 20:03):

Anyway, my point is that it's reductive to say "bytes are cheap". Short names are not always worse, long names are not always better. It's complex.

Andrew Yang (Oct 16 2022 at 20:29):

I don't actually care if the lemma names are shorter. The lemmas usually only appears in proofs, and we in general doesn't really care that much about the readability of the proof. But this definition will frequently appear in type signatures, and I do care if the statements of lemmas and the info view in tactic states are readable or not.
I agree that there is a cost on using shorter but non-standard/non-descriptive names. But I'd argue that in this particular case it is worth it.

Andrew Yang (Oct 16 2022 at 20:30):

For example, set.Iio is unintelligible for people outside mathlib. But imagine a mathlib where set.Iio were called set.left_infinite_right_open_interval instead.

Junyan Xu (Oct 16 2022 at 20:40):

Kyle Miller said:

My understanding of this answer is that "rank" of a module corresponds to the cardinality of a maximum independent set (for a module over a domain).

How about gen_card? Taking the cardinality of a minimal generating set is implicit, since you can always add more generators.

Notice that we already use docs#group.rank to mean the minimal number of generators (rather than the maximal cardinality of independent sets as in docs#module.rank), but only defined for finitely generated groups; probably docs#group.rank should be renamed when we decide what to call this notion for modules.

And there is precedent in using docs#gen_loop to mean "generalized loop". But we might reverse the order to card_gen to minimize ambiguity, and mathlib do have other ambiguities like ball in docs#ball_congr and in docs#ball_pi.

Other possible options:

Min-gen is sort of a dual notion of max-indep, but corank has other meanings, and dual_rank seems obscure. In graph theory, the dual of max-indep is called minimum (vertex) cover, and they talk about the vertex cover number.

dimension seems underutilized in mathlib, but I think it's still usually a maximum (like Krull dimension).

In any case, I think this notion should be defined for module, and if we want to apply it to a submodule then we just coerce it to a module,

Eric Rodriguez (Oct 16 2022 at 21:45):

Yaël Dillies said:

Anyway, my point is that it's reductive to say "bytes are cheap". Short names are not always worse, long names are not always better. It's complex.

for an extreme example of this: i'll be impressed if someone ever finds the cosine rule without knowing what it's called already

Junyan Xu (Oct 16 2022 at 22:29):

I didn't know; I searched "sub two cos" and docs#euclidean_geometry.law_cos pops out (among 5 results).

Jireh Loreaux (Oct 16 2022 at 22:45):

I think the point is that's just an alias for the "actual" result: docs#euclidean_geometry.dist_sq_eq_dist_sq_add_dist_sq_sub_two_mul_dist_mul_dist_mul_cos_angle

Kyle Miller (Oct 16 2022 at 22:55):

Junyan Xu said:

Notice that we already use docs#group.rank to mean the minimal number of generators (rather than the maximal cardinality of independent sets as in docs#module.rank), but only defined for finitely generated groups; probably docs#group.rank should be renamed when we decide what to call this notion for modules.

Just beware that the rank of a group is already an established concept in mathematics for the minimal number of generators of a group.

Maybe another avenue for names is to consider that we're wanting the minimum rank of a free module F such that there is a surjection F -> M. Would proj_rank, surj_rank, or cov_rank (for "covering rank") be sensible? (You wouldn't want to confuse proj_rank with the projective dimension.) Then there is also be the maximum rank of a free module F such that there is an injection F -> M; could that be the inj_rank?

Junyan Xu (Oct 16 2022 at 23:12):

Just beware that the rank of a group is already an established concept in mathematics for the minimal number of generators of a group.

Okay, so it shouldn't be renamed.

Andrew Yang (Oct 17 2022 at 03:21):

Would covrank/fin_covrank be better than grank/fin_grank (where the g stands for both generator and generalized)?

Junyan Xu (Oct 17 2022 at 14:42):

I think span_rank and span_finrank are better, since you're gonna use docs#submodule.span to define the quantity. Pete Clark uses mg in this answer and his commutative algebra notes (in which he points out that the quantity behaves badly) though.

I've also thought about calling span_rank "upper rank" and the original rank "lower rank", because rank ≤ span_rank always holds over nonzero commutative rings. However, the inequality may reverse even for IBN rings, and I think including the keyword span is clearer. Moreover, over commutative rings, although the rank is always the minimum of the local ranks if the module is locally of finite type, the span_rank can be larger than the maximum.


Last updated: Dec 20 2023 at 11:08 UTC