Zulip Chat Archive
Stream: mathlib4
Topic: Topic label for "data"?
Yury G. Kudryashov (Aug 03 2024 at 01:45):
Should we add t-data
label for List
/Multiset
/Finset
/...?
Yaël Dillies (Aug 03 2024 at 05:37):
I have thought about this before and it is reasonable, but I wonder whether it should include things like #11243 should be included?
Jon Eugster (Aug 14 2024 at 07:53):
Coming back to this, I'd be in favour of a topic label t-data
or t-basic
. I tried to label made a push on all unlabelled PRs and right now there are 18 left. Most of them could fall into such a label's domain. (some I labelled t-number-theory
when they barely belong there, e.g. a PR about ZMod
could probably be t-basic
, too)
At least for me, I tend to miss unlabbelled PRs as I'm filtering the #queue by labels of interest.
Would it be okay if I created such a topic label? (@reviewers)
Damiano Testa (Aug 14 2024 at 07:54):
Maybe there should be a bot that adds t-
-labels automatically (and there should be a good discussion about what heuristics the bot should use!).
Damiano Testa (Aug 14 2024 at 07:55):
(Note that this was brought up, certainly by me and likely by others as well in other threads.)
Yaël Dillies (Aug 14 2024 at 08:00):
Jon Eugster said:
I made a push on all unlabelled PRs and right now there are 18 left
Because it took me a good minute to parse, let me rephrase this sentence for others: "I tried labelling all unlabelled PRs and I managed to do so except for those 18 PRs"
Johan Commelin (Aug 14 2024 at 08:10):
@Jon Eugster Please go ahead. Since you are most up to date on which labels are useful and applicable, I think you are also in the best position to decide whether t-data
or t-basic
makes the most sense. If you opt for t-basic
, maybe mention the Data/
folder in its description. Of course, we can also have both, if you think that's useful.
Yaël Dillies (Aug 14 2024 at 08:11):
Yes, I think there are some computer-sciency PRs that are not really "basic" but are really "data"
Damiano Testa (Aug 14 2024 at 08:13):
t-cs
could be a label as well.
Damiano Testa (Aug 14 2024 at 08:13):
Is there a reason to not allow many labels?
Yaël Dillies (Aug 14 2024 at 08:14):
It makes it difficult to search. But I would think we are still pretty far from that point
Damiano Testa (Aug 14 2024 at 08:14):
I would think that there may be a small number of labels that are possibly managed by a bot and then "human" labels that can be generally informative.
Damiano Testa (Aug 14 2024 at 08:15):
For instance, maybe every PR should get labeled by t-xxx
where xxx
ranges over all root folders in mathlib that the PR touches.
Damiano Testa (Aug 14 2024 at 08:16):
That would give an easy way of filtering. On top of that, any author can add more labels, if they think that it would be useful.
Michael Rothgang (Aug 14 2024 at 08:21):
Damiano Testa said:
That would give an easy way of filtering. On top of that, any author can add more labels, if they think that it would be useful.
Or perhaps remove labels when these are "clearly not relevant".
Michael Rothgang (Aug 14 2024 at 08:24):
To add to this: I usually tag any PR "split large file", "trim import bush" or "remove any nolints" with tech debt
. Perhaps this should not be automated yet, but it could.
Michael Rothgang (Aug 14 2024 at 08:29):
@Jon Eugster Thanks for this labelling push, this is great! There might be a follow-up: are there PRs without any topic
label that should get one? (For instance, a few PRs are only labelled with tech debt
. Or, are they feat
ure PRs without such a label?
Michael Rothgang (Aug 14 2024 at 08:33):
I just checked: there are currently 166 feature PRs without a t-something
label.
Jon Eugster (Aug 14 2024 at 09:16):
In that case, I think I'll add t-data
for now.
And if somebody (looking at @Damiano Testa :wink: ) wants to write a CI doing some sort of automatic label assignment, my impression from this conversation is that this would be pretty much welcomed by everybody, regardless of the exact implementation details.
Damiano Testa (Aug 14 2024 at 11:15):
I'll take a look at this: in the first instance, I will try to see if I can add a t-xxx
for each root dir containing modified files (maybe not labeling anything if there are more than 3 root dirs involved).
Damiano Testa (Aug 14 2024 at 11:16):
Once/if that works, we can then implement better heuristics.
Damiano Testa (Aug 14 2024 at 11:16):
However, from working with CI, I know that the algorithm is not usually the hard part: getting CI to remotely do what you want is!
Michael Rothgang (Aug 15 2024 at 11:18):
Michael Rothgang said:
I just checked: there are currently 166 feature PRs without a
t-something
label.
I just made a push over all remaining PRs: there are 17 remaining PRs without a t-something label: three of them are squarely CI (hence, should be excluded from the github search also), for the others it's less clear how to label them.
Michael Rothgang (Aug 15 2024 at 11:19):
One strong take-away: not every PR can be nicely labelled; having an "I don't know" state seems useful.
Last updated: May 02 2025 at 03:31 UTC