Zulip Chat Archive
Stream: general
Topic: Queue histograms
Johan Commelin (Jul 19 2022 at 14:46):
Can we generate histograms about how long PRs are ope?. And/or how long PRs have to wait for a review after being labeled awaiting-review
? Because those are metrics we care about. The length of #queue is just a proxy. Any GitHub API experts around who know how to do this?
Yakov Pechersky (Jul 19 2022 at 16:18):
Working on it
Yakov Pechersky (Jul 19 2022 at 16:30):
Yakov Pechersky (Jul 19 2022 at 16:30):
I don't know about how to track label-change events, so I don't know how to answer your second question
Yakov Pechersky (Jul 19 2022 at 16:31):
gh pr list --state all --limit 5000 --json number,createdAt,closedAt,labels > prs.json
python pres.py prs.json prs.csv
Yakov Pechersky (Jul 19 2022 at 16:31):
import sys
import csv
import json
entries = []
labels = set()
for entry in json.load(open(sys.argv[1])):
entry["labels"] = [label["name"] for label in entry["labels"]]
labels.update(entry["labels"])
entries.append(entry)
for entry in entries:
for label in labels:
entry[label] = label in entry["labels"]
del entry["labels"]
writer = csv.DictWriter(open(sys.argv[2], "w"), fieldnames=["number", "closedAt", "createdAt"] + sorted(labels))
writer.writeheader()
writer.writerows(entries)
Yakov Pechersky (Jul 19 2022 at 16:31):
Excuse the sloppy python code, just trying to crank something out fast
Yakov Pechersky (Jul 19 2022 at 16:32):
This plot indicates that our PR creation flux has been pretty constant over the past 5K PRs
image.png
Yakov Pechersky (Jul 19 2022 at 16:33):
Yakov Pechersky (Jul 19 2022 at 17:02):
Here's the code to get the label changes -- but I have now hit my GH API limit, and am barred from making more requests for idk how long:
import sys
import csv
import json
import requests
from requests.adapters import HTTPAdapter, Retry
events = []
writer = csv.DictWriter(open(sys.argv[1], "w"),
fieldnames=["action", "created_at", "number", "label"])
writer.writeheader()
session = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[500, 403])
session.mount("https://", HTTPAdapter(max_retries=retries))
per_page = 100
pages = 100 // per_page
for page in range(pages):
resp = session.get(
"https://api.github.com/repos/leanprover-community/mathlib/issues/events",
headers={"Accept": "application/vnd.github+json"},
params={"per_page": per_page, "page": page + 1} # github starts at 1
)
print(resp)
for event in resp.json():
if event["event"] not in ["unlabeled", "labeled"]:
continue
mini_event = {}
mini_event["action"] = event["event"]
mini_event["created_at"] = event["created_at"]
mini_event["number"] = event["issue"]["number"]
mini_event["label"] = event["label"]["name"]
events.append(mini_event)
writer.writerow(mini_event)
Yakov Pechersky (Jul 19 2022 at 17:02):
That's based on https://docs.github.com/en/rest/issues/events
Johan Commelin (Jul 19 2022 at 17:02):
@Yakov Pechersky Thanks! What is the y
-axis in that diagram 2 posts up?
Yakov Pechersky (Jul 19 2022 at 17:03):
the tiny bar chart?
Yakov Pechersky (Jul 19 2022 at 17:03):
Just the count of PRs for each bin
Johan Commelin (Jul 19 2022 at 17:04):
Aha. So it doesn't yet tell us how long PRs are open on average.
Yakov Pechersky (Jul 19 2022 at 17:04):
That was the other bar chart
Yakov Pechersky (Jul 19 2022 at 17:04):
Johan Commelin (Jul 19 2022 at 17:05):
Ooh, I see. And that is "number of days" along the y-axis?
Yakov Pechersky (Jul 19 2022 at 17:06):
The y axis is binning -- there is a hypothesis that the closedAt - createdAt
delta has changed over time
Yakov Pechersky (Jul 19 2022 at 17:06):
The y axis is, for each bin, the median of that delta over that bin
Yakov Pechersky (Jul 19 2022 at 17:07):
Separating out PRs labeled with easy
(at time of close)
Yakov Pechersky (Jul 19 2022 at 17:11):
Here it is more clearly:
image.png
Yakov Pechersky (Jul 19 2022 at 17:11):
I've hidden the outliers on the top
Johan Commelin (Jul 19 2022 at 17:18):
So what does median
mean exactly? Does it mean that in February the median was 81 days for a PR to be merged?
Yakov Pechersky (Jul 19 2022 at 17:21):
Sorry, the most recent plot was in hours
Yakov Pechersky (Jul 19 2022 at 17:22):
That in february, the median time for a PR to go from created to closed was 81 hrs if it is not labeled as easy
, and 6 hrs if easy
Yakov Pechersky (Jul 19 2022 at 17:23):
I haven't differentiated "closed due to bors merge" and "closed due to user closing it without merge". My expectation is that the latter are way fewer than former. But that is why I am using median and not mean
Yakov Pechersky (Jul 19 2022 at 17:23):
Because I assume the latter are not distributed in the way the former are
Eric Wieser (Jul 19 2022 at 17:32):
I guess you didn't auth with github in your example? That probably leads to the rate limit being hit sooner
Yakov Pechersky (Jul 19 2022 at 17:33):
I was hitting 200 for a while, then 500 (set up retry), but then started to hit 403. I guess I am not passing the token that is in my shell, you're right
Johan Commelin (Jul 19 2022 at 17:37):
Aah, 81 hrs makes a lot more sense.
Johan Commelin (Jul 19 2022 at 17:39):
I think this is a very useful chart! Thanks for building it!
Yakov Pechersky (Jul 19 2022 at 18:27):
Thanks to Eric's comment, I got the labelling events out. An example of a viz:
image.png
Yakov Pechersky (Jul 19 2022 at 18:28):
Yakov Pechersky (Jul 19 2022 at 18:28):
from datetime import datetime
import sys
import csv
import json
import requests
from requests.adapters import HTTPAdapter, Retry
writer = csv.DictWriter(open(sys.argv[1], "w"),
fieldnames=["action", "created_at", "number", "label"])
writer.writeheader()
session = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[500])
session.mount("https://", HTTPAdapter(max_retries=retries))
per_page = 100
page = 0
in_range = True
while in_range:
page += 1
resp = session.get(
"https://api.github.com/repos/leanprover-community/mathlib/issues/events",
headers={"Accept": "application/vnd.github+json", "Authorization": f"token {token}"},
params={"per_page": per_page, "page": page + 1} # github starts at 1
)
for event in resp.json():
print(resp)
if event["event"] not in ["unlabeled", "labeled"]:
continue
mini_event = {}
mini_event["action"] = event["event"]
mini_event["created_at"] = event["created_at"]
mini_event["number"] = event["issue"]["number"]
mini_event["label"] = event["label"]["name"]
writer.writerow(mini_event)
if datetime.fromisoformat(mini_event["created_at"][:-1]) < datetime(2022, 1, 1):
in_range = False
Yakov Pechersky (Jul 19 2022 at 18:44):
Johan, are you sure you want the "when's the earliest awaiting-review
"? It's like minutes between PR creating to labeling as awaiting-review
Yakov Pechersky (Jul 19 2022 at 18:48):
As viz'd here, separate bins per month and whether (earliest-awaiting-review-label - PR-creation) < 5
image.png
Kyle Miller (Jul 19 2022 at 19:07):
If you haven't done it already, something that would be nice to see is time between awaiting-review
and any change in status (label change or closure). Individual PRs might have multiple of these intervals.
Yakov Pechersky (Jul 19 2022 at 19:08):
Earliest awaiting-review
or the latest?
Kyle Miller (Jul 19 2022 at 19:09):
for every awaiting-review
, what's the time to the first change in status?
Yakov Pechersky (Jul 19 2022 at 19:18):
Unfortunately, that's all the time I had today for these analyses. Given the csvs I attached earlier, I hope that they're enough to answer
Johan Commelin (Jul 19 2022 at 19:36):
Thanks! That's already very useful!
Kyle Miller (Jul 19 2022 at 21:37):
Here's an attempt at visualizing lengths of review cycles, which is an interval of time when 'awaiting-review' is present, starting after the last 'awaiting-author', 'blocked-by-other-PR', 'awaiting-CI', 'CI', 'merge-conflict', 'help wanted', or 'WIP' was removed, ending when either 'awaiting-review' is removed or 'ready-to-merge' or 'ready-for-bors' are added.
To get a sense of distributions, for each month in Yakov's data (labels.csv
) I have the review cycles in sorted order, and the y-axis is in days. (So the x-axis is just the index of a given review cycle in sorted order.) A review cycle is recorded in the month it ends.
Kyle Miller (Jul 19 2022 at 21:38):
code
Eric Wieser (Jul 19 2022 at 21:43):
Can you replot with the same y axis on each graph, and maybe a log y scale?
Kyle Miller (Jul 19 2022 at 21:46):
I just used LibreOffice Calc -- I tried to figure out how to do any of that with it, but didn't find it with some clicking around.
Here's the CSV in a silly format: review_times.csv
Each month starts with "date, 2022, 2" (for example), then there are idx/days pairs, and then a blank line.
Kyle Miller (Jul 19 2022 at 21:50):
Here's a less silly format if you want to do some processing of your own. It has year/month/day of the end of a review cycle and its duration in days. review_times_full.csv
Johan Commelin (Jul 20 2022 at 05:34):
@Kyle Miller Thanks! That clearly shows that review times have been increasing!
I guess the y-axis is still in hours? And has some outliers chopped off?
Julian Berman (Jul 20 2022 at 05:36):
The ratio of PR submitters to number of distinct PR reviewers (and/or reviews per reviewer) is also possibly something interesting to look at over time. Lots of larger projects I think run into cases where the rate of growth of the former outpaces the latter either because submitting is more fun than reviewing or because it takes active encouragement to get people to think they're good enough to review stuff.
Johan Commelin (Jul 20 2022 at 05:37):
Nice idea. If we can autogenerate a bunch of such plots, I think that would be really useful.
Kyle Miller (Jul 20 2022 at 05:57):
@Johan Commelin The y-axes are in days, and no outliers have been removed. What I'm seeing is that reviewing times have gotten longer, but there has simultaneously been fewer review cycles overall. These are only a few months though so it's hard to say what's going on.
Johan Commelin (Jul 20 2022 at 06:04):
Kyle Miller said:
there has simultaneously been fewer review cycles overall
That's an interesting point! Could that be correlated with there being a lot more PRs that are 20 - 50 lines long? Or even < 20 lines, adding just 1 lemma?
Yaël Dillies (Jul 20 2022 at 09:43):
I suspected it was more of a cultural thing. Given that there are more PRs, you cannot spend as much time reviewing each.
Last updated: Dec 20 2023 at 11:08 UTC