Zulip Chat Archive

Stream: general

Topic: Plots of mathlib4 and lean4 lines of code over time


Joscha Mennicken (Aug 02 2024 at 23:58):

Hi, I recently wrote a small tool to plot lines of code based on git blame and thought mathlib4 would be a nice benchmark. There are two (interactive) graphs (that don't work well on mobile, sorry):

The commits are roughly ordered by their committer date, but inside each month they are ordered topologically and spread out to cover the entire month evenly. This is also the reason why the graph seems to flatten out around the beginning of the current month - there's just not many commits yet.

Controls: Click and drag or scroll to zoom. Double-click to fully zoom back out. Click a commit to pin its info to the right, for copy-pasting. Click on the legend to hide/show individual years or authors. Ctrl-click to show only a specific year or author.

image.png
image.png

Edward van de Meent (Aug 03 2024 at 09:12):

big thanks to our greatest contributor Misc. Authors for making the project possible :grinning_face_with_smiling_eyes:

Edward van de Meent (Aug 03 2024 at 09:18):

i'm curious, is there a significant change in policy/workflow that caused the slowing of growth around the start of july 2023?

Yaël Dillies (Aug 03 2024 at 09:18):

The port ended!

Edward van de Meent (Aug 03 2024 at 09:18):

ahhh

Edward van de Meent (Aug 03 2024 at 09:21):

it looks like the sudden drop is the removing of #align-like lines, for those wondering.

Joscha Mennicken (Aug 03 2024 at 11:54):

I've produced the same set of graphs for the lean4 repository (excluding stage0 as best as I could), and they look a lot more exciting :P. I've also updated the mathlib4 author graph to use the author names instead of email addresses, and to count the amount of hidden misc authors.

If you're wondering about the vertical jumps from mid-2021 to mid-2023, those are due to lake and lean4 being developed in parallel but separate repos/branches, and they go away when lake is merged into lean.

image.png
image.png

Ruben Van de Velde (Aug 03 2024 at 11:56):

Perhaps for mathlib it would be more interesting to have plots ignoring everything before the end of the port

Joscha Mennicken (Aug 03 2024 at 12:21):

According to https://leanprover-community.github.io/mathlib-port-status/, the port was declared complete on 2023-07-16, so here are the mathlib4 graphs but every commit before 2023-07-16 is ignored:

By the way, authors are sorted roughly by their area below the graph.

image.png
image.png

Yaël Dillies (Aug 03 2024 at 12:29):

Yet I don't beat Misc. :pensive:

Scott Carnahan (Aug 03 2024 at 15:16):

These were very interesting charts! I got credit (or blame) for more lines than I had expected. Looking a little more closely, it turns out that about half of them came from splitting up big files, i.e., other people's hard work.

Joscha Mennicken (Aug 03 2024 at 15:32):

I find it interesting how mathlib seems to be growing at a fairly consistent rate compared to other repos. The size and complexity doesn't seem to be slowing it down.

Patrick Massot (Aug 03 2024 at 19:51):

Scott, that kind of phenomenon is part of why we stopped having such statistics on the community website. And of course the port completely changed those statistics.

Joscha Mennicken (Jan 11 2025 at 22:10):

I've updated the graph of lines of mathlib4 code over time, split up by the year they were written and created a version without the refactor jump at 2024-07-19 (the lines before the refactor jump are scaled down linearly). See the original message for why the last month may look too flat.

image.png
image.png

Eric Wieser (Jan 11 2025 at 23:55):

I would guess the refactor jump can be addressed by including mathlib dependencies (following the shas in the lake-manifest.json)

Eric Wieser (Jan 11 2025 at 23:55):

That is, it represents code being upstreamed

Kim Morrison (Jan 12 2025 at 01:01):

Oh, I assumed it was removing #align.

Joscha Mennicken (Jan 12 2025 at 01:03):

Yes, that's the cause of the jump (commit 465f26fde59aa31d051ef0b6db1fb34476802b64) as you can see if you zoom into the graph enough.


Last updated: May 02 2025 at 03:31 UTC