Zulip Chat Archive
Stream: condensed mathematics
Topic: Project statistics
David Michael Roberts (Mar 15 2021 at 06:51):
I thought it might be interesting to keep track of how the LTE is going as far as length of code produced. I appreciate it may not be easy with material going into mathlib, so maybe even just tracking the main self-contained parts is reasonable. Is there any automatic way to do this? Or are the parts all moving too fast at present?
Johan Commelin (Mar 15 2021 at 07:12):
@David Michael Roberts Does https://github.com/leanprover-community/lean-liquid/pulse give you some of the stats you want?
Johan Commelin (Mar 15 2021 at 07:13):
You can also run git ls-files | xargs wc
to get very rough line-count statistics
David Michael Roberts (Mar 15 2021 at 08:46):
I'd checked that, but it only goes back so far. I found this online tool that seems to a reasonable job, giving a current snapshot only: https://codetabs.com/count-loc/count-loc-online.html (so no historical added/removed)
Johan Commelin (May 26 2021 at 07:27):
I'm giving a talk about LTE later today, so I decided to gather a bit more statistics.
- Scholze's proof:
- Blueprint:
- Lean code:
strip-lean.tar.gz
(nofor_mathlib/
)strip-blue.tar.gz
- De Bruijn factor
Scott Morrison (May 26 2021 at 07:29):
Which you interpret to mean ...? :-)
Johan Commelin (May 26 2021 at 07:31):
I don't really interpret. That's the job of my audience...
Johan Commelin (May 26 2021 at 07:32):
It seems that the blueprint compresses better than the lean code. So I guess there's more redundancy in the blueprint? :stuck_out_tongue_wink:
Johan Commelin (May 26 2021 at 07:34):
(Edit: I forgot to count 8.17 and 8.19, so the proof in Analytic.pdf is more like 12p.)
Riccardo Brasca (May 26 2021 at 07:38):
Is the talk public?
Johan Commelin (May 26 2021 at 07:39):
http://individual.utoronto.ca/groechenig/K.html
Scott Morrison (May 26 2021 at 07:46):
Johan is on at . Too late for me. :-)
Johan Commelin (May 26 2021 at 07:47):
Sorry :shrug: I'm also not sure it's recorded. But ey, you guys won't here anything new anyways.
Sebastien Gouezel (May 26 2021 at 07:52):
How come the Lean text only compresses by a factor of two? It's just text, right, with a lot of words coming back, so it should be much better. Or has it something to do with unicode? I guess a more efficient compression engine would give different results.
Johan Commelin (May 26 2021 at 08:03):
I'm not a compression expert...
Johan Commelin (May 26 2021 at 08:09):
-rw-rw-r-- 1 jmc jmc 21764 May 26 08:08 strip-blue.tar.xz
-rw-rw-r-- 1 jmc jmc 468512 May 26 08:08 strip-lean.tar.xz
Johan Commelin (May 26 2021 at 08:13):
lz4
only makes things worse
-rw-rw-r-- 1 jmc jmc 37870 May 26 08:12 strip-blue.tar.lz4
-rw-rw-r-- 1 jmc jmc 619758 May 26 08:12 strip-lean.tar.lz4
Last updated: Dec 20 2023 at 11:08 UTC