Topic: de Bruijn factor of Lean?
Kevin Buzzard (Apr 21 2020 at 14:37):
Freek Wiedijk writes about the de Bruijn factor of a formalisation; he gives a reasonable-looking definition but one could also try things such as "number of lines of code in textbook v number of lines of code in formalisation". The problem with measuring it, though, is that a formalisation of something a la mathlib might contain a whole bunch of solid API stuff which is not mentioned in the books, or a formalisation might borrow from several print sources. Note also that "Lean's de Bruijn factor" may not even make sense; it might well be different for different branches of mathematics.
I'm writing an article and I'd like to mention this concept, but currently I'm at a bit of a loss as to what I can say about it; is there any way that some kind of approximation of a de Bruijn factor for Lean can be measured?
Mario Carneiro (Apr 21 2020 at 14:53):
You should generally not count API stuff as part of the de Bruijn factor unless it is also part of the informal text
Mario Carneiro (Apr 21 2020 at 14:54):
It's usually rare to have a direct comparison just happen by accident. You can easily make synthetic benchmarks by proving self contained theorems though
Mario Carneiro (Apr 21 2020 at 14:56):
and yes, significant variance is expected. This number depends on a myriad of factors
Bryan Gin-ge Chen (Apr 21 2020 at 15:05):
Out of curiosity I just did it for the sensitivity proof.
proof.tex.gz (see here) is 1159 bytes and
sensitivity.lean.gz (see here) is 5931 bytes for a de Bruijn factor of 5.12. If I strip out the comments from
sensitivity.lean (in a not too careful fashion) then
sensitivity_stripped.lean.gz is 3960 bytes for a de Bruijn factor of 3.42.
Johan Commelin (Apr 21 2020 at 15:21):
Ha, that's a nice test case.
Last updated: May 13 2021 at 17:42 UTC