Zulip Chat Archive

Stream: general

Topic: de Bruijn factor of Lean?

Kevin Buzzard (Apr 21 2020 at 14:37):

Freek Wiedijk writes about the de Bruijn factor of a formalisation; he gives a reasonable-looking definition but one could also try things such as "number of lines of code in textbook v number of lines of code in formalisation". The problem with measuring it, though, is that a formalisation of something a la mathlib might contain a whole bunch of solid API stuff which is not mentioned in the books, or a formalisation might borrow from several print sources. Note also that "Lean's de Bruijn factor" may not even make sense; it might well be different for different branches of mathematics.

I'm writing an article and I'd like to mention this concept, but currently I'm at a bit of a loss as to what I can say about it; is there any way that some kind of approximation of a de Bruijn factor for Lean can be measured?

Mario Carneiro (Apr 21 2020 at 14:53):

You should generally not count API stuff as part of the de Bruijn factor unless it is also part of the informal text

Mario Carneiro (Apr 21 2020 at 14:54):

It's usually rare to have a direct comparison just happen by accident. You can easily make synthetic benchmarks by proving self contained theorems though

Mario Carneiro (Apr 21 2020 at 14:56):

and yes, significant variance is expected. This number depends on a myriad of factors

Bryan Gin-ge Chen (Apr 21 2020 at 15:05):

Out of curiosity I just did it for the sensitivity proof. proof.tex.gz (see here) is 1159 bytes and sensitivity.lean.gz (see here) is 5931 bytes for a de Bruijn factor of 5.12. If I strip out the comments from sensitivity.lean (in a not too careful fashion) then sensitivity_stripped.lean.gz is 3960 bytes for a de Bruijn factor of 3.42.

Johan Commelin (Apr 21 2020 at 15:21):

Ha, that's a nice test case.

Last updated: May 02 2025 at 03:31 UTC