Zulip Chat Archive

Stream: general

Topic: How do you do fast integer arithmetic in a Lean-like kernel?


Daniel Donnelly (Dec 03 2021 at 19:23):

Computer-implementable general math kernels (say of a proof assistant) are usually implemented in a one-to-one correspondence (in data structures) with a certain formalized theory of types.

But they usually define the naturals, and then then integers, in such a way that doing things 100% one-to-one would result in an extremely inefficient algorithm for say adding two arbitrary integers.

Since most programming language standard libraries have a built-in BigInt (D does, Python's is part of the language), and the most basic datatype the computer works with is the integral bit, and stringing these together leads to quotient rings $\Bbb{Z}/{2^k}$, in particular of the integers, it seems like BigInt is a good candidate for the first kind of built-in I would try to add to a theory.

So my question is, how do you perform this seamless integration of a high-level built-in type $\Bbb{Z}$. That is, not all of its operations are formalized / verified at least at runtime. This is sort of like having "managed" and "unmanaged" portions of code in the same C# project.

So a better question would be what does Lean do to deal with this problem? Say if I were searching for a proof, I would not want to "verify everything always", I would probably want to treat certain things atomically.

Eric Wieser (Dec 03 2021 at 19:34):

It's hard for me to tell what you're asking for, but as a guess maybe the lean answer is "generalize away all the implementation details:

example {Z : Type*} [ring Z] : (3 * 7 : Z) = 21 := by norm_num
-- I don't know if this proof works!

Eric Wieser (Dec 03 2021 at 19:35):

Note that here 3 is syntactically (bit1 (bit1 0)), and lean just prints it nicely

Daniel Donnelly (Dec 03 2021 at 19:43):

Since Lean is also a programming language, surely there is some use of it that would require fast BigInt operations, so I'm wondering how does Lean perform the operations quickly enough to be comparable with a C++ BigInt implementation? But only do so where it absolutely needs to and should, perhaps the user could tell it when to...

Mac (Dec 03 2021 at 19:46):

@Daniel Donnelly are you interested particularly in Lean3 or is Lean 4 also on the table? As it stands, Lean 4 actually does use a C++ implementation of big numbers to do numeric operations (either GMP or a custom version).

Kyle Miller (Dec 03 2021 at 19:47):

There are a couple levels to Lean, which makes answering this not so straightforward. There's the kernel, which is what does the final check that a proof is correct (and its version of the naturals is the inductive type), there's the evaluator, which is what things like tactic scripts run in (and as I understand it, it uses an efficient big natural number library), and for Lean 4 there's compiled code, which uses GMP for natural numbers (Mac beat me to mentioning it while I was writing).

Daniel Donnelly (Dec 03 2021 at 19:48):

@Mac Yes, lean4, okay so Lean does it properly by now. Just wondering! I'm messing around with some code in D right now. :)

Mac (Dec 03 2021 at 19:48):

@Kyle Miller the Lean 4 kernel also uses C++ big numbers as well though (it has special added support for this).

Daniel Donnelly (Dec 03 2021 at 19:50):

@Mac I guess it's up to the Lean designer's discretion when determining which method to use.

Kyle Miller (Dec 03 2021 at 19:50):

@Mac Does the kernel also do this for other types with an extern implementation? like arrays?

Mac (Dec 03 2021 at 19:52):

@Kyle Miller no, the kernel just has special support for natural numbers.

Daniel Donnelly (Dec 03 2021 at 19:55):

Is there any way to base a type theory off of integers so that you don't have to verify that the arithmetic is correct, but you assume it is?

Mac (Dec 03 2021 at 19:56):

Daniel Donnelly said:

Is there any way to base a type theory off of integers so that you don't have to verify that the arithmetic is correct, but you assume it is?

I would not imagine so, no. I don't see how exactly type theory would ever get you arithmetic for free.

Mac (Dec 03 2021 at 19:56):

But I am not really an expert on such formal methods, so don't quote me on that.

Daniel Donnelly (Dec 03 2021 at 20:00):

@Mac Not just any arithmetic, only arithmetic of the integers is built-in. It seems appropriate since it is a most-basic type in a lot of senses and the computer's natural datatype.

Kyle Miller (Dec 03 2021 at 20:15):

I'm not sure what you're looking for here; maybe one answer is that you define anything with axioms. However, you lose niceties that come with an inductive type definition that are helpful for proving things.

With Lean 4 and natural numbers in the kernel, I think an idea is that with inductive types, the underlying implementation can be anything that satisfies the axioms, so non-negative GMP natural numbers can be used perfectly well in place of the generic implementation, supposing you trust they are implemented correctly. (A complication is that you would probably also want to also replace basic arithmetic operations with their GMP counterparts.) I don't see why you can't do the same for integers and GMP integers.

Mario Carneiro (Dec 03 2021 at 22:06):

The current implementation of lean 4 puts GMP in the trusted computing base, which not everyone is comfortable with. Moving bignum arithmetic to the kernel also has consequences for proof export, since anyone consuming the proof also has to support the same operations.

Mario Carneiro (Dec 03 2021 at 22:09):

Note that you can get within a constant multiplicative factor of optimal even without any fancy bignum support; this is what lean 3 norm_num does, it implements everything with bit0 and bit1 functions. These are simple and portable, but the constant cost is large, probably on the order of 100 to 1000, which is good enough for numbers with, say, 50 digits but not if you want to deal with 100000 digit numbers

Mario Carneiro (Dec 03 2021 at 22:10):

I think there are very few problems in verification which need such large numbers, and norm_num is generally not the bottleneck of mathlib

Mario Carneiro (Dec 03 2021 at 22:11):

But certainly there are certain domains where fast large number computation is important and having bignum arithmetic support opens those areas

Mario Carneiro (Dec 03 2021 at 22:13):

Personally, I think the sweet spot is built in kernel computations on fixed size words, say 64 bit numbers, and make the user write anything on top of that themselves (i.e. they have to write the bignum package). It's usually not that hard to justify operations on integers which are part of the underlying memory model


Last updated: Dec 20 2023 at 11:08 UTC