Zulip Chat Archive

Stream: Is there code for X?

Topic: Why isn't there code for `BitVec.foldr`?

Fady Adal (May 07 2025 at 18:44):

I am working on a quantum library that works with dot products between two bitvectors, and I believe definitions like below:

def foldr (x : BitVec w) (f : Bool → α → α) (init : α) : α :=
  w.fold (fun i h acc => f x[i] acc) init

def weight (x : BitVec w) : Nat :=
  x.foldr (·.toNat + ·) 0

def dot (x y : BitVec w) : Nat :=
  (x &&& y).weight

would be useful to have in the standard library. Is that true?

Eric Wieser (May 07 2025 at 18:47):

I think the argument is that fold is more general than foldr, so the latter isn't needed

Eric Wieser (May 07 2025 at 18:48):

weight might be better called popcount to match gmp_popcount; which is also to say that it could have a native-ish implementation

Fady Adal (May 07 2025 at 18:50):

Hi @Eric Wieser . Thanks for the reply. Where is BitVec.fold defined?

Eric Wieser (May 07 2025 at 18:52):

docs#BitVec.fold

Eric Wieser (May 07 2025 at 18:52):

Oh, I thought you were saying it already existed

Eric Wieser (May 07 2025 at 18:52):

Did an LLM hallucinate it above?

Fady Adal (May 07 2025 at 18:57):

Above I'm folding on the w : Nat that is the width of the BitVec. The only fold-related operation that I can find on BitVecs is BitVec.iunfoldr, which doesn't really accomplish the intention here. Would this be a useful construct to add?

Eric Wieser (May 07 2025 at 19:00):

Sorry for my garbage comment above, your suggestion makes sense now!

Fady Adal (May 07 2025 at 19:02):

No worries! Sorry I wasn't very clear with my suggestion

Eric Wieser (May 07 2025 at 19:11):

On a more helpful note, you can write dot as

import Mathlib

abbrev dot (x y : BitVec w) : Nat :=
  hammingDist x.getLsb' y.getLsb'

which comes with some lemmas

Fady Adal (May 08 2025 at 19:59):

I think this is more like (x ^^^ y).weight, but I didn't know about that function! I think hammingDist (x &&& y).getLsb' 0 should do the job. Thanks for pointing it out.

Kim Morrison (May 09 2025 at 12:41):

So far we have very little theory about BitVec as a container type, rather than BitVec as an algebraic/logical type.

The fact we are missing fold is a consequence of this.

We're generally open to such additions in the Lean4 standard library, although I'd like to see any additional functions coming equipped with fairly complete theory as well, so we don't get further behind.

To start treating BitVec as a container we need a lot of basic infrastructure, e.g. ofFnLE ofVector, toVector, ...

Eric Wieser (May 09 2025 at 12:43):

toVector is Vector.mk bv.getLsb', right?

Kim Morrison (May 09 2025 at 12:46):

Yes. I think it nevertheless deserves a name.

Kim Morrison (May 09 2025 at 12:46):

(despite my usual aversion to increasing API surface area --- basic conversion functions are needed for sanity)

Eric Wieser (May 09 2025 at 12:48):

I guess my argument is that conversion to vector is expensive, and actually working directly with the tuple returned by getLsb' is likely to lead to more efficient code

Eric Wieser (May 09 2025 at 12:48):

But perhaps that's just what the docstring for toVector should say

Kim Morrison (May 09 2025 at 12:49):

I'd be happy to mark toVector noncomputable, and explain that it is for theory only.

Eric Wieser (May 09 2025 at 13:21):

That also sounds like a bad idea, there are situation where it really is what you want

Fady Adal (Nov 23 2025 at 22:00):

I have opened a draft PR with container operations for BitVec: https://github.com/leanprover/lean4/pull/11329. Looking forward to your feedback!

Wrenna Robson (Nov 24 2025 at 17:02):

Kim Morrison said:

I'd be happy to mark toVector noncomputable, and explain that it is for theory only.

I can think of very real applications where one wants a vector of bits. Though you're more likely of course to want a bytearray in practice...

Wrenna Robson (Nov 24 2025 at 17:11):

Being able to unpack/pack a bytearray to a bitvec would be quite useful for some things.

Fady Adal (Nov 27 2025 at 09:37):

Wrenna Robson said:

Being able to unpack/pack a bytearray to a bitvec would be quite useful for some things.

I would be interested in adding that as well! It would be a nice feature. We'd need to decide on conventions for endianness and how to handle packing/unpacking bitvectors whose length isn't a multiple of 8. I'm open to exploring a C++ vs pure Lean implementation for it, too.

Wrenna Robson (Nov 27 2025 at 11:46):

I think BitVec itself takes a big-endian convention so I would suggest doing the same here!

Kim Morrison (Nov 29 2025 at 01:06):

BitVec actually aspires to mostly allow both endianness conventions. It's only getElem that is biased.

Eric Wieser (Nov 29 2025 at 01:10):

I don't agree with that; when working with big endian bit vectors in verilog (eg wire x [7:0]), x[0] is still the LSB.

Eric Wieser (Nov 29 2025 at 01:11):

The bias arguably resides in ++, or perhaps toList

Fady Adal (Nov 29 2025 at 01:26):

In the PR, I added both toList and toListBE to complement ofBoolListLE and ofBoolListBE. I left toList without a suffix so that bv[i] and bv.toList[i] would align, but I’m open to adding a suffix if the goal is to keep the API fully endianness-agnostic. I also only added toArray and toVector for the same reason, but I can include the big-endian variants if that would be more consistent.

Fady Adal (Dec 10 2025 at 07:04):

I've implemented the ByteArray conversions we discussed. I added both Little Endian and Big Endian variants to support different use cases:

To ByteArray: toBytesLE and toBytesBE convert a BitVec w to a ByteArray of size (w + 7) / 8.
- Padding: Since BitVec width might not be a multiple of 8, I handled padding as follows:
  - LE: Zero-padding goes in the high bits of the last byte.
  - BE: Zero-padding goes in the high bits of the first byte.
From ByteArray: ofBytesLE and ofBytesBE convert a ByteArray back to a BitVec.
- The resulting BitVec always has width bytes.size * 8.
Round-trips:
- toBytes (ofBytes bytes) is an exact round-trip.
- ofBytes (toBytes bv) results in bv.zeroExtend if the original width wasn't a multiple of 8 (since the byte array forgets the exact original bit width).

I also included proofs of the round-trip properties and injectivity. The next steps will likely involve extending the theory to cover these conversions when used alongside BitVec algebraic operations such as &&&, shifts, and related constructs. Let me know what you think!

Yuyang Zhao (Dec 11 2025 at 04:06):

Maybe there should be a function to get the nth limb in the GMP implementation. Then we can implement BitVec.toBytes, Nat.popcount, Nat.testBit, etc. efficiently directly in Lean.

Last updated: Feb 28 2026 at 14:05 UTC