bitvec
Contentsbitvec
is a Rust implementation of the bit-packing data structure available in C++ as std::vector<bool>
and std::bitset
. In features, API completeness, and expressiveness, it surpasses every other bit-packing library both in Rust and in any other language.
It is, to my knowledge, the best-in-class library for working with memory at single-bit precision. It compiles to the shift-and-mask instructions you would ordinarily write, while eliminating the need to use anything other than semantic addresses.
If your work involves bit-level manipulation of memory registers, partial-register data compaction, bitfields, or large collections of bool
s, then you want this library.
Sample
use bitvec::prelude::*;
static BITS: &'static BitSlice<Msb0, u16> = bits![static u8, Msb0;
0, 1, 1, 0, 0, 0, 1, 0,
0, 1, 1, 0, 1, 0, 0, 1,
0, 1, 1, 1, 0, 1, 0, 0,
0, 1, 1, 1, 0, 1, 1, 0,
0, 1, 1, 0, 0, 1, 0, 1,
0, 1, 1, 0, 0, 0, 1, 1,
];
#[repr(C})]
pub struct Magic {
inner: BitArr!(for 128, in Lsb0, u32),
}
impl Magic {
fn bitfield(&self) -> &BitSlice<Lsb0, u32> {
&self.inner[5 .. 30]
}
fn bitfield_mut(&mut self) -> &mut BitSlice<Lsb0, u32> {
&mut self.inner[5 .. 30]
}
}
fn write(place: &mut Magic, val: u32) {
place.bitfield_mut().store_le(val);
}
fn read(place: &Magic) -> u32 {
place.bitfield().load_le()
}
History
I began writing bitvec
on 2018 Jun 28. I had at the time been working on an implementation of the Ball Aerospace COSMOS message definition language. This language defines bitfields, which in C++ can be written as native struct or class fields. Rust has no such language element, so I began implementing it as a library.
Because the COSMOS language allows users to specify both the endianness and register type of fields, I designed bitvec
from day one to take both of these as type parameters. Of the major bitfield implementations, only the C/C++ structural definition is parametric over the register type, and no other implementation is parametric over bit ordering.
My initial drafts establishing the library foundations were hampered by the universal flaw common to all bitstream implementations (except Erlang’s): either the handle is large and slower than the equivalent [bool]
handle, or it cannot track the start bit’s position in the head element.
In December, I came up with a solution. Rust is unique among systems languages in that pointers to sequences have a well-defined representation of carrying both the address of the first element, and the length of the sequence, as a pair of processor registers. This corresponds to the C idiom of carrying void*
and size_t
separately. Because Rust guarantees two registers of space for these pointers, rather than one, and I only need three bits to encode the location of the start bit in the first byte, I could pack the start bit into the pointer’s representation, and use *const BitSlice
rather than struct BitSlicePtr
to describe any region of memory as a BitSlice
.
Once I established an encoding of element address, starting bit, and bit count into a standard Rust slice pointer, I was able to use the full idiomatic API and expectations of the Rust language without limitation. This encoding is unique in Rust crates to bitvec
, and by enabling the construction of &BitSlice
references, it unlocks full, seamless, access to APIs that demand reference types rather than mere borrowing handles. BitSlice
is indexable with the []
operator; no other Rust library can provide this.
Since 2018 Dec, I have continued development by reïmplementing the full standard library APIs for [bool]
regions, [bool; N]
arrays, Box<[N]>
owned slices, and Vec<bool>
vectors. I have also implemented bitfield register storage in the library, enabling users to write bits[span].load_or_store()
to move register values into and out of a BitSlice
region, equivalent to C member bitfields.
bitvec
is now feature-complete. I published version 1.0.0 on 2022 January 11, and will now provide only security and correctness maintenance until the Rust language advances enough to enable major changes.
Type Parameters
In addition to the pointer encoding scheme, bitvec
is unique in its use of the Rust trait system to allow users to select the register element type and the translation of semantic indices to electrical positions. In fact, bitvec
permits users to supply their own ordering implementations, and the rest of the library will seamlessly work with them.
The BitStore
trait defines bitvec
’s interface to memory. It is implemented on plain register integers, their Cell<>
wrappers, and their Atomic
variants. It is the latter two implementations that allow bitvec
to safely express any view of memory, including partitioning a single element across multiple &mut BitSlice
s, without violating any of Rust’s rules about data races.
The BitOrder
trait translates the semantic indices available to users to the electrical positions used to drive memory. This allows users to only track abstract numbers, and let their choice of implementation handle producing the shift/mask or single-bit instructions at the processor level. bitvec
provides two out of the box: Lsb0
, which places 0 at the least significant bit of a register and counts to the left, and Msb0
, which places 0
at the most significant bit and counts to the right.
Conveniences
bitvec
provides constructor macros, similar to the standard library’s vec!
, to build up BitSlice
buffers at compile time. The bits!
macro accepts any number of compile- or run- time expressions and uses them to construct the appropriate buffer. When invoked with literals, const
s, or values that LLVM can const-fold, it runs entirely at compile-time and produces a memory buffer loaded into your .rodata
section that can be read directly at runtime without any additional computation.
bits!
is written entirely by Nika Layzell, whose advice and expertise on alias safety and macro_rules!
has been invaluable in making bitvec
what it is today.
The bitarr!
, bitbox!
, and bitvec!
macros build atop bits!
to create the desired data structure with the given contents.
Performance
Rust is still developing its type-level-integer (const
-generics) capability, so currently, bitvec
is not able to use const
functionality at all. Rust does not allow generics in const contexts, nor does it allow computation at the type level. These restrictions prevent BitArray
from having the same signature and precision as C++’s std::bitset<N>
, and the rest of the crate from being usable in static
or const
contexts.
However, because BitOrder
is strictly numeric, and the rest of the crate is built atop it, LLVM is able to perform aggressive const-folding during compilation. One of my difficulties in observing bitvec
’s codegen or in benchmarking its performance has been obfuscating enough to prevent LLVM from solving it entirely during compilation.
In general, indexing with literal numbers should be completely free, and indexing with runtime values should be equivalent to the shift/mask instructions you would be writing without this library. I have not observed the fact that &BitSlice
can have unrestricted start positions to have any performance penalty; accessing the offset is typically only two instructions.
Additionally, thanks to a great deal of work in early 2020, bitvec
is able to balance perfect thread-safety with atomic bypass to maintain fast access to memory that it can observe is not aliased.
API Completeness
bitvec
implements every part of the standard library APIs for pointers, references, slices, arrays, boxed slices, and vectors of [bool]
. The only items missing at the type level are implementations of IndexMut<Target = bool>
, because this trait requires producing &mut bool
references (which must be to a full byte, uniquely accessed) rather than a proxy structure like C++’s std::bitset::reference
.
bitvec
provides a BitPtr
structure that acts like a *bool
, except that it points into a BitSlice
region.
As the standard library grows and evolves, bitvec
will mirror it.