You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
127: Add scalar optimizations from CRoaring / arXiv:1709.07821 section 3 r=Kerollmops a=saik0
### Purpose
This PR adds some optimizations from CRoaring as outlined in arXiv:1709.07821 section 3
### Overview
* All inserts and removes are now branchless (!in arXiv:1709.0782, in CRoaring)
* Section 3.1 was already implemented, except for `BitmapIter`. This is covered in #125
* Implement Array-Bitset aggregates as outlined in section 3.2
* Also branchless 😎
* Tracks bitmap cardinality while performing bitmap-bitmap ops
* This is a deviation from CRoaring, and will need to be benchmarked further before this Draft PR is ready
* Curious to hear what you think about this `@lemire`
* In order to track bitmap cardinality the len field had to moved into `Store::Bitmap`
* This is unfortunately a cross cutting change
* `Store` was quite large (LoC) and had many responsibilities. The largest change in this draft is decomposing `Store` such hat it's field variants are two new container types: each responsible for maintaining their invariants and implementing `ops`
* `Bitmap8K` keeps track of it's cardinality
* `SortedU16Vec` maintains its sorting
* `Store` now only delegates to these containers
* My hope is that this will be useful when implementing run containers. 🤞
* Unfortunately so much code was moved this PR is _HUGE_
### Out of scope
* Inline ASM for Array-Bitset aggregates
* Section 4 (explicit SIMD). As noted by the paper authors: The compiler does a decent job of autovectorization, though not as good as hand-tuned
### Notes
* I attempted to emulate the inline ASM Array-Bitset aggregates by using a mix of unsafe ptr arithmetic and x86-64 intrinsics, hoping to compile to the same instructions. I was unable to get it under 13 instructions per iteration (compared to the papers 5). While it was an improvement, I abandoned the effort in favor of waiting for the `asm!` macro to stabilize. rust-lang/rust#72016
Co-authored-by: saik0 <[email protected]>
Co-authored-by: Joel Pedraza <[email protected]>
0 commit comments