Skip to content
This repository was archived by the owner on Sep 13, 2023. It is now read-only.

Commit ce46f48

Browse files
Patrick Mézardpquentin
Patrick Mézard
authored andcommitted
array-aho: implement memory mapped trie
FrozenTrie already implemented a read-only in-memory structure based on vectors of simple structs. Implementing a mmapped version of that is not too hard: serialize these arrays in such a way they can be accessed directly after mapping in memory. This version makes no effort to ensure compatibility between architectures: mapped data must be written and read on machines sharing the same architecture. Data is not really serialized, only reinterpret_cast'ed and written as such. In theory, there might be arguments about why it may fail. In practice it works just fine. That said, to avoid playing data alignment games, Node fields are denormalized in as many packed arrays of similar size. The upside is it helps unifying parsing/loading them in memory, the drawback is scanning nodes touch more memory pages. Performance seems fine. The change is roughly split into: - Implement serialization code in FrozenTrie. The basic structure is a primitive type array prefixed by the number of elements. - Implement MappedTrie, which mmaps the input file and wraps the arrays in MappedArray objects to minimize pointer arithmetic. - Introduce AbstractTrie to expose basic accessors on nodes, indices and payloads and rewrite find_anchored() using it. Share find_anchored() implementation between FrozenTrie and MappedTrie.
1 parent 945c20d commit ce46f48

File tree

5 files changed

+3568
-751
lines changed

5 files changed

+3568
-751
lines changed

0 commit comments

Comments
 (0)