-
Notifications
You must be signed in to change notification settings - Fork 273
Small map #2087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small map #2087
Conversation
/// Map from small integers to values | ||
/// | ||
/// A data structure that maps small integers (in {0, ..., Num-1}) to values. | ||
/// It is designed to be more memory-efficient than std::map for this use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much "more" memory-efficient is it? And what values of Num
are typically to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the small map to represent the children of a node in a tree data structure. A node can have up to Num
children, with indices 0...Num-1
, but not all of them are usually present.
Typical values for Num
are say 8 or 16 (we're using 8 at the moment). For a map that stores pointers (as in our case) it uses about 1/5 of the memory compared to std::map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, cool, it would be nice to have as concrete as possible data points in the comments. Statements like "for Num=8, std::map requires X bytes on a 64-bit system when using GNU STL while this map only requires Y bytes".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've added the data about the memory consumption to the class description.
4942ec3
to
3959ca2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
src/util/small_map.h
Outdated
|
||
T *allocate(std::size_t n) const | ||
{ | ||
return (T *)malloc(sizeof(T) * n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One little thing to be careful of, is that the usage of malloc
is introducing a very subtle failure case that I think is not being properly handled. If new
fails, it throw
s. If malloc
fails, it returns a null pointer. This can lead to very subtle bugs in copy
if a machine is stressed in memory. Do you think this is something that might be a good idea to be a PRECONDITION
, or is it just a waste?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'm not sure what exactly should happen in this case. I'm gonna check what the STL containers do and then maybe implement it similarly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the STL typically defines an implementation type and then just calls new
on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though really all you need here is return new T[n];
paired with a delete[]
in the deallocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the STL containers use allocators (which in turn use malloc/free) and then use placement new to create the actual objects.
We don't use new T[n]
here as we want to be able to grow (if possible) and shrink the allocated memory block with realloc()
. Curiously, C++ allocators don't have an equivalent of realloc()
. That's why e.g. the capacity of a std::vector
does not decrease even when erasing elements. C++11 added shrink_to_fit()
, but that copies the elements to a newly allocated memory segment. Also new T[n]
would call the default constructor which would be unnecessary when copying objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe just do T *ptr = malloc(sizeof(T) * n); if(!ptr) throw std::bad_alloc(); return ptr;
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the right thing to do.
02f1202
to
12dcd27
Compare
Just a note if anyone is reviewing this right now: I've had an idea about how to simplify the implementation and will change/refactor the code in the next few days. It's thus better to hold on the reviewing until then. |
Ok, I realised that the simpler version would require moving elements on |
src/util/small_map.h
Outdated
T *allocate(T *ptr, std::size_t n) | ||
#ifndef _SMALL_MAP_REALLOC_STATS | ||
const | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems perfectly ok to mark realloc_success
and realloc_failure
mutable so that you don't have to do this preprocessor-conditional const
.
src/util/small_map.h
Outdated
{ | ||
realloc_failure++; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about if(ptr == mem) ++realloc_success; else if(ptr != nullptr) ++realloc_failure;
to avoid one level of indentation?
src/util/small_map.h
Outdated
@@ -552,6 +571,11 @@ class small_mapt | |||
return !ind; | |||
} | |||
|
|||
#ifdef _SMALL_MAP_REALLOC_STATS | |||
unsigned realloc_failure = 0; | |||
unsigned realloc_success = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use std::size_t
- unsigned
should really only be used where storage size matters.
src/util/small_map.h
Outdated
#include "invariant.h" | ||
|
||
// When we don't have constexpr | ||
#if !(defined(__GNUC__) || _MSC_VER >= 1900) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I suggest #if !defined(__GNUC__) && _MSC_VER < 1900
to make the negation more local?
struct num_bitst<1> | ||
{ | ||
static const std::size_t value = 1; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about num_bitst<0>
? I believe that ought to be the base case rather than num_bitst<1>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not quite sure whether num_bitst<0>
should return 0 or 1. The latter seems a bit more logical to me. I'll add another specialization num_bits<0>
which returns 1.
src/util/small_map.h
Outdated
|
||
// Memory | ||
|
||
void copy(T *dest, const T *src, std::size_t n) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be public? If so, documentation is required. Notably, it doesn't do any kind of sanity checks (in particular on n
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that should be private.
src/util/small_map.h
Outdated
return; | ||
} | ||
|
||
std::size_t n = m.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const?
src/util/small_map.h
Outdated
static_assert(NUM >= 2, ""); | ||
|
||
// When we don't have constexpr | ||
#if !(defined(__GNUC__) || _MSC_VER >= 1900) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest propagating the negation as above. And wouldn't it make sense to move the above "no-constexpr" template instantiations down here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it can't be moved down as then it would be contained within another template. And then we could only specialize the template if we specialize the outer template as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I should have thought of this. Maybe add a comment that this is used within the class? Reading the code top-to-bottom this piece of code comes a bit as a surprise.
|
||
static_assert(std::numeric_limits<unsigned>::digits >= BITS, ""); | ||
|
||
// Internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private
or protected
?
unit/util/small_map.cpp
Outdated
|
||
#include <cassert> | ||
#include <iostream> | ||
#include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are those three required?
d29ca32
to
a2ad909
Compare
31ef182 Merge pull request diffblue#2208 from diffblue/cleanout-goto-programs 0e84d3e Merge pull request diffblue#2195 from smowton/smowton/feature/smarter-reachability-slicer c801126 Merge pull request diffblue#2209 from diffblue/travis-no-osx-g++ 3487bf7 Reachability slicer: mark reachable code more precisely 85e8451 Merge pull request diffblue#2066 from tautschnig/bv-endianness 5e5eafb Merge pull request diffblue#2191 from smowton/smowton/feature/java-fatal-assertions 8a68ab8 remove g++ build on OS X; this is identical to the clang++ build 48e5b25 Amend faulty long-string test 3f718ba Java: make null-pointer and similar checks fatal 821eb1d remove basic_blocks and goto_program_irep d87c2db Merge pull request diffblue#2203 from diffblue/typed-lookup d77dea3 strongly type the lookup() methods in namespacet 2382f27 Merge pull request diffblue#2193 from martin-cs/feature/correct-instruction-documentation c314272 Merge pull request diffblue#2181 from diffblue/format-expr-constants 514a0a5 Merge pull request diffblue#2167 from diffblue/parse_unwindset_options 0102452 move escape(s) to string_utils f1b6174 use unwindsett in goto-instrument dcc907a introduce unwindsett for unwinding limits 10aeae8 format_expr now does index, c_bool constants, string constants 92b92d6 Correct the documentation of ASSERT : it does not alter control flow. a11add8 Expand the documentation of the goto-program instructions. a06503b Merge pull request diffblue#2197 from tautschnig/fix-help 05e4bc3 Remove stray whitespace previously demanded by clang-format 3261f4d Fix help output of --generate-function-body-options 7c67b23 Merge pull request diffblue#2110 from tautschnig/type-mismatch-exception 18fb262 Merge pull request diffblue#2025 from tautschnig/stack-depth-fix 9191b6a Merge pull request diffblue#2199 from tautschnig/simplifier-typing f99e631 Merge pull request diffblue#2198 from tautschnig/fix-test-desc 1a79a11 stack depth instrumentation: __CPROVER_initialize may be empty a7690ba Merge pull request diffblue#2185 from smowton/smowton/fix/tolerate-ts18661-types fc02e8f Restore returns before writing the simplified binary 49333eb Make restore_returns handle simplified programs 46f7987 Fix perl regular expressioons in regression test descriptions 9fe432b Merge pull request diffblue#1899 from danpoe/sharing-map-catch-unit-tests 9cc6aae Merge pull request diffblue#2081 from hannes-steffenhagen-diffblue/floating_point_simplificiation da19abe Tolerate TS 18661 (_Float32 et al) types a055454 Try all rounding modes when domain is unknown 5f7bc15 Add float support to constant propagator domain 3dc2244 Move adjust_float_expressions to goto-programs 06d8bea Merge pull request diffblue#2187 from diffblue/move-convert-nondet 6d8c3fa Merge pull request diffblue#2189 from thk123/bugfix/json-parser-restart 2ad157f Merge pull request diffblue#2186 from smowton/smowton/fix/call-graph-uninit-field cd54ad7 Corrected state persistence in the json parser 4447be0 Fix uninitialised collect_callsites field in call_grapht ead0aa3 Merge pull request diffblue#2188 from smowton/smowton/fix/init-string-types-without-refine-strings 57988fc Fix String type initialisation when --refine-strings is not active 6a76aff Move convert_nondet to java_bytecode ac6eb21 Merge pull request diffblue#1858 from danpoe/dependence-graph-fix 10b8b09 Merge pull request diffblue#2011 from thomasspriggs/final_classes a154593 Merge pull request diffblue#2087 from danpoe/feature/small-map 7002909 More dependence graph tests 66263ea Make dependence graph merge() return true when abstract state changes 3aa6dca Control dependency computation fix a408423 Simplified state merging in the dependence graph 0315816 Merge pull request diffblue#2180 from thomasspriggs/json_id2string 8941567 Merge pull request diffblue#2124 from smowton/smowton/feature/fallback-function-provider cd04e70 JBMC: use simple method stubbing pass a6b2cda Add Java simple method stubbing pass ec1127c Lazy goto model: hook for driver program to replace or stub functions b6a4382 Remove `id2string` from inside calls to the `json_stringt` constructor. b673ebb Add storage of final modifier status of java classes in `java_class_typet`. a2ad909 Small map 808dade Provide suitable diagnostics for equality-without-matching-types 89cf6fe Throw appropriate exceptions when converting constraints 2ae66c2 Produce a proper error message when catching a std::runtime_error at top level e7b3ade Workaround for Visual Studio loss of CV qualifier bug 1f661f5 Move sharing map and sharing node unit tests to util 47463a3 Use std::size_t instead of unsigned in the sharing map 3e22217 Sharing map documentation e54f740 Fix sharing map compiler warnings bcc489c Move sharing map unit tests to catch 34114b8 Use a specialised endianness map for flattening git-subtree-dir: cbmc git-subtree-split: 31ef182
This is a feature that is used in a future refactoring of
sharing_mapt
.A map that maps small integers to values. It is designed to be more memory-efficient than
std::map
for this use case.