-
Notifications
You must be signed in to change notification settings - Fork 273
Sharing map refactoring to reduce memory consumption #2126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharing map refactoring to reduce memory consumption #2126
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a cursory review for now, we'll have a detailed look tomorrow. So far we've found a couple of places that could use more detailed explanation, and especially rationale.
unit/Makefile
Outdated
@@ -28,6 +28,7 @@ SRC += unit_tests.cpp \ | |||
java_bytecode/inherited_static_fields/inherited_static_fields.cpp \ | |||
pointer-analysis/custom_value_set_analysis.cpp \ | |||
sharing_node.cpp \ | |||
sharing_map.cpp \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit seems to be shared with #1899?
src/util/sharing_map.h
Outdated
@@ -450,7 +450,7 @@ SHARING_MAPT2(, size_type)::erase( | |||
return 0; | |||
|
|||
node_type *del=nullptr; | |||
unsigned del_bit; | |||
unsigned del_bit = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this one too? (#1899)
src/util/small_map.h
Outdated
@@ -0,0 +1,560 @@ | |||
/*******************************************************************\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "small map" mean? Can you elaborate on this in the commit message itself?
src/util/small_map.h
Outdated
/// Map from small integers to values | ||
/// | ||
/// A data structure that maps small integers (in {0, ..., Num-1}) to values. | ||
/// It is designed to be more memory-efficient than std::map for this use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually more memory efficient than arrays? The benchmark given in this comment for an empty small_mapt<uint64t, uint32_t, 8>
- 32 bytes - is exactly as big as an std::array<uint32_t, 4>
, with a full small_mapt
being much worse. And arrays are also indexed by integers. Is the relation more favourable towards small_mapt
with larger Num
? If so, can you mention this here?
src/util/small_map.h
Outdated
if(n == 0) | ||
return nullptr; | ||
|
||
T *mem = (T *)malloc(sizeof(T) * n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why all this code with malloc/realloc instead of just using std::vector
(which already does all of these things for you)?
Also, this leaves the result of the allocation uninitialised which is probably ok for POD types, but non-POD types need to be initialised with placement-new.
src/util/sharing_node.h
Outdated
@@ -12,340 +12,499 @@ Author: Daniel Poetzl | |||
#ifndef CPROVER_UTIL_SHARING_NODE_H | |||
#define CPROVER_UTIL_SHARING_NODE_H | |||
|
|||
#include <list> | |||
#define DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is debug code this shouldn't be enabled by default. Also, "DEBUG" is too generic a name to have it defined and then include a bunch of headers. If you're going to keep this, you should give it a less generic name and immediately undef it after the ifdef check.
src/util/sharing_node.h
Outdated
//#define _sn_assert(b) | ||
|
||
#define _sn_type_par_decl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do all these things start with an underscore? I don't think this matches our internal naming convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions about _sn_assert_use
as in src/util/sharing_map.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now renamed all the macros to be all uppercase and have dropped the leading underscore.
src/util/sharing_node.h
Outdated
bool no_sharing=false> | ||
class sharing_nodet | ||
template <typename baseT, typename dataT> | ||
struct delete_through_base_ptrt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a highly unusual thing. Please document why you need this.
src/util/sharing_node.h
Outdated
} | ||
else if(data.use_count() > 1) | ||
{ | ||
auto p = std::static_pointer_cast<dataT>(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You assume that data
is pointing to an instance of dataT
here. Can you add an assertion for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please leave a comment explaining why you're using static rather than dynamic cast for this.
src/util/sharing_node.h
Outdated
|
||
typedef const std::pair<const self_type &, const bool> const_find_type; | ||
typedef const std::pair<self_type &, const bool> find_type; | ||
_sn_type_par_decl class d_to_internalt : public d_baset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's d
? And to internal what?
Now that the pre-dependencies are merged, could this please be rebased before doing any reviews? |
a3f3f3a
to
d76cd52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still have some questions, otherwise looks OK to me.
src/util/sharing_map.h
Outdated
|
||
#ifdef SHARING_MAP_INTERNAL_CHECKS | ||
#define _sm_assert(b) INVARIANT(b, "Sharing map internal invariant") | ||
#define _sm_assert_use(v, b) _sm_assert(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for? Is this supposed to be a sort of [[maybe_unused]]
for v
? If so, why is it not used if SHARING_MAP_INTERNAL_CHECKS
is on, and why not just (void)v
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is to avoid a compiler warning when the internal assertions are disabled (if for example r
is only used in assert(r)
). Yes, (void)v
would also work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But ... is this actually used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, it's not used anymore. (a similar define _sn_assert_use
also exists in sharing_node.cpp
where it is used)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can they be removed in that case? @danpoe
src/util/sharing_node.h
Outdated
//#define _sn_assert(b) | ||
|
||
#define _sn_type_par_decl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions about _sn_assert_use
as in src/util/sharing_map.h
unit/util/sharing_map.cpp
Outdated
@@ -43,13 +49,13 @@ void sharing_map_interface_test() | |||
{ | |||
smt sm; | |||
|
|||
smt::const_find_type r1 = sm.insert(std::make_pair("i", "0")); | |||
smt::const_find_type r1=sm.insert(std::make_pair("i", "0")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are all these formatting changes? They seem to do the opposite of what clang-format wants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, not sure why or how I did this. I've now run clang-format
over all the commits.
unit/util/sharing_map.cpp
Outdated
//n = 5000000; | ||
|
||
//sharing_mapt<int, int> m; | ||
std::map<int, int> m; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called sharing_map_memory_test
when it inserts things into std::map
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test was obsolete and I've now removed it.
unit/util/sharing_map.cpp
Outdated
@@ -345,3 +402,8 @@ TEST_CASE("Sharing map views") | |||
{ | |||
sharing_map_view_test(); | |||
} | |||
|
|||
TEST_CASE("Sharing map memory") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case doesn't have any assertions in it? Should this really be a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
src/util/sharing_node.h
Outdated
// Internal nodes | ||
|
||
class d_baset | ||
class d_baset : public small_shared_two_way_pointeet<unsigned> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's "d"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This somewhat mimics the naming scheme in irept
, where the shared pointer contained in an irept
is named dt
(I believe it stands for something like "data type").
3a08546
to
4662bdd
Compare
src/util/sharing_node.h
Outdated
@@ -12,339 +12,455 @@ Author: Daniel Poetzl | |||
#ifndef CPROVER_UTIL_SHARING_NODE_H | |||
#define CPROVER_UTIL_SHARING_NODE_H | |||
|
|||
#include <list> | |||
//#define DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be removed instead of commented out? I think it can be passed as a CLI compiler supplied symbol during compilation with -DDEBUG
. Other than that the name is too generic - is the symbol DEBUG
reserved across the whole of CBMC for debugging support like this, or is this code debugging support just specific to this code? If so, can it be renamed to something else if it can't be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rename it to SN_DEBUG
(the other macros in the sharing node file also start with SN
).
src/util/sharing_node.h
Outdated
#include "small_shared_ptr.h" | ||
#include "small_shared_two_way_ptr.h" | ||
|
||
//#define SN_INTERNAL_CHECKS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the above comment also applies to this as well? Can this be documented somewhere that it can be passed during compilation with -DSN_INTERNAL_CHECKS
and removed from here? Otherwise it's confusing to a new reader, I believe.
|
||
typedef sharing_nodet<key_type, mapped_type, key_equal> self_type; | ||
SN_TYPE_PAR_DECL class d_internalt : public d_baset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a question regarding the naming of this and similar classes. What does the d_
prefix symbolise? Is it just data
? Or ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw now, that these are supposed to mimic the naming scheme of irept
, with d
symbolising data
or data type
or something along this lines. Understood, but I think this should be documented somewhere more prominently than an easily forgotten comment in a PR.
src/util/sharing_node.h
Outdated
|
||
#ifdef SN_INTERNAL_CHECKS | ||
#define SN_ASSERT(b) INVARIANT(b, "Sharing node internal invariant") | ||
#define SN_ASSERT_USE(v, b) SN_ASSERT(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a previous comment it has been asserted that these utilities are not being used. Do you think they might be useful in the future, and if not, can they be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were two similar macros, SM_ASSERT_USE
and SN_ASSERT_USE
(SM
for sharing map and SN
for sharing node). The former was unused and has now been removed, but the latter is still used.
src/util/sharing_map.h
Outdated
|
||
#ifdef SHARING_MAP_INTERNAL_CHECKS | ||
#define _sm_assert(b) INVARIANT(b, "Sharing map internal invariant") | ||
#define _sm_assert_use(v, b) _sm_assert(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can they be removed in that case? @danpoe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor, non-blocking comments.
src/util/sharing_map.h
Outdated
#define SHARING_MAPT(R) \ | ||
template <class keyT, class valueT, class hashT, class predT> \ | ||
R sharing_mapt<keyT, valueT, hashT, predT> | ||
#define SHARING_MAPT(R) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use // clang-format off
to avoid the ugly reformatting
src/util/sharing_map.h
Outdated
return nullptr; | ||
|
||
p=p->find_leaf(k); | ||
const leaft *lp; | ||
lp = cp->find_leaf(k); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure that separating declaration and initialisation improves readability here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll pull this method up into the class.
src/util/sharing_map.h
Outdated
std::size_t key = hash()(k); | ||
innert *ip = ↦ | ||
|
||
std::size_t bit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this declaration not inside the loop?
src/util/sharing_map.h
Outdated
#include <stack> | ||
#include <vector> | ||
#include <stdexcept> | ||
//#define SM_DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer removing commented code
src/util/sharing_map.h
Outdated
|
||
#include "irep.h" | ||
#include "sharing_node.h" | ||
#include "threeval.h" | ||
|
||
#define _sm_assert(b) assert(b) | ||
//#define _sm_assert(b) | ||
//#define SM_INTERNAL_CHECKS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer removing commented code
{ | ||
view.push_back(view_itemt(child.get_key(), child.get_value())); | ||
const innert *i = &item.second; | ||
stack.push({depth + 1, i}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You both push and pop on the stack within the do-while loop. Maybe comment on the property that ensures that the stack eventually becomes empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're essentially iterating over all the nodes in a tree with DFS. I'll add a comment to say that.
d2ed841
to
af2defd
Compare
779fa71 Merge pull request diffblue#2253 from peterschrammel/documentation/override2 40ecff8 Merge pull request diffblue#2250 from tautschnig/expr-iterator-deque 050b344 Re-enable enforcement of override without virtual b5dec9c Get legalistic about use of override without virtual b51e2a8 Merge pull request diffblue#2196 from peterschrammel/check-module-includes d5eabdf expr_iterator: use a std::deque to implement the stack fada0af Add module dependency definition files a90ea44 Add module dependency check to CPP-LINT 88f8cfc Remove unnecessary includes d6986d8 Fix relative include paths 0f9c202 Merge pull request diffblue#2242 from diffblue/section-name-warning b7f5886 Merge pull request diffblue#2241 from diffblue/ld_mode 5c11eb7 Merge pull request diffblue#2245 from mgudemann/fix/warning/clang_self_assign f7e5fb5 Merge pull request diffblue#2229 from diffblue/ssize_t 1a504c9 Merge pull request diffblue#2244 from diffblue/solver-Makefile-fix 4bb1bf0 Fix clang++'s warning about self-assign 9a0aa9c Merge pull request diffblue#2235 from thomasspriggs/test-pl-colour 4c2cb3a remove linker mode from gcc_mode 303908f add separate path for ld 524091f factor out creation of hybrid binaries b9127f3 linker_script_merget now takes exactly one ELF + goto binary cd967db update year + add Michael 0d95cc5 missing const method qualifiers 6f04d98 fix ordering problem in solvers/Makefile 8f6bae0 remove a warning about section names 8befd02 Merge pull request diffblue#2238 from owen-jones-diffblue/owen-jones-diffblue/doc/irep_id 34b0ac6 Merge pull request diffblue#2236 from diffblue/show-class-hierarchy 8e8e450 Merge pull request diffblue#2232 from owen-jones-diffblue/owen-jones-diffblue/generic-bounded-types 01dc76b Add section on irep_idt and dstringt 2f4c6ad Add and unify --show-class-hierarchy command line option 56256f1 Minor typos in irept documentation 3cf4e3a Merge pull request diffblue#2178 from thomasspriggs/remove_java_bytecode_parse_treet_swap 1a7235d use __CPROVER_size_t and __CPROVER_ssize_t for __CPROVER_POINTER_OBJECT/OFFSET a018e2f Add JSON output for class hierarchy 68c45ed Improve class hierarchy output eeb732f Switch `push_back` to `emplace_back` when constructing `parse_trees`. f154840 Delete copy constructor of `class java_bytecode_parse_treet`. c5cbcec Fix instances where copying was being used instead of moving. 52a669f Remove `java_bytecode::swap` and return using `optionalt` instead. fabbd04 Give up parsing generic method signature with bound 77f8162 Colour code tests passing vs failing. e5e0897 Merge pull request diffblue#2126 from danpoe/refactor/sharing-map-small-nodes f55bd96 Merge pull request diffblue#2231 from smowton/smowton/fix/jbmc-tests af02973 Merge pull request diffblue#2202 from smowton/smowton/fix/java-lang-class-fields 42a78af JBMC tests: suffix logfiles when using symex-driven loading af2defd Removed obsolete sharing map unit test 1d7fbd3 Refactor sharing map nodes to reduce memory consumption 5235938 Restore testing of jbmc 8a59f6f Object factory: permit null pointers within java.lang.Class 8412eb0 Merge pull request diffblue#2228 from peterschrammel/move-remaining-java-tests 369577a Move remaining java tests to jbmc/regression/ bfe3d3d Merge pull request diffblue#2226 from tautschnig/inline-get-str-cont 2b00973 Merge pull request diffblue#2227 from tautschnig/fptr-removal 3f7685f Merge pull request diffblue#2223 from diffblue/fp-builtins 3b3dc71 Distinct names of return-value symbols 4f7fade Cleanup: use symbolt::symbol_expr 8372862 function-pointer removal: Set the mode of a return symbol 272cde0 Inline get_string_container 72a0379 test __builtin_isinf, __builtin_isinf_sign, __builtin_isnormal f156ef0 Merge pull request diffblue#2222 from tautschnig/attributes a69c603 add __builtin_isnormal 83aeddd added __builtin_isinf_sign 87d467e fix return types of various __builtin_is* functions 61af061 added typecast_exprt::conditional_cast e1b906a Support GCC's fallthrough attribute d6d0a49 C front-end: support alias attributes on variables 376beab Merge pull request diffblue#2218 from diffblue/builtin_fpclassify c3603e3 added a test for raw __builtin_fpclassify 52595bd add support for __builtin_fpclassify 50d1c79 Merge pull request diffblue#2214 from tautschnig/tg-only 3c59312 Remove unused substitute.{h,cpp} d3e131c Revert "Set memory limit utility" a4389fe Merge pull request diffblue#2210 from tautschnig/verbosity-cleanup c250880 Merge pull request diffblue#2211 from tautschnig/travis-osx-cleanup c8597a4 Merge pull request diffblue#2174 from romainbrenguier/bugfix/not_contains#TG-3150 b08ef94 Merge pull request diffblue#2216 from peterschrammel/update-codeowners 471ab0f Merge pull request diffblue#2207 from diffblue/remove-solvers-cvc 215cd69 Use enum entries instead of numeric values when comparing verbosity 6344b4f Warn if user-specified verbosity is out of range bf04bcb Use a single implementation of eval_verbosity b4731eb Do not redundantly set the message handler 42ec63a Clean up .gitignore 19200bf Update CODEOWNERS for /jbmc 0487376 Merge pull request diffblue#2173 from svorenova/gs_tg1121 6af2270 Update regression test that no longer throws an exception bc17328 Enable previously failing regression tests 146bb29 Adding debug information to dereference type comparison 7b9a20a Allow pointers to be dereferenced to void types 108129c Merge pull request diffblue#2118 from diffblue/remove-jbmc 11411e4 Travis/OSX follow-up cleanup: remove unnecessary environment variables 386faa8 Test for String.contains and very large strings 9e73699 Refactor negation of not contains constraints 29a8818 Build jbmc on CI f196e74 Update compilation instructions 1b7c84a Add JBMC README 03d6f5b Shorten goto-analyzer-taint-ansi-c tests to goto-analyzer-taint 8dc0d74 Remove obsolete jbmc-cover tests f36da08 Move Java regression tests b6742ca Move Java unit tests e247458 Add JANALYZER tool 4588753 Add JDIFF tool a20f2c1 Move java_bytecode, jbmc and miniz to jbmc/src 987106f Make unit test independent of java_bytecode d945452 Adapt cpplint header guard check 28907b2 remove (pre-SMT-LIB) CVC interface git-subtree-dir: cbmc git-subtree-split: 779fa71
This changes the representation of nodes in
sharing_mapt
to reduce memory consumption.