Skip to content

Small map #2087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2018
Merged

Small map #2087

merged 1 commit into from
May 15, 2018

Conversation

danpoe
Copy link
Contributor

@danpoe danpoe commented Apr 18, 2018

This is a feature that is used in a future refactoring of sharing_mapt.

A map that maps small integers to values. It is designed to be more memory-efficient than std::map for this use case.

/// Map from small integers to values
///
/// A data structure that maps small integers (in {0, ..., Num-1}) to values.
/// It is designed to be more memory-efficient than std::map for this use case.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much "more" memory-efficient is it? And what values of Num are typically to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the small map to represent the children of a node in a tree data structure. A node can have up to Num children, with indices 0...Num-1, but not all of them are usually present.

Typical values for Num are say 8 or 16 (we're using 8 at the moment). For a map that stores pointers (as in our case) it uses about 1/5 of the memory compared to std::map.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool, it would be nice to have as concrete as possible data points in the comments. Statements like "for Num=8, std::map requires X bytes on a 64-bit system when using GNU STL while this map only requires Y bytes".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've added the data about the memory consumption to the class description.

@danpoe danpoe force-pushed the feature/small-map branch from 4942ec3 to 3959ca2 Compare April 19, 2018 15:24
Copy link
Contributor

@NlightNFotis NlightNFotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


T *allocate(std::size_t n) const
{
return (T *)malloc(sizeof(T) * n);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One little thing to be careful of, is that the usage of malloc is introducing a very subtle failure case that I think is not being properly handled. If new fails, it throws. If malloc fails, it returns a null pointer. This can lead to very subtle bugs in copy if a machine is stressed in memory. Do you think this is something that might be a good idea to be a PRECONDITION, or is it just a waste?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm not sure what exactly should happen in this case. I'm gonna check what the STL containers do and then maybe implement it similarly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the STL typically defines an implementation type and then just calls new on that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though really all you need here is return new T[n]; paired with a delete[] in the deallocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the STL containers use allocators (which in turn use malloc/free) and then use placement new to create the actual objects.

We don't use new T[n] here as we want to be able to grow (if possible) and shrink the allocated memory block with realloc(). Curiously, C++ allocators don't have an equivalent of realloc(). That's why e.g. the capacity of a std::vector does not decrease even when erasing elements. C++11 added shrink_to_fit(), but that copies the elements to a newly allocated memory segment. Also new T[n] would call the default constructor which would be unnecessary when copying objects.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe just do T *ptr = malloc(sizeof(T) * n); if(!ptr) throw std::bad_alloc(); return ptr; ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the right thing to do.

@danpoe danpoe force-pushed the feature/small-map branch 5 times, most recently from 02f1202 to 12dcd27 Compare April 25, 2018 11:03
@danpoe
Copy link
Contributor Author

danpoe commented Apr 26, 2018

Just a note if anyone is reviewing this right now: I've had an idea about how to simplify the implementation and will change/refactor the code in the next few days. It's thus better to hold on the reviewing until then.

@danpoe
Copy link
Contributor Author

danpoe commented Apr 27, 2018

Ok, I realised that the simpler version would require moving elements on insert() even when realloc() succeeds, which might be slightly less efficient. So I'm leaving the implementation as it is now.

T *allocate(T *ptr, std::size_t n)
#ifndef _SMALL_MAP_REALLOC_STATS
const
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems perfectly ok to mark realloc_success and realloc_failure mutable so that you don't have to do this preprocessor-conditional const.

{
realloc_failure++;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about if(ptr == mem) ++realloc_success; else if(ptr != nullptr) ++realloc_failure; to avoid one level of indentation?

@@ -552,6 +571,11 @@ class small_mapt
return !ind;
}

#ifdef _SMALL_MAP_REALLOC_STATS
unsigned realloc_failure = 0;
unsigned realloc_success = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use std::size_t - unsigned should really only be used where storage size matters.

#include "invariant.h"

// When we don't have constexpr
#if !(defined(__GNUC__) || _MSC_VER >= 1900)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest #if !defined(__GNUC__) && _MSC_VER < 1900 to make the negation more local?

struct num_bitst<1>
{
static const std::size_t value = 1;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about num_bitst<0>? I believe that ought to be the base case rather than num_bitst<1>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not quite sure whether num_bitst<0> should return 0 or 1. The latter seems a bit more logical to me. I'll add another specialization num_bits<0> which returns 1.


// Memory

void copy(T *dest, const T *src, std::size_t n) const
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be public? If so, documentation is required. Notably, it doesn't do any kind of sanity checks (in particular on n).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that should be private.

return;
}

std::size_t n = m.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

static_assert(NUM >= 2, "");

// When we don't have constexpr
#if !(defined(__GNUC__) || _MSC_VER >= 1900)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest propagating the negation as above. And wouldn't it make sense to move the above "no-constexpr" template instantiations down here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it can't be moved down as then it would be contained within another template. And then we could only specialize the template if we specialize the outer template as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I should have thought of this. Maybe add a comment that this is used within the class? Reading the code top-to-bottom this piece of code comes a bit as a surprise.


static_assert(std::numeric_limits<unsigned>::digits >= BITS, "");

// Internal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private or protected?


#include <cassert>
#include <iostream>
#include <map>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are those three required?

@danpoe danpoe force-pushed the feature/small-map branch 3 times, most recently from d29ca32 to a2ad909 Compare May 3, 2018 16:29
@tautschnig tautschnig merged commit a154593 into diffblue:develop May 15, 2018
NathanJPhillips pushed a commit to NathanJPhillips/cbmc that referenced this pull request Aug 22, 2018
31ef182 Merge pull request diffblue#2208 from diffblue/cleanout-goto-programs
0e84d3e Merge pull request diffblue#2195 from smowton/smowton/feature/smarter-reachability-slicer
c801126 Merge pull request diffblue#2209 from diffblue/travis-no-osx-g++
3487bf7 Reachability slicer: mark reachable code more precisely
85e8451 Merge pull request diffblue#2066 from tautschnig/bv-endianness
5e5eafb Merge pull request diffblue#2191 from smowton/smowton/feature/java-fatal-assertions
8a68ab8 remove g++ build on OS X; this is identical to the clang++ build
48e5b25 Amend faulty long-string test
3f718ba Java: make null-pointer and similar checks fatal
821eb1d remove basic_blocks and goto_program_irep
d87c2db Merge pull request diffblue#2203 from diffblue/typed-lookup
d77dea3 strongly type the lookup() methods in namespacet
2382f27 Merge pull request diffblue#2193 from martin-cs/feature/correct-instruction-documentation
c314272 Merge pull request diffblue#2181 from diffblue/format-expr-constants
514a0a5 Merge pull request diffblue#2167 from diffblue/parse_unwindset_options
0102452 move escape(s) to string_utils
f1b6174 use unwindsett in goto-instrument
dcc907a introduce unwindsett for unwinding limits
10aeae8 format_expr now does index, c_bool constants, string constants
92b92d6 Correct the documentation of ASSERT : it does not alter control flow.
a11add8 Expand the documentation of the goto-program instructions.
a06503b Merge pull request diffblue#2197 from tautschnig/fix-help
05e4bc3 Remove stray whitespace previously demanded by clang-format
3261f4d Fix help output of --generate-function-body-options
7c67b23 Merge pull request diffblue#2110 from tautschnig/type-mismatch-exception
18fb262 Merge pull request diffblue#2025 from tautschnig/stack-depth-fix
9191b6a Merge pull request diffblue#2199 from tautschnig/simplifier-typing
f99e631 Merge pull request diffblue#2198 from tautschnig/fix-test-desc
1a79a11 stack depth instrumentation: __CPROVER_initialize may be empty
a7690ba Merge pull request diffblue#2185 from smowton/smowton/fix/tolerate-ts18661-types
fc02e8f Restore returns before writing the simplified binary
49333eb Make restore_returns handle simplified programs
46f7987 Fix perl regular expressioons in regression test descriptions
9fe432b Merge pull request diffblue#1899 from danpoe/sharing-map-catch-unit-tests
9cc6aae Merge pull request diffblue#2081 from hannes-steffenhagen-diffblue/floating_point_simplificiation
da19abe Tolerate TS 18661 (_Float32 et al) types
a055454 Try all rounding modes when domain is unknown
5f7bc15 Add float support to constant propagator domain
3dc2244 Move adjust_float_expressions to goto-programs
06d8bea Merge pull request diffblue#2187 from diffblue/move-convert-nondet
6d8c3fa Merge pull request diffblue#2189 from thk123/bugfix/json-parser-restart
2ad157f Merge pull request diffblue#2186 from smowton/smowton/fix/call-graph-uninit-field
cd54ad7 Corrected state persistence in the json parser
4447be0 Fix uninitialised collect_callsites field in call_grapht
ead0aa3 Merge pull request diffblue#2188 from smowton/smowton/fix/init-string-types-without-refine-strings
57988fc Fix String type initialisation when --refine-strings is not active
6a76aff Move convert_nondet to java_bytecode
ac6eb21 Merge pull request diffblue#1858 from danpoe/dependence-graph-fix
10b8b09 Merge pull request diffblue#2011 from thomasspriggs/final_classes
a154593 Merge pull request diffblue#2087 from danpoe/feature/small-map
7002909 More dependence graph tests
66263ea Make dependence graph merge() return true when abstract state changes
3aa6dca Control dependency computation fix
a408423 Simplified state merging in the dependence graph
0315816 Merge pull request diffblue#2180 from thomasspriggs/json_id2string
8941567 Merge pull request diffblue#2124 from smowton/smowton/feature/fallback-function-provider
cd04e70 JBMC: use simple method stubbing pass
a6b2cda Add Java simple method stubbing pass
ec1127c Lazy goto model: hook for driver program to replace or stub functions
b6a4382 Remove `id2string` from inside calls to the `json_stringt` constructor.
b673ebb Add storage of final modifier status of java classes in `java_class_typet`.
a2ad909 Small map
808dade Provide suitable diagnostics for equality-without-matching-types
89cf6fe Throw appropriate exceptions when converting constraints
2ae66c2 Produce a proper error message when catching a std::runtime_error at top level
e7b3ade Workaround for Visual Studio loss of CV qualifier bug
1f661f5 Move sharing map and sharing node unit tests to util
47463a3 Use std::size_t instead of unsigned in the sharing map
3e22217 Sharing map documentation
e54f740 Fix sharing map compiler warnings
bcc489c Move sharing map unit tests to catch
34114b8 Use a specialised endianness map for flattening

git-subtree-dir: cbmc
git-subtree-split: 31ef182
@danpoe danpoe deleted the feature/small-map branch June 2, 2020 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants