-
Notifications
You must be signed in to change notification settings - Fork 273
Dynamic memory snapshot #4578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic memory snapshot #4578
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me (with the disclaimer that I don't really understand this)
bool gdb_value_extractort::memory_scopet::contains( | ||
const memory_addresst &point) const | ||
{ | ||
size_t begin_int = std::strtoul(begin.address_string.c_str(), NULL, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a unchecked conversion effectively (size_t
is not necessarily the same size as unsigned long
, even though it's often the case). I'd recommend using safe_string2size_t
from string2int.h
(same with all the other string to int conversions here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if the safe
-conversion would work on hex-numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should, they use stoll
which basically works the same way as the str*
functions except it's easier to detect failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and refactored so that the conversion is computed once in case of distance
. See: 3812c28.
src/memory-analyzer/analyze_symbol.h
Outdated
@@ -91,6 +91,70 @@ class gdb_value_extractort | |||
/// value of `symbol`. | |||
std::map<memory_addresst, exprt> values; | |||
|
|||
struct memory_scopet | |||
{ | |||
memory_addresst begin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not private
, if these are not arbitrary (as the deleted default constructor implies)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, also added getters when necessary.
optionalt<mp_integer> get_malloc_size(irep_idt name); | ||
|
||
/// Build the pointee string for address \p point assuming it points to a | ||
/// dynamic allocation of `n' elements each of size \p member_size. E.g.: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's n
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T* point = (T*) malloc(sizeof(T) * n);
.. that n
.
@@ -230,8 +242,11 @@ code_blockt memory_snapshot_harness_generatort::add_assignments_to_globals( | |||
ordered_snapshot_symbols.begin(), | |||
ordered_snapshot_symbols.end(), | |||
[this](const snapshot_pairt &left, const snapshot_pairt &right) { | |||
return pointer_depth(left.second.symbol_expr().type()) < | |||
pointer_depth(right.second.symbol_expr().type()); | |||
if(refers_to(right.second.value, left.first)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, say int* p = malloc(8); int *q = p+1;
is the source code for which we want to take the memory snapshot. Then the snapshot will contain: int tmp[2]; int* p = tmp[0]; int* q = tmp[1];
And both p
and q
refer_to
tmp
. I sort the symbols based on this notion of reference so that when building the harness entry function, tmp
is assigned before the two pointers.
const std::string pointee; | ||
const std::string character; | ||
const optionalt<std::string> string; | ||
memory_addresst address; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields are all related, so I don't think they should be both be constant and public at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well they are not const
in the final version.
{ | ||
if(pointer_value.pointee.find("+") != std::string::npos) | ||
return true; | ||
|
||
if(pointer_value.pointee.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this only done if the pointee
is empty? What does that mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well pointer_value
is the result of the gdb query get_memory
. When the pointee
is empty it means that gdb did not know what structure the pointer points to. This can happen if the pointer points to inside a dynamically allocated memory.
bool gdb_value_extractort::memory_scopet::contains( | ||
const memory_addresst &point) const | ||
{ | ||
size_t begin_int = std::strtoul(begin.address_string.c_str(), NULL, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
src/memory-analyzer/analyze_symbol.h
Outdated
memory_scopet() = delete; | ||
memory_scopet( | ||
const memory_addresst &begin, | ||
mp_integer byte_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const mp_integer &byte_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/memory-analyzer/analyze_symbol.h
Outdated
memory_scopet( | ||
const memory_addresst &begin, | ||
mp_integer byte_size, | ||
irep_idt name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const irep_idt &name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
CHECK_RETURN(contains(point)); | ||
size_t begin_int = std::strtoul(begin.address_string.c_str(), NULL, 0); | ||
size_t point_int = std::strtoul(point.address_string.c_str(), NULL, 0); | ||
return (point_int - begin_int) / member_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you first do CHECK_RETURN(point_int >= begin_int);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was implied by contains
. But I refactored this section a bit to avoid duplicate computation of the address-to-string conversion. See: 3812c28.
src/memory-analyzer/analyze_symbol.h
Outdated
mp_integer byte_size; | ||
irep_idt name; | ||
|
||
memory_scopet() = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's necessary if you explicitly define some other constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Removed.
} | ||
|
||
const auto it = values.find(memory_location); | ||
// if the structure we are pointing to does not exists we need to build a | ||
// temporary object for it: get the type from symbol table, query gdb for | ||
// value, allocate new object for it and then store into assignments | ||
if(it == values.end()) | ||
if(struct_symbol->symbol_expr().type().id() == ID_pointer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very weird?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about struct_symbol->type.id()
? Changed.
@@ -303,6 +303,7 @@ exprt gdb_value_extractort::get_pointer_to_member_value( | |||
} | |||
|
|||
if(!found) | |||
if(struct_symbol->symbol_expr().type().id() == ID_array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks weird!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
@@ -291,6 +291,8 @@ exprt gdb_value_extractort::get_pointer_to_member_value( | |||
bool found = false; | |||
CHECK_RETURN(maybe_struct_size.has_value()); | |||
for(const auto &value_pair : values) | |||
|
|||
if(memory_map.count(struct_name) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapped in a helper function has_known_memory_location
.
@@ -0,0 +1,13 @@ | |||
FUTURE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain (after another --
at the end of the file) what isn't supported here so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
tmp = \{ \.next=\(\(struct S \*\)0\) \}; | ||
p = \&tmp; | ||
st = \{ .next=\&st \}; | ||
p = \&st; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please squash this into the commit that made this possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was this one: 33d0a79. I will squash them together.
d803707
to
3812c28
Compare
3812c28
to
71ee2f1
Compare
785cf3a
to
410fd84
Compare
@xbauch My apologies for taking this long! You might want to squash some commits and then just merge. Thank you for all the work!!! |
89f8805
to
5453ba7
Compare
In memory-snapshot-harness: so that pointer to objects are initialised after those objects.
no functional change
So that we can assume the implicit assignment operator.
One for value, one for register-value.
by breaking the execution at malloc entry and extracting the byte size and memory address from registers.
by returning the exact value rather than an estimate collected during preceding execution.
memory_scopet keeps the starting point and allocated size for a malloc(ed) site. We also include helper methods to query about dynamic allocation.
To account for the possibility of the pointee to be dynamically allocated.
Via dynamically_allocated and memory_map.
into a member method.
Unnecessary now that we enforce analysing those objects.
1: If pointee is known, return it's value instead. 2: malloc-size is precise and type-size has a wrapper 3: do not cast values from dynamically allocated pointers: we would loose the information about them being arrays.
1: Use memory map to get pointer-values (but double-check if gdb previously claimed that their pointees are themselves) 2: add more checks before casting.
instead of a vector to prevent duplications.
1: Casts are a part of the output 2: We can use some of the assertions previously left out (now that we get the dynamic allocation right).
Floating point numbers cannot be extracted for now.
rather than as a part of regression testing and only under non-MSVC environment.
2ab16c4
to
ba8beb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: ba8beb5).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/113721650
Enable tracking information about dynamically allocated memory from
memory-analyzer
across togoto-harness
via memory snapshots. The knowledge about what addresses have been allocated dynamically (and the scope of these allocation in byte size) is collected by breaking the execution atmalloc
call-sites and inspecting the registers of these function calls.