Skip to content

User controlled havoc(ing) for memory-snapshot harness #4482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 20, 2019

Conversation

xbauch
Copy link
Contributor

@xbauch xbauch commented Apr 4, 2019

In #4438 we build support for recovering structs and pointers in goto-harness from a memory snapshot. The havocing there was by default: the user couldn't change how deeply will the structs and array-pointers initialised. This PR unifies the control over non-deterministic initialisation between function-call and memory-snapshot harnesses, consequently allowing the user to control the initialisation in memory-snapshot harness as well.

Pointers to arrays are initialised in a similar fashion to function-call harness, i.e. the user has to declare which variable contains the size of the array.

Also dynamically allocated pointers in the source-program and changed into statically allocated arrays in the memory snapshot, so that it would be legal to query about their content in the harness-program.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@xbauch xbauch force-pushed the feature/user-control-havoc branch 2 times, most recently from 1ab1e67 to 5b4f4ec Compare April 7, 2019 14:19
sonodtt
sonodtt previously requested changes Apr 8, 2019
@@ -35,6 +38,10 @@ struct recursive_initialization_configt
std::set<irep_idt> pointers_to_treat_as_cstrings;

std::string to_string() const; // for debugging purposes

bool handle_option(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Descriptive comment please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -415,3 +416,56 @@ recursive_initializationt::cstring_member_initialization()
}
};
}

bool recursive_initialization_configt::handle_option(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be moved to top of file? (preserve expected ordering)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

auto &require_exactly_one_value =
harness_options_parsert::require_exactly_one_value;

if(p_impl->recursive_initialization_config.handle_option(option, values))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty branch - remind us of the why and what is going on here?
People will typically be viewing the code in a hurry, and not in an ordered PR as here. This may be where they jump in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -222,7 +226,7 @@ code_blockt memory_snapshot_harness_generatort::add_assignments_to_globals(
goto_modelt &goto_model) const
{
recursive_initializationt recursive_initialization{
recursive_initialization_configt{}, goto_model};
recursive_initialization_config, goto_model};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extraneous removal of brace initialization that clarifies this is a temporary object constructed in place (and initialised via value initialization).
Not sure you gain additional clarity by removing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a temporary object anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Good.

if(option == "memory-snapshot")
auto &require_exactly_one_value =
harness_options_parsert::require_exactly_one_value;
if(recursive_initialization_config.handle_option(option, values))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty branch - remind us of the why and what is going on here?
People will typically be viewing the code in a hurry, and not in an ordered PR as here. This may be where they jump in. (same remark in function_call_harness_generator.cpp)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

checkpoint();

assert(p != NULL);
for(int index = 0; index < 3; index++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment. Yes - its clear, but for this may be where future someone starts in figuring out the code.

This principle holds elsewhere in the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe done.

assert(p->array[index]->to_st2->to_st1->to_st2 == NULL);
}

assert(p->array[0] != p->array[1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely some comments here. Clarify why the != holds

This principle holds elsewhere in the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe done.


st1_t dummy1;
st2_t dummy2;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The dummys are not needed in this test? Remove anything extraneous or we'll be wondering if we've overlooked something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -190,7 +190,8 @@ exprt symbol_analyzert::get_char_pointer_value(
}
else
{
return it->second;
return symbol_exprt{to_symbol_expr(it->second).get_identifier(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment? if you had to understand something - please tell us what and why.
This should greatly help in reducing review time and increase accuracy of reviews.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, or maybe removed.

"pointers-to-char shouldn't point to members");
const auto result_pointee_expr =
get_char_pointer_value(expr, memory_location, location);
const auto maybe_cast_result = typecast_exprt::conditional_cast(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. My guess is you had to understand something here about memory analyser.
Any clarification would greatly help reviewers. This code was inherited, but as we understand it (to extend it) - we should clarify, so as to not duplicate effort when this code is passed on to others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, or maybe rewritten.

@xbauch xbauch force-pushed the feature/user-control-havoc branch 3 times, most recently from b025b32 to 4f334f8 Compare April 10, 2019 14:07
@xbauch xbauch marked this pull request as ready for review April 11, 2019 07:45
@xbauch xbauch force-pushed the feature/user-control-havoc branch from 4f334f8 to 9c59bfb Compare April 23, 2019 11:43
@xbauch xbauch requested review from forejtv and thk123 as code owners April 23, 2019 11:43
@xbauch xbauch force-pushed the feature/user-control-havoc branch 2 times, most recently from 484b84a to 197f23d Compare April 23, 2019 12:00
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 197f23d).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/109281843
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

xbauch pushed a commit to xbauch/cbmc that referenced this pull request Apr 27, 2019
xbauch pushed a commit to xbauch/cbmc that referenced this pull request Apr 27, 2019
@xbauch xbauch mentioned this pull request Apr 27, 2019
7 tasks
@@ -16,6 +16,27 @@ Author: Diffblue Ltd.

class goto_modelt;

class harness_options_parsert
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so this really is using a class as a namespace, just because our coding guidelines say so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not the intention, but I guess that is the only reason.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe we should actually use a namespace?

Copy link
Contributor Author

@xbauch xbauch May 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here: 6981150. I'm just not sure about the naming convention: should a namespace end with t or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the linter doesn't like it very much.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you fixed the latter? Looks good to me. But I'm not sure what others think about it :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was suggested that rather than rewriting the linter we should just wrap the namespace in // NOLINT comments: b9fd0af.

@@ -16,9 +16,63 @@ Author: Diffblue Ltd.
#include <util/optional_utils.h>
#include <util/std_code.h>
#include <util/std_expr.h>
#include <util/string2int.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message sounds cool, but really doesn't give my any information I'm afraid. Please clarify.

Copy link
Contributor Author

@xbauch xbauch May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the sloppy description, expanded here: d96545d.

@@ -83,6 +83,12 @@ class gdb_apit
const optionalt<std::string> string;
};

/// Get the allocated size estimate for a pointer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why this is an estimate? And is it always either an under- or an overestimate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explanation: ef45775.

return get_char_pointer_value(expr, memory_location, location);
const auto target_expr =
get_pointer_to_member_value(expr, value, location);
CHECK_RETURN(target_expr.id() != ID_nil);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer target_expr.is_not_nil()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

: get_non_char_pointer_value(expr, memory_location, location);

// postpone if we cannot resolve now
if(target_expr.id() == ID_nil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(target_expr.is_nil())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const auto val = get_expr_value(symbol_expr, *zero, location);

symbol_exprt dummy(
pointer_typet(symbol_expr.type(), config.ansi_c.pointer_width));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pointer_type from util/c_types.{h,cpp}, and add a name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


// allocate a new symbol for the temporary static array
symbol_exprt array_dummy(
pointer_typet(target_array_type, config.ansi_c.pointer_width));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pointer_type, add a name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: b30bb65).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112204059

Copy link
Contributor

@NlightNFotis NlightNFotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Contributor

@chrisr-diffblue chrisr-diffblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just nits/queries - but I think the unconditional use of malloc_useable_size needs addressing for future sanity.


assert(first == second);
assert(second[array_size - 1] == 'a');
// assert(second[10]=='0');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this assert is still commented out? If so, maybe worth including a comment explaining why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is redundant. Removed.

checkpoint();

assert(first == second);
/* assert(array_size >= prefix_size); */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto with these asserts - worth explaining why they don't currently hold/when they might be able to be enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was postponed until the improved support for dynamic allocation will be implemented in #4578. I've added a comment.

@@ -56,6 +57,39 @@ gdb_apit::~gdb_apit()
wait(NULL);
}

size_t gdb_apit::query_malloc_size(const std::string &pointer_expr)
{
write_to_gdb("-var-create tmp * malloc_usable_size(" + pointer_expr + ")");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe malloc_usable_size is only available on Linux. On OSX I think the equivalent function on Mac OS X is malloc_size - so we probably need to do some conditional compilation guarding here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the machine interface worked so much differently under OSX that we only wanted to compile there but not actually run the memory-analyzer. Anyway what do you think about 57f08e1?

@xbauch xbauch force-pushed the feature/user-control-havoc branch 2 times, most recently from 91b15c4 to 57f08e1 Compare May 17, 2019 19:45
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 47138b7).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112282432
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 57f08e1).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112284250

Copy link
Contributor

@chrisr-diffblue chrisr-diffblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, thanks!

petr-bauch added 10 commits May 20, 2019 11:00
Returning 1 if variable was not malloc(ed). The estimation is described in the
documentation.
Not all queried values are sensible if we ask past allocated
memory (malloc_usable_size doesn't return exact values).
To prevent assigning uninitialized values. Also build the
recursive_initialization with the member config.
1: Harness option parser is a namespace not a class: its methods are not static
   and we do not inherit from it.
2: Rename mis-labelled option: from common-harness..-nondet-globals to
   function-harness-..-nondet-globals.
Also pointers to members can be chars.
Unify handling pointers-to-members/to-char/to-non-char and add comments.
Using the type from symbol table and value from gdb.
By pretending they were allocated statically. We get the estimated allocated
size and build a static object of that size.
Testing that the user can set how deeply a struct/pointer is
non-deterministically initialized as well as some cases regarding pointers,
strings, dynamic allocation, etc.
@xbauch xbauch force-pushed the feature/user-control-havoc branch from 57f08e1 to e0d3e5e Compare May 20, 2019 10:12
@xbauch xbauch merged commit 3d9a71b into diffblue:develop May 20, 2019
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: e0d3e5e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112430842
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

@xbauch xbauch deleted the feature/user-control-havoc branch June 10, 2019 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants