Skip to content

CBMC additional profiling info: Pointer dereference, points-to set metric #5475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 19, 2020

Conversation

natasha-jeppu
Copy link

Parse option to track size and contents of points-to sets during pointer dereferencing.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

ss << format(value);
json_result["Value"] = json_stringt(ss.str());

std::cout << ",\n" << json_result;
Copy link
Member

@peterschrammel peterschrammel Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 This is not allowed. Pass a message handler to the class instead, as done everywhere else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this.

Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. There are already a number of ways of showing the points-to sets in general.
  2. If it must specifically be done during symbolic execution, why not just add it as a call to the right level of logging. I don't see why we need extra options.

FYI : there has been a long-standing goal of minimising the number of options that CPROVER tools have or at least preventing the proliferation of them. We haven't done so well of late but it is still a worthwhile goal to pursue.

@natasha-jeppu
Copy link
Author

  1. There are already a number of ways of showing the points-to sets in general.

Are you referring to the --show-value-sets option? As I understand, the CBMC call just looks at the points-to set information which is recorded during symex and this is what we are interested in.

@danielsn danielsn added the aws Bugs or features of importance to AWS CBMC users label Aug 26, 2020
@danielsn
Copy link
Contributor

  1. There are already a number of ways of showing the points-to sets in general.
  2. If it must specifically be done during symbolic execution, why not just add it as a call to the right level of logging. I don't see why we need extra options.

FYI : there has been a long-standing goal of minimising the number of options that CPROVER tools have or at least preventing the proliferation of them. We haven't done so well of late but it is still a worthwhile goal to pursue.

I wonder if it would make sense to have a mechanism for defining what statistics to collect / what output to generate, that could go under a single commandline flag.
Ideas are:

  1. A json config file
  2. A --show-metric-in-log metric flag that takes a metric, or list of metrics.

@codecov
Copy link

codecov bot commented Sep 3, 2020

Codecov Report

Merging #5475 (f179946) into develop (c907b2b) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5475      +/-   ##
===========================================
+ Coverage    69.32%   69.34%   +0.01%     
===========================================
  Files         1241     1241              
  Lines       100443   100474      +31     
===========================================
+ Hits         69636    69674      +38     
+ Misses       30807    30800       -7     
Flag Coverage Δ
cproversmt2 43.12% <100.00%> (+0.03%) ⬆️
regression 66.24% <100.00%> (+0.01%) ⬆️
unit 32.26% <4.76%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/goto-symex/symex_config.h 100.00% <ø> (ø)
src/cbmc/cbmc_parse_options.cpp 77.32% <100.00%> (+0.31%) ⬆️
src/goto-symex/symex_clean_expr.cpp 95.83% <100.00%> (ø)
src/goto-symex/symex_dereference.cpp 89.91% <100.00%> (ø)
src/goto-symex/symex_main.cpp 87.78% <100.00%> (+0.03%) ⬆️
src/pointer-analysis/goto_program_dereference.h 66.66% <100.00%> (ø)
src/pointer-analysis/value_set_dereference.cpp 94.97% <100.00%> (+0.65%) ⬆️
src/pointer-analysis/value_set_dereference.h 100.00% <100.00%> (ø)
src/pointer-analysis/value_set.cpp 78.35% <0.00%> (+0.69%) ⬆️
src/pointer-analysis/goto_program_dereference.cpp 26.92% <0.00%> (+0.70%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c907b2b...f179946. Read the comment docs.

@peterschrammel
Copy link
Member

If it's just about showing metrics that are cheap to collect (e.g. points to set size) then I would just always output them - no flag needed. Such metrics are indeed very useful.
If you are also going to show the points to sets themselves then I'd prefer to put them behind a flag, as this produces massive output... and I'm not sure what it is good for except for very special debug situations.

@natasha-jeppu
Copy link
Author

If it's just about showing metrics that are cheap to collect (e.g. points to set size) then I would just always output them - no flag needed. Such metrics are indeed very useful.
If you are also going to show the points to sets themselves then I'd prefer to put them behind a flag, as this produces massive output... and I'm not sure what it is good for except for very special debug situations.

For now both are behind a flag. We prefer to keep this behind a flag because even with just the points-to set size, we can get a really big output (multiple pointer dereferences and number of pointers may be large). But if you prefer to have it as part of the standard output we could make the change.

@@ -11,10 +11,9 @@ Author: Daniel Kroening, [email protected]

#include "value_set_dereference.h"

#ifdef DEBUG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗Please don't remove the ifdef here. iostream must not be included outside DEBUG here.

You can use iosfwd outside DEBUG - that's what you need here, as far as I can see.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -36,7 +36,7 @@ class goto_program_dereferencet:protected dereference_callbackt
: options(_options),
ns(_ns),
value_sets(_value_sets),
dereference(_ns, _new_symbol_table, *this, ID_nil, false)
dereference(_ns, _new_symbol_table, *this, ID_nil, false, messaget())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗ Passing a default messaget instance doesn't look right here. The instance should come from the caller.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid construct? The points-to sets are displayed during symex only. exprt dereference(const exprt &pointer, bool display_points_to_sets = false); sets display_points_to_set to false by default so any calls to value_set_dereferencet::dereference() from the goto program conversion phase does not display the points-to sets. Similarly, is it okay to set the messaget here to a default message instance for goto_program_dereferencet?

goto_program_dereferencet(
    const namespacet &_ns,
    symbol_tablet &_new_symbol_table,
    const optionst &_options,
    value_setst &_value_sets,
    const messaget &_log = messaget())
    : options(_options),
      ns(_ns),
      value_sets(_value_sets),
      dereference(_ns, _new_symbol_table, *this, ID_nil, false, _log)
  {
  }

Passing a valid messaget object heregoto_program_dereferencet will need a lot of additional refactoring across the goto program conversion code modules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the explanation. Please put a code comment to clarify why a default messaget instance is passed in this place. The next person stumbling over that code should not have to redo your reasoning. Many thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I retain the current implementation i.e dereference(_ns, _new_symbol_table, *this, ID_nil, false, messaget()) or replace it with the construct above?
I will add in the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one above is probably slightly cleaner as it pushes the messaget further up. But, still, please put a comment in goto_program_dereferencet then.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Comment added.

@peterschrammel
Copy link
Member

peterschrammel commented Sep 24, 2020

Approved assuming that the last two comments will be fixed before merging.

Natasha Yogananda Jeppu added 11 commits November 19, 2020 10:39
…eference

The points to set information provides some intuition about the complexity
of the resulting case split expression in case of a pointer dereference.
This commit adds parse option --show-points-to-sets to track the size and
contents of the points-to set for pointers.
…e option holds

This commit propagates parse option information to the symex stage. It is then used to
display the points-to sets.
…ng during symex

This commit displays the points-to set information in json format. Support for other
formats are yet to be added.
…ng during symex

This commit displays the points-to set information in json format. Support for other
formats are yet to be added.

clang format value_set_dereference.cpp symex_dereference.cpp
clang format value_set_derefernce.h
…eference

The points to set information provides some intuition about the complexity
of the resulting case split expression in case of a pointer dereference.
This commit adds parse option --show-points-to-sets to track the size and
contents of the points-to set for pointers.

cpp lint bmc_util.h
Add regression test for xml and plain output
…eference

The points to set information provides some intuition about the complexity
of the resulting case split expression in case of a pointer dereference.
This commit adds parse option --show-points-to-sets to track the size and
contents of the points-to set for pointers.

cpp lint bmc_util.h

Add error message for plain and xml format output
Natasha Yogananda Jeppu added 4 commits November 19, 2020 10:39
…eference

The points to set information provides some intuition about the complexity
of the resulting case split expression in case of a pointer dereference.
This commit adds parse option --show-points-to-sets to track the size and
contents of the points-to set for pointers.

cpp lint bmc_util.h

Add error message for plain and xml format output
@danielsn danielsn merged commit 639b9e9 into diffblue:develop Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants