CBMC additional profiling info: Pointer dereference, points-to set metric #5475

natasha-jeppu · 2020-08-26T16:15:08Z

Parse option to track size and contents of points-to sets during pointer dereferencing.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

peterschrammel · 2020-08-26T17:09:11Z

src/pointer-analysis/value_set_dereference.cpp

+    ss << format(value);
+    json_result["Value"] = json_stringt(ss.str());
+
+    std::cout << ",\n" << json_result;


🚫 This is not allowed. Pass a message handler to the class instead, as done everywhere else.

Fixed this.

martin-cs

There are already a number of ways of showing the points-to sets in general.
If it must specifically be done during symbolic execution, why not just add it as a call to the right level of logging. I don't see why we need extra options.

FYI : there has been a long-standing goal of minimising the number of options that CPROVER tools have or at least preventing the proliferation of them. We haven't done so well of late but it is still a worthwhile goal to pursue.

natasha-jeppu · 2020-08-26T18:56:11Z

There are already a number of ways of showing the points-to sets in general.

Are you referring to the --show-value-sets option? As I understand, the CBMC call just looks at the points-to set information which is recorded during symex and this is what we are interested in.

danielsn · 2020-08-27T18:01:08Z

There are already a number of ways of showing the points-to sets in general.

If it must specifically be done during symbolic execution, why not just add it as a call to the right level of logging. I don't see why we need extra options.

FYI : there has been a long-standing goal of minimising the number of options that CPROVER tools have or at least preventing the proliferation of them. We haven't done so well of late but it is still a worthwhile goal to pursue.

I wonder if it would make sense to have a mechanism for defining what statistics to collect / what output to generate, that could go under a single commandline flag.
Ideas are:

A json config file
A --show-metric-in-log metric flag that takes a metric, or list of metrics.

codecov · 2020-09-03T00:41:16Z

Codecov Report

Merging #5475 (f179946) into develop (c907b2b) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #5475      +/-   ##
===========================================
+ Coverage    69.32%   69.34%   +0.01%     
===========================================
  Files         1241     1241              
  Lines       100443   100474      +31     
===========================================
+ Hits         69636    69674      +38     
+ Misses       30807    30800       -7

Flag	Coverage Δ
cproversmt2	`43.12% <100.00%> (+0.03%)`	⬆️
regression	`66.24% <100.00%> (+0.01%)`	⬆️
unit	`32.26% <4.76%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/goto-symex/symex_config.h	`100.00% <ø> (ø)`
src/cbmc/cbmc_parse_options.cpp	`77.32% <100.00%> (+0.31%)`	⬆️
src/goto-symex/symex_clean_expr.cpp	`95.83% <100.00%> (ø)`
src/goto-symex/symex_dereference.cpp	`89.91% <100.00%> (ø)`
src/goto-symex/symex_main.cpp	`87.78% <100.00%> (+0.03%)`	⬆️
src/pointer-analysis/goto_program_dereference.h	`66.66% <100.00%> (ø)`
src/pointer-analysis/value_set_dereference.cpp	`94.97% <100.00%> (+0.65%)`	⬆️
src/pointer-analysis/value_set_dereference.h	`100.00% <100.00%> (ø)`
src/pointer-analysis/value_set.cpp	`78.35% <0.00%> (+0.69%)`	⬆️
src/pointer-analysis/goto_program_dereference.cpp	`26.92% <0.00%> (+0.70%)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c907b2b...f179946. Read the comment docs.

peterschrammel · 2020-09-03T15:53:48Z

If it's just about showing metrics that are cheap to collect (e.g. points to set size) then I would just always output them - no flag needed. Such metrics are indeed very useful.
If you are also going to show the points to sets themselves then I'd prefer to put them behind a flag, as this produces massive output... and I'm not sure what it is good for except for very special debug situations.

natasha-jeppu · 2020-09-23T14:25:54Z

If it's just about showing metrics that are cheap to collect (e.g. points to set size) then I would just always output them - no flag needed. Such metrics are indeed very useful.
If you are also going to show the points to sets themselves then I'd prefer to put them behind a flag, as this produces massive output... and I'm not sure what it is good for except for very special debug situations.

For now both are behind a flag. We prefer to keep this behind a flag because even with just the points-to set size, we can get a really big output (multiple pointer dereferences and number of pointers may be large). But if you prefer to have it as part of the standard output we could make the change.

peterschrammel · 2020-09-23T15:55:30Z

src/pointer-analysis/value_set_dereference.cpp

@@ -11,10 +11,9 @@ Author: Daniel Kroening, [email protected]

 #include "value_set_dereference.h"

-#ifdef DEBUG


❗Please don't remove the ifdef here. iostream must not be included outside DEBUG here.

You can use iosfwd outside DEBUG - that's what you need here, as far as I can see.

peterschrammel · 2020-09-23T16:08:30Z

src/pointer-analysis/goto_program_dereference.h

@@ -36,7 +36,7 @@ class goto_program_dereferencet:protected dereference_callbackt
    : options(_options),
      ns(_ns),
      value_sets(_value_sets),
-      dereference(_ns, _new_symbol_table, *this, ID_nil, false)
+      dereference(_ns, _new_symbol_table, *this, ID_nil, false, messaget())


❗ Passing a default messaget instance doesn't look right here. The instance should come from the caller.

Is this a valid construct? The points-to sets are displayed during symex only. exprt dereference(const exprt &pointer, bool display_points_to_sets = false); sets display_points_to_set to false by default so any calls to value_set_dereferencet::dereference() from the goto program conversion phase does not display the points-to sets. Similarly, is it okay to set the messaget here to a default message instance for goto_program_dereferencet?

goto_program_dereferencet( const namespacet &_ns, symbol_tablet &_new_symbol_table, const optionst &_options, value_setst &_value_sets, const messaget &_log = messaget()) : options(_options), ns(_ns), value_sets(_value_sets), dereference(_ns, _new_symbol_table, *this, ID_nil, false, _log) { }

Passing a valid messaget object heregoto_program_dereferencet will need a lot of additional refactoring across the goto program conversion code modules.

Ok, thanks for the explanation. Please put a code comment to clarify why a default messaget instance is passed in this place. The next person stumbling over that code should not have to redo your reasoning. Many thanks!

Should I retain the current implementation i.e dereference(_ns, _new_symbol_table, *this, ID_nil, false, messaget()) or replace it with the construct above?
I will add in the comment.

The one above is probably slightly cleaner as it pushes the messaget further up. But, still, please put a comment in goto_program_dereferencet then.

Fixed. Comment added.

peterschrammel · 2020-09-24T12:32:47Z

Approved assuming that the last two comments will be fixed before merging.

…eference The points to set information provides some intuition about the complexity of the resulting case split expression in case of a pointer dereference. This commit adds parse option --show-points-to-sets to track the size and contents of the points-to set for pointers.

…e option holds This commit propagates parse option information to the symex stage. It is then used to display the points-to sets.

…ng during symex This commit displays the points-to set information in json format. Support for other formats are yet to be added.

…ng during symex This commit displays the points-to set information in json format. Support for other formats are yet to be added. clang format value_set_dereference.cpp symex_dereference.cpp

clang format value_set_derefernce.h

…eference The points to set information provides some intuition about the complexity of the resulting case split expression in case of a pointer dereference. This commit adds parse option --show-points-to-sets to track the size and contents of the points-to set for pointers. cpp lint bmc_util.h

Add regression test for xml and plain output

…eference The points to set information provides some intuition about the complexity of the resulting case split expression in case of a pointer dereference. This commit adds parse option --show-points-to-sets to track the size and contents of the points-to set for pointers. cpp lint bmc_util.h Add error message for plain and xml format output

…rogram_dereferencet

peterschrammel reviewed Aug 26, 2020

View reviewed changes

martin-cs reviewed Aug 26, 2020

View reviewed changes

danielsn added the aws Bugs or features of importance to AWS CBMC users label Aug 26, 2020

natasha-jeppu marked this pull request as ready for review September 10, 2020 19:08

natasha-jeppu requested review from chrisr-diffblue, kroening, romainbrenguier, smowton and tautschnig as code owners September 10, 2020 19:08

natasha-jeppu mentioned this pull request Sep 16, 2020

Possible performance optimisation: replacing multiple pointer dereferences with temporary variable storing dereferenced pointer #5494

Closed

peterschrammel reviewed Sep 23, 2020

View reviewed changes

peterschrammel approved these changes Sep 24, 2020

View reviewed changes

danielsn force-pushed the pointer-alias branch from 61b5a6b to 46376b6 Compare November 18, 2020 17:58

Natasha Yogananda Jeppu added 11 commits November 19, 2020 10:39

Add show_points_to_sets to symex configuration to detect if CBMC pars…

be9b086

…e option holds This commit propagates parse option information to the symex stage. It is then used to display the points-to sets.

Add module to track points-to set information for pointer dereferenci…

bb6f89a

…ng during symex This commit displays the points-to set information in json format. Support for other formats are yet to be added.

Regression test for --show-points-to-sets --json-ui

4e7eb30

Add module to track points-to set information for pointer dereferenci…

caa506a

…ng during symex This commit displays the points-to set information in json format. Support for other formats are yet to be added. clang format value_set_dereference.cpp symex_dereference.cpp

Pass message handler to display points-to sets data

1a7dcea

clang format value_set_derefernce.h

clang format regression test

cafc143

clang format regression test

e7fb35e

Add regression test for xml and plain output

clang format regression test

c1444c9

Natasha Yogananda Jeppu added 4 commits November 19, 2020 10:39

Fix parse options handling for --show-points-to-sets

96c6ac8

fix #ifdef DEBUG in headers, keep iostream inside DEBUG tags

7c70957

Add comment explaining the use if default messaget instance in goto_p…

f179946

…rogram_dereferencet

danielsn force-pushed the pointer-alias branch from 46376b6 to f179946 Compare November 19, 2020 15:48

danielsn merged commit 639b9e9 into diffblue:develop Nov 19, 2020

		@@ -11,10 +11,9 @@ Author: Daniel Kroening, [email protected]

		#include "value_set_dereference.h"

		#ifdef DEBUG

CBMC additional profiling info: Pointer dereference, points-to set metric #5475

CBMC additional profiling info: Pointer dereference, points-to set metric #5475

Uh oh!

Conversation

natasha-jeppu commented Aug 26, 2020

Uh oh!

peterschrammel Aug 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martin-cs left a comment

Choose a reason for hiding this comment

Uh oh!

natasha-jeppu commented Aug 26, 2020

Uh oh!

danielsn commented Aug 27, 2020

Uh oh!

codecov bot commented Sep 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

peterschrammel commented Sep 3, 2020

Uh oh!

natasha-jeppu commented Sep 23, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterschrammel commented Sep 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

peterschrammel Aug 26, 2020 •

edited

Loading

codecov bot commented Sep 3, 2020 •

edited

Loading

peterschrammel commented Sep 24, 2020 •

edited

Loading