Cover failed assertions #5636

hannes-steffenhagen-diffblue · 2020-11-27T18:48:02Z

This adds an option --cover-failed-assertions that prevents the
default behaviour of coverage stopping at failed assertions by turning
assertions into skips rather than assumes (which is the default
behaviour for coverage criteria other than assertion, which behaves
the same with or without the flag).

This is one way to address #5543

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

These were previously duplicated between 3 different executables.

hannes-steffenhagen-diffblue · 2020-11-27T18:49:29Z

src/goto-instrument/cover.cpp

+        }
+        else
+        {
+          i_it->turn_into_skip();


Btw we have to skip them rather than just leaving them be because otherwise cbmc would consider these assertions coverage goals.

Why not make --no-assertions (aka turn-assertions-into-skips) useable with --cover then (as @martin-cs suggests)? That would make the behaviour more transparent to the user.

This adds an option `--cover-failed-assertions` that prevents the default behaviour of coverage stopping at failed assertions by turning assertions into skips rather than assumes (which is the default behaviour for coverage criteria other than `assertion`, which behaves the same with or without the flag).

hannes-steffenhagen-diffblue · 2020-11-27T19:10:58Z

src/cbmc/README.md

-proceeds the same as in all-properties mode. Coverage solving is implemented by
-`bmc_covert`, but is structurally practically identical to
+equation solver; In cases where this behaviour is undesirable you can pass the
+`--cover-failed-assertions` which makes coverage checking continue even for


BTW shouldn’t this file be under doc rather than src? I don’t really understand our documentation structure...

The READMEs which are in each module are pulled into here: http://cprover.diffblue.com/folder-walkthrough.html (click on cbmc, for example).
We decided to keep them in the source folders so that they are as close as possible to the code.

codecov · 2020-11-27T19:27:58Z

Codecov Report

Merging #5636 (bad0d16) into develop (3fec6e1) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop    #5636   +/-   ##
========================================
  Coverage    69.35%   69.35%           
========================================
  Files         1241     1241           
  Lines       100596   100601    +5     
========================================
+ Hits         69771    69776    +5     
  Misses       30825    30825

Flag	Coverage Δ
cproversmt2	`43.28% <100.00%> (+0.12%)`	⬆️
regression	`66.25% <100.00%> (+<0.01%)`	⬆️
unit	`32.27% <77.77%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
jbmc/src/jdiff/jdiff_parse_options.cpp	`68.50% <ø> (ø)`
jbmc/src/jdiff/jdiff_parse_options.h	`100.00% <ø> (ø)`
src/cbmc/cbmc_parse_options.cpp	`77.62% <ø> (ø)`
src/cbmc/cbmc_parse_options.h	`100.00% <ø> (ø)`
src/goto-diff/goto_diff_parse_options.cpp	`66.20% <ø> (ø)`
src/goto-diff/goto_diff_parse_options.h	`100.00% <ø> (ø)`
src/goto-instrument/cover.h	`100.00% <ø> (ø)`
src/goto-instrument/cover.cpp	`85.31% <100.00%> (+0.53%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3fec6e1...bad0d16. Read the comment docs.

martin-cs

Approve because the refactor is good. I would prefer if the new option could be added in to the argument for --cover and I would prefer it even more if we could make assertion transformation orthogonal to and preceeding coverage assertions or at least have a conversation about it but I also realise that you need to get stuff done so approve.

martin-cs · 2020-12-02T21:22:45Z

src/goto-instrument/cover.h

-  " --cover CC                   create test-suite with coverage criterion CC\n"
+#define OPT_COVER                                                              \
+  "(cover):"                                                                   \
+  "(cover-failed-assertions)"


Is it possible to have this as an argument to --cover?

martin-cs · 2020-12-02T21:23:49Z

src/goto-instrument/cover.h

@@ -22,6 +22,10 @@ class message_handlert;
 class cmdlinet;
 class optionst;

+#define OPT_COVER "(cover):"


Thank you for this refactoring. Finding an removing unnecessary and redundant repetition is always good.

martin-cs · 2020-12-02T21:36:12Z

src/cbmc/README.md

-There is one further BMC mode that differs more fundamentally: when `--cover` is
-passed, assertions in the program text are converted into assumptions (these
+There is one further BMC mode that differs more fundamentally: when `--cover`
+is passed, assertions in the program text are converted into assumptions (these


I don't want to open a massive can of worms here but ... why is this the default? Is this actually useful? Given that there are options for --no-assertions, --no-assumptions and --assert-to-assume (whether they work or even do anything at all ... is an open question) could we make the transformation of assertions and assumptions orthogonal to --cover. It feels like this should be possible and even desirable...

@martin-cs I think this is the default because whoever wrote this had the expectation that programs should stop at failed assertions. I suppose we could change it, but I’m always a bit wary about changing defaults.

@martin-cs FWIW because cover treats each failed goal as something relating to coverage right now we have to do something with asserts. That said, I think the alternative (just making cover cleverer about how it counts coverage) would be more desirable.

I'd consider that as something we can take a look at in terms of future work though.

Yes, if you are doing a naive count then, I see the need to do something with pre-existing assertions or otherwise true assertions (generally a good thing) will count against your coverage. Smarter counting would certainly be one way to address this.

It feels to me that making the "doing something about asserts" independent might be a desirable option as well. I can see use-cases for converting to SKIP, to ASSUME and leaving as they are.

Then again, I am not a user of this code at the moment and do not have time to work on it so my views maybe not so important.

The tests are failing when the `--paths` flag is set. This is because we understand that the paths code hasn't been exercised thoroughly, and as a result this behaviour may or may not indicate a potential issue - we need to investigate this at a later point.

peterschrammel · 2020-12-14T11:08:12Z

regression/validate-trace-xml-schema/check.py

@@ -34,6 +34,9 @@
    ['unknown-argument-suggestion', 'test.desc'],
    # this one produces XML intermingled with main XML output when used with --xml-ui
    ['graphml_witness2', 'test.desc'],
+    # these are producing coverage goals which aren't including in the schema


including -> included?

peterschrammel · 2020-12-14T11:14:55Z

src/goto-instrument/cover.cpp

+        }
+        else
+        {
+          i_it->turn_into_skip();


Why not make --no-assertions (aka turn-assertions-into-skips) useable with --cover then (as @martin-cs suggests)? That would make the behaviour more transparent to the user.

hannes-steffenhagen-diffblue · 2020-12-14T16:22:54Z

@peterschrammel --no-assertions applies to goto_check assertions, it only turns user provided assertions into skips. What we'd want for cover is to turn all non-cover related assertions into skips, user added or not. Making it work differently depending on whether or not --cover is passed would be confusing.

That said, maybe the better solution would be to make sure cover only checks cover-added assertions for coverage metrics instead of this skip-workaround...

peterschrammel · 2020-12-14T16:28:23Z

...unless you use --cover assertion.

hannes-steffenhagen-diffblue · 2020-12-14T17:00:08Z

I don’t think the behaviour is any different for --cover assertion? It still only removes user assertions (from what I can tell --cover assertions doesn’t care specifically about this option at all, this is all happening in goto check), and the original complaint this is meant to address is that coverage changes based on whether or not you pass in the various *-check flags (which I think may well be expected behaviour in some circumstances, but is clearly not the only reasonable behaviour).

It’s just that (only) removing user assertions is actually what we want for --cover assertion I guess.

thomasspriggs

Seems reasonable. If it was up to me, I would squash the commits fixing the tests back into the commit where they were added. This is because it could make use of git bisect easier, if we ever need to use it. Please fix the spelling error noted by Peter before this is merged.

thomasspriggs · 2020-12-14T14:57:05Z

src/cbmc/cbmc_parse_options.h

@@ -73,8 +75,9 @@ class optionst;
  "(property):(stop-on-fail)(trace)" \
  "(error-label):(verbosity):(no-library)" \
  "(nondet-static)" \
-  "(version)" \
-  "(cover):(symex-coverage-report):" \
+  "(version)"        \


❔ Is this additional spacing accidental?

thomasspriggs · 2020-12-15T18:06:06Z

regression/cbmc/cover-failed-assertions/test-no-failed-assertions.desc

+^EXIT=0$
+^SIGNAL=0$
+--
+^warning: ignoring


⛏️ It would be nice to have a comment in each of .desc files explaining what the test is intended to test. Even something pretty short could make this a while lot easier to maintain.

thomasspriggs · 2020-12-15T18:09:52Z

src/goto-instrument/cover.h

+#define HELP_COVER                                                             \
+  " --cover CC                   create test-suite with coverage criterion "   \
+  "CC\n"                                                                       \
+  " --cover-failed-assertions    do not stop coverage checking at failed "     \


⛏️ Because of the preceding refactor, this option now applies to jdiff and goto-diff as well as cbmc. This means that we should probably test that it works with those entry points.

thomasspriggs · 2020-12-15T18:11:41Z

src/goto-instrument/cover.h

-  " --cover CC                   create test-suite with coverage criterion CC\n"
+#define OPT_COVER                                                              \
+  "(cover):"                                                                   \
+  "(cover-failed-assertions)"


⛏️ If we keep this as a separate option from --cover then we should check that --cover-failed-assertions is not passed without --cover and exit with a suitable error message in the case of that particular invalid usage.

peterschrammel

I understand that this needs a quick fix. But I don't think that the current solution will be the last. The options around goto check assertions, user assertions, cover, the various -no-X options, etc need to be made consistent so that they can be used in a clearly understandable and orthogonal way without subtle side effects.

martin-cs · 2020-12-18T12:21:28Z

I realise this is easy for me to say as 1. it's not me doing the work and 2. it's not me in need of the urgent fix but... +1 agree with @peterschrammel . This has been a mess for too long. I will leave it up to others (those that 1 and 2 apply to) to decide how this PR should relate to that work.

Extract constants for --cover help message and options

cefd3ce

These were previously duplicated between 3 different executables.

hannes-steffenhagen-diffblue requested review from chrisr-diffblue, kroening, martin-cs, peterschrammel, smowton and tautschnig as code owners November 27, 2020 18:48

hannes-steffenhagen-diffblue commented Nov 27, 2020

View reviewed changes

hannes-steffenhagen-diffblue mentioned this pull request Nov 27, 2020

Omitting property checking flags changes coverage checking #5543

Closed

hannes-steffenhagen-diffblue force-pushed the feature/cover-failed-assertions branch from d0d6748 to a442dbb Compare November 27, 2020 19:01

Update cbmc README.md with --cover-failed-assertions flag

d3bfa2b

hannes-steffenhagen-diffblue commented Nov 27, 2020

View reviewed changes

martin-cs approved these changes Dec 2, 2020

View reviewed changes

NlightNFotis and others added 3 commits December 7, 2020 11:31

Update regexes in test to match actual lines matched.

b3d0bfa

Do not check XML traces on --cover tests

a86ffc7

peterschrammel reviewed Dec 14, 2020

View reviewed changes

thomasspriggs approved these changes Dec 15, 2020

View reviewed changes

hannes-steffenhagen-diffblue added 3 commits December 17, 2020 14:26

Fix typo

e93654c

Fix spacing

3d21a85

Add comments to tests

bad0d16

peterschrammel approved these changes Dec 18, 2020

View reviewed changes

hannes-steffenhagen-diffblue merged commit 932599e into diffblue:develop Jan 8, 2021

Cover failed assertions #5636

Cover failed assertions #5636

Uh oh!

Conversation

hannes-steffenhagen-diffblue commented Nov 27, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Nov 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

martin-cs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hannes-steffenhagen-diffblue Dec 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hannes-steffenhagen-diffblue commented Dec 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

peterschrammel commented Dec 14, 2020

Uh oh!

hannes-steffenhagen-diffblue commented Dec 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasspriggs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterschrammel left a comment

Choose a reason for hiding this comment

Uh oh!

martin-cs commented Dec 18, 2020

Uh oh!

Uh oh!

codecov bot commented Nov 27, 2020 •

edited

Loading

hannes-steffenhagen-diffblue Dec 11, 2020 •

edited

Loading

hannes-steffenhagen-diffblue commented Dec 14, 2020 •

edited

Loading

hannes-steffenhagen-diffblue commented Dec 14, 2020 •

edited

Loading