Skip to content

Regression tests for the SMT bitvector arithmetic operator conversion #6648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

NlightNFotis
Copy link
Contributor

This PR adds regression tests for the code added in #6614 .

Previously the code was exercises by unit tests. Regression tests
are allowing for more comfort around the correctness of the implementation
by testing in an end-to-end manner (from source code to verification result)

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copy link
Contributor

@TGWDB TGWDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mixing in of the Windows pipe fix is kind of separate, I understand the motivation for adding it here, but I'm not sure it's the best idea.

There are three things I think would improve this PR a lot.

  1. Regression tests for the unary arithmetic operators as well.
  2. Splitting the tests of the behaviour/outcome from the tests of the translation. At the moment there is no check of the actual results, only VERIFICATION FAILED for all of them.
  3. The detailed checking of the translation is very brittle, I can see this easily failing if we make a minor tweak or improvement somewhere, and I don't think the detail of the verbose SMT form is the main thing to test?

@codecov
Copy link

codecov bot commented Feb 8, 2022

Codecov Report

Merging #6648 (b900b19) into develop (afc53d8) will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #6648      +/-   ##
===========================================
+ Coverage    76.73%   76.74%   +0.01%     
===========================================
  Files         1579     1579              
  Lines       181999   182171     +172     
===========================================
+ Hits        139652   139806     +154     
- Misses       42347    42365      +18     
Impacted Files Coverage Δ
src/util/piped_process.cpp 81.48% <ø> (ø)
src/util/exception_utils.h 92.85% <0.00%> (-7.15%) ⬇️
src/solvers/flattening/boolbv_add_sub.cpp 48.33% <0.00%> (-3.06%) ⬇️
src/ansi-c/c_expr.h 100.00% <0.00%> (ø)
src/ansi-c/expr2c.cpp 65.94% <0.00%> (ø)
src/ansi-c/c_typecheck_base.h 100.00% <0.00%> (ø)
src/solvers/flattening/boolbv.h 62.50% <0.00%> (ø)
src/solvers/flattening/bv_utils.h 85.71% <0.00%> (ø)
src/solvers/smt2/smt2_tokenizer.h 96.96% <0.00%> (ø)
src/goto-programs/slice_global_inits.h 0.00% <0.00%> (ø)
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update afc53d8...b900b19. Read the comment docs.

Sending command to SMT2 solver - \(check-sat\)
Sending command to SMT2 solver - \(get-value \(|B9|\)\)
Sending command to SMT2 solver - \(get-value \(|B10|\)\)
^VERIFICATION FAILED$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For VERIFICATION FAILED tests we should add --trace to the arguments add use regexes to check the values of interest assigned in the trace.

@@ -0,0 +1,16 @@
#include <stdbool.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Comment on the commit title - "second-degree polynomial" is a pretty long-winded way of saying "quadratic".

Sending command to SMT2 solver - \(get-value \(|B7|\)\)
Sending command to SMT2 solver - \(get-value \(|B8|\)\)
Sending command to SMT2 solver - \(get-value \(|B9|\)\)
^VERIFICATION FAILED$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this test stands with multiple assertions where we (presumably) expect all of them to be refutable we should add regexes for each of them. Otherwise the test could still pass with a single assertion being refuted only.

int a;

// Simple algebraic identities - expected to drive the SMT backend
__CPROVER_assert(!(a + a == a * 2), "a plus a always equals two times a");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 This might be a better/stronger regression test if we remove the ! and check the properties we expect to hold cannot be refuted and that CBMC reports VERIFICATION SUCCESS.

unsigned int b = 12;
__CPROVER_assume(a > 15);

// expected to fail by assigning a = 4294967284u in the trace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values appear to be platform specific, given that the width of unsigned int is platform specific. I think it would be better if we included <stdint.h> and used types such as uint32_t, so that the values are the same across all platforms.

@NlightNFotis NlightNFotis force-pushed the smt_arithops_regression_tests branch 3 times, most recently from de68b68 to 0fbca50 Compare February 9, 2022 16:24
@NlightNFotis NlightNFotis force-pushed the smt_arithops_regression_tests branch from 0fbca50 to 1519d40 Compare February 10, 2022 11:15
These are expected to exercise that implements conversion of
arithmetic operators for signed ints, including integers of
multiple bit widths (using `long long` and `char` for now,
because `stdint.h` introduces typecasts and we don't support
them yet).
@NlightNFotis NlightNFotis force-pushed the smt_arithops_regression_tests branch from 1519d40 to b900b19 Compare February 11, 2022 10:02
@NlightNFotis NlightNFotis merged commit d5ff1ba into diffblue:develop Feb 11, 2022
@NlightNFotis NlightNFotis deleted the smt_arithops_regression_tests branch February 11, 2022 11:32
Comment on lines +351 to 377
const auto write_file = [&]() {
return WriteFile(
child_std_IN_Wr, message.c_str(), message_size, &bytes_written, NULL);
};
if(!write_file())
{
// Error handling with GetLastError ?
return send_responset::FAILED;
}
// `WriteFile` can return a success status but write 0 bytes if we write
// messages quickly enough. This problem has been observed when using a
// release build, but resolved when using a debug build or `--verbosity 10`.
// Presumably this happens if cbmc gets too far ahead of the sub process.
// Flushing the buffer and retrying should then succeed to write the message
// in this case.
if(bytes_written == 0)
{
FlushFileBuffers(child_std_IN_Wr);
if(!write_file())
{
// Error handling with GetLastError ?
return send_responset::FAILED;
}
}
INVARIANT(
message_size == bytes_written,
"Number of bytes written to sub process must match message size.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this part needs more work as regression tests are repeatedly failing that invariant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up PR is here -#6675

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants