Skip to content

Add support for conversion of pointer arithmetic expressions to new SMT backend. #6866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

NlightNFotis
Copy link
Contributor

This PR adds the capability to handle expressions like the following:

int *a;
int *b;

a + 1;
a + (2 * sizeof(int)
a - 2;
a - b;

to the new SMT backend, along with regression tests and unit tests.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@codecov
Copy link

codecov bot commented May 18, 2022

Codecov Report

Merging #6866 (95cab0d) into develop (bcca9da) will increase coverage by 0.01%.
The diff coverage is 98.24%.

❗ Current head 95cab0d differs from pull request most recent head 34b2985. Consider uploading reports for the commit 34b2985 to get more accurate results

@@             Coverage Diff             @@
##           develop    #6866      +/-   ##
===========================================
+ Coverage    77.80%   77.81%   +0.01%     
===========================================
  Files         1567     1568       +1     
  Lines       179916   180023     +107     
===========================================
+ Hits        139988   140093     +105     
- Misses       39928    39930       +2     
Impacted Files Coverage Δ
..._incremental/smt2_incremental_decision_procedure.h 75.00% <ø> (ø)
...t/solvers/smt2_incremental/convert_expr_to_smt.cpp 99.65% <96.55%> (-0.23%) ⬇️
...c/solvers/smt2_incremental/convert_expr_to_smt.cpp 87.67% <100.00%> (+0.66%) ⬆️
.../solvers/smt2_incremental/pointer_size_mapping.cpp 100.00% <100.00%> (ø)
...ncremental/smt2_incremental_decision_procedure.cpp 94.73% <100.00%> (+0.11%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 679ab3f...34b2985. Read the comment docs.

Copy link
Contributor

@thomasspriggs thomasspriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

auto pointer_base_type = pointer_type.base_type();
exprt pointer_size_expr;
// There's a special case for a pointer subtype here: the case where the pointer is `void *`. This means
// that we don't know the underlying base type, so we're just assigning a size expression value of 1 (given
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ void * does not mean that "we don't know the underlying base type". It means the type is not defined or "is empty". For example the return type of malloc is void * and the memory to which the returned pointer refers to will have no type associated. Your explanation of why we are using 1 explains how this affects the maths which we are using but not why (or why not) it is the correct thing to do. Critically, pointer arithmetic on void pointers is disallowed by the C standard because void does not have a size. Using a size of 1 is actually a GCC extension. Source references - http://www.c-faq.com/ansi/voidparith.html and https://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/Pointer-Arith.html

Copy link
Contributor

@thomasspriggs thomasspriggs May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still unaddressed.


#include <unordered_map>

using pointer_size_mapt = std::unordered_map<typet, smt_termt, irep_full_hash>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we agreed to call this a type_size_map rather than a pointer size map? The keys of the map are expected to be the types the pointers point to, not the pointers themselves or the whole pointer type. I don't want to tie the naming of this map to pointers only, because we may want the type size information for other purposes as we add more functionality to the new decision procedure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the refactor commits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename the file to match as well please?

Copy link
Contributor Author

@NlightNFotis NlightNFotis May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/// expression argument.
/// \param expression: the expression we're building the map for.
/// \param ns: a namespace - passed to size_of_expr to find expression sizes.
/// \param pointer_size_map: the map containing the pointer.base_type() -> size (in bytes) mappings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Its worth stating that this function adds the mappings to this map. Your current document doesn't really explain that this is an "output" parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 658672c

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean by "Initially empty, by the function.". This function does not empty the map and the map may or may not contain existing entries when it is called. I suggest -

/// \param type_size_map:
///   A map of types to terms expressing the size of the type (in bytes). This
///   function adds new entries to the map for instances of pointer.base_type()
///   from \p expression which are not already keys in the map.

/// Establish pointer-sizes map for all pointers present in the
/// expression argument.
/// \param expression: the expression we're building the map for.
/// \param ns: a namespace - passed to size_of_expr to find expression sizes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Its worth noting that the name space is needed for looking up (following) type symbols in the case where pointers have tag_typet, rather than a more completely defined type. Ideally the size_of_expr would have doxygen as well, but I am happy to consider that to be "off topic" for this PR.

@NlightNFotis NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from 38bbc27 to 41f27a9 Compare May 19, 2022 12:48
@NlightNFotis
Copy link
Contributor Author

This PR is now actively being worked on, and includes a small number of commits that are going to be squashed away or spun out into their own PRs.


using type_size_mapt = std::unordered_map<typet, smt_termt, irep_full_hash>;

/// This function creates a map of types to their related sizes (in bytes).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is inaccurate as the function adds to the map rather than creating/constructing it. The map is (empty) constructed before this function is called.

/// expression argument.
/// \param expression: the expression we're building the map for.
/// \param ns: a namespace - passed to size_of_expr to find expression sizes.
/// \param pointer_size_map: the map containing the pointer.base_type() -> size (in bytes) mappings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean by "Initially empty, by the function.". This function does not empty the map and the map may or may not contain existing entries when it is called. I suggest -

/// \param type_size_map:
///   A map of types to terms expressing the size of the type (in bytes). This
///   function adds new entries to the map for instances of pointer.base_type()
///   from \p expression which are not already keys in the map.


pointer_typet pointer_type =
*type_try_dynamic_cast<pointer_typet>(pointer.type());
auto base_type = pointer_type.base_type();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I would probably inline pointer_type.base_type() into the call to at. But this should at least be const if you are going to introduce a new variable which is not mutated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the related definitions have been consted.

@@ -0,0 +1,13 @@
#include <stdint.h>

#define NULL (void *)0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I would prefer that one of the standard library definitions of NULL was used instead of defining it in the test, in order to avoid re-definition errors.

{
int *x;
int *y;
int *z = x - y;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z should be of int type rather than int * type. The result of x - y is an integer not a pointer, so because you declared z as a pointer then the integer result of the subtraction is being implicitly cast back to a pointer. Because you have assumed that x != y the property that z != 0 should hold in a subsequent assertion.

SECTION("Addition of a pointer and a constant")
{
// (int32_t *)a + 2
const auto pointer_arith_expr = plus_exprt{pointer_a, two_bvint_32bit};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 I suggest using catchs INFO macro to add information of the pretty printing of pointer_arith_expr, so that more information is provided if this test fails in future.

const auto constructed_term =
test.convert(minus_exprt{pointer_b, pointer_a});
const auto expected_term =
smt_bit_vector_theoryt::subtract(smt_term_b, smt_term_a);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be missing the division of the subtraction result by the size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fixed in a following commit?

{
int *x = malloc(sizeof(int));
float *y = x + 3;
int z = y - x;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subtraction is actually non-compiling code if I run it through gcc. If I run cbmc with the sat backend then this code is currently accepted and analysed. As this is not valid code it should really be detected in the C type checking performed as part of the C front end. Adding such front-end checks would be outside of the scope of this PR however.

@NlightNFotis NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch 2 times, most recently from 6c7c885 to 26f5a62 Compare May 27, 2022 11:17
Copy link
Contributor

@thomasspriggs thomasspriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the serious issues have all been resolved.

⛏️ It could do with a regression test to check on the result of subtracting two pointers is as we expect in the case where the resulting integer is negative. That is because this is the case where the kind of division used is important. The wrong kind of division would result in a value of the wrong sign and far larger than expected.

@NlightNFotis NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from 26f5a62 to c27ba48 Compare May 27, 2022 15:34
@NlightNFotis NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from c27ba48 to a441756 Compare May 28, 2022 10:18
@NlightNFotis NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from a441756 to 34b2985 Compare May 28, 2022 10:21
@NlightNFotis NlightNFotis merged commit 5303a9d into diffblue:develop May 28, 2022
@NlightNFotis NlightNFotis deleted the smt_pointer_arithmetic_conversion_final branch May 28, 2022 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants