Add support for conversion of pointer arithmetic expressions to new SMT backend. #6866

NlightNFotis · 2022-05-18T16:37:57Z

This PR adds the capability to handle expressions like the following:

int *a;
int *b;

a + 1;
a + (2 * sizeof(int)
a - 2;
a - b;

to the new SMT backend, along with regression tests and unit tests.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

codecov · 2022-05-18T17:53:51Z

Codecov Report

Merging #6866 (95cab0d) into develop (bcca9da) will increase coverage by 0.01%.
The diff coverage is 98.24%.

❗ Current head 95cab0d differs from pull request most recent head 34b2985. Consider uploading reports for the commit 34b2985 to get more accurate results

@@             Coverage Diff             @@
##           develop    #6866      +/-   ##
===========================================
+ Coverage    77.80%   77.81%   +0.01%     
===========================================
  Files         1567     1568       +1     
  Lines       179916   180023     +107     
===========================================
+ Hits        139988   140093     +105     
- Misses       39928    39930       +2

Impacted Files	Coverage Δ
..._incremental/smt2_incremental_decision_procedure.h	`75.00% <ø> (ø)`
...t/solvers/smt2_incremental/convert_expr_to_smt.cpp	`99.65% <96.55%> (-0.23%)`	⬇️
...c/solvers/smt2_incremental/convert_expr_to_smt.cpp	`87.67% <100.00%> (+0.66%)`	⬆️
.../solvers/smt2_incremental/pointer_size_mapping.cpp	`100.00% <100.00%> (ø)`
...ncremental/smt2_incremental_decision_procedure.cpp	`94.73% <100.00%> (+0.11%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 679ab3f...34b2985. Read the comment docs.

thomasspriggs

Some initial comments.

src/solvers/smt2_incremental/pointer_size_mapping.cpp

thomasspriggs · 2022-05-18T19:07:48Z

src/solvers/smt2_incremental/pointer_size_mapping.cpp

+    auto pointer_base_type = pointer_type.base_type();
+    exprt pointer_size_expr;
+    // There's a special case for a pointer subtype here: the case where the pointer is `void *`. This means
+    // that we don't know the underlying base type, so we're just assigning a size expression value of 1 (given


⛏️ void * does not mean that "we don't know the underlying base type". It means the type is not defined or "is empty". For example the return type of malloc is void * and the memory to which the returned pointer refers to will have no type associated. Your explanation of why we are using 1 explains how this affects the maths which we are using but not why (or why not) it is the correct thing to do. Critically, pointer arithmetic on void pointers is disallowed by the C standard because void does not have a size. Using a size of 1 is actually a GCC extension. Source references - http://www.c-faq.com/ansi/voidparith.html and https://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/Pointer-Arith.html

This is still unaddressed.

thomasspriggs · 2022-05-18T19:13:44Z

src/solvers/smt2_incremental/pointer_size_mapping.h

+
+#include <unordered_map>
+
+using pointer_size_mapt = std::unordered_map<typet, smt_termt, irep_full_hash>;


I thought we agreed to call this a type_size_map rather than a pointer size map? The keys of the map are expected to be the types the pointers point to, not the pointers themselves or the whole pointer type. I don't want to tie the naming of this map to pointers only, because we may want the type size information for other purposes as we add more functionality to the new decision procedure.

Addressed in the refactor commits.

Could you rename the file to match as well please?

thomasspriggs · 2022-05-18T19:39:51Z

src/solvers/smt2_incremental/pointer_size_mapping.h

+/// expression argument.
+/// \param expression: the expression we're building the map for.
+/// \param ns: a namespace - passed to size_of_expr to find expression sizes.
+/// \param pointer_size_map: the map containing the pointer.base_type() -> size (in bytes) mappings.


⛏️ Its worth stating that this function adds the mappings to this map. Your current document doesn't really explain that this is an "output" parameter.

Addressed in 658672c

I am not sure what you mean by "Initially empty, by the function.". This function does not empty the map and the map may or may not contain existing entries when it is called. I suggest -

/// \param type_size_map: /// A map of types to terms expressing the size of the type (in bytes). This /// function adds new entries to the map for instances of pointer.base_type() /// from \p expression which are not already keys in the map.

thomasspriggs · 2022-05-18T19:46:51Z

src/solvers/smt2_incremental/pointer_size_mapping.h

+/// Establish pointer-sizes map for all pointers present in the
+/// expression argument.
+/// \param expression: the expression we're building the map for.
+/// \param ns: a namespace - passed to size_of_expr to find expression sizes.


⛏️ Its worth noting that the name space is needed for looking up (following) type symbols in the case where pointers have tag_typet, rather than a more completely defined type. Ideally the size_of_expr would have doxygen as well, but I am happy to consider that to be "off topic" for this PR.

src/solvers/smt2_incremental/convert_expr_to_smt.cpp

NlightNFotis · 2022-05-19T13:58:48Z

This PR is now actively being worked on, and includes a small number of commits that are going to be squashed away or spun out into their own PRs.

thomasspriggs · 2022-05-26T10:21:13Z

src/solvers/smt2_incremental/pointer_size_mapping.h

+
+using type_size_mapt = std::unordered_map<typet, smt_termt, irep_full_hash>;
+
+/// This function creates a map of types to their related sizes (in bytes).


This documentation is inaccurate as the function adds to the map rather than creating/constructing it. The map is (empty) constructed before this function is called.

thomasspriggs · 2022-05-26T16:03:15Z

src/solvers/smt2_incremental/pointer_size_mapping.h

+/// expression argument.
+/// \param expression: the expression we're building the map for.
+/// \param ns: a namespace - passed to size_of_expr to find expression sizes.
+/// \param pointer_size_map: the map containing the pointer.base_type() -> size (in bytes) mappings.


I am not sure what you mean by "Initially empty, by the function.". This function does not empty the map and the map may or may not contain existing entries when it is called. I suggest -

/// \param type_size_map: /// A map of types to terms expressing the size of the type (in bytes). This /// function adds new entries to the map for instances of pointer.base_type() /// from \p expression which are not already keys in the map.

thomasspriggs · 2022-05-26T16:09:48Z

src/solvers/smt2_incremental/convert_expr_to_smt.cpp

+
+    pointer_typet pointer_type =
+      *type_try_dynamic_cast<pointer_typet>(pointer.type());
+    auto base_type = pointer_type.base_type();


⛏️ I would probably inline pointer_type.base_type() into the call to at. But this should at least be const if you are going to introduce a new variable which is not mutated.

All of the related definitions have been consted.

src/solvers/smt2_incremental/convert_expr_to_smt.cpp

thomasspriggs · 2022-05-26T16:16:08Z

regression/cbmc-incr-smt2/pointer_arithmetic/addition_compound_expr.c

@@ -0,0 +1,13 @@
+#include <stdint.h>
+
+#define NULL (void *)0


⛏️ I would prefer that one of the standard library definitions of NULL was used instead of defining it in the test, in order to avoid re-definition errors.

thomasspriggs · 2022-05-26T16:27:14Z

regression/cbmc-incr-smt2/pointer_arithmetic/pointer_subtraction.c

+{
+  int *x;
+  int *y;
+  int *z = x - y;


z should be of int type rather than int * type. The result of x - y is an integer not a pointer, so because you declared z as a pointer then the integer result of the subtraction is being implicitly cast back to a pointer. Because you have assumed that x != y the property that z != 0 should hold in a subsequent assertion.

thomasspriggs · 2022-05-26T16:34:01Z

unit/solvers/smt2_incremental/convert_expr_to_smt.cpp

+  SECTION("Addition of a pointer and a constant")
+  {
+    // (int32_t *)a + 2
+    const auto pointer_arith_expr = plus_exprt{pointer_a, two_bvint_32bit};


💡 I suggest using catchs INFO macro to add information of the pretty printing of pointer_arith_expr, so that more information is provided if this test fails in future.

thomasspriggs · 2022-05-26T16:39:01Z

unit/solvers/smt2_incremental/convert_expr_to_smt.cpp

+    const auto constructed_term =
+      test.convert(minus_exprt{pointer_b, pointer_a});
+    const auto expected_term =
+      smt_bit_vector_theoryt::subtract(smt_term_b, smt_term_a);


This looks to be missing the division of the subtraction result by the size.

I think this is fixed in a following commit?

thomasspriggs · 2022-05-26T17:00:46Z

regression/cbmc-incr-smt2/pointer_arithmetic/pointer_subtraction_diff_types.c

+{
+  int *x = malloc(sizeof(int));
+  float *y = x + 3;
+  int z = y - x;


This subtraction is actually non-compiling code if I run it through gcc. If I run cbmc with the sat backend then this code is currently accepted and analysed. As this is not valid code it should really be detected in the C type checking performed as part of the C front end. Adding such front-end checks would be outside of the scope of this PR however.

unit/solvers/smt2_incremental/convert_expr_to_smt.cpp

thomasspriggs

Looks like the serious issues have all been resolved.

⛏️ It could do with a regression test to check on the result of subtracting two pointers is as we expect in the case where the resulting integer is negative. That is because this is the case where the kind of division used is important. The wrong kind of division would result in a value of the wrong sign and far larger than expected.

…MT backend. This will be needed later to support pointer arithmetic.

…T backend.

This works without any changes to our conversion of `minus_exprt`s because there's a transformation between the frontend and the backend that converts a (*a - 3) to (*a + (-3)).

NlightNFotis requested a review from kroening as a code owner May 18, 2022 16:37

NlightNFotis self-assigned this May 18, 2022

NlightNFotis requested review from tautschnig, peterschrammel, chris-ryder, thomasspriggs and TGWDB as code owners May 18, 2022 16:37

thomasspriggs suggested changes May 18, 2022

View reviewed changes

NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from 38bbc27 to 41f27a9 Compare May 19, 2022 12:48

NlightNFotis added the do not review label May 19, 2022

thomasspriggs mentioned this pull request May 24, 2022

Add get value response validation for smt bv constant descriptors #6879

Merged

7 tasks

NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from dc7a300 to ec40a7b Compare May 25, 2022 09:44

thomasspriggs suggested changes May 26, 2022

View reviewed changes

NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch 2 times, most recently from 6c7c885 to 26f5a62 Compare May 27, 2022 11:17

thomasspriggs approved these changes May 27, 2022

View reviewed changes

TGWDB approved these changes May 27, 2022

View reviewed changes

NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from 26f5a62 to c27ba48 Compare May 27, 2022 15:34

NlightNFotis removed the do not review label May 28, 2022

NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from c27ba48 to a441756 Compare May 28, 2022 10:18

NlightNFotis added 7 commits May 28, 2022 11:20

Add support for mapping the sizes of expression subtypes in the new S…

dad512c

…MT backend. This will be needed later to support pointer arithmetic.

Add support for conversion of addition with pointer operand to new SM…

4a40bca

…T backend.

Add regression tests for plus_exprt conversion

1281a8d

Add a simple test for the conversion of pointer arithmetic/subtraction.

1c82d4e

This works without any changes to our conversion of `minus_exprt`s because there's a transformation between the frontend and the backend that converts a (*a - 3) to (*a + (-3)).

Add conversion of minus_exprt with both operands being pointers

0e81d55

Add regression tests for the subtraction of two pointer values

ee56b4f

Add unit tests for pointer arithmetic conversion.

34b2985

NlightNFotis force-pushed the smt_pointer_arithmetic_conversion_final branch from a441756 to 34b2985 Compare May 28, 2022 10:21

NlightNFotis merged commit 5303a9d into diffblue:develop May 28, 2022

NlightNFotis deleted the smt_pointer_arithmetic_conversion_final branch May 28, 2022 11:51


		#include <unordered_map>

		using pointer_size_mapt = std::unordered_map<typet, smt_termt, irep_full_hash>;


		using type_size_mapt = std::unordered_map<typet, smt_termt, irep_full_hash>;

		/// This function creates a map of types to their related sizes (in bytes).

Add support for conversion of pointer arithmetic expressions to new SMT backend. #6866

Add support for conversion of pointer arithmetic expressions to new SMT backend. #6866

Uh oh!

Conversation

NlightNFotis commented May 18, 2022

Uh oh!

codecov bot commented May 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

thomasspriggs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thomasspriggs May 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NlightNFotis May 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

NlightNFotis commented May 19, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thomasspriggs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented May 18, 2022 •

edited

Loading

thomasspriggs May 26, 2022 •

edited

Loading

NlightNFotis May 27, 2022 •

edited

Loading