CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671

remi-delmas-3000 · 2023-04-18T16:26:25Z

This class extracts loop information from a goto program and:

tags goto instructions with loop markers to facilitate loop instrumentation in the presence of nested loops
presents loop contract clauses it in a structured way for loop contract instrumentation
computes which function-local and loop-local variables must be tracked explicitly in function and loop write sets

This PR is extracted from a larger PR #7541 and can only be tested once all features are merged.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

codecov · 2023-04-18T17:35:13Z

Codecov Report

Patch coverage has no change and project coverage change: -0.01 ⚠️

Comparison is base (0e78362) 78.51% compared to head (569804b) 78.51%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7671      +/-   ##
===========================================
- Coverage    78.51%   78.51%   -0.01%     
===========================================
  Files         1674     1674              
  Lines       191946   191940       -6     
===========================================
- Hits        150715   150709       -6     
  Misses       41231    41231

Impacted Files	Coverage Δ
...t/contracts/dynamic-frames/dfcc_spec_functions.cpp	`91.51% <ø> (-0.30%)`	⬇️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

tautschnig

In this first round I have just taken a look at the interface, not the implementation. I believe this needs to be changed.

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

tautschnig · 2023-04-21T11:16:24Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

+    const irep_idt &function_id,
+    goto_functiont &goto_function,
+    const exprt &top_level_write_set,
+    const dfcc_loop_contract_modet loop_contract_mode,
+    goto_modelt &goto_model,


You are taking a function_id, a goto_function, and a goto_model. How do these relate? I'd hope that the goto_function is the value where function_id is the key in goto_model's function map. If so, please omit the goto_function parameter and compute goto_function. Else you are just asking for inconsistency.

Yes, the parameter goto_function should be removed. I will fix it.

There are places in contract instrumentation where we instrument goto programs snippets that will eventually land in the function named function_id but are not there yet. These snippets are temporarily embedded in dummy goto functions because natural loop or may alias functions only accept goto_functions as inputs.
These goto functions are not found in the function table of the goto model. Fresh symbols generated when instrumenting these dummy goto functions have to use function_id as base name so it needs to be passed around, so does the goto model because it carries the symbol table which is needed when creating fresh symbols.

Making the suggested change will force us to rewrite large parts of the instrumentation code in dfcc_instrumentt and dfcc_wrapper_functiont, or to temporarily insert/remove dummy goto functions in the goto model just to be able to pass the goto model and the key instead of the goto model, the key and the goto function.

Actually, @remi-delmas-3000 is correct. And removing goto_function will break the tests. I will undo the fix.

We need to fix this. The DFCC code is becoming inscrutable because of undocumented hacks of this kind.

if you look at dfcc_instrumentt in #7541 you'll see the different entry points for instrumentation, hopefully that makes things clearer.

I removed goto_model and usesymbol_table instead to make it less confusing.

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

tautschnig · 2023-04-24T10:20:41Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+  {
+    // This is a complex expression. Compute the set of root objects expressions
+    // it can assign.
+    auto root_objects = dfcc_root_objects(lhs);


How about you always do this? For a symbol expression, dfcc_root_objects will immediately return just that symbol. I think it would reduce the code complexity here, and also make behaviour more consistent.

This is in the else branch, so it is not a symbol expression ... What do you mean by "more consistent" ?

I thought you could do auto root_objects = dfcc_root_objects(lhs); at the beginning of this function and then the if branch that currently exists at the top would be eliminated for root_objects would equally cover symbol expressions (you would have exactly one entry in root_idents below). I would then expect that all the use of tracked and local be consistent, which is not obvious from the code in the if case above: it doesn't even look at tracked, and I don't know why this is ok to do. Conversely, this else branch never does dfcc_is_cprover_symbol, and again I don't know why this is ok (not) to do.

I've modified this a bit

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

tautschnig · 2023-04-27T12:12:28Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+    std::unordered_set<irep_idt> loop_tracked;
+    for(const auto &ident : loop_local)
+    {
+      if(dirty(ident))
+      {
+        loop_tracked.insert(ident);
+      }
+      else
+      {
+        // Check if this ident is touched by one of the inner loops
+        for(const auto inner_loop_id :
+            loop_nesting_graph.get_predecessors(loop_id))
+        {
+          if(is_assigned(ident, loop_info_map.at(inner_loop_id).assigns))
+            loop_tracked.insert(ident);
+        }
+      }
+    }


This piece of code is now repeated twice (with the necessary local modifications) and should be factored out into a method or function.

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

tautschnig · 2023-04-27T12:26:50Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+    exprt::operandst invariant_clauses = get_invariants(loop.latch);
+    exprt::operandst decreases_clauses = get_decreases(loop.latch);
+    exprt::operandst assigns_clauses = get_assigns(loop.latch);
+    std::set<exprt> assigns;
+    exprt invariant_expr = nil_exprt();
+
+    // Generate defaults for all clauses if some clause is defined
+    if(
+      !invariant_clauses.empty() || !decreases_clauses.empty() ||
+      !assigns_clauses.empty())
+    {
+      if(invariant_clauses.empty())
+      {
+        // use a default invariant if none given.
+        invariant_expr = true_exprt{};
+        // assigns clause is missing; we will try to automatic inference
+        log.warning() << "No invariant provided for loop " << function_id << "."
+                      << loop.latch->loop_number << " at "
+                      << loop.head->source_location()
+                      << ". Using 'true' as a sound default invariant. Please "
+                         "provide an invariant the default is too weak."
+                      << messaget::eom;
+      }
+      else
+      {
+        // conjoin all invariant clauses
+        invariant_expr = conjunction(invariant_clauses);
+      }
+
+      // unpack assigns clause targets
+      if(!assigns_clauses.empty())
+      {
+        for(auto &operand : assigns_clauses)
+        {
+          assigns.insert(operand);
+        }
+      }
+      else
+      {
+        // infer assigns clause targets if none given
+        auto inferred = dfcc_infer_loop_assigns(
+          local_may_alias, loop.instructions, loop.head->source_location(), ns);
+        log.warning() << "No assigns clause provided for loop " << function_id
+                      << "." << loop.latch->loop_number << " at "
+                      << loop.head->source_location() << ". The inferred set {";
+        bool first = true;
+        for(const auto &expr : inferred)
+        {
+          if(!first)
+          {
+            log.status() << ", ";
+          }
+          first = false;
+          log.status() << format(expr);
+        }
+        log.status() << "} might be incomplete or imprecise, please provide an "
+                        "assigns clause if the analysis fails."
+                     << messaget::eom;
+        assigns.swap(inferred);
+      }
+
+      if(decreases_clauses.empty())
+      {
+        log.warning() << "No decrease clause provided for loop " << function_id
+                      << "." << loop.latch->loop_number << " at "
+                      << loop.head->source_location()
+                      << ". Termination will not be checked." << messaget::eom;
+      }


I think this deserves factoring out into a method of its own. I think it would just need to take loop as argument, and could return assigns/invariant_expr/decreases_clauses via a struct that has those three members.

tautschnig · 2023-04-27T12:31:36Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+    {
+      const auto head = check_inner_loops_have_contracts(loop_nesting_graph);
+      if(head.has_value())
+      {
+        throw invalid_source_file_exceptiont(
+          "Found loop without contract nested in a loop with a "
+          "contract.\nPlease "
+          "provide a contract or unwind this loop before applying loop "
+          "contracts.",
+          head.value()->source_location());
+      }
+    }
+
+    auto topsorted = loop_nesting_graph.topsort();
+    for(const auto idx : topsorted)
+    {
+      topsorted_loops.push_back(idx);
+    }


I think all of this could be moved out of the if body. Else its hard to understand why just tag_loop_instructions has to live outside that body. (Alternatively, if my earlier suggestion of not tagging top-level instructions is implemented, you could safely move tag_loop_instructions inside this if.

The idea we had here is that unless they are actually used, we do not impose the check_inner_loops_have_contracts constraints.

Not tagging top-level instructions would have far reaching consequences outside of this PR. The tag has to exist even it trivial, as it represents a key in a total mapping from instructions to write sets, and is used by everything else to lookup the write set variables to use for instrumentation.

kept logic as-is but added comment

tautschnig

Thank you for all the refactoring! I still have some questions/comments, but they seem rather minor now.

tautschnig · 2023-05-02T12:29:21Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+  // tag remaining instructions as top level
+  for(auto target = goto_program.instructions.begin();
+      target != goto_program.instructions.end();
+      target++)
+  {
+    // Skip instructions that are already tagged (belong to loops).
+    auto loop_id_opt = dfcc_get_loop_id(target);
+    if(loop_id_opt.has_value())
+    {
+      continue;
+    }
+    dfcc_set_loop_id(target, top_level_id);
+    dfcc_set_loop_top_level(target);
+  }


I keep wondering: why do we actually need to tag these? Can't we just leave them untagged?

the instrumentation passes that follow expect to find a tag on each instruction, and use it as a key to query the write set variable from the dfcc_cfg_info class.

Tagging all instructions amounts to using size_t as key type and a special value to represent the top-level scope.

Not tagging some instructions amounts to using optionalt<size_t> as key type, with none the for top level scope.

I think 1) allows for better sanity checks.

With 1), if you discover an instruction without tag then it is an error for sure. With 2), it could be because it is a top level instruction, or it could be a loop instruction that was skipped because of a bug somewhere.

tautschnig · 2023-05-02T12:30:55Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+{
+  messaget log(message_handler);
+
+  const auto loop = loop_nesting_graph[loop_id];


Nit pick: I believe this will create a copy - don't you want a reference instead?

tautschnig · 2023-05-02T12:32:05Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+
+  // initialise defaults
+  struct contract_clausest result;
+  result.invariant_expr = nil_exprt();


Good candidate for a (default) constructor?

tautschnig · 2023-05-02T12:35:17Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+      bool first = true;
+      for(const auto &expr : inferred)
+      {
+        if(!first)
+        {
+          log.status() << ", ";
+        }
+        first = false;
+        log.status() << format(expr);
+      }
+      log.status() << "} might be incomplete or imprecise, please provide an "
+                      "assigns clause if the analysis fails."
+                   << messaget::eom;


You might want to use log.conditional_output(log.status(), [&](messaget::mstreamt &mstream) { ... < the above code > } to make clear that all of this is just about logging. (I usually recommend this for debug output to improve performance; here it's more about readability.)

This should actually be all printed as a warning.

tautschnig · 2023-05-02T12:36:42Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+  auto loop_id = loop_id_opt.value();
+  if(is_top_level_id(loop_id))
+  {
+    return top_level_write_set;
+  }
+  else
+  {
+    return loop_info_map.at(loop_id).addr_of_write_set_var;
+  }


So does the above code still require use of one of those methods?

tautschnig · 2023-05-02T12:41:44Z

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp

+  {
+    // This is a complex expression. Compute the set of root objects expressions
+    // it can assign.
+    auto root_objects = dfcc_root_objects(lhs);


I thought you could do auto root_objects = dfcc_root_objects(lhs); at the beginning of this function and then the if branch that currently exists at the top would be eliminated for root_objects would equally cover symbol expressions (you would have exactly one entry in root_idents below). I would then expect that all the use of tracked and local be consistent, which is not obvious from the code in the if case above: it doesn't even look at tracked, and I don't know why this is ok to do. Conversely, this else branch never does dfcc_is_cprover_symbol, and again I don't know why this is ok (not) to do.

remi-delmas-3000 requested review from tautschnig, feliperodri, martin-cs, chris-ryder, peterschrammel and kroening as code owners April 18, 2023 16:26

remi-delmas-3000 self-assigned this Apr 18, 2023

remi-delmas-3000 added aws Bugs or features of importance to AWS CBMC users aws-high Code Contracts Function and loop contracts labels Apr 18, 2023

remi-delmas-3000 mentioned this pull request Apr 18, 2023

CONTRACTS: class to instrument a single loop #7672

Merged

7 tasks

remi-delmas-3000 force-pushed the contracts-dfcc-cfg-info branch from ba28654 to 4e65b80 Compare April 18, 2023 16:36

remi-delmas-3000 mentioned this pull request Apr 18, 2023

CONTRACTS: Dynamic frames for loop contracts #7541

Merged

7 tasks

remi-delmas-3000 assigned tautschnig, feliperodri and qinheping Apr 18, 2023

feliperodri approved these changes Apr 20, 2023

View reviewed changes

feliperodri removed their assignment Apr 20, 2023

remi-delmas-3000 force-pushed the contracts-dfcc-cfg-info branch 2 times, most recently from 4d8ba1c to 75b519a Compare April 20, 2023 19:36

qinheping force-pushed the contracts-dfcc-cfg-info branch from 75b519a to 2071d5f Compare April 20, 2023 22:11

qinheping approved these changes Apr 20, 2023

View reviewed changes

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.h Outdated Show resolved Hide resolved

src/goto-instrument/contracts/dynamic-frames/dfcc_cfg_info.cpp Outdated Show resolved Hide resolved

tautschnig requested changes Apr 21, 2023

View reviewed changes

tautschnig removed their assignment Apr 21, 2023

qinheping force-pushed the contracts-dfcc-cfg-info branch 3 times, most recently from bfc6f6c to 39a0a11 Compare April 21, 2023 21:12

tautschnig reviewed Apr 24, 2023

View reviewed changes

qinheping force-pushed the contracts-dfcc-cfg-info branch from 39a0a11 to 428d542 Compare April 25, 2023 07:02

qinheping force-pushed the contracts-dfcc-cfg-info branch from 428d542 to 2ae2d4a Compare April 26, 2023 04:06

qinheping requested a review from tautschnig April 26, 2023 15:30

tautschnig reviewed Apr 27, 2023

View reviewed changes

remi-delmas-3000 force-pushed the contracts-dfcc-cfg-info branch from 2ae2d4a to eaf9f6d Compare May 1, 2023 23:56

tautschnig approved these changes May 2, 2023

View reviewed changes

remi-delmas-3000 force-pushed the contracts-dfcc-cfg-info branch from eaf9f6d to 3f2af0f Compare May 2, 2023 16:39

CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC

569804b

remi-delmas-3000 force-pushed the contracts-dfcc-cfg-info branch from 3f2af0f to 569804b Compare May 2, 2023 19:05

tautschnig enabled auto-merge May 3, 2023 12:50

tautschnig merged commit fa60087 into diffblue:develop May 3, 2023

remi-delmas-3000 mentioned this pull request May 3, 2023

CONTRACTS: re-enable side effect checking in assigns/frees clauses (HOTFIX) #7699

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671

CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671

remi-delmas-3000 commented Apr 18, 2023

codecov bot commented Apr 18, 2023 •

edited

Loading

tautschnig left a comment

tautschnig Apr 21, 2023

qinheping Apr 21, 2023

remi-delmas-3000 Apr 21, 2023

qinheping Apr 21, 2023

tautschnig Apr 24, 2023

remi-delmas-3000 Apr 25, 2023

qinheping Apr 26, 2023

tautschnig Apr 24, 2023

remi-delmas-3000 Apr 28, 2023

tautschnig May 2, 2023

remi-delmas-3000 May 2, 2023

tautschnig Apr 27, 2023

remi-delmas-3000 Apr 28, 2023 •

edited

Loading

tautschnig Apr 27, 2023

remi-delmas-3000 Apr 28, 2023

remi-delmas-3000 May 1, 2023

tautschnig Apr 27, 2023

remi-delmas-3000 Apr 28, 2023

remi-delmas-3000 May 1, 2023

tautschnig left a comment

tautschnig May 2, 2023

remi-delmas-3000 May 2, 2023

tautschnig May 2, 2023

remi-delmas-3000 May 2, 2023

tautschnig May 2, 2023

remi-delmas-3000 May 2, 2023

tautschnig May 2, 2023

remi-delmas-3000 May 2, 2023

tautschnig May 2, 2023

tautschnig May 2, 2023

CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671

CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671

Conversation

remi-delmas-3000 commented Apr 18, 2023

codecov bot commented Apr 18, 2023 • edited Loading

Codecov Report

tautschnig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

remi-delmas-3000 Apr 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tautschnig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Apr 18, 2023 •

edited

Loading

remi-delmas-3000 Apr 28, 2023 •

edited

Loading