-
Notifications
You must be signed in to change notification settings - Fork 273
Proper handling of library and __CPROVER built-in functions in contracts #7197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
7 tasks
remi-delmas-3000
pushed a commit
to remi-delmas-3000/cbmc
that referenced
this issue
Oct 31, 2022
- Fixes diffblue#7197 - Fixes diffblue#7198 - Fixes diffblue#7206 Overview --- The new features brought by the system are: - Allow to use C functions to specify reusable/parametric assigns clauses and frees clauses in the contracts language; - Allow function calls in pre- and post-conditions and check the absence of side-effects; - Checking contracts on recursive functions (by assuming preconditions and asserting post conditions on the top level call and asserting pre conditions and assuming post conditions on the recursive call) - The frame checks are fully dynamic and does not require inlining the program or unwinding loops a priori; - The method is compatible with deferred resolution of function pointers calls during symex; - All CPROVER library functions are supported; In addition: - The checks are implemented as C library `src/ansi-c/library/cprover_contracts.c`, the instrumentation insert calls to the library in the program; - Contract checking and contract replacement mostly share the same code; - The whole system's user documentation and developer documentation is written in the doxygen format and found in `src/goto-instrument/contracts/doc/`. Limitations: - loop contracts are not yet ported to the new framework; - deferred function pointer resolution is not yet implemented, user-specified function pointer restriction is still required; - havocing during contract replacement still relies on snapshots known to affect performance; - more work is performed by symbolic execution which can be felt on runtime; Method --- Let’s call `f_top` the top-level function against which the contract is being checked. We assume `f_top` is itself called from a `harness` function, which may call other functions besides `f_top`. Let’s consider a function `f` that is either `f_top` or one of the functions called (directly or indirectly) by `f_top`. The goal is to verify that `f` writes only to memory locations that are part of the frame of the call to `f_top` but allowed by the contract or that were allocated after entering `f_top` and are hence not part of the frame of the call. To do that, we add ghost code to the program to build representations the following sets of memory locations as the program runs: (`contract_assigns`, `contract_frees`, `allocated`, `deallocated`) where: - `contract_assigns` is the set of memory locations specified in the *assigns clause* of the contract being checked - `contract_frees` is the set of pointers specified in the *frees clause* of the contract being checked - `allocated` is the set objects identifiers allocated on the stack or the heap since entering `f_top` - `deallocated` is the set of heap objects identifiers deallocated since entering `f_top` (allows verification that an object was freed as a post-condition of the contract) The instrumentation transforms all user-defined goto functions and library functions: ```c ret_t f(<parameters>); ``` Into functions accepting an extra parameter that is a pointer to a dynamic representation of the sets introduced above: ```c ret_t f(<parameters>, __CPROVER_contracts_write_set_ptr_t write_set); ``` Function bodies is instrumented so that when the write_set pointer is NULL no checks are performed, when it is not NULL the following checks are performed: - Assignments LHS are checked for inclusion in `contract_assigns` or `allocated`, - DECL objects are recorded in `allocated` - Dynamically allocated objects are recorded in `allocated` - DEAD objects or objects assigned to `__CPROVER_dead_object` are removed from `allocated` - dynamic objects deallocated with `__CPROVER_deallocate` are checked for inclusion in `contract_frees` or `allocated`, and recorded in `deallocated` (so that if needed we can check in post conditions that the function indeed deallocated the expected memory). - The current write set is propagated to all functions calls or function pointer calls. Since the write set is dynamically updated, a same function called at different call sites can see a different write set. Write sets are initialised from contracts descriptions and injected in function calls that are checked against contracts, or used to perform inclusion checks against write sets obtained from callers for functions that are replaced by a contract. Implementation --- The class `dfcc_utilst` provides methods to manipulate function symbols in the symbol table: - create fresh local symbols - create parameter symbols - create static symbols - clone and rename function symbols - add function parameters - rename function parameters The file `cprover_contracts.c` defines: - C types representing write sets - C functions to create/populate/release write sets - C functions to do write sets inclusion checks - C functions implementing built-in predicates `__CPROVER_is_freeable`, `__CPROVER_was_freed`, `__CPROVER_is_fresh` The class `dfcc_libraryt` provides a programmatic interface to load and use the library-defined types and functions and perform some optimisations like inlining or unwinding loops in library functions. The class `dfcc_contract_modet` is an enum representing the CHECK and REPLACE modes for contracts. The class `dfcc_is_fresht` Rewrites calls to `__CPROVER_is_fresh` into calls to the library implementation `__CPROVER_contracts_is_fresh`. The class `dfcc_is_freeablet` rewrites calls to `__CPROVER_is_freeable` into calls to the library implementation `__CPROVER_contracts_is_freeable` and rewrites calls to `__CPROVER_was_freed` into calls to the library implementation `__CPROVER_contracts_was_freed`. The class `goto_program_cfg_infot` extends `cfg_infot` and computes the set of program-locals from DECLs found in the instruction sequence. Contrary to the existing `function_cfg_infot` this new class does consider parameters symbols as locals. The class `dfcc_instrumentt` instruments GOTO functions with write-set checks. It also instruments proof harness functions and sub-sequences of GOTO instructions extracted from larger GOTO programs. The class `dfcc_spec_functionst` rewrites assigns clause and frees clause specification functions into functions that populate a given write-set, and can be checked for side-effects against a second write-set. The class `dfcc_contract_functionst` allows to translate the assigns clause (or frees clause) of a contract expressed in DSL style into GOTO a function describing the same set of targets as the clause. It also allows to generate a havoc function from an assigns clause, that havocs the targets described by the clause in a type-directed way. The class `dfcc_dsl_wrapper_programt` generates a sequence of GOTO instruction from a contract expressed in the DSL syntax. The instructions either encode a checked call to the function under verification (CHECK mode) or an abstraction derived from the contract (REPLACE mode). In CHECK mode, it generates GOTO instructions that assume preconditions, snapshots history variables, creates a write set instance, call the function under verification asserts the postconditions, and returns the call's return value. In REPLACE mode, it generates GOTO instructions encoding the nondeterministic abstraction defined by the contract: assert preconditions, create and havoc assigns clause targets, create a nondet return value for the call, nondeterministically free frees clause targets, assume postconditions, return a value that satisfies the post conditions. These instructions are meant to be added to the GOTO function that serves as a wrapper for the function under verification or being replaced by the contract. The class `dfcc_contract_handlert` is an abstract interface for generating GOTO instructions from contracts for CHECK or REPLACE modes. It delegates the actual generation to `dfcc_dsl_wrapper_programt`. The class `dfcc_swap_and_wrapt`, given a `function_id`, a `contract_id`, a contract `mode`, and a `dfcc_contract_handlert` for the contract, swaps the body of the function to a function with a mangled name and instruments the mangled function for dynamic frame condition checking using `dfcc_instrument`. The body of the original function is then replaced with instructions generated by the contract handler for the required contract mode (CHECK or REPLACE). To handle recursive functions in contract checking mode, the body of the wrapper function is generated in such a way the first invocation results in checking the contract against the mangled function and recursive invocations of the wrapper behave like the contract abstraction (induction hypothesis for the recursive call). The `dfcct` class receives the proof harness function, the map of functions and contracts to check and the map of functions to replace with contracts. It instruments the proof harness, performs swap-and-wrap operations for each of the checked and replaced functions. It then instruments all other functions for DFCC. Regression Tests --- Regression tests are found in `regression/contracts-dfcc`. These tests are ported from `regression/contracts` and only contain tests for function contracts.
remi-delmas-3000
pushed a commit
to remi-delmas-3000/cbmc
that referenced
this issue
Nov 1, 2022
- Fixes diffblue#7197 - Fixes diffblue#7198 - Fixes diffblue#7206 Overview --- The new features brought by the system are: - Allow to use C functions to specify reusable/parametric assigns clauses and frees clauses in the contracts language; - Allow function calls in pre- and post-conditions and check the absence of side-effects; - Checking contracts on recursive functions (by assuming preconditions and asserting post conditions on the top level call and asserting pre conditions and assuming post conditions on the recursive call) - The frame checks are fully dynamic and does not require inlining the program or unwinding loops a priori; - The method is compatible with deferred resolution of function pointers calls during symex; - All CPROVER library functions are supported; In addition: - The checks are implemented as C library `src/ansi-c/library/cprover_contracts.c`, the instrumentation insert calls to the library in the program; - Contract checking and contract replacement mostly share the same code; - The whole system's user documentation and developer documentation is written in the doxygen format and found in `src/goto-instrument/contracts/doc/`. Limitations: - loop contracts are not yet ported to the new framework; - deferred function pointer resolution is not yet implemented, user-specified function pointer restriction is still required; - havocing during contract replacement still relies on snapshots known to affect performance; - more work is performed by symbolic execution which can be felt on runtime; Method --- Let’s call `f_top` the top-level function against which the contract is being checked. We assume `f_top` is itself called from a `harness` function, which may call other functions besides `f_top`. Let’s consider a function `f` that is either `f_top` or one of the functions called (directly or indirectly) by `f_top`. The goal is to verify that `f` writes only to memory locations that are part of the frame of the call to `f_top` but allowed by the contract or that were allocated after entering `f_top` and are hence not part of the frame of the call. To do that, we add ghost code to the program to build representations the following sets of memory locations as the program runs: (`contract_assigns`, `contract_frees`, `allocated`, `deallocated`) where: - `contract_assigns` is the set of memory locations specified in the *assigns clause* of the contract being checked - `contract_frees` is the set of pointers specified in the *frees clause* of the contract being checked - `allocated` is the set objects identifiers allocated on the stack or the heap since entering `f_top` - `deallocated` is the set of heap objects identifiers deallocated since entering `f_top` (allows verification that an object was freed as a post-condition of the contract) The instrumentation transforms all user-defined goto functions and library functions: ```c ret_t f(<parameters>); ``` Into functions accepting an extra parameter that is a pointer to a dynamic representation of the sets introduced above: ```c ret_t f(<parameters>, __CPROVER_contracts_write_set_ptr_t write_set); ``` Function bodies is instrumented so that when the write_set pointer is NULL no checks are performed, when it is not NULL the following checks are performed: - Assignments LHS are checked for inclusion in `contract_assigns` or `allocated`, - DECL objects are recorded in `allocated` - Dynamically allocated objects are recorded in `allocated` - DEAD objects or objects assigned to `__CPROVER_dead_object` are removed from `allocated` - dynamic objects deallocated with `__CPROVER_deallocate` are checked for inclusion in `contract_frees` or `allocated`, and recorded in `deallocated` (so that if needed we can check in post conditions that the function indeed deallocated the expected memory). - The current write set is propagated to all functions calls or function pointer calls. Since the write set is dynamically updated, a same function called at different call sites can see a different write set. Write sets are initialised from contracts descriptions and injected in function calls that are checked against contracts, or used to perform inclusion checks against write sets obtained from callers for functions that are replaced by a contract. Implementation --- The class `dfcc_utilst` provides methods to manipulate function symbols in the symbol table: - create fresh local symbols - create parameter symbols - create static symbols - clone and rename function symbols - add function parameters - rename function parameters The file `cprover_contracts.c` defines: - C types representing write sets - C functions to create/populate/release write sets - C functions to do write sets inclusion checks - C functions implementing built-in predicates `__CPROVER_is_freeable`, `__CPROVER_was_freed`, `__CPROVER_is_fresh` The class `dfcc_libraryt` provides a programmatic interface to load and use the library-defined types and functions and perform some optimisations like inlining or unwinding loops in library functions. The class `dfcc_contract_modet` is an enum representing the CHECK and REPLACE modes for contracts. The class `dfcc_is_fresht` Rewrites calls to `__CPROVER_is_fresh` into calls to the library implementation `__CPROVER_contracts_is_fresh`. The class `dfcc_is_freeablet` rewrites calls to `__CPROVER_is_freeable` into calls to the library implementation `__CPROVER_contracts_is_freeable` and rewrites calls to `__CPROVER_was_freed` into calls to the library implementation `__CPROVER_contracts_was_freed`. The class `goto_program_cfg_infot` extends `cfg_infot` and computes the set of program-locals from DECLs found in the instruction sequence. Contrary to the existing `function_cfg_infot` this new class does consider parameters symbols as locals. The class `dfcc_instrumentt` instruments GOTO functions with write-set checks. It also instruments proof harness functions and sub-sequences of GOTO instructions extracted from larger GOTO programs. The class `dfcc_spec_functionst` rewrites assigns clause and frees clause specification functions into functions that populate a given write-set, and can be checked for side-effects against a second write-set. The class `dfcc_contract_functionst` allows to translate the assigns clause (or frees clause) of a contract expressed in DSL style into GOTO a function describing the same set of targets as the clause. It also allows to generate a havoc function from an assigns clause, that havocs the targets described by the clause in a type-directed way. The class `dfcc_dsl_wrapper_programt` generates a sequence of GOTO instruction from a contract expressed in the DSL syntax. The instructions either encode a checked call to the function under verification (CHECK mode) or an abstraction derived from the contract (REPLACE mode). In CHECK mode, it generates GOTO instructions that assume preconditions, snapshots history variables, creates a write set instance, call the function under verification asserts the postconditions, and returns the call's return value. In REPLACE mode, it generates GOTO instructions encoding the nondeterministic abstraction defined by the contract: assert preconditions, create and havoc assigns clause targets, create a nondet return value for the call, nondeterministically free frees clause targets, assume postconditions, return a value that satisfies the post conditions. These instructions are meant to be added to the GOTO function that serves as a wrapper for the function under verification or being replaced by the contract. The class `dfcc_contract_handlert` is an abstract interface for generating GOTO instructions from contracts for CHECK or REPLACE modes. It delegates the actual generation to `dfcc_dsl_wrapper_programt`. The class `dfcc_swap_and_wrapt`, given a `function_id`, a `contract_id`, a contract `mode`, and a `dfcc_contract_handlert` for the contract, swaps the body of the function to a function with a mangled name and instruments the mangled function for dynamic frame condition checking using `dfcc_instrument`. The body of the original function is then replaced with instructions generated by the contract handler for the required contract mode (CHECK or REPLACE). To handle recursive functions in contract checking mode, the body of the wrapper function is generated in such a way the first invocation results in checking the contract against the mangled function and recursive invocations of the wrapper behave like the contract abstraction (induction hypothesis for the recursive call). The `dfcct` class receives the proof harness function, the map of functions and contracts to check and the map of functions to replace with contracts. It instruments the proof harness, performs swap-and-wrap operations for each of the checked and replaced functions. It then instruments all other functions for DFCC. Regression Tests --- Regression tests are found in `regression/contracts-dfcc`. These tests are ported from `regression/contracts` and only contain tests for function contracts.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CBMC version: 5.67.0
Operating system: N/A
These classes of functions need special care during contract instrumentation:
__CPROVER
functions__CPROVER
library functions (have a written implementation)src/ansi-c/library/*.c
)In the current workflow, instrumentation is performed on user code before __CPROVER or POSIX libraries get added the goto model. This is done on purpose, to avoid dragging internal CBMC machinery in user contracts (library functions), and also to avoid instrumenting the body of trusted functions, and to avoid bringing loops that have no contracts in the instrumentation phase.
As a consequence:
As of today (Dec 1st 2021):
Steps to improve:
The key with these functions is to avoid dragging the __CPROVER internals into user contracts
The text was updated successfully, but these errors were encountered: