Skip to content

TG-838 Documentation of the string solver #1455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
211 changes: 211 additions & 0 deletions src/solvers/refinement/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
\page string-solver String solver

\author Romain Brenguier

\tableofcontents

\section string_solver_interface String solver interface

\subsection general_interface General interface

The common interface for solvers in CProver is inherited from
`decision_proceduret` and is the common interface for all solvers.
It is essentially composed of these three functions:

- `string_refinementt::set_to(const exprt &expr, bool value)`:
\copybrief string_refinementt::set_to
- `string_refinementt::dec_solve()`:
\copybrief string_refinementt::dec_solve
- `string_refinementt::get(const exprt &expr) const`:
\copybrief string_refinementt::get

For each goal given to CProver:
- `set_to` is called on several equations, roughly one for each step of the
symbolic execution that leads to that goal;
- `dec_solve` is called to determine whether the goal is reachable given these
equations;
- `get` is called by the interpreter to obtain concrete value to build a trace
leading to the goal;
- The same process can be repeated for further goals, in that case the
constraints added by previous calls to `set_to` remain valid.

\subsection specificity Specificity of the string solver

The specificity of the solver is in what kind of expressions `set_to` accepts
and understands. `string_refinementt::set_to` accepts all constraints that are
normally accepted by `bv_refinementt`.

`string_refinementt::set_to` also understands constraints of the form:
* `char_pointer1 = b ? char_pointer2 : char_pointer3` where `char_pointer<i>`
variables are of type pointer to characters and `b` is a Boolean
expression.
* `i = cprover_primitive(args)` where `i` is of signed bit vector type.
String primitives are listed in the next section.

\note In the implementation, equations that are not of these forms are passed
to an embedded `bv_refinementt` solver.

\subsection string-representation String representation in the solver

String primitives can have arguments which are pointers to characters.
These pointers represent strings.
To each of these pointers the string solver associate a char array
which represents the content of the string.
If the pointer is the address of an actual array in the program they should be
linked by using the primitive `cprover_string_associate_array_to_pointer`.
The length of the array can also be linked to a variable of the program using
`cprover_string_associate_length_to_array`.

\warning The solver assumes the memory pointed by the arguments is immutable
which is not something that is true in general for C pointers for instance.
Therefore for each transformation on a string, it is assumed the program
allocates a new string before calling a primitive.

\section primitives String primitives

\subsection basic-primitives Basic access:

* `cprover_string_associate_array_to_pointer` : Link with an array of
characters of the program.
* `cprover_string_associate_length_to_array` : Link the length of the array
with the given integer value.
* `cprover_string_char_at` :
\copybrief string_constraint_generatort::add_axioms_for_char_at(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_char_at(const function_application_exprt &f) More... \endlink
* `cprover_string_length` :
\copybrief string_constraint_generatort::add_axioms_for_length(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_length(const function_application_exprt &f) More... \endlink

\subsection comparisons Comparisons:

* `cprover_string_compare_to` :
\copybrief string_constraint_generatort::add_axioms_for_compare_to(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_compare_to(const function_application_exprt &f) More... \endlink
* `cprover_string_contains` :
\copybrief string_constraint_generatort::add_axioms_for_contains(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_contains(const function_application_exprt &f) More... \endlink
* `cprover_string_equals` :
\copybrief string_constraint_generatort::add_axioms_for_equals(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_equals(const function_application_exprt &f) More... \endlink
* `cprover_string_equals_ignore_case` :
\copybrief string_constraint_generatort::add_axioms_for_equals_ignore_case(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_equals_ignore_case(const function_application_exprt &f) More... \endlink
* `cprover_string_hash_code` :
\copybrief string_constraint_generatort::add_axioms_for_hash_code
\link string_constraint_generatort::add_axioms_for_hash_code More... \endlink
* `cprover_string_is_prefix` :
\copybrief string_constraint_generatort::add_axioms_for_is_prefix
\link string_constraint_generatort::add_axioms_for_is_prefix More... \endlink
* `cprover_string_is_suffix` :
\copybrief string_constraint_generatort::add_axioms_for_is_suffix
\link string_constraint_generatort::add_axioms_for_is_suffix More... \endlink
* `cprover_string_index_of` :
\copybrief string_constraint_generatort::add_axioms_for_index_of(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_index_of(const function_application_exprt &f) More... \endlink
* `cprover_string_last_index_of` :
\copybrief string_constraint_generatort::add_axioms_for_last_index_of(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_last_index_of(const function_application_exprt &f) More... \endlink

\subsection transformations Transformations:

* `cprover_string_char_set` :
\copybrief string_constraint_generatort::add_axioms_for_char_set(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_char_set(const function_application_exprt &f) More... \endlink
* `cprover_string_concat` :
\copybrief string_constraint_generatort::add_axioms_for_concat(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_concat(const function_application_exprt &f) More... \endlink
* `cprover_string_delete` :
\copybrief string_constraint_generatort::add_axioms_for_delete(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_delete(const function_application_exprt &f) More... \endlink
* `cprover_string_insert` :
\copybrief string_constraint_generatort::add_axioms_for_insert(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_insert(const function_application_exprt &f) More... \endlink
* `cprover_string_replace` :
\copybrief string_constraint_generatort::add_axioms_for_replace(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_replace(const function_application_exprt &f) More... \endlink
* `cprover_string_set_length` :
\copybrief string_constraint_generatort::add_axioms_for_set_length(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_set_length(const function_application_exprt &f) More... \endlink
* `cprover_string_substring` :
\copybrief string_constraint_generatort::add_axioms_for_substring(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_substring(const function_application_exprt &f) More... \endlink
* `cprover_string_to_lower_case` :
\copybrief string_constraint_generatort::add_axioms_for_to_lower_case(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_to_lower_case(const function_application_exprt &f) More... \endlink
* `cprover_string_to_upper_case` :
\copybrief string_constraint_generatort::add_axioms_for_to_upper_case(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_to_upper_case(const function_application_exprt &f) More... \endlink
* `cprover_string_trim` :
\copybrief string_constraint_generatort::add_axioms_for_trim(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_trim(const function_application_exprt &f) More... \endlink

\subsection conversions Conversions:

* `cprover_string_format` :
\copybrief string_constraint_generatort::add_axioms_for_format(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_format(const function_application_exprt &f) More... \endlink
* `cprover_string_from_literal` :
\copybrief string_constraint_generatort::add_axioms_from_literal(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_from_literal(const function_application_exprt &f) More... \endlink
* `cprover_string_of_int` :
\copybrief string_constraint_generatort::add_axioms_from_int(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_from_int(const function_application_exprt &f) More... \endlink
* `cprover_string_of_float` :
\copybrief string_constraint_generatort::add_axioms_for_string_of_float(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_string_of_float(const function_application_exprt &f) More... \endlink
* `cprover_string_of_float_scientific_notation` :
\copybrief string_constraint_generatort::add_axioms_from_float_scientific_notation(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_from_float_scientific_notation(const function_application_exprt &f) More... \endlink
* `cprover_string_of_char` :
\copybrief string_constraint_generatort::add_axioms_from_char(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_from_char(const function_application_exprt &f) More... \endlink
* `cprover_string_parse_int` :
\copybrief string_constraint_generatort::add_axioms_for_parse_int(const function_application_exprt &f)
\link string_constraint_generatort::add_axioms_for_parse_int(const function_application_exprt &f) More... \endlink

\subsection deprecated Deprecated primitives:

* `cprover_string_concat_code_point`, `cprover_string_code_point_at`,
`cprover_string_code_point_before`, `cprover_string_code_point_count`:
Java specific, should be part of Java models.
* `cprover_string_offset_by_code_point`, `cprover_string_concat_char`,
`cprover_string_concat_int`, `cprover_string_concat_long`,
`cprover_string_concat_bool`, `cprover_string_concat_double`,
`cprover_string_concat_float`, `cprover_string_insert_int`,
`cprover_string_insert_long`, `cprover_string_insert_bool`,
`cprover_string_insert_char`, `cprover_string_insert_double`,
`cprover_string_insert_float` :
Should be done in two steps: conversion from primitive type and call
to the string primitive.
* `cprover_string_array_of_char_pointer`, `cprover_string_to_char_array` :
Pointer to char array association
is now handled by `string_constraint_generatort`, there is no need for
explicit conversion.
* `cprover_string_intern` : Never tested.
* `cprover_string_is_empty` :
Should use `cprover_string_length(s) == 0` instead.
* `cprover_string_empty_string` : Can use literal of empty string instead.
* `cprover_string_of_long` : Should be the same as `cprover_string_of_int`.
* `cprover_string_delete_char_at` : A call to
`cprover_string_delete_char_at(s, i)` would be the same thing as
`cprover_string_delete(s, i, i+1)`.
* `cprover_string_of_bool` :
Language dependent, should be implemented in the models.
* `cprover_string_copy` : Same as `cprover_string_substring(s, 0)`.
* `cprover_string_of_int_hex` : Same as `cprover_string_of_int(s, 16)`.
* `cprover_string_of_double` : Same as `cprover_string_of_float`.

\section algorithm Decision algorithm

\copydetails string_refinementt::dec_solve

\subsection instantiation Instantiation

This is done by generate_instantiations(messaget::mstreamt &stream, const namespacet &ns, const string_constraint_generatort &generator, const index_set_pairt &index_set, const string_axiomst &axioms).
\copydetails generate_instantiations(messaget::mstreamt &stream, const namespacet &ns, const string_constraint_generatort &generator, const index_set_pairt &index_set, const string_axiomst &axioms)

\subsection axiom-check Axiom check

\copydetails check_axioms(const string_axiomst &axioms, string_constraint_generatort &generator, const std::function<exprt(const exprt &)> &get, messaget::mstreamt &stream, const namespacet &ns, std::size_t max_string_length, bool use_counter_example, ui_message_handlert::uit ui, const union_find_replacet &symbol_resolve)
\link check_axioms(const string_axiomst &axioms, string_constraint_generatort &generator, const std::function<exprt(const exprt &)> &get, messaget::mstreamt &stream, const namespacet &ns, std::size_t max_string_length, bool use_counter_example, ui_message_handlert::uit ui, const union_find_replacet &symbol_resolve)
(See function documentation...) \endlink
50 changes: 27 additions & 23 deletions src/solvers/refinement/string_constraint.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,29 +26,32 @@ Author: Romain Brenguier, [email protected]
#include <util/string_expr.h>
#include <langapi/language_util.h>

/*! \brief Universally quantified string constraint

This represents a universally quantified string constraint as laid out in
DOI: 10.1007/978-3-319-03077-7. The paper seems to specify a universal
constraint as follows.

A universal constraint is of the form \f$\forall n. L(n) \rightarrow
P(n, s_0,\ldots, s_k)\f$ where

1. \f$L(n)\f$ does not contain string indices [possibly not required, but
implied by examples]
2. \f$\forall n. L(n) \rightarrow P'\left(s_0[f_0(n)],\ldots, s_k[f_k(n)]
\right)\f$, i.e. when focusing on one specific string, all indices are
the same [stated in a roundabout manner]
3. Each \f$f\f$ is linear and therefore has an inverse [explicitly stated]
4. \f$L(n)\f$ and \f$P(n, s_0,\ldots, s_k)\f$ may contain other (free)
variables, but in \f$P\f$, \f$n\f$ can only occur as an argument to an
\f$f\f$ [explicitly stated, implied]

We extend this slightly by restricting n to be in a specific range, but this
is another implication which can be pushed in to \f$L(n)\f$.
*/

/// ### Universally quantified string constraint
///
/// This represents a universally quantified string constraint as laid out in
/// DOI: 10.1007/978-3-319-03077-7. The paper seems to specify a universal
/// constraint as follows.
///
/// A universal constraint is of the form
/// \f$ \forall i.\ PI(i) \Rightarrow PV(i)\f$
/// where \f$PI\f$ and \f$PV\f$ satisfies the following conditions:
///
/// * The predicate `PI` , called the index guard, must follow the grammar
/// * \f$iguard : iguard \land iguard \mid iguard \lor iguard \mid
/// iterm \le iterm \mid iterm = iterm \f$
/// * \f$iterm : integer\_constant1 \times i + integer\_constant2 \f$
///
/// * The predicate `PV` is called the value constraint.
/// The index variable `i` can only be used in array read expressions of
/// the form `a[i]`.
/// ie. `PV` is of the form \f$P'(s_0[f_0(i)],\ldots, s_k[f_k(i)]
/// )\f$, moreover when focusing on one specific string, all indices are
/// the same [stated in a roundabout manner].
/// \f$L(n)\f$ and \f$P(n, s_0,\ldots, s_k)\f$ may contain other (free)
/// variables, but in \f$P\f$, \f$n\f$ can only occur as an argument to an
/// \f$f\f$ [explicitly stated, implied].
///
/// \todo The fact that we follow this grammar is not enforced at the moment.
class string_constraintt: public exprt
{
public:
Expand Down Expand Up @@ -155,6 +158,7 @@ inline std::string from_expr(
from_expr(ns, identifier, expr.body());
}

/// Constraints to encode non containement of strings.
class string_not_contains_constraintt: public exprt
{
public:
Expand Down
Loading