Skip to content

Commit 5bd6ded

Browse files
committed
Documentation: Layout of BMC section.
1 parent b1479fb commit 5bd6ded

File tree

1 file changed

+78
-74
lines changed

1 file changed

+78
-74
lines changed

doc/architectural/background-concepts.md

Lines changed: 78 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -778,54 +778,58 @@ unsigned long fac = 1;
778778
Now, from the C standard and the most common C ABIs, we know that internally,
779779
`fac` will be represented as a binary number with 64 bits. So, if we wish to
780780
reason about the contents of the variable `fac`, we might as well represent it
781-
as a vector of 64 propositional variables, say `fac_0` to `fac_{63}`, where
782-
`fac = 2^{63} fac_{63} + ... + 2^0 fac_0`. We can then assert that `fac=1` using
781+
as a vector of 64 propositional variables, say `fac`<sub>0</sub> to
782+
`fac`<sub>63</sub>, where
783+
`fac` = 2<sup>63</sup> `fac`<sub>63</sub> + ... + 2<sup>0</sup> `fac`<sub>0</sub>.
784+
We can then assert that `fac`=1 using
783785
the propsitional formula
784-
`fac_{63} = 0 and ... and fac_1 = 0 and fac_0 = 1`, where we define the formula
785-
`A = B` as `(A and B) or ((not A) and (not B))`.
786+
`fac`<sub>63</sub> = 0 and ... and `fac`<sub>1</sub> = 0 and `fac`<sub>0</sub> = 1,
787+
where we define the formula A = B as ''(A and B) or ((not A) and (not B))''.
786788

787789
We call this a *bit vector* representation. Compare the Wikipedia page on
788790
[Binary numbers](https://en.wikipedia.org/wiki/Binary_number).
789791

790792
Bit vector representations can also be used to describe operations on binary
791-
numbers. For instance, suppose we have two four-bit numbers `A_3, ... A_0`
792-
(representing a number `A`) and `B_3, ..., B_0` (representing a number `b`)
793+
numbers. For instance, suppose we have two four-bit numbers A_3, ... A_0
794+
(representing a number A) and B_3, ..., B_0 (representing a number b)
793795
and wish to add them. As detailed on the page on
794796
[Binary adders](https://en.wikipedia.org/wiki/Adder_(electronics)),
795-
we define three additional bit vectors, the *carries* `C_0, ..., C_3`,
796-
the *partial sum* `P_0, ..., P_4` and
797-
the *sum* `S_0, ..., S_4` , representing a number `S` such that `S=A+B`.
797+
we define three additional bit vectors, the *carries* C<sub>0</sub>, ..., C<sub>3</sub>,
798+
the *partial sum* P<sub>0</sub>, ..., P<sub>4</sub> and
799+
the *sum* S<sub>0</sub>, ..., S<sub>4</sub> , representing a number S such that
800+
S=A+B.
798801
Note that the sum vector has one bit more - why? How
799802
is this related to arithmetic overflow in C?
800803

801804
For convenience, we first define the *half-adder*. This is given as two
802-
formulas, `HA_S(A,B) = (A and not B) or (B and not A)`, which gives the
803-
sum of the bits `A` and `B`, and `HA_C(A,B) = A and B`, which indicates
805+
formulas, HA_S(A,B) = (A and not B) or (B and not A), which gives the
806+
sum of the bits A and B, and HA_C(A,B) = A and B, which indicates
804807
whether the result of the sum was too big to fit into one bit (so we carry
805808
a one to the next bit).
806809

807810
Using the half-adder formulas, we can define the *full adder*, again given as
808811
two formulas, one for sum and the other for carry. We have
809-
`FA_S(A,B,C_in) = HA(HA(A,B),C_in)`, giving the sum of `A`, `B` and `C_in`,
810-
and `FA_C(A,B,C_in) = HA_C(A,B) or (C_in and HA_S(A,B))`, which states
812+
FA_S(A,B,C_in) = HA(HA(A,B),C_in), giving the sum of A, B and C_in,
813+
and FA_C(A,B,C_in) = HA_C(A,B) or (C_in and HA_S(A,B)), which states
811814
whether the result is too big to fit into a bit. Not that the full adder
812815
has an additional input for carries; in the following step, we will use it
813816
to chain the full adders together to compute the actual sum.
814817

815818
Using the helper variables we have above, we can give the sum of the four-bit
816-
numbers as `C_0 = FA_C(A_0,B_0,0) and C_1 = FA_C(A_1,B_1,C_0) and
817-
C_2 = FA_C(A_2,B_2,C_1) and C_3 = FA_C(A_3,B_3,C_2) and
818-
S_0 = FA_S(A_0,B_0,0) and S_1 = FA_S(A_1,B_1,C_0) and
819-
and S_2 = FA_S(A_2,B_2,C_1) and S_3 = FA_S(A_3,B_3,C_2)
820-
and S_3 = FA_S(0,0,C_3)`.
819+
numbers as
820+
> C_0 = FA_C(A_0,B_0,0) and C_1 = FA_C(A_1,B_1,C_0) and
821+
> C_2 = FA_C(A_2,B_2,C_1) and C_3 = FA_C(A_3,B_3,C_2) and
822+
> S_0 = FA_S(A_0,B_0,0) and S_1 = FA_S(A_1,B_1,C_0) and
823+
> and S_2 = FA_S(A_2,B_2,C_1) and S_3 = FA_S(A_3,B_3,C_2)
824+
> and S_3 = FA_S(0,0,C_3).
821825
822826
Other arithmetic operations on binary number can be expressed using propositional
823827
logic as well; the details can be found in the linked articles, as well
824828
as [Two's complement](https://en.wikipedia.org/wiki/Two%27s_complement) for
825829
handling signed integers and [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)
826830
for floating point numbers.
827831

828-
In the following, we will simply write formulas such as `S=A+B`, with the
832+
In the following, we will simply write formulas such as S=A+B, with the
829833
understanding that this is internally represented using the appropriate bit
830834
vectors.
831835

@@ -842,14 +846,14 @@ int sum(int a, int b)
842846
}
843847
```
844848
To describe the behavior of this program, we introduce the appropriately-sized
845-
bit vectors `A` and `B`, and an additional helper vector `return`. The `A` and
846-
`B` bit vectors reflect the values of the parameters `a` and `b`, while
847-
`return` contains the return value of the function. As we have seen above, we
848-
can describe the value of `a+b` as `A+B` -- remember that this is an
849+
bit vectors A and B, and an additional helper vector return. The A and
850+
B bit vectors reflect the values of the parameters `a` and `b`, while
851+
return contains the return value of the function. As we have seen above, we
852+
can describe the value of `a+b` as A+B -- remember that this is an
849853
abbreviation for a moderately complex formula on bit vectors!
850854
851855
From the semantics of the `return` instruction, we know that this program will
852-
return the value `a+b`, so we can describe its behavior as `return = A+B`.
856+
return the value `a+b`, so we can describe its behavior as return = A+B.
853857
854858
Let us consider a slighly more complex program.
855859
```C
@@ -861,7 +865,7 @@ int calculate(int x)
861865
}
862866
```
863867
We again introduce several bit vectors. For the parameter `x`, we introduce a
864-
bit vector `X`, and for the return value, `return`. But we also have to deal
868+
bit vector X, and for the return value, return. But we also have to deal
865869
with the (local) variable `y`, which gets two assignments. Furthermore, we
866870
now have a program with three instructions.
867871

@@ -882,15 +886,15 @@ int calculate(int x.1)
882886
```
883887
In this form, we know that each variable is assigned to at most once.
884888
To capture the behavior of this program, we translate it statement-by-statement
885-
into a propositional formula. We introduce two bit vectors `Y1` and `Y2` to
886-
stand for `y.1` and `y.2` (we map `X` to `x.1` and `return` to the return
887-
value). `int y.1 = x.1 * x.1` becomes `Y1 = X * X`, `y.2 = y.1 + x.1` becomes
888-
`Y2 = Y1 + X` and `return y.2` becomes `return = Y2`.
889+
into a propositional formula. We introduce two bit vectors Y1 and Y2 to
890+
stand for `y.1` and `y.2` (we map X to `x.1` and return to the return
891+
value). `int y.1 = x.1 * x.1` becomes Y1 = X * X, `y.2 = y.1 + x.1` becomes
892+
Y2 = Y1 + X and `return y.2` becomes return = Y2.
889893
890894
To tie the three formulas together into a description of the while program,
891895
we note that the three instructions form a single basic block, so we know they
892896
are always executed as a unit. In this case, it is sufficient to simple connect
893-
them with `and`: `Y1 = X * X and Y2 = Y1 + X and return = Y2`. Note that this
897+
them with ''and'': Y1 = X * X and Y2 = Y1 + X and return = Y2. Note that this
894898
propositional formula does not actually describe the order of execution of the
895899
statements, but simply summarizes their outcomes! Once we have non-trivial
896900
control flow, we have to do some extra work in this model.
@@ -908,7 +912,7 @@ int max(int a, int b)
908912
return result;
909913
}
910914
```
911-
Bringing this into SSA form, we have the following program:
915+
Bringing this into SSA form, we have the following program (we write `Phi` for &Phi;):
912916
```C
913917
int max(int a, int b)
914918
{
@@ -917,19 +921,19 @@ int max(int a, int b)
917921
result.1 = b;
918922
else
919923
result.2 = a;
920-
return &Phi;(result.1,result.2);
924+
return Phi(result.1,result.2);
921925
}
922926
```
923-
We again introduce bit vectors `A` (for parameter `a`), `B` (for parameters `b`),
924-
`R1` (for `result.1`), `R2` (for `result.2`) and
925-
`return` (for the return value). The interesting question in this case is
927+
We again introduce bit vectors A (for parameter `a`), B (for parameters `b`),
928+
R1 (for `result.1`), R2 (for `result.2`) and
929+
return (for the return value). The interesting question in this case is
926930
how we can handle the &Phi; node: so far, it is a ''magic'' operator that selects
927931
the correct value.
928932
929933
As a first step, we modify the SSA form slightly by introducing an additional
930-
propositional variable `C` that tracks which branch of the `if` was taken.
934+
propositional variable C that tracks which branch of the `if` was taken.
931935
We call this variabel the *code guard variable*, or *guard* for short.
932-
Additionally, we add `C` to the &Phi; node as a new first parameter, describing
936+
Additionally, we add C to the &Phi; node as a new first parameter, describing
933937
which input to use as a result.
934938
The corresponding program looks something like this:
935939
```C
@@ -942,39 +946,39 @@ int max(int a, int b)
942946
result.1 = b;
943947
else
944948
result.2 = a;
945-
return &Phi;(C,result.1,result.2);
949+
return Phi(C,result.1,result.2);
946950
}
947951
```
948952
For the encoding of the program, we introduce a new propositional junctor,
949-
&Rarr;, where `A &Rarr; B` is equivalent to `(not A) or B`. It can be understood
950-
as ''if `A` holds, `B` must hold as well''.
953+
&rArr;, where ''A &rArr; B'' is equivalent to ''(not A) or B''.
954+
It can be understood as ''if A holds, B must hold as well''.
951955

952956
With these ingredients, we can encode the program. First of all, we translate
953957
the basic statements of the program:
954-
- `C = a<b` maps to `C = A<B`, for an appropriate formula `A<B`.
955-
- `result.1 = b` becomes `R1 = B`, and `result.2 = a` becomes `R2 = A`.
958+
- `C = a<b` maps to C = A&lt;B, for an appropriate formula A&lt;B.
959+
- `result.1 = b` becomes R1 = B, and `result.2 = a` becomes R2 = A.
956960

957961
To handle the `if` statement, we simply make the execution of each branch
958-
conditional on `C` using the &Rarr; junctor:
962+
conditional on C using the &rArr; junctor:
959963
```C
960964
if (C)
961965
result.1 = b;
962966
else
963967
result.2 = a;
964968
```
965-
becomes `(C &Rarr; R1 = B) and ((not C) &Rarr; R2 = A)`, stating that
966-
the equation for the first assignment holds when `C` is true, and that for
967-
the second assignment holds when `C` is false.
969+
becomes (C &rArr; R1 = B) and ((not C) &rArr; R2 = A), stating that
970+
the equation for the first assignment holds when C is true, and that for
971+
the second assignment holds when C is false.
968972

969-
Finally, the &Phi; node is again resolved using the &Rarr; junctor: we
973+
Finally, the &Phi; node is again resolved using the &rArr; junctor: we
970974
can encode the `return` statement as
971-
`(C &Rarr; return = R1) and ((not C) &Rarr; return = R2)`.
975+
(C &rArr; return = R1) and ((not C) &rArr; return = R2).
972976

973977
At this point, it remains to tie the statements together; we find that we can
974978
again simply connect them with ''and'', since the statements are always executed
975979
in sequence. We get:
976-
`C = a<b and (C &Rarr; R1 = B) and (C &Rarr; return = R1) and
977-
((not C) &Rarr; R2 = A) and ((not C) &Rarr; return = R2)`.
980+
> C = a&lt;b and (C &rArr; R1 = B) and (C &rArr; return = R1) and
981+
> ((not C) &rArr; R2 = A) and ((not C) &rArr; return = R2).
978982
979983
We can extend this approach quite straightforwardly to other constructs, but
980984
one obvious problem remains: We have not described how to handle loops. This
@@ -1079,48 +1083,48 @@ unsigned long factorial(unsigned n) {
10791083
}
10801084
}
10811085
}
1082-
return &Phi(C1, &Phi(C2, &Phi(C3, fac.4, fac.3), fac.2), fac.1);
1086+
return Phi(C1, Phi(C2, Phi(C3, fac.4, fac.3), fac.2), fac.1);
10831087
}
10841088
```
1085-
We translate `IGNORE` into the formula `false` - this will later allow
1089+
We translate `IGNORE` into the formula **false** - this will later allow
10861090
us to rule out all paths that reach this point.
10871091

10881092
The corresponding propositional formula can then be written as (check
10891093
that this is equivalent to the formula you would be getting by following
10901094
the translation procedure described above):
1091-
```
1092-
fac.1 = 1 and i.1 = 1 and C1 = i.1 <= n and
1093-
((not C1) &Rarr; return = fac.1) and
1094-
C1 &Rarr; (
1095-
fac.2 = fac.1 * i.1 and i.2 = i.1 + 1 and C2 = i.2 <= n and
1096-
((not C2) &Rarr; return = fac.2) and
1097-
C2 &Rarr; (
1098-
fac.3 = fac.2 * i.2 and i.3 = i.2 + 1 and C3 = i.3 <= n and
1099-
((not C3) &Rarr; return = fac.3) and
1100-
C3 &Rarr; (
1101-
fac.4 = fac.3 * i.3 and i.4 = i.3 + 1 and C4 = i.4 <= n and
1102-
((not C4) &Rarr; return = fac.4) and
1103-
(C4 &Rarr; false)
1104-
)
1105-
)
1106-
)
1107-
```
1108-
In the following, we reference this formula as `FA(n, result)`.
1095+
1096+
> fac.1 = 1 and i.1 = 1 and C1 = i.1 &lt;= n and
1097+
> ((not C1) &rArr; return = fac.1) and
1098+
> C1 &rArr; (
1099+
>> fac.2 = fac.1 * i.1 and i.2 = i.1 + 1 and C2 = i.2 &lt;= n and
1100+
>> ((not C2) &rArr; return = fac.2) and
1101+
>> C2 &rArr; (
1102+
>>> fac.3 = fac.2 * i.2 and i.3 = i.2 + 1 and C3 = i.3 &lt;= n and
1103+
>>> ((not C3) &rArr; return = fac.3) and
1104+
>>> C3 &rArr; (
1105+
>>>> fac.4 = fac.3 * i.3 and i.4 = i.3 + 1 and C4 = i.4 &lt;= n and
1106+
>>>> ((not C4) &rArr; return = fac.4) and
1107+
>>>> (C4 &rArr; false)
1108+
>>> )
1109+
>>> )
1110+
>> )
1111+
1112+
In the following, we reference this formula as FA(n, result).
11091113

11101114
At this point, we know how to encode programs as propositional formulas.
11111115
Our goal was to reason about programs, and in particular, to check whether
11121116
a certain property holds. Suppose, for example, that we want to check if there
11131117
is a way that the `factorial` function returns `6`. One way to do this is to
1114-
look at the following propositional formula: `FA(n, result) and result = 6`.
1118+
look at the following propositional formula: FA(n, result) and result = 6.
11151119
If this formula has a model (i.e., if we can find a satisfying assignment to
1116-
all variables, and in particular, to `n`), we can extract the required value
1120+
all variables, and in particular, to n), we can extract the required value
11171121
for the parameter `n` from that model. As we have discussed above, this can
11181122
be done using a SAT solver: If you run, say, MiniSAT on this formula, you will
1119-
get a model involving `n=3`.
1123+
get a model involving n=3.
11201124

11211125
Be aware that this method has very clear limits: We know that the factorial of
11221126
`5` is `120`, but with the formula above, evaluating
1123-
`FA(n, result) and result=120` would yield ''unsatisfiable''! This is because
1127+
''FA(n, result) and result=120'' would yield ''unsatisfiable''! This is because
11241128
we limited the number of loop iterations, and to reach 120, we have to execute
11251129
the loop more than three times.
11261130
That being said, for typical CPROVER use cases, we can often make do with a

0 commit comments

Comments
 (0)