@@ -778,54 +778,58 @@ unsigned long fac = 1;
778
778
Now, from the C standard and the most common C ABIs, we know that internally,
779
779
` fac ` will be represented as a binary number with 64 bits. So, if we wish to
780
780
reason about the contents of the variable ` fac ` , we might as well represent it
781
- as a vector of 64 propositional variables, say ` fac_0 ` to ` fac_{63} ` , where
782
- ` fac = 2^{63} fac_{63} + ... + 2^0 fac_0 ` . We can then assert that ` fac=1 ` using
781
+ as a vector of 64 propositional variables, say ` fac ` <sub >0</sub > to
782
+ ` fac ` <sub >63</sub >, where
783
+ ` fac ` = 2<sup >63</sup > ` fac ` <sub >63</sub > + ... + 2<sup >0</sup > ` fac ` <sub >0</sub >.
784
+ We can then assert that ` fac ` =1 using
783
785
the propsitional formula
784
- ` fac_{63} = 0 and ... and fac_1 = 0 and fac_0 = 1` , where we define the formula
785
- ` A = B ` as ` (A and B) or ((not A) and (not B)) ` .
786
+ ` fac ` < sub >63</ sub > = 0 and ... and ` fac ` < sub >1</ sub > = 0 and ` fac ` < sub >0</ sub > = 1,
787
+ where we define the formula A = B as '' (A and B) or ((not A) and (not B))'' .
786
788
787
789
We call this a * bit vector* representation. Compare the Wikipedia page on
788
790
[ Binary numbers] ( https://en.wikipedia.org/wiki/Binary_number ) .
789
791
790
792
Bit vector representations can also be used to describe operations on binary
791
- numbers. For instance, suppose we have two four-bit numbers ` A_3, ... A_0 `
792
- (representing a number ` A ` ) and ` B_3, ..., B_0 ` (representing a number ` b ` )
793
+ numbers. For instance, suppose we have two four-bit numbers A_3, ... A_0
794
+ (representing a number A ) and B_3, ..., B_0 (representing a number b )
793
795
and wish to add them. As detailed on the page on
794
796
[ Binary adders] ( https://en.wikipedia.org/wiki/Adder_(electronics) ) ,
795
- we define three additional bit vectors, the * carries* ` C_0, ..., C_3 ` ,
796
- the * partial sum* ` P_0, ..., P_4 ` and
797
- the * sum* ` S_0, ..., S_4 ` , representing a number ` S ` such that ` S=A+B ` .
797
+ we define three additional bit vectors, the * carries* C<sub >0</sub >, ..., C<sub >3</sub >,
798
+ the * partial sum* P<sub >0</sub >, ..., P<sub >4</sub > and
799
+ the * sum* S<sub >0</sub >, ..., S<sub >4</sub > , representing a number S such that
800
+ S=A+B.
798
801
Note that the sum vector has one bit more - why? How
799
802
is this related to arithmetic overflow in C?
800
803
801
804
For convenience, we first define the * half-adder* . This is given as two
802
- formulas, ` HA_S(A,B) = (A and not B) or (B and not A) ` , which gives the
803
- sum of the bits ` A ` and ` B ` , and ` HA_C(A,B) = A and B ` , which indicates
805
+ formulas, HA_S(A,B) = (A and not B) or (B and not A), which gives the
806
+ sum of the bits A and B , and HA_C(A,B) = A and B, which indicates
804
807
whether the result of the sum was too big to fit into one bit (so we carry
805
808
a one to the next bit).
806
809
807
810
Using the half-adder formulas, we can define the * full adder* , again given as
808
811
two formulas, one for sum and the other for carry. We have
809
- ` FA_S(A,B,C_in) = HA(HA(A,B),C_in) ` , giving the sum of ` A ` , ` B ` and ` C_in ` ,
810
- and ` FA_C(A,B,C_in) = HA_C(A,B) or (C_in and HA_S(A,B)) ` , which states
812
+ FA_S(A,B,C_in) = HA(HA(A,B),C_in), giving the sum of A, B and C_in,
813
+ and FA_C(A,B,C_in) = HA_C(A,B) or (C_in and HA_S(A,B)), which states
811
814
whether the result is too big to fit into a bit. Not that the full adder
812
815
has an additional input for carries; in the following step, we will use it
813
816
to chain the full adders together to compute the actual sum.
814
817
815
818
Using the helper variables we have above, we can give the sum of the four-bit
816
- numbers as `C_0 = FA_C(A_0,B_0,0) and C_1 = FA_C(A_1,B_1,C_0) and
817
- C_2 = FA_C(A_2,B_2,C_1) and C_3 = FA_C(A_3,B_3,C_2) and
818
- S_0 = FA_S(A_0,B_0,0) and S_1 = FA_S(A_1,B_1,C_0) and
819
- and S_2 = FA_S(A_2,B_2,C_1) and S_3 = FA_S(A_3,B_3,C_2)
820
- and S_3 = FA_S(0,0,C_3)`.
819
+ numbers as
820
+ > C_0 = FA_C(A_0,B_0,0) and C_1 = FA_C(A_1,B_1,C_0) and
821
+ > C_2 = FA_C(A_2,B_2,C_1) and C_3 = FA_C(A_3,B_3,C_2) and
822
+ > S_0 = FA_S(A_0,B_0,0) and S_1 = FA_S(A_1,B_1,C_0) and
823
+ > and S_2 = FA_S(A_2,B_2,C_1) and S_3 = FA_S(A_3,B_3,C_2)
824
+ > and S_3 = FA_S(0,0,C_3).
821
825
822
826
Other arithmetic operations on binary number can be expressed using propositional
823
827
logic as well; the details can be found in the linked articles, as well
824
828
as [ Two's complement] ( https://en.wikipedia.org/wiki/Two%27s_complement ) for
825
829
handling signed integers and [ IEEE 754] ( https://en.wikipedia.org/wiki/IEEE_754 )
826
830
for floating point numbers.
827
831
828
- In the following, we will simply write formulas such as ` S=A+B ` , with the
832
+ In the following, we will simply write formulas such as S=A+B, with the
829
833
understanding that this is internally represented using the appropriate bit
830
834
vectors.
831
835
@@ -842,14 +846,14 @@ int sum(int a, int b)
842
846
}
843
847
```
844
848
To describe the behavior of this program, we introduce the appropriately-sized
845
- bit vectors `A` and `B` , and an additional helper vector ` return` . The `A` and
846
- `B` bit vectors reflect the values of the parameters `a` and `b`, while
847
- ` return` contains the return value of the function. As we have seen above, we
848
- can describe the value of `a+b` as ` A+B` -- remember that this is an
849
+ bit vectors A and B , and an additional helper vector return. The A and
850
+ B bit vectors reflect the values of the parameters `a` and `b`, while
851
+ return contains the return value of the function. As we have seen above, we
852
+ can describe the value of `a+b` as A+B -- remember that this is an
849
853
abbreviation for a moderately complex formula on bit vectors!
850
854
851
855
From the semantics of the `return` instruction, we know that this program will
852
- return the value `a+b`, so we can describe its behavior as ` return = A+B` .
856
+ return the value `a+b`, so we can describe its behavior as return = A+B.
853
857
854
858
Let us consider a slighly more complex program.
855
859
```C
@@ -861,7 +865,7 @@ int calculate(int x)
861
865
}
862
866
```
863
867
We again introduce several bit vectors. For the parameter ` x ` , we introduce a
864
- bit vector ` X ` , and for the return value, ` return ` . But we also have to deal
868
+ bit vector X , and for the return value, return. But we also have to deal
865
869
with the (local) variable ` y ` , which gets two assignments. Furthermore, we
866
870
now have a program with three instructions.
867
871
@@ -882,15 +886,15 @@ int calculate(int x.1)
882
886
```
883
887
In this form, we know that each variable is assigned to at most once.
884
888
To capture the behavior of this program, we translate it statement-by-statement
885
- into a propositional formula. We introduce two bit vectors `Y1` and `Y2` to
886
- stand for `y.1` and `y.2` (we map `X` to `x.1` and ` return` to the return
887
- value). `int y.1 = x.1 * x.1` becomes ` Y1 = X * X` , `y.2 = y.1 + x.1` becomes
888
- ` Y2 = Y1 + X` and `return y.2` becomes ` return = Y2` .
889
+ into a propositional formula. We introduce two bit vectors Y1 and Y2 to
890
+ stand for `y.1` and `y.2` (we map X to `x.1` and return to the return
891
+ value). `int y.1 = x.1 * x.1` becomes Y1 = X * X, `y.2 = y.1 + x.1` becomes
892
+ Y2 = Y1 + X and `return y.2` becomes return = Y2.
889
893
890
894
To tie the three formulas together into a description of the while program,
891
895
we note that the three instructions form a single basic block, so we know they
892
896
are always executed as a unit. In this case, it is sufficient to simple connect
893
- them with ` and`: ` Y1 = X * X and Y2 = Y1 + X and return = Y2` . Note that this
897
+ them with '' and'': Y1 = X * X and Y2 = Y1 + X and return = Y2. Note that this
894
898
propositional formula does not actually describe the order of execution of the
895
899
statements, but simply summarizes their outcomes! Once we have non-trivial
896
900
control flow, we have to do some extra work in this model.
@@ -908,7 +912,7 @@ int max(int a, int b)
908
912
return result;
909
913
}
910
914
```
911
- Bringing this into SSA form, we have the following program:
915
+ Bringing this into SSA form, we have the following program (we write ` Phi ` for & Phi ; ) :
912
916
``` C
913
917
int max (int a, int b)
914
918
{
@@ -917,19 +921,19 @@ int max(int a, int b)
917
921
result.1 = b;
918
922
else
919
923
result.2 = a;
920
- return & Phi ; (result.1,result.2);
924
+ return Phi(result.1,result.2);
921
925
}
922
926
```
923
- We again introduce bit vectors `A` (for parameter `a`), `B` (for parameters `b`),
924
- `R1` (for `result.1`), `R2` (for `result.2`) and
925
- ` return` (for the return value). The interesting question in this case is
927
+ We again introduce bit vectors A (for parameter `a`), B (for parameters `b`),
928
+ R1 (for `result.1`), R2 (for `result.2`) and
929
+ return (for the return value). The interesting question in this case is
926
930
how we can handle the Φ node: so far, it is a ''magic'' operator that selects
927
931
the correct value.
928
932
929
933
As a first step, we modify the SSA form slightly by introducing an additional
930
- propositional variable `C` that tracks which branch of the `if` was taken.
934
+ propositional variable C that tracks which branch of the `if` was taken.
931
935
We call this variabel the *code guard variable*, or *guard* for short.
932
- Additionally, we add `C` to the Φ node as a new first parameter, describing
936
+ Additionally, we add C to the Φ node as a new first parameter, describing
933
937
which input to use as a result.
934
938
The corresponding program looks something like this:
935
939
```C
@@ -942,39 +946,39 @@ int max(int a, int b)
942
946
result.1 = b;
943
947
else
944
948
result.2 = a;
945
- return & Phi; (C,result.1,result.2);
949
+ return Phi(C,result.1,result.2);
946
950
}
947
951
```
948
952
For the encoding of the program, we introduce a new propositional junctor,
949
- &Rarr ; , where ` A &Rarr ; B ` is equivalent to ` (not A) or B ` . It can be understood
950
- as ''if ` A ` holds, ` B ` must hold as well''.
953
+ &rArr ; , where '' A &rArr ; B'' is equivalent to '' (not A) or B''.
954
+ It can be understood as ''if A holds, B must hold as well''.
951
955
952
956
With these ingredients, we can encode the program. First of all, we translate
953
957
the basic statements of the program:
954
- - ` C = a<b ` maps to ` C = A<B ` , for an appropriate formula ` A<B ` .
955
- - ` result.1 = b ` becomes ` R1 = B ` , and ` result.2 = a ` becomes ` R2 = A ` .
958
+ - ` C = a<b ` maps to C = A& lt ; B , for an appropriate formula A & lt ; B .
959
+ - ` result.1 = b ` becomes R1 = B, and ` result.2 = a ` becomes R2 = A.
956
960
957
961
To handle the ` if ` statement, we simply make the execution of each branch
958
- conditional on ` C ` using the &Rarr ; junctor:
962
+ conditional on C using the &rArr ; junctor:
959
963
``` C
960
964
if (C)
961
965
result.1 = b;
962
966
else
963
967
result.2 = a;
964
968
```
965
- becomes ` (C &Rarr ; R1 = B) and ((not C) &Rarr ; R2 = A) ` , stating that
966
- the equation for the first assignment holds when ` C ` is true, and that for
967
- the second assignment holds when ` C ` is false.
969
+ becomes (C &rArr ; R1 = B) and ((not C) &rArr ; R2 = A), stating that
970
+ the equation for the first assignment holds when C is true, and that for
971
+ the second assignment holds when C is false.
968
972
969
- Finally, the &Phi ; node is again resolved using the &Rarr ; junctor: we
973
+ Finally, the &Phi ; node is again resolved using the &rArr ; junctor: we
970
974
can encode the ` return ` statement as
971
- ` (C &Rarr ; return = R1) and ((not C) &Rarr ; return = R2) ` .
975
+ (C &rArr ; return = R1) and ((not C) &rArr ; return = R2).
972
976
973
977
At this point, it remains to tie the statements together; we find that we can
974
978
again simply connect them with ''and'', since the statements are always executed
975
979
in sequence. We get:
976
- ` C = a< b and (C &Rarr ; R1 = B) and (C &Rarr ; return = R1) and
977
- ((not C) &Rarr ; R2 = A) and ((not C) &Rarr ; return = R2)` .
980
+ > C = a& lt ; b and (C &rArr ; R1 = B) and (C &rArr ; return = R1) and
981
+ > ((not C) &rArr ; R2 = A) and ((not C) &rArr ; return = R2).
978
982
979
983
We can extend this approach quite straightforwardly to other constructs, but
980
984
one obvious problem remains: We have not described how to handle loops. This
@@ -1079,48 +1083,48 @@ unsigned long factorial(unsigned n) {
1079
1083
}
1080
1084
}
1081
1085
}
1082
- return & Phi(C1, & Phi(C2, & Phi(C3, fac.4, fac.3), fac.2), fac.1);
1086
+ return Phi(C1, Phi(C2, Phi(C3, fac.4, fac.3), fac.2), fac.1);
1083
1087
}
1084
1088
```
1085
- We translate ` IGNORE ` into the formula ` false ` - this will later allow
1089
+ We translate ` IGNORE ` into the formula ** false** - this will later allow
1086
1090
us to rule out all paths that reach this point.
1087
1091
1088
1092
The corresponding propositional formula can then be written as (check
1089
1093
that this is equivalent to the formula you would be getting by following
1090
1094
the translation procedure described above):
1091
- ```
1092
- fac.1 = 1 and i.1 = 1 and C1 = i.1 < = n and
1093
- ((not C1) &Rarr ; return = fac.1) and
1094
- C1 &Rarr ; (
1095
- fac.2 = fac.1 * i.1 and i.2 = i.1 + 1 and C2 = i.2 < = n and
1096
- ((not C2) &Rarr ; return = fac.2) and
1097
- C2 &Rarr ; (
1098
- fac.3 = fac.2 * i.2 and i.3 = i.2 + 1 and C3 = i.3 < = n and
1099
- ((not C3) &Rarr ; return = fac.3) and
1100
- C3 &Rarr ; (
1101
- fac.4 = fac.3 * i.3 and i.4 = i.3 + 1 and C4 = i.4 < = n and
1102
- ((not C4) &Rarr ; return = fac.4) and
1103
- (C4 &Rarr ; false)
1104
- )
1105
- )
1106
- )
1107
- ```
1108
- In the following, we reference this formula as ` FA(n, result) ` .
1095
+
1096
+ > fac.1 = 1 and i.1 = 1 and C1 = i.1 & lt ; = n and
1097
+ > ((not C1) &rArr ; return = fac.1) and
1098
+ > C1 &rArr ; (
1099
+ > > fac.2 = fac.1 * i.1 and i.2 = i.1 + 1 and C2 = i.2 & lt ; = n and
1100
+ > > ((not C2) &rArr ; return = fac.2) and
1101
+ > > C2 &rArr ; (
1102
+ > > > fac.3 = fac.2 * i.2 and i.3 = i.2 + 1 and C3 = i.3 < = n and
1103
+ > > > ((not C3) &rArr ; return = fac.3) and
1104
+ > > > C3 &rArr ; (
1105
+ > > > > fac.4 = fac.3 * i.3 and i.4 = i.3 + 1 and C4 = i.4 < = n and
1106
+ > > > > ((not C4) &rArr ; return = fac.4) and
1107
+ > > > > (C4 &rArr ; false)
1108
+ > > > )
1109
+ > > > )
1110
+ > > )
1111
+
1112
+ In the following, we reference this formula as FA(n, result).
1109
1113
1110
1114
At this point, we know how to encode programs as propositional formulas.
1111
1115
Our goal was to reason about programs, and in particular, to check whether
1112
1116
a certain property holds. Suppose, for example, that we want to check if there
1113
1117
is a way that the ` factorial ` function returns ` 6 ` . One way to do this is to
1114
- look at the following propositional formula: ` FA(n, result) and result = 6 ` .
1118
+ look at the following propositional formula: FA(n, result) and result = 6.
1115
1119
If this formula has a model (i.e., if we can find a satisfying assignment to
1116
- all variables, and in particular, to ` n ` ), we can extract the required value
1120
+ all variables, and in particular, to n ), we can extract the required value
1117
1121
for the parameter ` n ` from that model. As we have discussed above, this can
1118
1122
be done using a SAT solver: If you run, say, MiniSAT on this formula, you will
1119
- get a model involving ` n=3 ` .
1123
+ get a model involving n=3.
1120
1124
1121
1125
Be aware that this method has very clear limits: We know that the factorial of
1122
1126
` 5 ` is ` 120 ` , but with the formula above, evaluating
1123
- ` FA(n, result) and result=120 ` would yield ''unsatisfiable''! This is because
1127
+ '' FA(n, result) and result=120'' would yield ''unsatisfiable''! This is because
1124
1128
we limited the number of loop iterations, and to reach 120, we have to execute
1125
1129
the loop more than three times.
1126
1130
That being said, for typical CPROVER use cases, we can often make do with a
0 commit comments