Skip to content

Commit 943ca49

Browse files
committed
Add new JSON value assignment feature
This commit adds a new file containing a recursive algorithm which, given an exprt and a JSON representation of a value for the exprt, produces codet that assigns the exprt to that value. Some helper functions in other files are also added.
1 parent 9383137 commit 943ca49

10 files changed

+1034
-7
lines changed

jbmc/src/java_bytecode/Makefile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
SRC = bytecode_info.cpp \
1+
SRC = assignments_from_json.cpp \
2+
bytecode_info.cpp \
23
character_refine_preprocess.cpp \
34
ci_lazy_methods.cpp \
45
ci_lazy_methods_needed.cpp \

jbmc/src/java_bytecode/assignments_from_json.cpp

Lines changed: 843 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
/*******************************************************************\
2+
3+
Module: Assignments to values specified in JSON files
4+
5+
Author: Diffblue Ltd.
6+
7+
\*******************************************************************/
8+
9+
/// \file
10+
11+
#ifndef CPROVER_JAVA_BYTECODE_ASSIGNMENTS_FROM_JSON_H
12+
#define CPROVER_JAVA_BYTECODE_ASSIGNMENTS_FROM_JSON_H
13+
14+
#include <util/std_code.h>
15+
16+
/// Information to store when several references point to the same Java object.
17+
struct object_creation_referencet
18+
{
19+
/// Expression for the symbol that stores the value that may be reference
20+
/// equal to other values.
21+
exprt expr;
22+
23+
/// If `symbol` is an array, this expression stores its length.
24+
optionalt<symbol_exprt> array_length;
25+
};
26+
27+
class jsont;
28+
class symbol_table_baset;
29+
class ci_lazy_methods_neededt;
30+
31+
/// Given an expression \p expr representing a Java object or primitive and a
32+
/// JSON representation \p json of the value of a Java object or primitive of a
33+
/// compatible type, adds statements to \p block to assign \p expr to the
34+
/// deterministic value specified by \p json.
35+
/// The expected format of the JSON representation is mostly the same as that
36+
/// of the json-io serialization library (https://github.com/jdereg/json-io) if
37+
/// run with the following options, as of version 4.10.1:
38+
/// - A type name map with identity mappings such as
39+
/// `("java.lang.Boolean", "java.lang.Boolean")` for all primitive wrapper
40+
/// types, java.lang.Class, java.lang.String and java.util.Date. That is, we
41+
/// are not using the json-io default shorthands for those types.
42+
/// - `WRITE_LONGS_AS_STRINGS` should be set to `true` to avoid a loss of
43+
/// precision when representing longs.
44+
///
45+
/// Some examples of json-io representations that may not be obvious include:
46+
/// - The representation of a Java object generally may or may not contain a
47+
/// `"@type"` key. The value corresponding to such a key specifies the runtime
48+
/// type of the object (or the boxed type if the object is primitive). If no
49+
/// `"@type"` key is present, it is assumed that the runtime type is the same
50+
/// as the compile-time type. Most reference-typed objects are represented as
51+
/// JSON objects (i.e. key-value maps) either way, so the `"@type"` key is
52+
/// just an additional key in the map. However, primitive types, arrays and
53+
/// string types without a `"@type"` key are not represented as JSON objects.
54+
/// For example, "untyped" ints are just represented as e.g. 1234, i.e. a JSON
55+
/// number. The "typed" version of this int then becomes
56+
/// `{"@type":"java.lang.Integer","value":1234}`, i.e. a JSON object, with the
57+
/// original ("untyped") JSON number stored in the "value" entry. For arrays,
58+
/// the corresponding key is called "@items", not "value". Typed versions of
59+
/// primitive types are probably not necessary, but json-io will sometimes
60+
/// produce such values, which is why we support both typed and untyped
61+
/// versions.
62+
/// - The way we deal with reference-equal objects is that they all get assigned
63+
/// the same ID, and exactly one of them will have an `{"@id": some_ID}`
64+
/// entry, in addition to its usual representation. All the other objects with
65+
/// this ID are represented simply as `{"@ref": some_ID}`, with no further
66+
/// entries.
67+
///
68+
/// The above rule has the following exceptions:
69+
/// - It seems that strings are always printed in "primitive" representation by
70+
/// json-io, i.e.\ they are always JSON strings, and never JSON objects with
71+
/// a `@type` key. For cases where we don't know that an expression has a
72+
/// string type (e.g.\ if its type is generic and specialized to
73+
/// java.lang.String), we need to sometimes represent strings as JSON objects
74+
/// with a `@type` key. In this case, the content of the string will be the
75+
/// value associated with a `value` key (similarly to StringBuilder in
76+
/// json-io). See \ref get_untyped_string.
77+
/// - json-io does not include the `ordinal` field of enums in its
78+
/// representation, but our algorithm depends on it being present. It may be
79+
/// possible to rewrite parts of it to set the ordinal depending on the order
80+
/// of elements seen in the `$VALUES` array, but it would generally make
81+
/// things more complicated.
82+
///
83+
/// For examples of JSON representations of objects, see the regression tests
84+
/// for this feature in
85+
/// jbmc/regression/jbmc/deterministic_assignments_json.
86+
///
87+
/// Known limitations:
88+
/// - If two reference-equal objects are defined in the same function, they are
89+
/// correctly assigned the same value.
90+
/// However, the case where they are defined in two different functions is not
91+
/// supported. The object that is stored as a `{"@ref":1}` or similar will
92+
/// generally either point to a freshly allocated symbol or an out-of-scope
93+
/// symbol. The `{"@id":1}` (or similar) object may be assigned correctly, or
94+
/// it may point to an out-of-scope symbol. This is because the symbol for the
95+
/// shared value is currently allocated dynamically. To fix this limitation,
96+
/// static allocation would have to be used instead, together with a static
97+
/// boolean to keep track of whether or not the symbol has been allocated
98+
/// already.
99+
/// - The special floating-point values NaN and positive/negative infinity are
100+
/// not supported. Note that in json-io 4.10.1, these are printed as "null".
101+
/// Future versions of json-io will support these values, and this function
102+
/// should be consistent with that if possible.
103+
/// - json-io prints \\uFFFF as a single character, which is not read correctly
104+
/// by the JSON parser.
105+
/// - Not all assignments have source locations, and those that do only link to
106+
/// a function, not a line number.
107+
///
108+
/// For parameter documentation, see \ref det_creation_infot.
109+
void assign_from_json(
110+
const exprt &expr,
111+
const jsont &json,
112+
const irep_idt &function_id,
113+
code_blockt &block,
114+
symbol_table_baset &symbol_table,
115+
optionalt<ci_lazy_methods_neededt> &needed_lazy_methods,
116+
size_t max_user_array_length,
117+
std::unordered_map<std::string, object_creation_referencet> &references);
118+
119+
#endif // CPROVER_JAVA_BYTECODE_ASSIGNMENTS_FROM_JSON_H

jbmc/src/java_bytecode/java_bytecode_language.cpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1110,7 +1110,13 @@ bool java_bytecode_languaget::convert_single_method(
11101110
declaring_class(symbol_table.lookup_ref(function_id));
11111111
INVARIANT(class_name, "json_clinit must be declared by a class.");
11121112
writable_symbol.value = get_json_clinit_body(
1113-
*class_name, static_values_file, symbol_table, get_message_handler());
1113+
*class_name,
1114+
static_values_file,
1115+
symbol_table,
1116+
get_message_handler(),
1117+
needed_lazy_methods,
1118+
max_user_array_length,
1119+
references);
11141120
break;
11151121
}
11161122
case synthetic_method_typet::STUB_CLASS_STATIC_INITIALIZER:

jbmc/src/java_bytecode/java_bytecode_language.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Author: Daniel Kroening, [email protected]
1010
#ifndef CPROVER_JAVA_BYTECODE_JAVA_BYTECODE_LANGUAGE_H
1111
#define CPROVER_JAVA_BYTECODE_JAVA_BYTECODE_LANGUAGE_H
1212

13+
#include "assignments_from_json.h"
1314
#include "ci_lazy_methods.h"
1415
#include "ci_lazy_methods_needed.h"
1516
#include "java_class_loader.h"
@@ -227,6 +228,11 @@ class java_bytecode_languaget:public languaget
227228
std::unordered_set<std::string> no_load_classes;
228229

229230
std::vector<load_extra_methodst> extra_methods;
231+
232+
/// Shared map to be used in all calls to \ref assign_from_json.
233+
/// It tracks objects that have been specified as reference-equal in the JSON
234+
/// file by mapping IDs of such objects to symbols that store their values.
235+
std::unordered_map<std::string, object_creation_referencet> references;
230236
};
231237

232238
std::unique_ptr<languaget> new_java_bytecode_language();

jbmc/src/java_bytecode/java_static_initializers.cpp

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ Author: Chris Smowton, [email protected]
77
\*******************************************************************/
88

99
#include "java_static_initializers.h"
10+
#include "assignments_from_json.h"
1011
#include "java_object_factory.h"
1112
#include "java_utils.h"
1213
#include <goto-programs/class_hierarchy.h>
@@ -755,7 +756,10 @@ code_blockt get_json_clinit_body(
755756
const irep_idt &class_id,
756757
const std::string &static_values_file,
757758
symbol_table_baset &symbol_table,
758-
message_handlert &message_handler)
759+
message_handlert &message_handler,
760+
optionalt<ci_lazy_methods_neededt> needed_lazy_methods,
761+
size_t max_user_array_length,
762+
std::unordered_map<std::string, object_creation_referencet> &references)
759763
{
760764
jsont json;
761765
if(
@@ -792,9 +796,15 @@ code_blockt get_json_clinit_body(
792796
code_blockt body;
793797
for(const auto &value_pair : static_field_values)
794798
{
795-
// TODO append code to `body` to assign value_pair.first to
796-
// TODO value_pair.second
797-
(void)value_pair;
799+
assign_from_json(
800+
value_pair.first,
801+
value_pair.second,
802+
clinit_function_name(class_id),
803+
body,
804+
symbol_table,
805+
needed_lazy_methods,
806+
max_user_array_length,
807+
references);
798808
}
799809
return body;
800810
}

jbmc/src/java_bytecode/java_static_initializers.h

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ Author: Chris Smowton, [email protected]
99
#ifndef CPROVER_JAVA_BYTECODE_JAVA_STATIC_INITIALIZERS_H
1010
#define CPROVER_JAVA_BYTECODE_JAVA_STATIC_INITIALIZERS_H
1111

12+
#include "assignments_from_json.h"
13+
#include "ci_lazy_methods_needed.h"
1214
#include "java_object_factory_parameters.h"
1315
#include "select_pointer_type.h"
1416
#include "synthetic_methods_map.h"
@@ -62,12 +64,21 @@ code_ifthenelset get_clinit_wrapper_body(
6264
/// values are maps from static field names to values.
6365
/// \param symbol_table: used to look up and create new symbols
6466
/// \param message_handler: used to log any errors with parsing the JSON
67+
/// \param needed_lazy_methods: used to mark any runtime types given in the JSON
68+
/// file as needed
69+
/// \param max_user_array_length: maximum value for constant or variable length
70+
/// arrays. Any arrays that were specified to be of nondeterministic length in
71+
/// the JSON file will be limited by this value.
72+
/// \param references: map to keep track of reference-equal objets.
6573
/// \return the body of the json_clinit function as a code block.
6674
code_blockt get_json_clinit_body(
6775
const irep_idt &class_id,
6876
const std::string &static_values_file,
6977
symbol_table_baset &symbol_table,
70-
message_handlert &message_handler);
78+
message_handlert &message_handler,
79+
optionalt<ci_lazy_methods_neededt> needed_lazy_methods,
80+
size_t max_user_array_length,
81+
std::unordered_map<std::string, object_creation_referencet> &references);
7182

7283
class stub_global_initializer_factoryt
7384
{

jbmc/src/java_bytecode/java_utils.cpp

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,18 @@ irep_idt resolve_friendly_method_name(
183183
}
184184
}
185185

186+
pointer_typet pointer_to_replacement_type(
187+
const pointer_typet &given_pointer_type,
188+
const java_class_typet &replacement_class_type)
189+
{
190+
if(can_cast_type<struct_tag_typet>(given_pointer_type.subtype()))
191+
{
192+
struct_tag_typet struct_tag_subtype{replacement_class_type.get_name()};
193+
return pointer_type(struct_tag_subtype);
194+
}
195+
return pointer_type(replacement_class_type);
196+
}
197+
186198
dereference_exprt checked_dereference(const exprt &expr)
187199
{
188200
dereference_exprt result(expr);

jbmc/src/java_bytecode/java_utils.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,13 @@ irep_idt resolve_friendly_method_name(
7777
const symbol_table_baset &symbol_table,
7878
std::string &error);
7979

80+
/// Given a pointer type to a Java class and a type representing a more specific
81+
/// Java class, return a pointer type to the more specific class with the same
82+
/// structure as the original pointer type.
83+
pointer_typet pointer_to_replacement_type(
84+
const pointer_typet &given_pointer_type,
85+
const java_class_typet &replacement_class_type);
86+
8087
/// Dereference an expression and flag it for a null-pointer check
8188
/// \param expr: expression to dereference and check
8289
dereference_exprt checked_dereference(const exprt &expr);

src/util/json.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -446,4 +446,16 @@ inline const json_objectt &to_json_object(const jsont &json)
446446
return static_cast<const json_objectt &>(json);
447447
}
448448

449+
inline json_stringt &to_json_string(jsont &json)
450+
{
451+
PRECONDITION(json.kind == jsont::kindt::J_STRING);
452+
return static_cast<json_stringt &>(json);
453+
}
454+
455+
inline const json_stringt &to_json_string(const jsont &json)
456+
{
457+
PRECONDITION(json.kind == jsont::kindt::J_STRING);
458+
return static_cast<const json_stringt &>(json);
459+
}
460+
449461
#endif // CPROVER_UTIL_JSON_H

0 commit comments

Comments
 (0)