Skip to content

[TG-7706] Add new JSON value assignment feature for static fields #4631

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 24, 2019

Conversation

antlechner
Copy link
Contributor

This new feature allows us to read values from a JSON file and use them in assignments in JBMC. This is currently used to replace the static initialiser, clinit, with a synthetic function json_clinit that is equivalent but only contains assignments to static fields. This only happens when the option --static-values is given. Without this option, there is no change in behaviour.

I'm not sure if it is necessary to include the core models in all the regression tests; it looks like they are required for enums, Strings and inheritance from Object. I can investigate this further but it should not delay reviews.

@allredj and @jeannielynnmoulton offered to review this PR. A code owner will also have to approve of course, but maybe the code owner review can be less in-depth after two full reviews.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@antlechner antlechner force-pushed the antonia/json-value-retriever branch from 1f00acc to b2879d4 Compare May 9, 2019 10:03
@antlechner antlechner force-pushed the antonia/json-value-retriever branch from b2879d4 to 130fd2d Compare May 9, 2019 11:47
@antlechner antlechner added Needs TG approval 🦉 Only merge with explicit approval from test-gen maintainers and removed do not merge labels May 9, 2019
@antlechner antlechner force-pushed the antonia/json-value-retriever branch from 130fd2d to 25df4f1 Compare May 9, 2019 13:52
@antlechner
Copy link
Contributor Author

The AWS "cbmc" build is failing (but showing up as not starting in Github for some reason) because this is not yet based on #4625. I will rebase this PR when the test-gen -> cbmc pointer has been advanced, as this PR requires some changes in TG too.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed up to "Refactor allocate_dynamic_object into two functions "

@antlechner antlechner force-pushed the antonia/json-value-retriever branch from 25df4f1 to 1ce80c1 Compare May 9, 2019 17:58
@antlechner
Copy link
Contributor Author

@allredj I addressed all of your comments so far. 👍

Copy link
Contributor

@jeannielynnmoulton jeannielynnmoulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see the std::stoi comment tested and addressed. There is maybe some room for simplification in some places that I suggested, but these are not blocking.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stunning!
A few comments and questions, but I can't find much to complain about.
I can't wait to see this in action on our benchmarks!

"OtherClass":{
"otherField":{"@ref":1}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to include the json-io command line that generated that file. And also the json-io version. Maybe in a README in this folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version is mentioned in assignments_from_json.h but I can mention it again in the README. I did use a script to generate this file but it's messy and not really merge-able, and I did some manual changes after running it... I thought we decided that the script wasn't necessary for this PR? But if you think it would be useful, I'll tidy it up and add it to this commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the version here would be good. It's all about understanding where this file comes from, and how we can reproduce it. It's unfortunate that you had to do some manual tweaks. What is the extend of those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a script to generate something very close to clinit-state.json - minus some newlines, as the enum/String custom writers don't support pretty printing. I documented this in the README. The version of json-io is in the pom file of the newly added Maven project (for the script).

@tautschnig
Copy link
Collaborator

@antlechner Could you please sync with @hannes-steffenhagen-diffblue and @chrisr-diffblue (and possibly others) about the following: isn't this repeating some effort that is going into goto-harness? Specifically, in goto-harness there already is support for reading a JSON memory snapshot. I acknowledge that goto-harness wouldn't work with Java, because we assume goto binaries to exist. But using JSON to represent initial values seems like a commonality?

@antlechner antlechner force-pushed the antonia/json-value-retriever branch from b9153f0 to 93deef3 Compare May 17, 2019 17:48
@@ -207,6 +209,7 @@ static void clinit_wrapper_do_recursive_calls(
const irep_idt &class_name,
code_blockt &init_body,
const bool nondet_static,
const bool use_json_clinit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like json in the name as the source format of the data is irrelevant; it could as well be read from an XML file or whatever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this to replace_clinit. Similarly for other variable names, see my reply to @smowton 's comment below.

@@ -27,6 +27,9 @@ enum class synthetic_method_typet
/// These are generated for both user and stub types, to ensure the actual
/// static initializer is only run once on any given path.
STATIC_INITIALIZER_WRAPPER,
/// Only exists if the `--static-values` option was used.
/// TODO Currently has no content, future commits will add it.
JSON_STATIC_INITIALIZER,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STATIC_INITIALIZER_VALUES ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now called USER_SPECIFIED_STATIC_INITIALIZER, for consistency with the (new) name of the new function symbol.

if(!json.is_object())
return {};
const auto &json_object = to_json_object(json);
if(json_object.find("@type") == json_object.end())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the @? It's directly taken from the json-io library, and in the Google style guide it says "JSON maps can use any Unicode character in key names."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on whether it is considered a key in a map or a property name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property name - the name (or key) portion of the property.

Looks like keys and property names are the same thing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. A key is itself a piece of data, eg a parameter name that is mapped to its type, whereas a property name is a name for a piece of data, eg 'zipCode' in an address object.

@@ -10,11 +10,12 @@ Author: Chris Smowton, [email protected]
#include "java_object_factory.h"
#include "java_utils.h"
#include <goto-programs/class_hierarchy.h>
#include <util/std_types.h>
#include <util/std_expr.h>
#include <json/json_parser.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if the use of JSON was limited to the assignments_from_json class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON is now used/referenced only in assignments_from_json and the get_user_specified_clinit_body function in this file. The latter can be changed to allow other formats if/when we support them.

@antlechner antlechner force-pushed the antonia/json-value-retriever branch from 93deef3 to 42b70d3 Compare May 20, 2019 14:48
@antlechner
Copy link
Contributor Author

The only bit not clear to me is: are the formats of the JSON files necessarily different?

@tautschnig I would have to go through the details of goto-harness to answer this question. The format I used in this PR is almost the same as what you get if you run JsonWriter.objectToJson from the json-io library on a map that stores field names and values. The output from other Java to JSON serialising libraries (e.g. Gson) is similar. So I think the format of JSON files for the new feature is close to what someone with experience in Java-to-JSON serialisation would expect. I will tidy up the script I used to generate the JSON, as requested in a review comment by @allredj , and add it to the regression test folder, to make it clearer why I chose this format (following a standard) and easier to try out the feature on other examples.

@antlechner antlechner force-pushed the antonia/json-value-retriever branch from 42b70d3 to d0e5031 Compare May 20, 2019 15:17
@antlechner
Copy link
Contributor Author

@smowton

  1. I renamed json_clinit to user_specified_clinit. Related variables are renamed as well, e.g. use_json_clinit -> replace_clinit, is_json_clinit_needed -> is_user_clinit_needed, json_clinit_name -> user_specified_clinit_name.
  2. I renamed everything starting with det_. For structs I replaced the prefix with object_, and for symbol names I replaced it with user_specified.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 42b70d3).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112477579
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: d0e5031).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112480809
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

@antlechner antlechner force-pushed the antonia/json-value-retriever branch from d0e5031 to 6a012bb Compare May 23, 2019 17:34
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 6a012bb).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/112988694

@antlechner antlechner changed the title Add new JSON value assignment feature for static fields [TG-7706] Add new JSON value assignment feature for static fields May 24, 2019

<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ weird indent

output.write("\"ordinal\":" + enumm.ordinal());
for (Field field : enumm.getDeclaringClass().getDeclaredFields()) {
if ((field.getModifiers() & Modifier.STATIC) != Modifier.STATIC) {
output.write(",\"" + field.getName() + "\":");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ indent

antlechner added 11 commits May 24, 2019 13:58
The option expects the path to a JSON file as an argument, and the file
is expected to contain a map of maps: the first map uses class names as
keys, the inner maps use field names, and the values of the inner maps
are initial values of static fields.
This path is then stored in a new field of java_bytecode_languaget. From
here, we will be able to look for the file if it was specified, parse
it, and construct assignments that are equivalent to and replace the
clinit function.
The next commit will reuse most of this code for creating a new function
symbol.
This function symbol is created in the same way as the one for
clinit_wrapper.
In this commit, the function has an empty body. It will be given content
in the following commits.
If a file containing static values is given, the call to clinit within
clinit_wrapper is replaced with a call to the new
user_specified_clinit function.
To get the assignments from the input file, we will have to
- parse the file
- look up the given class name in the file
- for each static field of this class in the symbol table, look up the
  field name in the class entry of the file
- if an entry was found, the "TODO" is hit: in the following commits the
  actual assignments will be added here.
One returns the expression representing the pointer to the newly
allocated memory, and the other returns the expression representing the
dereferenced pointer.
This is so that the function can be used in more than just one place.
This commit adds a new file containing a recursive algorithm which,
given an exprt and a JSON representation of a value for the exprt,
produces codet that assigns the exprt to that value.
Some helper functions in other files are also added.
The test folder contains a file clinit-state.json that is used as an
argument for the --static-values option in all of the tests in this
folder.
This will make it easier to adapt the tests or try out the
--static-values option on other examples.
@antlechner antlechner force-pushed the antonia/json-value-retriever branch from 6a012bb to c9546e5 Compare May 24, 2019 12:58
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: c9546e5).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/113095894

@antlechner antlechner merged commit 10fa912 into diffblue:develop May 24, 2019
@antlechner antlechner deleted the antonia/json-value-retriever branch May 24, 2019 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs TG approval 🦉 Only merge with explicit approval from test-gen maintainers new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants