-
Notifications
You must be signed in to change notification settings - Fork 273
[SEC-180] Add ability to overlay classes with new definitions of existing methods #1816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SEC-180] Add ability to overlay classes with new definitions of existing methods #1816
Conversation
Would it be possible to have a more complete description of what this is supposed to do? I am struggling to parse "Add ability to overlay classes with new definitions of existing methods", because "classes" and "methods" doesn't really seem to fit together? |
3b8a802
to
f7e9c27
Compare
@tautschnig absolutely. Thank you for asking. |
Thanks a lot for amending the description, that's very helpful! Would you mind copying the same to the commit message? You might very well have done this already, and just not pushed -- there is of course not rush, I'd just hope to have it in the very final version. |
I had a vague memory that commit messages were what and PR descriptions were why so I hadn't done this. It is however in a comment in class_loader.cpp. Will update when I do my next push. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to file: src/java_bytecode/java_bytecode_language.cpp
then eyes started to glaze over
/// \param parse_trees: The parse trees found for the class to be converted. | ||
/// \remarks | ||
/// Allows multiple definitions of the same class to appear on the | ||
/// classpath, so long as all but the first definition are marked with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies if the first is annotated with OverlayClassImplementation
this won't work, is that correct/desirable (it seems like if exactly one are marked with the OverlayClassImplementation
tag as the ordering of loading classes doesn't seem like an ideal thing to depend on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should definitely put your overlay classes in a folder/jar that you add to the end of your classpath. We depend on the order of the classpath already as we currently only load the first instance seen.
// Check that the first class implementation isn't an overlay | ||
if(is_overlay_class(parse_tree.parsed_class)) | ||
{ | ||
warning() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so yeah the comment is correct - perhaps a task for a future day to allow the classes to come in any order
else if(loading_success) | ||
{ | ||
overlay_classest overlay_classes; | ||
for(auto parse_tree_it = std::next(parse_trees.begin()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::filter(
std::next(parse_trees.begin()),
parse_trees.end(),
std::front_inserter(overlay_classes),
[](parse_tree) -> { return parse_tree.loading_succesful && is_overlay_class(parse_tree); });
?
(you can probably still get the warning in inside the lambda, or could separately find mutliple non-overlay classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In many languages I'd do this, but in C++ it doesn't really reduce the code and I'm not convinced it would do the std::cref
correctly.
debug() | ||
<< "Adding symbol from overlay class: field '" << field.name << "'" | ||
<< eom; | ||
convert(*class_symbol, field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all the fields from the overlay classes are added to the original? This might benefit from a comment
<< eom; | ||
convert(*class_symbol, field); | ||
} | ||
} | ||
for(const auto &field : c.fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add the "real" fields first, then if an overlay field is going to hide an existing field can drop a warning, unless there is a good reason not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you disagree or didn't see this, so just flagging it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fields like @class_identifier
will always be duplicated and can't be marked as overlays. I am happy that the final overlay gets to specify the final type/value of the fields, though we should probably warn if the types don't match.
{ | ||
const irep_idt method_identifier= | ||
id2string(qualified_classname)+ | ||
"."+id2string(method.name)+ | ||
":"+method.descriptor; | ||
if(method_bytecode.contains_method(method_identifier)) | ||
{ | ||
if(overlay_methods.erase(method_identifier) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like this might even be an invariant?
@@ -316,6 +420,21 @@ void java_bytecode_convert_classt::convert(const classt &c) | |||
symbol_table, | |||
get_message_handler()); | |||
method_bytecode.add(qualified_classname, method_identifier, method); | |||
if(is_overlay_method(method)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this feels like an invariant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an invariant because it depends on user data rather than on CBMC programme logic.
{ | ||
if(annotation.type.id() != ID_pointer) | ||
return false; | ||
irept type = annotation.type.get_sub()[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typet type = annotation.type.subtype()
?
if(type.id() == ID_symbol) | ||
{ | ||
return | ||
type.get_named_sub()[ID_identifier].id() == annotation_type_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_symbol_type(type).get_identifier()
@@ -44,7 +44,7 @@ Author: Daniel Kroening, [email protected] | |||
void java_bytecode_languaget::get_language_options(const cmdlinet &cmd) | |||
{ | |||
assume_inputs_non_null=cmd.isset("java-assume-inputs-non-null"); | |||
java_class_loader.use_core_models=!cmd.isset("no-core-models"); | |||
java_class_loader.set_use_core_models(!cmd.isset("no-core-models")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 but would have been better in a separate commit for future reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was only necessary because we need to clear a cache now when this is set so specific to this change.
fb37311
to
087c09c
Compare
@thk123 - many comments added and your last two style suggestions applied. |
94e9b7b
to
0afde90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the other half. Found it quite hard to review the changes to java_class_loaded, I think it would have helped to split some of into separate commits (e.g. swapping some of the iterator based for loops to range based for loops)
I also note there are no tests - I really think a few simple uses cases of overlays would go along way to helping reviewers understand exactly what is going on (I'm sure you have your own tests in a separate repo but it'd be great if we can extract some kernel of them into this PR).
continue; | ||
try | ||
{ | ||
mark_java_implicitly_generic_class_type( | ||
c.second.parsed_class.name, symbol_table); | ||
c.second.front().parsed_class.name, symbol_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we only care about the first parse tree in this case? Note an implicitly generic class means something like:
class Outer<T> {
class Inner {
T field;
}
}
Inner isn't explicitly generic (has no generic parameters), but clearly is in some sense a generic type as one of its fields is of type T
. I suspect you are better placed to know if it is important for any overlayed class parse trees to be correctly, but my feeling is it probably should be (the parse essentially means that the type of field
is correctly updated to be Outer::T
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We care about them all, but they all have the same name. So c.second.front().parsed_class.name
should be the same as c.second[x].parsed_class.name
. At least that was my reasoning. I may be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry obviously didn't read it carefully enough - you are of course correct here.
0cb53b0
to
409df56
Compare
409df56
to
a75aa7d
Compare
@thk123 - added some tests, hope this helps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test could do with overlaid fields, some general feedback, don't think anything blocking (other than elaborating the tests)
std::list<std::reference_wrapper<const classt>> | ||
loaded_parse_trees; | ||
for(const java_bytecode_parse_treet &parse_tree : parse_trees) | ||
if(parse_tree.loading_successful) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since for loop body is over multiple lines, please wrap in braces
if(parse_tree.loading_successful) | ||
loaded_parse_trees.push_back(std::cref(parse_tree.parsed_class)); | ||
auto parse_tree_it = loaded_parse_trees.begin(); | ||
// Check that the first class implementation isn't an overlay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check that the first class implementation is an overlay
Seems more accurate, since it also checks the second isn't an overlay if the first is right?
I'd be tempted to write this differently (e.g. std::find
the first non-overlay class and then iterate up to it with the warnings) as this feels backwards to me (the expected condition is that the loop breaks on the first element). But I think that is just a preference so definitely ignore if you disagree.
const classt &parsed_class = *parse_tree_it; | ||
// Collect overlay classes | ||
overlay_classest overlay_classes; | ||
for(++parse_tree_it; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally would avoid variable reuse here and do something like const auto overlay_class_it = std::next(parse_tree_it)
so you have a more "normal" looking for loop, but again definitely personal preference if you disagree.
{ | ||
const classt &overlay_class = *parse_tree_it; | ||
// Ignore non-initial classes that aren't overlays | ||
if(is_overlay_class(overlay_class)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer making the expected route really clear at the expense of the error states being a bit more cumbersome (in this case, I'd use std::filter
to make it clear you are filtering out the non-overlay classes, but again clearly a personal preference
@@ -160,14 +224,11 @@ static optionalt<std::string> extract_generic_interface_reference( | |||
return {}; | |||
} | |||
|
|||
void java_bytecode_convert_classt::convert(const classt &c) | |||
void java_bytecode_convert_classt::convert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the (non-existant doxygen docs) to at least explain your parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done at the declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as it pains me as clearly the deceleration is the correct place - the coding guidelines say to do docs the implementation which is therefore where people look for this - could you move it?
"and also exists in the real class" | ||
<< eom; | ||
} | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So an method in an overlay class overwrites methods in the original class even if not marked as an overlay method (with only a warning) - I'd expect the behaviour in this case would be to not load the overlay method TBH.
create_stub_global_symbols( | ||
c.second, symbol_table_journal, class_hierarchy, *this); | ||
for(const java_bytecode_parse_treet &parse_tree : class_to_trees.second) | ||
create_stub_global_symbols( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For loop body now over multiple lines please wrap in braces
java_bytecode_parse_treet parse_tree; | ||
std::istringstream istream(data); | ||
if(java_bytecode_parse(istream, parse_tree, get_message_handler())) | ||
return optionalt<java_bytecode_parse_treet>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer return {}
for the optional without a value for consistency
@@ -0,0 +1,4 @@ | |||
package com.diffblue; | |||
|
|||
public @interface OverlayMethodImplementation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might consider pulling this into a separate repo so you don't have to keep providing implementations of these OverlayMethod/Class attributes.
{ | ||
@OverlayMethodImplementation | ||
public static void main(String[] args) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this test more robust I'd probably put an assert(true)
here an an assert(false)
in the other original one.
a75aa7d
to
d43e726
Compare
ee13c28
to
a13d103
Compare
@tautschnig if we didn't already have a clean solution that'd be an ingenious hack to dodge the work, but given we've got one (and platform-specific tests are surely needed for some other things, e.g. compiler-specific options to pass to goto-cc without producing an error on the other platform?), I suggest we go with the solution proposed. |
Would it be possible to factor the |
a13d103
to
5d690ec
Compare
@tautschnig @NathanJPhillips here's an example of the shell-script solution: smowton@bf075a3 Shall we just go with that for now, since it's simple, non-invasive and gets this PR out of its current quagmire? |
It's almost scary that you can actually use backticks in the command line arguments, but indeed a nice hack. I'm not going to argue against it. |
50e37f9
to
22a106d
Compare
Allows multiple definitions of the same class to appear on the classpath, so long as all but the first definition are marked with the attribute @java::com.diffblue.security.OverlayClassImplementation. Overlay class definitions can contain methods with the same signature as methods in the original class, so long as these are marked with the attribute @java::com.diffblue.security.OverlayMethodImplementation; such overlay methods are replaced in the original file with the version found in the last overlay on the classpath. Later definitions can also contain new supporting methods and fields that are merged in. This will allow insertion of Java methods into library classes to handle, for example, modelling dependency injection.
22a106d
to
baf33f8
Compare
7202003 Merge branch 'develop' of github.com:diffblue/cbmc into CBMC_subtree_2018-04-10 768e8a6 Merge pull request diffblue#2009 from romainbrenguier/solvers/sparse-arrays-in-get 69f6e7b Merge pull request diffblue#2032 from tautschnig/replace-rename-performance ec43b00 Merge pull request diffblue#2026 from tautschnig/sat-clean 64b336a Refactor interval_sparse_array::concretize 5392645 Reserve size of array in concretize 8277e92 Stop using concretize_array_expr in unit tests 1a08772 Tests where model involves long strings a6c4010 Stop using concretize_arrays_in_expression 6d87233 Use concretize instead of fill_in_array 052d503 Use get_array in get_char_array_and_concretize 8ed138b Remove unused header c58ac60 Avoid copy in ranged for loops over expressions beff419 Use sparse arrays in get_array 9b286d9 Use sparse arrays in string_refinement::get 037f631 Refactor string_refinementt::get dfc584a Add concretize function for interval_sparse_array cb10550 Use sparse arrays in substitute_array_access ce4c008 Remove plus_exprt_with_overflow_check 4779b25 Get rid of calls to plus_exprt_with_overflow c40b836 Truncate string concatenations in case of overflow c52c813 Add a sum_overflows function 0d03591 Add an `at` function for access in sparse arrays 848dd95 Add an interval_sparse_array::of_expr function 5f07bf0 Initialization of sparse array from array-list 5b4d618 Reduce number of constraints in format ede2fa1 Clear string_dependencies in calls to dec_solve d483c81 Initialize sparse array from array_exprt a914153 Use map instead of vector for sparse array entries 20d2445 Remove unused rename(expr, old_id, new_id) 430d402 Use exprt::depth_iterator in rename_symbolt 54e5b85 Use a const expr to avoid unnecessary detach 6f71ff6 replace_symbolt: stop early if there is nothing to replace with 1dbb162 Clean locally built SAT solver objects 28c076b Merge pull request diffblue#2013 from LAJW/lajw/java-no-load-class e6a9127 Merge pull request diffblue#2020 from tautschnig/sat-cleanup 70741ff Remove support for Precosat 2aa81eb Remove support for SMVSAT a0fd3f7 Remove support for Limmat as a SAT solver 28cca9c Remove unused DIMACS parser 1d81306 Merge pull request diffblue#2015 from tautschnig/fix-smt2_solver-clean 392144d Makefiles: Place .d suffix used for dependencies in DEPEXT variable 839d32a smt2_solver.{o,d} should be removed by "make clean" 48e427a Merge pull request diffblue#1979 from romainbrenguier/regression/fix-indexOf-test c2f3726 Merge pull request diffblue#1976 from romainbrenguier/regression/activate-dependency-tests 42ecfa2 Add --java-no-load-class option 69fb74a Merge pull request diffblue#1995 from tautschnig/byte-update-soundness aa766ae Set string-max-length in indexOf test 988b818 Merge pull request diffblue#1990 from tautschnig/missing-header 8300147 Abort on byte_update(pointer) 4a8d9b4 Include missing header a695814 Merge pull request diffblue#1986 from thk123/revert/1816/overlay-classes 9933b58 Revert "Merge pull request diffblue#1816 from NathanJPhillips/feature/overlay-methods" cd9b839 Revert "Merge pull request diffblue#1982 from NathanJPhillips/bugfix/load-object-once" 58beeb4 Merge pull request diffblue#1978 from svorenova/lambda_tg2478_cont 1e3a9cd Merge pull request diffblue#1980 from NathanJPhillips/tests/irept 772b603 Merge pull request diffblue#1982 from NathanJPhillips/bugfix/load-object-once 0902ae7 Tests to demonstrate expected sharing behaviour of irept 2239ec1 Unit test of irept's public API 81ac259 Prevent test running on symex-driven lazy loading eae194f Allow incorrect paths for jar files on the classpath without crashing 6749fd0 Tests to ensure invalid values in the classpath are ignored 3c95aa1 Prevent attempting to load any class more than once c520331 Unit test to show java.lang.Object can be loaded from an explicit model 05a7b4a Merge pull request diffblue#1981 from smowton/smowton/cleanup/missing-docstring e4dc2aa Merge pull request diffblue#1946 from thk123/feature/TG-1811/unit-tests-for-invokedynamic-static-lambda 2b53d3b Merge pull request diffblue#1983 from svorenova/lambda_tg2478_fix f428247 Tidying up for lambdas f3b1379 Merge pull request diffblue#1975 from romainbrenguier/bugfix/literal-length-TG2878 7d4441d Regression test for lambda in a package 7366fda Format class name for lambda method handles 7db42c0 Add missing Doxygen parameter description 31fa0fe Addressing review comments 2a0927c Merge pull request diffblue#1643 from NathanJPhillips/bugfix/string-solver-function-type-mistake 229d1ee Merge pull request diffblue#1977 from romainbrenguier/bugfix/string-equals-TG-2179 97a6713 Merge pull request diffblue#1816 from NathanJPhillips/feature/overlay-methods e2d4b09 Updates for merge 674c9f0 Activate tests for StringBuffer concat in loops a997ffb Add verification test for String.equals 1ab596d Merge commit '3b8120f3a8c9ed3a343493a44ac454ae265946c1' into develop f7602af Merge commit 'bb88574aaa4043f0ebf0ad6881ccaaeb1f0413ff' into merge-develop-20180327 55d36b5 Fix code for String.equals baf33f8 Added regression tests 9984fc1 Add ability to overlay classes with new definitions of existing methods 66c529c Tidied up java_class_loader_limitt 397c14e Correcting typos and adding documentation to unit tests 54f1c54 Adding checks for the super class of the generated class df895d3 Temp checkin for checking components 44a5dcb Adding check for inheritance 28bfc37 Amending path to reflect new location e27151b Correcting typo in the scenario name d7356be Modified behaviour to find function calls ac33761 Use raw strings to avoid unnecessary escaping 4201db9 Adding tests for static lambdas f9adaa6 Adding tests for member methods 5f5994b Pull out the logic for getting the inital assignment 7f843a7 Add natural language explanation of the test's checks fa27117 Introduce tests for lambdas that are member variables 8999cf6 Added utiltity for getting this member components 3e0e12e Adding tests for the other two returning lambdas that don't capture f3ddee6 Adding tests to verify the return of the lambda wrapper method db6756e Adding checks for parameters of the called function d2ed92b Adding test for lambda taking array parameters d171f64 Swap finding variable values to use regex 99c21ed Extended find pointer assignment to take a regex 5610aca Added unit test for lambda assigned 7b0cee1 Refactored test method to allow reuse bea730d Adding utility for verifying a set of statements contains a function call ee2179c Introduce checks the the function body for Execute calls the correct lambda 2348d10 Adding unit test for checking local lambda conversion 46fa176 Extending require utilities to be used in test 6df8d6b Extended require_goto_statements to provide meaningful errors 39282a6 Add debug information for working directory 770eb2a Remove methods without a implementation or usage 5cbb758 Merge pull request diffblue#1937 from svorenova/lambda_tg2711 8cfd9bf Merge pull request diffblue#1968 from smowton/smowton/cleanup/remove-exceptions-clarity 6ca3272 Merge pull request diffblue#1973 from karkhaz/kk-c++2a-fixes 212da75 Making members of a test utility class non-const d57fe53 Renaming the folder with lambda unit tests 823f2a7 Adding a unit test for lambda method handles in class symbol 8b172b4 Adding a utility function for lambda method handles in struct 844bb20 Pulling out a utility function to a separate file 3585f73 Updating unit tests for parsing lambda method table 078dc0f Updating a utility function a8ac3d4 Pass lambda method handles to method instruction conversion bf04c93 Add lambda method handles during class conversion a518393 Introduce lambda method handles in java class type 37afd9a Store the full method reference of lambda method handles 0a49697 Store bootstrap method index for invokedynamic instructions 56c9c02 Add test for string literals length af958f0 Correct length field of string literals a190534 Remove-exceptions: make lambda types explicit b052a4d Fix warnings emitted by C++2a compiler 20eaf21 Fix type of call to forName git-subtree-dir: cbmc git-subtree-split: 7202003985a99fb6563cf4d0fb8e7f2c727cc040
Allows multiple definitions of the same class to appear on the classpath, so long as all but the first definition are marked with the attribute
@java::com.diffblue.security.OverlayClassImplementation
. Overlay class definitions can contain methods with the same signature as methods in the original class, so long as these are marked with the attribute@java::com.diffblue.security.OverlayMethodImplementation
; such overlay methods are replaced in the original file with the version found in the last overlay on the classpath. Later definitions can also contain new supporting methods and fields that are merged in.This will allow insertion of Java methods into library classes to handle, for example, modelling dependency injection.