-
Notifications
You must be signed in to change notification settings - Fork 273
[TG-3694] Parse and store inner class information #2493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TG-3694] Parse and store inner class information #2493
Conversation
af174c5
to
d78c440
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good - couple of suggestions and I think your tests need tweaking.
"The number of bytes to be read for the InnerClasses attribute does not " | ||
"match the attribute length."); | ||
|
||
const std::function<pool_entryt &(u2)> pool_entry_lambda = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably add this exception to the coding standard, but I think auto
is preferable for a lambda for two reasons:
- By putting it into a
std::function
you allow the value to be null whereas theauto
allows the compiler to pick a precise lambda type - Probably doesn't help readability as the inputs and outputs are repeated on the rhs (
[this](u2 index) -> pool_entry &
)
|
||
const std::function<pool_entryt &(u2)> pool_entry_lambda = | ||
[this](u2 index) -> pool_entryt & { return pool_entry(index); }; | ||
std::function<std::string(std::string, char)> remove_separator_char = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: const
UNUSED bool is_protected = inner_class_access_flags & ACC_PROTECTED; | ||
|
||
// If the original parsed class name matches the inner class name | ||
// the parsed class is an inner class, so overwrite the parsed class' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this work? Also possibly I think you might be able to check whether outer_class_info_index == 0
?
If C is not a member of a class or an interface (that is, if C is a top-level class or interface (JLS §7.6) or a local class (JLS §14.3) or an anonymous class (JLS §15.9.5)), the value of the outer_class_info_index item must be zero.
Though that sounds like anonymous and local classes would be marked as not inner, which I guess is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anonymous classes and local classes are both inner classes. If they are not being marked as such then we have a problem. I think the accessibility for these should always be private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this, it may be advantageous to have a is_anonymous / is_local flag set at this point. We currently have an is_anonymous
method in state_typet
, which attempts to work this out based on regexing the name, which isn't the ideal approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problems on test-gen were related to treating anonymous/local classes the same as inner classes. This has been updated by checking that outer_class_info_index is 0 or not (it's 0 when it is outer, interface, local or anonymous). @thk123 The reason we can't just check outer_class_info_index is because the we only want to overwrite the accessibility information when the current parsed class (not the current parsed inner class information) is an inner class. As far as I can tell, there is no magic way to do this from the spec.
@@ -1640,6 +1640,48 @@ void java_bytecode_parsert::rclass_attribute(classt &parsed_class) | |||
parsed_class.attribute_bootstrapmethods_read = true; | |||
read_bootstrapmethods_entry(parsed_class); | |||
} | |||
else if(attribute_name == "InnerClasses") | |||
{ | |||
u2 number_of_classes = read_u2(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might consider pulling this out into a read_innerclass_attribute
which would allow you to document when you'd expect to see this and an overview of the info we're getting out of it, as I can't get my head around when this attribute appears and what we find in it, despite reading the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec is not great, agreed.
jbmc/src/java_bytecode/java_types.h
Outdated
@@ -111,6 +111,16 @@ class java_class_typet:public class_typet | |||
return set(ID_access, access); | |||
} | |||
|
|||
const irep_idt &get_inner_class() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return a bool (using get_bool
as done in get_final
) as this hides the messiness that everything is strings from the user of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these methods be called get_is_inner_class
and set_is_inner_class
instead? The calls to the current method names superficially appear to be a call to a method on an outer class to get an inner class belonging to it.
THEN("The class should be marked as public") | ||
{ | ||
const symbolt &class_symbol = | ||
*new_symbol_table.lookup("java::InnerClasses$PublicInnerClass"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use lookup_ref
if you are going to immediately dereference the return from lookup
as it will produce a nicer error if the symbol doesn't exist.
@@ -0,0 +1,26 @@ | |||
public class InnerClasses { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a non-public class to this file (in case that means there are two inner class attributes?) Unless that isn't true, in which case might not be worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a non-public outer class 👍
*new_symbol_table.lookup("java::InnerClasses$PublicInnerClass"); | ||
const java_class_typet java_class = | ||
to_java_class_type(class_symbol.type); | ||
REQUIRE(java_class.get_inner_class() == ID_is_inner_class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong, I think the string will be something like "1" or "true" - which probably means if this passes CI these unit tests aren't being run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it is the missing from Makefile as the CMake build fails:
https://travis-ci.org/diffblue/cbmc/jobs/397934975#L7792
1 == is_inner_class
{ | ||
WHEN("Parsing the InnerClasses attribute for a public inner class") | ||
{ | ||
const symbol_tablet &new_symbol_table = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you always load the same file, I'd probably do this at the top of scenario to reduce duplication
@@ -0,0 +1,84 @@ | |||
/*******************************************************************\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing from the Makefile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what this means or how to add it. I will ask you when you are done with your current conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nit picking. Couple of requests for extensions to the unit test.
@@ -1640,6 +1640,48 @@ void java_bytecode_parsert::rclass_attribute(classt &parsed_class) | |||
parsed_class.attribute_bootstrapmethods_read = true; | |||
read_bootstrapmethods_entry(parsed_class); | |||
} | |||
else if(attribute_name == "InnerClasses") | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should be moved into a new function/method, to avoid rclass_attribute
getting too long.
"The number of bytes to be read for the InnerClasses attribute does not " | ||
"match the attribute length."); | ||
|
||
const std::function<pool_entryt &(u2)> pool_entry_lambda = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: As this pool_entry_lambda
is only used once, I would move it into the call to .get_name
, rather than storing it in a variable.
UNUSED u2 inner_name_index = read_u2(); | ||
u2 inner_class_access_flags = read_u2(); | ||
|
||
if(inner_class_info_index != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bike shedding: I would prefer if(inner_class_info_index == 0) continue;
over indenting the remainder of the loop. I think opinions on this one are likely to vary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been asked to do worse. I am not sure which is clearer, but I've changed it 👍 No I haven't because the bytes need to be read anyway. No I am being stupid, changed :) Can you do double strikethrough?
u2 number_of_classes = read_u2(); | ||
u4 number_of_bytes_to_be_read = number_of_classes * 8 + 2; | ||
INVARIANT( | ||
number_of_bytes_to_be_read == attribute_length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good that we have this check. However I think this code could be more robust, if we have the condition number_of_bytes_to_be_read <= attribute_length
and a call to skip_bytes
for the difference at the end of this block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you are saying, but I am not sure that it's best. If the bytes don't match, I don't think we can guarantee that anything makes sense because that would mean that the bytecode spec has changed basically, and so should probably just fail hard.
|
||
// If the original parsed class name matches the inner class name | ||
// the parsed class is an inner class, so overwrite the parsed class' | ||
// access information and mark it as an inner class | ||
UNUSED bool is_inner_class = | ||
bool is_inner_class = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use parsed_class.is_inner_class =
here instead of declaring an additional local variable and hence save a line of code inside the if block below.
private PrivateInnerClass(int i) { | ||
this.i = i; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you extend this test with some Inner Inner class examples, to show that things still work when the classes are nested an additional layer deep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also extend this with an anonymous class and a local class?
this.i = i; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: The clion/intellij settings can be tweaked to automatically add a new line to the end of files when it is missing. Should save a bunch of time on comments around this in your PRs.
jbmc/src/java_bytecode/java_types.h
Outdated
@@ -111,6 +111,16 @@ class java_class_typet:public class_typet | |||
return set(ID_access, access); | |||
} | |||
|
|||
const irep_idt &get_inner_class() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these methods be called get_is_inner_class
and set_is_inner_class
instead? The calls to the current method names superficially appear to be a call to a method on an outer class to get an inner class belonging to it.
WHEN("Parsing the InnerClasses attribute for a public inner class") | ||
{ | ||
const symbol_tablet &new_symbol_table = | ||
load_java_class("InnerClasses", "./java_bytecode/java_bytecode_parser"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also have some tests where an inner class is loaded first, instead of an outer class containing inner classes?
.get_name(pool_entry_lambda); | ||
bool is_private = inner_class_access_flags & ACC_PRIVATE; | ||
bool is_public = inner_class_access_flags & ACC_PUBLIC; | ||
bool is_protected = inner_class_access_flags & ACC_PROTECTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: These three lines give a clang-tidy warning in clion, due to the implicit cast from integer to bool. This could be fixed as follows - bool is_protected = (inner_class_access_flags & ACC_PROTECTED) != 0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a read_bitflag
method but I wouldn't block this PR on it since this is a common approach used in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the rest of this file, all the other instances of this have already been fixed to avoid the implicit cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered it to be an optional "NIT" comment. I guess it would have been helpful if @thk123 had commented with something along the lines of "This is not a NIT, fix it."
b391e3d
to
7c79977
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed Diffblue compatibility checks (cbmc commit: a129281).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/77913660
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed Diffblue compatibility checks (cbmc commit: bccf9cc).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/77914476
In this commit, nothing is done with the data.
Though this data is stored, it will not be used in test-gen yet because test-gen is assuming all non-public inner classes are private.
bccf9cc
to
c0c1029
Compare
@thk123 @thomasspriggs This is ready for re-review. Seems to have passed the compatibility check last night, which is what I would expect now that it is treating anonymous/local classes correctly and test-gen is still ignoring most of inner classes, so I'm reasonably confident that PR2014 will pass too. I doubt the test-gen changes one will make it into the sprint, but it's here anyway for reference. Test-gen with changes: https://github.com/diffblue/test-gen/pull/2012 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed Diffblue compatibility checks (cbmc commit: c0c1029).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/77954988
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All comments look to be addressed, couple of suggestions for making the comment on rinner_classes_attribute
even better. I also think the enum for accessibility of classt
would be a big win.
UNUSED u2 inner_name_index = read_u2(); | ||
u2 inner_class_access_flags = read_u2(); | ||
|
||
if(inner_class_info_index == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment for why would be good here? My reading of the spec is it can't be zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could still do with a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
.get_name(pool_entry_lambda); | ||
bool is_private = inner_class_access_flags & ACC_PRIVATE; | ||
bool is_public = inner_class_access_flags & ACC_PUBLIC; | ||
bool is_protected = inner_class_access_flags & ACC_PROTECTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a read_bitflag
method but I wouldn't block this PR on it since this is a common approach used in this file.
{ | ||
// This is a marker for an anonymous or local class | ||
// which are treated as private | ||
parsed_class.is_private = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if spotted this before, but having three booleans to represent a single value is error prone, I'd prefer an enum in the classt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common approach in this file and it effectively matches the java bytecode.
/// Corresponds to the element_value structure | ||
/// Described in Java 8 specification 4.7.6 | ||
/// https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7.6 | ||
/// Parses the bytes of the InnerClasses attribute for the current parsed class, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: fewer words "Parses the bytes of the InnerClass attribute"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarity: current parsed class or current parsed class file? (If the former, ignore me)
/// Described in Java 8 specification 4.7.6 | ||
/// https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7.6 | ||
/// Parses the bytes of the InnerClasses attribute for the current parsed class, | ||
/// which contains any array of information about inner classes. We are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarity: Inner classes of that class?
/// in its inner class array. If the parsed class is not an inner class, then it | ||
/// is ignored. When a parsed class is an inner class, the accessibility | ||
/// information for the parsed class is overwritten, and the parsed class is | ||
/// marked as an inner class. Note that is outer_class_info_index == 0, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably stick this note by the code it relates to rather than here.
{ | ||
// This is a marker for an anonymous or local class | ||
// which are treated as private | ||
parsed_class.is_private = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common approach in this file and it effectively matches the java bytecode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed Diffblue compatibility checks (cbmc commit: 6ce7b13).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/77989278
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed Diffblue compatibility checks (cbmc commit: 9ba55e2).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/77998632
On a subsequent PR you said you're just waiting on CI, but I think these two comments are outstanding: |
@thk123 I have removed the |
Agree NIT means optional, but since there was another change (the docs, or as it turns out, removing the if) thought I'd draw your attention to it 🙂. Will merge when passes CI. |
Waiting for CI to pass on bump: https://github.com/diffblue/test-gen/pull/2012 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed Diffblue compatibility checks (cbmc commit: 5350133).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/78105793
6409eae Merge pull request diffblue#2449 from tautschnig/c++-template-cleanup 1134455 Merge pull request diffblue#2443 from tautschnig/vs-rdecl 6e9e655 Merge pull request diffblue#2520 from smowton/smowton/feature/constant-propagator-selectivity 9013779 Merge pull request diffblue#2383 from tautschnig/no-sed 56a487e Constant propagator: add callback to filter tracked values a354513 Merge pull request diffblue#2510 from polgreen/hex_trace 3545be4 Merge pull request diffblue#2523 from peterschrammel/do-not-ignore-full-slice 819c683 Merge pull request diffblue#2493 from jeannielynnmoulton/jeannie/InnerClasses a018652 allow unsigned long instead of unsigned in regression test for hex_trace d5bbdd8 represent numerical values in CBMC trace in hex 19bddce Do not ignore --full-slice 5350133 Refactor inner class parsing. 6e554d9 Merge pull request diffblue#2500 from diffblue/git-version-speedup 9ba55e2 Marks anonymous classes as inner classes b96c7ba move build commands for version.h from common to util/ 6ce7b13 Clarifies language in documentation. c0c1029 Fixes parsing for anonymous classes 1930aef Refactors parsing of inner classes attribute. b28562b Adds unit test for parsing inner classes. c336455 Stores inner class data in java class types. 457bac9 Parses InnerClasses attribute of java bytecode. 34bd58f Clean up unused template instantiation symbols fea1f47 Remove unused parameter from rDeclarator e00c6ee Set BUILD_ENV via make variable instead of patching via sed git-subtree-dir: cbmc git-subtree-split: 6409eae
This parses the InnerClasses attribute of the java bytecode with the intention of determining
If the parsed class is an inner class, then the original accessibility is overwritten by the information from the InnerClasses attribute.
Whether the class is an inner class is then stored in java_class_typet.
...
@thk123 @thomasspriggs Currently there is one test failing in test gen with the changes that will require some investigation, but I am on AL Friday and wanted to get some preliminary reviews so with some luck, this might be finished the sprint. Thanks.
@thk123 Could you please add an in progress label or equivalent. I have no privileges.