-
Notifications
You must be signed in to change notification settings - Fork 274
Synthesise lambda classes #4787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synthesise lambda classes #4787
Conversation
@smowton For Java-clueless people like myself: can you explain why this change is desirable to have? |
456c896
to
dc7c5d2
Compare
At the moment |
@smowton Thanks! Would you mind adding that info to the commit message? |
d98efc4
to
a94e7a9
Compare
@tautschnig done |
Codecov Report
@@ Coverage Diff @@
## develop #4787 +/- ##
===========================================
+ Coverage 69.26% 69.38% +0.12%
===========================================
Files 1309 1310 +1
Lines 108425 108662 +237
===========================================
+ Hits 75105 75400 +295
+ Misses 33320 33262 -58
Continue to review full report at Codecov.
|
Added an expansion of the |
a0a6485
to
87a435f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: a94e7a9).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115779638
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 87a435f).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115786915
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 447de94).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115796756
447de94
to
4b8837e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 4b8837e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115930962
Pinging @thk123 @peterschrammel for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pop(parameters.size()); | ||
if(!symbol_table.has_symbol(synthetic_class_name)) | ||
{ | ||
// We failed to parse the invokedynamic handle as a Java 8+ lambda; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we output a warning in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A debug message is already produced in java_bytecode_parser.cpp
in this case. That's probably enough for now, as otherwise we would print a warning for every non-Java use of invokedynamic, which would be pretty user-unfriendly
@@ -134,6 +134,18 @@ void create_invokedynamic_synthetic_classes( | |||
<< messaget::eom; | |||
continue; | |||
} | |||
else if(implemented_interface_type.methods().size() != 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(commit message) distiniguish -> distinguish
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚫 Missing tests for:
- capturing a primitive array ✔️
- capturing a reference array ✔️
- capturing an object factory produced array/reference type (reference done, but can't see array?)
- returning something captured (sort of tested - but might be good to verify the reference equality holds?)
- assigning a lambda from an interface to an interface:
Func<A, B> x = (a) -> return a;
Func<A, B> y = x;
y.Apply(1);
- lambda that capture members of outer classes ✔️
- lamdbas that are initalized in
<clinit>
(missing but in fairness GH had rendered this bullet point incomprehensible)
The tests are also quite difficult to figure out what is tested due to the cryptic names (not yours but still)
There are also unit tests that should, with a bit of tweaking, now work inside jbmc/unit/java_bytecode/java_bytecode_convert_method
- could these be turned on?
My biggest complaint on the code is there are huge chunks of straight line code that would be easier to digest if they were split up into smaller functions. You're half way there - you've broken them up with whitespace and stuck a comment that is essentially a function name above them.
@@ -94,6 +94,35 @@ static void assign_parameter_names( | |||
} | |||
} | |||
|
|||
void create_method_stub_symbol( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ I think would be more natural if this returned the symbol, and elt the caller add to the symbol table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thk123 This directly manipulates the symbol table, so returning/constness is not very relevant. I don't see a way to make this substantially better 😞
symbol_table.add(symbol); | ||
create_method_stub_symbol( | ||
invoked_method_id, | ||
arg0.get(ID_C_base_name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Lots of the parameters are derived from ID_C_base_name
, perhaps just pass that in and let the method do this substring nonsense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or possibly even better - pass in arg0
/// \param method_identifier: The fully qualified method name, including type | ||
/// descriptor (e.g. "x.y.z.f:(I)") | ||
/// \param m: The parsed method object to convert. | ||
/// \param symbol_table: The global symbol table (will be modified). | ||
/// \param message_handler: A message handler to collect warnings. | ||
void java_bytecode_convert_method_lazy( | ||
const symbolt &class_symbol, | ||
symbolt &class_symbol, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 This should be unit tested
{ | ||
std::vector<irep_idt> function_symbols; | ||
|
||
for(const auto &id_and_symbol : symbol_table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 This looks like a natural fit for a filter
else if(unknown == 1) | ||
assert lt.g(1.0f, 2, (x, y) -> x.intValue() + y) == 4; // should fail | ||
else if(unknown == 2) | ||
assert lt.f(1.0f, 2, 3) == 9; // should succeed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ f
-> capture_local
else if(unknown == 3) | ||
assert lt.f(1.0f, 2, 3) == 10; // should fail | ||
else if(unknown == 4) | ||
assert lt.i(2) == 10; // should succeed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i
-> capture_local_primitive
else if(unknown == 5) | ||
assert lt.i(2) == 11; // should fail | ||
else if(unknown == 6) | ||
assert lt.j(3) == 30; // should succeed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
j
-> capture_member
else if(unknown == 7) | ||
assert lt.j(3) == 31; // should fail | ||
else if(unknown == 8) | ||
assert lt.k(4) == 40; // should succeed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
-> capture_reference_member
assert lt.k(4) == 41; // should fail | ||
else if(unknown == 10) | ||
{ | ||
assert lt.l(5) == 70; // should succeed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l
-> side_effect_lambda
4b8837e
to
b990669
Compare
76d3d13
to
e6e24d0
Compare
e6e24d0
to
10caab2
Compare
@LAJW reviewing my own PR, realised the lambda1 test is needlessly changed from tab size 4 to 2. I'd fix it myself but I'm on the road until Sunday with only a phone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the clean up Lukasz - this is looking much better. There are a couple of tests still missing:
- capturing an object factory produced array type (reference done, but not array?):
int testFunction(Foo[] inputArr) {
lambda = () -> return inputArr.length;
lambda.execute();
}
- returning something captured (sort of tested - but might be good to verify the reference equality holds?):
int testFunction() {
Object reference = new Object();
lambda = () -> return reference;
Object o2 = lambda.execute();
assert o2 == reference;
}
- assigning a lambda from an interface to an interface:
Func<A, B> x = (a) -> return a;
Func<A, B> y = x;
y.Apply(1);
- lamdbas that are initalized in (missing but in fairness GH had rendered this bullet point incomprehensible)
static Func<A, A> x = (a) -> return a;
int testFunction() {
A a = new A();
A result = x.Apply(a);
assert(a == result);
}
It would also be really good to tidy up the variable names in the existing test as I commented in the original review, but this can be done in a follow up PR.
Finally could we have some rudimentary tests in TG with a bump pointing at this branch.
@@ -66,8 +66,8 @@ void create_invokedynamic_synthetic_classes( | |||
synthetic_methods_mapt &synthetic_methods, | |||
message_handlert &message_handler) | |||
{ | |||
messaget log(message_handler); | |||
namespacet ns(symbol_table); | |||
messaget log{message_handler}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ Putting this sort of changes in the same commit increases the burden for the reviewer, if possible put them in a separate commit in future
ns.lookup(implemented_interface_tag.get_identifier()); | ||
const java_class_typet &implemented_interface_type = | ||
to_java_class_type(implemented_interface_symbol.type); | ||
const java_class_typet &implemented_interface_type = [&] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ I'll trust your answer - but is storing the result of function in a reference (albeit const ref) a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ref
extends the lifespan of the value on the right. The &
is superfluous. This is the result of me trying to not replace all types with const auto
😁
constructor_symbol.type = constructor_type; | ||
set_declaring_class(constructor_symbol, synthetic_class_name); | ||
return constructor_symbol; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ This commit Convert lambdas into functions
seems to need clang formatting?
} | ||
} | ||
|
||
symbolt create_implemented_method_symbol( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚫 Please make the commit message more informative:
Conver the last function
-> Pull out function create_implemented_method_symbol for invokedynamic conversion
@@ -118,7 +118,7 @@ static optionalt<irep_idt> interface_method_id( | |||
return implemented_interface_type.methods().at(0).get_name(); | |||
} | |||
|
|||
const java_class_typet synthetic_class_type( | |||
symbolt synthetic_class_symbol( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💚
} | ||
|
||
for(const auto &id : function_symbols) | ||
const auto &id = id_and_symbol.first; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ id
is better than cmb
, but might be easier to just use symbol.name
whether id
is used, since then obvious is just the id of the symbol we're talking abotu
} | ||
} | ||
|
||
static const symbolt &get_or_create_method_symbol( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ I still don't understand what this method is for or why is lambda specific
<< " with unknown handle type" << messaget::eom; | ||
continue; | ||
} | ||
const auto &implemented_interface_tag = to_struct_tag_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Name suggestion: functional_interface_tag
This currently accounts for those with default methods, and those which re-declare as abstract methods inherited from Object, such as equals(Object). In the future we should distinguish and disregard these methods as candidates for the intended lambda method.
Because nobody could decie between: get_, create_, synthesize_, etc...
Enough code in cbmc and its users relies on method names having the form 'java::package.Class.method:(type)' that the existing method for synthesising lambda class names, which could yield things like java::synthetic_lambda$main:()V$0.<init>, could cause problems when that colon character in the middle of the name was mistaken for the function name's trailing type information. Similarly we might end up needing to append fake type information to the <init> and apply symbol names to appear "normal" enough.
Change required by clang builds
2f1cb06
to
c1a2f4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: a316fd5).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/122445309
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
This scans invokedynamic instructions during class-loading, creating a synthetic type for each one that we can understand, and registers its constructor (which captures) and method (which passes captured and actual parameters to the real lambda function) as synthetic methods. These are then created on demand using the same mechanism as e.g. synthetic static initializer methods.
Needs tests, but the code is in reviewable state.