Skip to content

Synthesise lambda classes #4787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Jun 13, 2019

This scans invokedynamic instructions during class-loading, creating a synthetic type for each one that we can understand, and registers its constructor (which captures) and method (which passes captured and actual parameters to the real lambda function) as synthetic methods. These are then created on demand using the same mechanism as e.g. synthetic static initializer methods.

Needs tests, but the code is in reviewable state.

@tautschnig
Copy link
Collaborator

@smowton For Java-clueless people like myself: can you explain why this change is desirable to have?

@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch from 456c896 to dc7c5d2 Compare June 13, 2019 17:09
@smowton
Copy link
Contributor Author

smowton commented Jun 13, 2019

At the moment invokedynamic instructions are always stubbed to return null. With this change they actually work, so you can verify programs that use lambdas, such as assert (MyInterface)(x -> x + 1)(1) == 2

@tautschnig
Copy link
Collaborator

@smowton Thanks! Would you mind adding that info to the commit message?

@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch 2 times, most recently from d98efc4 to a94e7a9 Compare June 17, 2019 09:10
@smowton
Copy link
Contributor Author

smowton commented Jun 17, 2019

@tautschnig done

@codecov-io
Copy link

codecov-io commented Jun 17, 2019

Codecov Report

Merging #4787 into develop will increase coverage by 0.12%.
The diff coverage is 99.06%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4787      +/-   ##
===========================================
+ Coverage    69.26%   69.38%   +0.12%     
===========================================
  Files         1309     1310       +1     
  Lines       108425   108662     +237     
===========================================
+ Hits         75105    75400     +295     
+ Misses       33320    33262      -58
Impacted Files Coverage Δ
...java_bytecode/java_bytecode_convert_method_class.h 100% <ø> (ø) ⬆️
jbmc/src/java_bytecode/java_bytecode_language.cpp 93.58% <100%> (+0.25%) ⬆️
jbmc/src/java_bytecode/lambda_synthesis.cpp 100% <100%> (ø)
...src/java_bytecode/java_bytecode_convert_method.cpp 97.52% <95.91%> (+0.03%) ⬆️
src/goto-programs/slice_global_inits.cpp 88% <0%> (-3.67%) ⬇️
src/ansi-c/c_typecheck_expr.cpp 72.52% <0%> (-1.21%) ⬇️
src/util/std_code.h 91.18% <0%> (-1.13%) ⬇️
src/util/simplify_expr.cpp 87.38% <0%> (-0.23%) ⬇️
src/ansi-c/ansi_c_convert_type.cpp 82.44% <0%> (-0.2%) ⬇️
src/ansi-c/c_typecheck_initializer.cpp 75.64% <0%> (-0.14%) ⬇️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33c00c1...a316fd5. Read the comment docs.

@smowton
Copy link
Contributor Author

smowton commented Jun 17, 2019

Added an expansion of the jbmc/lambda1 test to actually check behaviour

@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch from a0a6485 to 87a435f Compare June 17, 2019 10:45
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: a94e7a9).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115779638

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 87a435f).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115786915

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 447de94).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115796756

@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch from 447de94 to 4b8837e Compare June 18, 2019 08:48
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 4b8837e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/115930962

@smowton
Copy link
Contributor Author

smowton commented Jun 18, 2019

Pinging @thk123 @peterschrammel for review

Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pop(parameters.size());
if(!symbol_table.has_symbol(synthetic_class_name))
{
// We failed to parse the invokedynamic handle as a Java 8+ lambda;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we output a warning in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A debug message is already produced in java_bytecode_parser.cpp in this case. That's probably enough for now, as otherwise we would print a warning for every non-Java use of invokedynamic, which would be pretty user-unfriendly

@@ -134,6 +134,18 @@ void create_invokedynamic_synthetic_classes(
<< messaget::eom;
continue;
}
else if(implemented_interface_type.methods().size() != 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(commit message) distiniguish -> distinguish

thk123
thk123 previously requested changes Jun 19, 2019
Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 Missing tests for:

  • capturing a primitive array ✔️
  • capturing a reference array ✔️
  • capturing an object factory produced array/reference type (reference done, but can't see array?)
  • returning something captured (sort of tested - but might be good to verify the reference equality holds?)
  • assigning a lambda from an interface to an interface:
Func<A, B> x = (a) -> return a;
Func<A, B> y = x;
y.Apply(1);
  • lambda that capture members of outer classes ✔️
  • lamdbas that are initalized in <clinit> (missing but in fairness GH had rendered this bullet point incomprehensible)

The tests are also quite difficult to figure out what is tested due to the cryptic names (not yours but still)

There are also unit tests that should, with a bit of tweaking, now work inside jbmc/unit/java_bytecode/java_bytecode_convert_method - could these be turned on?

My biggest complaint on the code is there are huge chunks of straight line code that would be easier to digest if they were split up into smaller functions. You're half way there - you've broken them up with whitespace and stuck a comment that is essentially a function name above them.

@@ -94,6 +94,35 @@ static void assign_parameter_names(
}
}

void create_method_stub_symbol(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I think would be more natural if this returned the symbol, and elt the caller add to the symbol table

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thk123 This directly manipulates the symbol table, so returning/constness is not very relevant. I don't see a way to make this substantially better 😞

symbol_table.add(symbol);
create_method_stub_symbol(
invoked_method_id,
arg0.get(ID_C_base_name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Lots of the parameters are derived from ID_C_base_name, perhaps just pass that in and let the method do this substring nonsense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or possibly even better - pass in arg0

/// \param method_identifier: The fully qualified method name, including type
/// descriptor (e.g. "x.y.z.f:(I)")
/// \param m: The parsed method object to convert.
/// \param symbol_table: The global symbol table (will be modified).
/// \param message_handler: A message handler to collect warnings.
void java_bytecode_convert_method_lazy(
const symbolt &class_symbol,
symbolt &class_symbol,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 This should be unit tested

{
std::vector<irep_idt> function_symbols;

for(const auto &id_and_symbol : symbol_table)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 This looks like a natural fit for a filter

else if(unknown == 1)
assert lt.g(1.0f, 2, (x, y) -> x.intValue() + y) == 4; // should fail
else if(unknown == 2)
assert lt.f(1.0f, 2, 3) == 9; // should succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ f -> capture_local

else if(unknown == 3)
assert lt.f(1.0f, 2, 3) == 10; // should fail
else if(unknown == 4)
assert lt.i(2) == 10; // should succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i -> capture_local_primitive

else if(unknown == 5)
assert lt.i(2) == 11; // should fail
else if(unknown == 6)
assert lt.j(3) == 30; // should succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j -> capture_member

else if(unknown == 7)
assert lt.j(3) == 31; // should fail
else if(unknown == 8)
assert lt.k(4) == 40; // should succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k -> capture_reference_member

assert lt.k(4) == 41; // should fail
else if(unknown == 10)
{
assert lt.l(5) == 70; // should succeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l -> side_effect_lambda

@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch from 4b8837e to b990669 Compare June 19, 2019 13:53
@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch from 76d3d13 to e6e24d0 Compare July 8, 2019 12:58
@LAJW LAJW force-pushed the smowton/feature/synthesise-lambda-classes branch from e6e24d0 to 10caab2 Compare July 9, 2019 15:54
@smowton
Copy link
Contributor Author

smowton commented Jul 10, 2019

@LAJW reviewing my own PR, realised the lambda1 test is needlessly changed from tab size 4 to 2. I'd fix it myself but I'm on the road until Sunday with only a phone.

Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the clean up Lukasz - this is looking much better. There are a couple of tests still missing:

  • capturing an object factory produced array type (reference done, but not array?):
int testFunction(Foo[] inputArr) {
  lambda = () -> return inputArr.length;
  lambda.execute();
}
  • returning something captured (sort of tested - but might be good to verify the reference equality holds?):
int testFunction() {
  Object reference = new Object();
  lambda = () -> return reference;
  Object o2 = lambda.execute();
  assert o2 == reference;
}
  • assigning a lambda from an interface to an interface:
Func<A, B> x = (a) -> return a;
Func<A, B> y = x;
y.Apply(1);
  • lamdbas that are initalized in (missing but in fairness GH had rendered this bullet point incomprehensible)
static Func<A, A> x = (a) -> return a;
int testFunction() {
  A a = new A();
  A result = x.Apply(a);
  assert(a == result);
}

It would also be really good to tidy up the variable names in the existing test as I commented in the original review, but this can be done in a follow up PR.

Finally could we have some rudimentary tests in TG with a bump pointing at this branch.

@@ -66,8 +66,8 @@ void create_invokedynamic_synthetic_classes(
synthetic_methods_mapt &synthetic_methods,
message_handlert &message_handler)
{
messaget log(message_handler);
namespacet ns(symbol_table);
messaget log{message_handler};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Putting this sort of changes in the same commit increases the burden for the reviewer, if possible put them in a separate commit in future

ns.lookup(implemented_interface_tag.get_identifier());
const java_class_typet &implemented_interface_type =
to_java_class_type(implemented_interface_symbol.type);
const java_class_typet &implemented_interface_type = [&] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ I'll trust your answer - but is storing the result of function in a reference (albeit const ref) a good idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ref extends the lifespan of the value on the right. The & is superfluous. This is the result of me trying to not replace all types with const auto 😁

constructor_symbol.type = constructor_type;
set_declaring_class(constructor_symbol, synthetic_class_name);
return constructor_symbol;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ This commit Convert lambdas into functions seems to need clang formatting?

}
}

symbolt create_implemented_method_symbol(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 Please make the commit message more informative:
Conver the last function -> Pull out function create_implemented_method_symbol for invokedynamic conversion

@@ -118,7 +118,7 @@ static optionalt<irep_idt> interface_method_id(
return implemented_interface_type.methods().at(0).get_name();
}

const java_class_typet synthetic_class_type(
symbolt synthetic_class_symbol(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💚

}

for(const auto &id : function_symbols)
const auto &id = id_and_symbol.first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ id is better than cmb, but might be easier to just use symbol.name whether id is used, since then obvious is just the id of the symbol we're talking abotu

}
}

static const symbolt &get_or_create_method_symbol(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ I still don't understand what this method is for or why is lambda specific

<< " with unknown handle type" << messaget::eom;
continue;
}
const auto &implemented_interface_tag = to_struct_tag_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Name suggestion: functional_interface_tag

smowton and others added 25 commits August 6, 2019 09:43
This currently accounts for those with default methods, and those which re-declare as
abstract methods inherited from Object, such as equals(Object). In the future we should
distinguish and disregard these methods as candidates for the intended lambda method.
Because nobody could decie between: get_, create_, synthesize_, etc...
Enough code in cbmc and its users relies on method names having the form
'java::package.Class.method:(type)' that the existing method for synthesising
lambda class names, which could yield things like java::synthetic_lambda$main:()V$0.<init>,
could cause problems when that colon character in the middle of the name was mistaken
for the function name's trailing type information. Similarly we might end up
needing to append fake type information to the <init> and apply symbol names to
appear "normal" enough.
Change required by clang builds
@smowton smowton force-pushed the smowton/feature/synthesise-lambda-classes branch from 2f1cb06 to c1a2f4d Compare August 6, 2019 08:45
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: a316fd5).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/122445309
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

@LAJW LAJW merged commit 719e9a9 into diffblue:develop Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants