Skip to content

[TG-1931] Resolve calls to java.lang.Object functions to more specific ones #1731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 7, 2018

Conversation

mgudemann
Copy link
Contributor

needs a public facing regression test

@mgudemann mgudemann self-assigned this Jan 12, 2018
@mgudemann mgudemann changed the title Gather all resolved calls, filter for concrete ones [TG-1931] Gather all resolved calls, filter for concrete ones Jan 12, 2018
@smowton
Copy link
Contributor

smowton commented Jan 12, 2018

Please write a commit message explaining the problem and how this fixes it? (Sorry, hit the close button by mistake)

@smowton smowton closed this Jan 12, 2018
@smowton smowton reopened this Jan 12, 2018
@tautschnig
Copy link
Collaborator

Apart from adding the test, could the commit message please also be made a bit more verbose? Which part of the code is this affecting/where are you gathering calls, and why are you doing this?

Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already pointed out needs a test and better commit explanation of why this is an appropriate fix.

resolve_function_call);
}
}

/// Used to get dispatch entries to call for the given function
/// \par function: function that should be called
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: param

functions.begin(),
functions.end(),
[&concrete_call](const dispatch_table_entryt &entry) {
return concrete_call.find(entry.class_id) != concrete_call.end();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe being end-of-the-week dense, but rather than first finding the functions to remove then removing them, can this predicate not simply be return entry.symbol_expr != root_function.symbol_expr;?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also remove those for which the only found symbol_expr for the function call is the root_function.symbol_expr, wouldn't it? For a class that doesn't implement its own toString for example, it makes sense to call java.lang.Object.toString. Therefore I collect first entries that do have a more specialized concrete call and remove only those.

But as discussed, it might make sense for example to first have the classes with a more concrete resolved function call and then group those with root_function.symbol_expr-"only" into a single conditional to save some GOTO instructions.

For example

        // 73 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.StringBuilder" == arg0a->@class_identifier THEN GOTO 3
...
        IF "java::java.util.ArrayList" == arg0a->@class_identifier THEN GOTO 5
        // 58 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.util.Arrays$ArrayList" == arg0a->@class_identifier THEN GOTO 5
...
// all special cases treated, for the rest call `java.lang.Object.toString`
        // 268 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     1: arg0a . java.lang.Object.toString:()Ljava/lang/String;();
        // 269 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.Object.toString:()Ljava/lang/String;#return_value;
...
     3: (struct java.lang.StringBuilder *)arg0a . java.lang.StringBuilder.toString:()Ljava/lang/String;();
        // 277 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.StringBuilder.toString:()Ljava/lang/String;#return_value;
...
        // 284 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     5: (struct java.util.AbstractCollection *)arg0a . java.util.AbstractCollection.toString:()Ljava/lang/String;();
        // 285 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also remove those for which the only found symbol_expr for the function call is the root_function.symbol_expr, wouldn't it? For a class that doesn't implement its own toString for example, it makes sense to call java.lang.Object.toString. Therefore I collect first entries that do have a more specialized concrete call and remove only those.

I still don't understand: concrete_calls is created as a set of keys that are a subset of those contained in functions (for some loose notion of keys). Then you remove from functions all those that have their key in concrete_calls, so you remove an element whenever the previous loop inserted it into concrete_calls. Thus it seems that

std::remove_if(
  functions.begin(),
  functions.end(),
  [](const dispatch_table_entryt &entry) {
    entry.symbol_expr != root_function.symbol_expr;
  });

would achieve the same, as @thk123 suggested.

What are we missing?

@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch from 3c73ef7 to a451a49 Compare January 12, 2018 17:07
@mgudemann
Copy link
Contributor Author

@smowton @thk123 @tautschnig I added some explanation in the commit message for what is done in the PR. As so often, there's an internal regression test, but I will try to derive a public one from that.

const irep_idt class_id=function.get(ID_C_class);
const std::string class_id_string(id2string(class_id));
// method/function name of function to call
const irep_idt component_name=function.get(ID_component_name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about self-documenting code: rename the variable?

@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch from a451a49 to 6ef344b Compare January 13, 2018 09:39
resolve_function_call);

if(root_function.symbol_expr!=symbol_exprt())
functions.push_back(root_function);

// remove all classes from dispatch table where concrete call has been found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this currently filters nothing, as the set contains those entries that resolved to a function other than the root, and the filter condition requires membership of that set and having resolved to the root function.

@peterschrammel
Copy link
Member

@mgudemann, is there a TG companion PR for this?

@mgudemann
Copy link
Contributor Author

mgudemann commented Jan 16, 2018

@smowton After removing the visited set, there can be multiple entries for one class, e.g., one where the root symbol expression is found and one where a more specialized one is found. The above idea is to record first those where the function call is resolved to something other than root (but potentially also to root) and then removing those from that record that have an entry resolved to root. Therefore something that was, e.g., resolved to java.lang.Object.toString and also to own.package.MyObject.toString will retain only the second entry.

return
concrete_call.find(entry.class_id) != concrete_call.end() &&
entry.symbol_expr == root_function.symbol_expr;
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I maintain that I don't understand why this code has to be as complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it is complicated but it seems to do some unnecessary sorting.
Maybe pull out this code in a function which makes the intent clear, something like sort_java_lang_object_at_the_end(dispatch_table_entries &).
Then it will be easier to see how it can be improved.

@smowton
Copy link
Contributor

smowton commented Jan 16, 2018

Hmmm. How about this alternate approach:

At the moment we have

if(function.symbol_expr == symbol_exprt())
    {
      const resolve_concrete_function_callt::concrete_function_callt
        &resolved_call = resolve_function_call(child, component_name);

That takes care of the problem for any method except those implemented by java.lang.Object (i.e. hashCode and toString) which are concrete everywhere. Therefore we could simply handle the case where the current definition is given by j.l.O and treat it the same way.

It's an ugly solution, since it's obviously very Java specific, but so is resolve-concrete-call, which only looks to a class' first parent as a possible source of method implementations. If we want to generalise to C++ or other multiple-inheriting environments (Java 9 default implementations?) then we should do one walk to discover classes of interest, then visit them in topological order, at each juncture taking the most-derived of our parents as the source of the method implementation.

@mgudemann
Copy link
Contributor Author

To give you more context @tautschnig.

For the original test case we currently have for the following Java code

import java.util.ArrayList;
class ToStringObject {
    public static String test() {
        ArrayList<Integer> al = new ArrayList<>();
        return toStr(al);
    }
    public static String toStr(Object o) {
        return o.toString();
    }
}

the following GOTO

ToStringObject.toStr() /* java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; */
        // 43 no location
        struct java.lang.String *return_tmp0;
        // 44 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        // Labels: pc0
        ASSERT !(arg0a == null) // Throw null
        // 45 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.Class" == arg0a->@class_identifier THEN GOTO 1
        // 46 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.AbstractStringBuilder" == arg0a->@class_identifier THEN GOTO 2
        // 47 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.util.HashMap$Values" == arg0a->@class_identifier THEN GOTO 4
        // 48 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.util.AbstractCollection" == arg0a->@class_identifier THEN GOTO 4
        // 49 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.StringBuilder" == arg0a->@class_identifier THEN GOTO 5
        // 50 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.String" == arg0a->@class_identifier THEN GOTO 6
        // 51 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.Integer" == arg0a->@class_identifier THEN GOTO 7
        // 52 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        arg0a . java.lang.Object.toString:()Ljava/lang/String;();
        // 53 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.Object.toString:()Ljava/lang/String;#return_value;
        // 54 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.lang.Object.toString:()Ljava/lang/String;#return_value;
        // 55 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        GOTO 8
...

Which means that for ArrayList.toString we'd call the java.lang.Object implementation, although java.util.AbstractCollection.toString:()Ljava/lang/String;(); would be the correct one.

in my current PR I get the following for this (with current model loaded, i.e., all possible subclasses of JLO somehow appear

ToStringObject.toStr() /* java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; */
        // 43 no location
        struct java.lang.String *return_tmp0;
        // 44 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        // Labels: pc0
        ASSERT !(arg0a == null) // Throw null
        // 45 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.AbstractStringBuilder" == arg0a->@class_identifier || "java::java.lang.StringBuffer" == arg0a->@class_identifier THEN GOTO 1
        // 46 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.Class" == arg0a->@class_identifier THEN GOTO 3
        // 47 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.util.ArrayList" == arg0a->@class_identifier || "java::java.util.AbstractList" == arg0a->@class_identifier || "java::java.util.ArrayList$SubList" == arg0a->@class_identifier || "java::java.util.Arrays$ArrayList" == arg0a->@class_identifier || "java::java.util.AbstractCollection" == arg0a->@class_identifier || "java::java.util.HashMap$KeySet" == arg0a->@class_identifier || "java::java.util.HashMap$Values" == arg0a->@class_identifier || "java::java.util.HashMap$EntrySet" == arg0a->@class_identifier || "java::java.util.AbstractSet" == arg0a->@class_identifier THEN GOTO 4
        // 48 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.Integer" == arg0a->@class_identifier THEN GOTO 5
        // 49 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.String" == arg0a->@class_identifier THEN GOTO 6
        // 50 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        IF "java::java.lang.StringBuilder" == arg0a->@class_identifier THEN GOTO 7
        // 51 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        arg0a . java.lang.Object.toString:()Ljava/lang/String;();
        // 52 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.Object.toString:()Ljava/lang/String;#return_value;
        // 53 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.lang.Object.toString:()Ljava/lang/String;#return_value;
        // 54 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        GOTO 8
        // 55 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     1: (struct java.lang.AbstractStringBuilder *)arg0a . java.lang.AbstractStringBuilder.toString:()Ljava/lang/String;();
        // 56 no location
        IF !NONDET(_Bool) THEN GOTO 2
        // 57 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = null;
        // 58 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     2: GOTO 8
        // 59 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     3: (struct java.lang.Class *)arg0a . java.lang.Class.toString:()Ljava/lang/String;();
        // 60 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.Class.toString:()Ljava/lang/String;#return_value;
        // 61 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.lang.Class.toString:()Ljava/lang/String;#return_value;
        // 62 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        GOTO 8
        // 63 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     4: (struct java.util.AbstractCollection *)arg0a . java.util.AbstractCollection.toString:()Ljava/lang/String;();
        // 64 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.util.AbstractCollection.toString:()Ljava/lang/String;#return_value;
        // 65 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.util.AbstractCollection.toString:()Ljava/lang/String;#return_value;
        // 66 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        GOTO 8
        // 67 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     5: (struct java.lang.Integer *)arg0a . java.lang.Integer.toString:()Ljava/lang/String;();
        // 68 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.Integer.toString:()Ljava/lang/String;#return_value;
        // 69 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.lang.Integer.toString:()Ljava/lang/String;#return_value;
        // 70 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        GOTO 8
        // 71 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     6: (struct java.lang.String *)arg0a . java.lang.String.toString:()Ljava/lang/String;();
        // 72 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.String.toString:()Ljava/lang/String;#return_value;
        // 73 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.lang.String.toString:()Ljava/lang/String;#return_value;
        // 74 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        GOTO 8
        // 75 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     7: (struct java.lang.StringBuilder *)arg0a . java.lang.StringBuilder.toString:()Ljava/lang/String;();
        // 76 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = java.lang.StringBuilder.toString:()Ljava/lang/String;#return_value;
        // 77 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        dead java.lang.StringBuilder.toString:()Ljava/lang/String;#return_value;
        // 78 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 2
     8: ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String;#return_value = return_tmp0;
        // 79 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 2
        dead return_tmp0;
        // 80 no location
        END_FUNCTION
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This groups together the class_ids with the same call target, ending with a call to java.lang.Object which is simply the last, non-explicit case.

Do we agree that at least this result is more or less what we want to have? Is my assumption correct that having larger disjunctions instead of guarded GOTO is beneficial from a performance point of view? In any case it should be easier on the --depth parameter if many such methods are encountered, right?

@mgudemann
Copy link
Contributor Author

@smowton not sure that I understand you here correctly. The problem here at hand is that the original implementation tries to optimize which classes it visits which can be problematic as shown in the above example and the JIRA issue.

The problem is not that symbol_expr is empty but that it contains the wrong function to call. As I see it, if we have a function with parameter T t then we'll need to give an explicit function call for each possible subclass of type T. For JLO this effectively means having to enumerate all known object types.

@smowton
Copy link
Contributor

smowton commented Jan 24, 2018

What I mean is, currently AFAIK we re-visit a type if we don't have a resolution for it yet (i.e. we've previously seen it via an interface, but might be about to visit it via a concrete supertype). We could simply expand that logic to say re-visit also if the current resolution is to java.lang.Object.*

@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch from 6ef344b to 1f32d34 Compare January 24, 2018 10:17
@mgudemann
Copy link
Contributor Author

@smowton imagine the method doesn't take an JLO parameter, but some other object type T, if there's a subclass of T which overrides a function, we'd still like to use the more specialized version. Or is this case not possible ?

@smowton
Copy link
Contributor

smowton commented Jan 24, 2018

The only case where we can encounter interfaces while walking child types is if our starting point was either itself an interface, or is java.lang.Object. Arguably that means we should make interfaces not children of j.l.O! But in general if we started from a concrete type then our only children are also concrete types and the child graph is a tree, so this bug doesn't arise. The only case where it does is when either we start at an interface (with no definitions) or with Object (with toString and hashCode definitions)

@mgudemann
Copy link
Contributor Author

mgudemann commented Jan 24, 2018

In any case, my main question now is whether the result as above is what we want to have. How we get there in the end is a separate thing and I agree that it can be done in a simpler way than what I currently propose.
@smowton it could be an abstract class with an implementation for the method in question, too

@smowton
Copy link
Contributor

smowton commented Jan 24, 2018

If it's an abstract class (but not an interface) then its child class graph is again a tree.

That GOTO program looks good except for the label 2:, which looks like an empty callee block got written out? It's harmless in this program but potentially concerning.

@mgudemann
Copy link
Contributor Author

@smowton maybe an issue with String handling somehow (this is for StringBuilder), I think this is the same in the current implementation.

it returns null, depending on the non-det Boolean in 56

        // 55 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     1: (struct java.lang.AbstractStringBuilder *)arg0a . java.lang.AbstractStringBuilder.toString:()Ljava/lang/String;();
        // 56 no location
        IF !NONDET(_Bool) THEN GOTO 2
        // 57 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
        return_tmp0 = null;
        // 58 file ToStringObject.java line 9 function java::ToStringObject.toStr:(Ljava/lang/Object;)Ljava/lang/String; bytecode-index 1
     2: GOTO 8

@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch 3 times, most recently from e79e861 to 48baaab Compare January 26, 2018 16:51
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, couple of nitpicks.

// Emit target if end of dispatch table is reached or if the next element is
// dispatched to another function call. Assumes entries in the functions
// variable to be sorted for the identifier of the function to be called.
auto l_it = it;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= std::next(it)

// variable to be sorted for the identifier of the function to be called.
auto l_it = it;
l_it++;
bool next_emit_target = l_it == functions.crend();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably clearer as a straightforward ||

bool next_emit_target = l_it == functions.crend();
if(!next_emit_target)
{
next_emit_target |=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know that next_emit_target is false, so |= is the same as = - seems fishy?

@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch from 48baaab to 3b2973d Compare January 29, 2018 08:01
{
next_emit_target =
l_it->symbol_expr.get_identifier() != fun.symbol_expr.get_identifier();
}
Copy link
Contributor

@romainbrenguier romainbrenguier Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 6 lines could be rewriten bool next_emit_target = (l_it == functions.crend()) || (l_it->symbol_expr.get_identifier() != fun.symbol_expr.get_identifier();

exprt::operandst or_ops;
for(const auto &id : class_ids)
{
exprt c_id1 = constant_exprt(id, string_typet());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const constant_exprt c_id1(id, string_typet());

for(const auto &id : class_ids)
{
exprt c_id1 = constant_exprt(id, string_typet());
equal_exprt class_id_test(c_id1, c_id2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const


goto_programt::targett t4 = new_code_gotos.add_instruction();
t4->source_location = vcall_source_loc;
t4->make_goto(insertit.first->second, disjunction(or_ops));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the variable name t4 not very descriptive, I would suggest something like target_for_goto_instruction.

resolve_function_call);

if(root_function.symbol_expr!=symbol_exprt())
functions.push_back(root_function);

// Sort for call symbol expr grouping, keep java.lang.Object entries at the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how to parse that

resolve_function_call);

if(root_function.symbol_expr!=symbol_exprt())
functions.push_back(root_function);

// Sort for call symbol expr grouping, keep java.lang.Object entries at the
// end for fall through. The reasoning is that this is case with most entries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the case

return
concrete_call.find(entry.class_id) != concrete_call.end() &&
entry.symbol_expr == root_function.symbol_expr;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it is complicated but it seems to do some unnecessary sorting.
Maybe pull out this code in a function which makes the intent clear, something like sort_java_lang_object_at_the_end(dispatch_table_entries &).
Then it will be easier to see how it can be improved.

@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch 5 times, most recently from 71e7695 to c419046 Compare January 29, 2018 16:01
@mgudemann mgudemann requested a review from forejtv as a code owner January 30, 2018 08:44
@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch 6 times, most recently from 861788b to 2ed0e55 Compare February 1, 2018 11:07
Matthias Güdemann added 5 commits February 5, 2018 16:04
This fixes an issue in test-generator where a call to `toString` is dispatched
to the `toString` method from `java.lang.Object` is used instead of the
overridden method from `ArrayList`.

Cite from the issue about the current implementation:

> It tracks a `visited` set preventing it from listing callees twice when
> multiple inheritance is in play (e.g. Java interfaces). Unfortunately this
> malfunctions when we visit a type twice, once via its interfaces and once via
> its concrete subclass which provides a definition

This extends the original implementation, by resolving dispatch entries where an
initial step resolved to a java.lang.Object function. This case can be an error,
as some classes might be visited multiple times, first for an interface and then
for the concrete class.

The original implementation kept a visited set that recorded visited classes
which resulted in some functions not correctly being resolved. This set is
replaced with a map from class identifiers to dispatch table entries.
These tests were failing when the new way was used which groups class
identifiers that have the same resolved function call.

Unfortunately the result is not unique here which makes those tests not too
useful.
In the original implementation, D.toString would be resolved to
java.lang.Object.toString().
The unit test checks whether the correct function is called. It is therefor more
precise than the regression test `virtual10`.
The test was failing on two Travis goals, due to the config object being
persistent between different tests in "unit". The link order made another test
set `config.main` to non-empty which made this test fail due to that function
not being found.
@mgudemann mgudemann force-pushed the bugfix/all_resolved_calls branch from 2ed0e55 to 814cfcc Compare February 5, 2018 15:37
@mgudemann mgudemann merged commit 52dfc36 into develop Feb 7, 2018
@tautschnig tautschnig deleted the bugfix/all_resolved_calls branch March 29, 2018 07:12
smowton pushed a commit to smowton/cbmc that referenced this pull request May 9, 2018
f7602af Merge commit 'bb88574aaa4043f0ebf0ad6881ccaaeb1f0413ff' into merge-develop-20180327
906aeb3 Merge pull request diffblue#349 from diffblue/owen-jones-diffblue/fix-compilation-for-release
3d8423c Merge pull request diffblue#350 from diffblue/owen-jones-diffblue/skip-duplicate-callsites-in-lazy-methods
73fb488 bugfix from upstream repo for generic crash
fd76555 Speed up resolution of virtual callsites in lazy loading
3fd28f3 Replace assert(X) by UNREACHABLE/INVARIANT(X)
557158e Merge pull request diffblue#334 from diffblue/pull-support-20180216
1e48132 Merge from master, 20180216
ad7b28e Updates requsted in the PR: mostly rename 'size -> length'.
e3fcb9b Introducing MAX_FILE_NAME_SIZE constant.
bb88574 Merge pull request diffblue#1806 from thk123/refactor/address-review-comments-from-1796
db9c214 Merge pull request diffblue#1850 from tautschnig/include-cleanup
78fbf08 Merge pull request diffblue#1844 from smowton/smowton/feature/prepare-symex-for-lazy-loading
4098ed5 Merge pull request diffblue#1849 from smowton/smowton/cleanup/java-main-function-types
06f3e83 Use C++ headers instead of their C variants
e918a91 Goto-symex: add support for general function accessor
9e31303 Symex: switch to incrementally populating address-taken locals
ac5af68 Address-taken locals analysis: support incremental analysis
fe775f3 Merge pull request diffblue#1843 from peterschrammel/instructions-function
a6fd729 Cleanup tests with anomalous main functions
5df3fca Clean up get_function_id hacks
552b100 Set function member of each goto instruction in goto program passes
38e6e4a Merge pull request diffblue#1813 from smowton/smowton/fix/cleanup-unused-clinits
278e4e6 Merge pull request diffblue#1826 from smowton/smowton/fix/java-inherited-static-fields
a2ebb33 Merge pull request diffblue#1713 from karkhaz/kk-debug-timestamps
7b5dd17 Merge pull request diffblue#1834 from diffblue/library-preconditions
1da5be1 Add tests for inherited static fields
19d622b Add tests to verify that synthetic method removal is performed
adc9fd4 Java frontend: clean up unused clinit symbols
f15c312 Exclude j.l.O from possible static field hosts.
afa443c US spelling of initialize
d4d4a9a Tolerate stub static fields defined on non-stub types
873e1f6 Guess public access for stub globals
d6783d8 Java method converter: look for inherited static fields
32cc538 Insert stub static globals onto any incomplete ancestor class
b2d3d61 Search static fields inherited from interfaces
045ac05 Create stub globals on the first parent incomplete class
5b3cde5 Use a common instance of class_hierarchyt in get_inherited_component
e73e756 Create stub globals: check for inherited globals
bea6371 Annotate static fields with accessibility
168c2a8 Generalise get_inherited_method
f3160e1 resolve_concrete_function_callt -> resolve_inherited_componentt
82549de Emit timestamps on each line of output
3f6965b Replace util/timer* with std::chrono
ef08ae2 Merge pull request diffblue#1820 from smowton/smowton/fix/remove-string-solver-iteration-limit
0f20482 Merge pull request diffblue#1836 from karkhaz/kk-remove-unused-lambda-capture
e8105bd Merge pull request diffblue#1833 from diffblue/symex_class_cleanup
f6f45fc turn some assertions in the stdlib.h models into preconditions
9ea0cc6 pre-conditions for strings
fbd54df Remove unused lambda capture
9620802 Merge pull request diffblue#1815 from smowton/smowton/feature/replace-clinit-unwinder
9b59631 Merge pull request diffblue#1828 from smowton/smowton/cleanup/remove-recreate-initialize
1ac9abe Remove string refinement iteration limit
c94548c preconditions for delete and delete[]
9ba7fe2 cleanup of some noise (mostly obvious declarators) in the goto_symext class
bb64ea6 clean up symex_assign vs. symex_assign_rec
932a38f Merge pull request diffblue#1827 from karkhaz/kk-symex-operator-tidy
968d97e Remove __CPROVER_initialize recreation
06a220a Reimplement remove-static-init-loops to avoid need to inspect GOTO program
2bb98d9 Merge pull request diffblue#1819 from romainbrenguier/refactor/coverage-instrumentation
6492b3a Rearrange cover_basic_blocks header
a9549e7 Define constants as const
fa35ccd Pull continuation_of_block function out
560d712 Declare constants as const
35422f3 Make update_covered_lines a static function
b4cadf8 [path explore 1/8] Tidy up symext top-level funs
0f3ae1a Make representative_inst an optional
dc696a4 Make format_number_range function instead of class
678218a Merge pull request diffblue#1825 from thk123/refactor/corrected-path-of-language-file
ba76a8f Merge pull request diffblue#1751 from tautschnig/fix-1748
b665269 Correcting path to a file
d0889a8 Merge pull request diffblue#1822 from diffblue/legacy-language
441f706 Merge pull request diffblue#1823 from diffblue/cleanup
11714b2 use constant_exprt version of to_integer
733f3b8 remove old-style constructor for member_exprt
63f09ac remove unused function make_next_state
77c8b9c remove translation for certain boolean program constructs
1bac484 cleanout decision_proceduret::in_core
d8967f5 moving language.h and language_file.h to langapi folder
f9b9599 Merge pull request diffblue#1761 from diffblue/function_typet
dd040e5 Added function_typet.
bcd88a0 Merge pull request diffblue#1821 from smowton/smowton/feature/test-pl-tags
45f0939 test.pl: add support for tagging test-cases
40b8c03 Updates requested in PR - mainly rename of functions.
7f868e2 Reused private code in 'remove_virtual_functions.cpp' by making it public.
ae6775a Merge pull request diffblue#1790 from martin-cs/fix/correct-domain-interface
d7bb937 Catch the case when a GOTO instruction is effectively a SKIP.
b2fba97 Correct domain transformers so that they meet the preconditions.
d447c26 Document the invariants on iterators arguments to transform and merge.
e3db794 Whitespace changes to keep clang-format happy.
1990994 Revert "Add edge type parameter to ai transform method"
3ca91bc Revert "Fix iterator comparison bug in reaching_definitions.cpp"
ac036fd Revert "Fix iterator equality check bug in dependence_graph.cpp"
86cadcd Revert "Fix iterator equality check bug in custom_bitvector_analysis.cpp"
db925de Revert "Fix iterator equality check bug in constant_propagator.cpp"
2c69364 Merge pull request diffblue#1811 from cesaro/iterator-fix
807268e Fixes the symbol_base_tablet iterator
0df054c Merge pull request diffblue#1781 from smowton/smowton/feature/java-create-stub-globals-earlier
e163ab6 Java frontend: create synthetic static initialisers for stub globals
fbcb423 Merge pull request diffblue#1802 from NathanJPhillips/feature/symbol_iterator
e106cf8 Merge pull request diffblue#1793 from smowton/smowton/cleanup/remove-java-new-lowering-pass
52dfc36 Merge pull request diffblue#1731 from diffblue/bugfix/all_resolved_calls
f123ae9 Adding comment referencing where the invariant comes from
f7c89e1 Add iterator for symbol_table_baset
150f826 Merge pull request diffblue#1801 from hannes-steffenhagen-diffblue/add-idea-gitignore
745afbc Merge pull request diffblue#1796 from thk123/refactor/bytecode-parsing-tidy
6362295 Add .idea (CLion) directory to .gitignore
31da890 Revert "Do lowering of java_new as a function-level pass"
6f6fda7 Merge pull request diffblue#1794 from smowton/smowton/fix/goto-diff-test-escapes
4a538d2 Adding comments on the non-standard patternt
9bfe177 Adding an early guard for correctly parsed exception table
6fe1808 Improved error reporting on invalid constant pool index
93dab4c Escape curly braces in regexes
814cfcc Adapt failing unit test for value set analysis
3ff90bc Add unit test
e67a96e Add regression test
3df5348 Adapt regression tests for virtual functions.
09efc90 Re-Resolve function calls if only java.lang.Object was found.
a619e48 Merge pull request diffblue#1763 from jeannielynnmoulton/base_class_info_tg1287
0b8dd57 Merge pull request diffblue#1785 from smowton/smowton/fix/core-models-cmake-script
50dcec8 Adding unit tests for extracting generic bases' info
54df3a1 Correcting generic parameters in bases for implicitly generic classes
6d691d7 Parsing generic bases' information into the class symbol
7d041f0 Defining a new type for generic bases
f10eb71 Fix Java core-models build script
8d66028 Merge pull request diffblue#1774 from smowton/smowton/feature/java-create-clinit-state-earlier
679d9b8 Java frontend: create static initialiser globals ahead of time
4a93a29 Merge pull request diffblue#1788 from smowton/smowton/fix/java_tcmp_nan
6ad8ffd Fix Java ternary-compare against NaN
1e0ac30 Turn get_may, set_may, etc into irep_ids
b0cb1ee Merge pull request diffblue#1766 from smowton/smowton/feature/java-frontend-create-literal-globals-early
a2e3af5 Merge pull request diffblue#1744 from smowton/smowton/feature/instrument_cover_per_function
22ae7aa Merge pull request diffblue#1637 from tautschnig/bswap
a1a972f Merge pull request diffblue#1776 from smowton/smowton/feature/class-hierarchy-grapht
ef3c598 Merge pull request diffblue#1775 from diffblue/refactor/set_classpath
45dd840 Merge pull request diffblue#1728 from romainbrenguier/refactor/split-axiom-vectors
8a27950 Java frontend: create String an Class literals earlier
d95cb12 Move string literal initialisation into separate file
515ebdd CI lazy methods: scan global initialisers for global references
cab7b52 C front-end: fix promotion order for types ranking lower than int
c450328 Support for --16 on Visual Studio, no _WIN64 in 32-bit mode
b2c4188 Do not use non-trivial system headers with --32
80b972b Use split_string in set_classpath
fdb2ebc Merge pull request diffblue#1773 from smowton/smowton/feature/string-solver-ensure-class-graph-consistency
a6eed7c Add class-hierarchy variant based on grapht
311af6d Coverage: fully support instrumenting one function at a time
ceafd85 Java string solver: ensure base types are loaded
1e17db6 Merge pull request diffblue#1735 from cesaro/core-models
6844760 Merge pull request diffblue#1769 from smowton/smowton/fix/nondet-initialize-after-initialize
a8e659c Fixed CMake linker ODR violations caused by a regression-test
f66288b Internalize core models of the Java Class Library
34216f5 Refactor jar loading
ed008f9 Add constructors for having memory-loaded jar files This allows the jar_file class to load from a buffer (c array) as opposed to a file
86a34c9 Merge pull request diffblue#1765 from smowton/smowton/fix/ci-lazy-methods-array-element-types
1e11f6d Add test for multiple array types in single method
5009cbb CI lazy methods: re-explore array types with different element types
857fcf9 Cleanup unused fields in string refinement
51d86f5 Adapt unit tests for splitted axiom vectors
1843e44 Split string generator axioms into separate vectors
5669d9b Java: run nondet-initialize method *after* initialization
0b5a5c3 Rename test case
3440018 Provide function name in goto_model_functiont
f17e2c8 Merge pull request diffblue#1741 from smowton/smowton/feature/add_failed_symbols_per_function
f65f0fd Merge pull request diffblue#1764 from smowton/smowton/feature/java-infer-opaque-type-fields-earlier
dbc00a7 Add doxygen to add-failed-symbols
3788467 JBMC: add failed symbols on a per-function basis
e934867 Provide a journalling symbol table to process-goto-function
e86e2a0 Java: infer opaque type fields before method conversion
f0f50e3 Journalling symbol table: enable nesting
58d5980 Merge pull request diffblue#1740 from smowton/smowton/feature/adjust_float_expressions_per_function
c91ff69 JBMC: adjust float expressions per function
eed983a JBMC: add property checks on a per-function basis
db3bc99 JBMC: run convert-nondet on a per-function basis
99ea8fe JBMC: run replace-Java-nondet on function-by-function basis
bfd4f50 Merge pull request diffblue#1730 from smowton/smowton/feature/remove_returns_per_function
96569c3 JBMC: remove return values on a per-function basis
a7595c1 Remove returns: support running per-function
fd6e195 Merge pull request diffblue#1718 from cesaro/concurrency-team-small-fixes
e6fe617 Merge pull request diffblue#1705 from jgwilson42/goto-diff-tests
22afc5c Fixes wrong invocation order for static initializers
5c3997d Refectors how CBMC interprets a codet thread-block
001c1a2 ireps of type "ID_atomic_begin" and "ID_atomic_end" will now be properly displayed when the "show-symbol-table" flag is specified.
d978ef9 Folder build/ ignored.
bc145fd Merge pull request diffblue#1756 from romainbrenguier/tests/index-of-corrections#TG-2246
47b4ee9 Merge pull request diffblue#1725 from cesaro/exception-handlig-fixes
d397d6a Merge pull request diffblue#1726 from diffblue/multi_ary_expr2c
bd95317 Merge pull request diffblue#1753 from diffblue/xor_exprt
1d4af6d Merge pull request diffblue#1747 from NathanJPhillips/feature/upstream-cleanup
9c7debb Merge pull request diffblue#1750 from pkesseli/feature/sat-interrupt
f11c995 Merge pull request diffblue#1749 from pkesseli/ci/remove-unapproved
981c8e0 Merge pull request diffblue#1743 from tautschnig/dump-c-fix
bcb076b Correct tests for String.indexOf
ef5c6f0 Merge pull request diffblue#1742 from owen-jones-diffblue/owen-jones-diffblue/small-shared-ptr
6c9f05e Fixes to exception handling behaviour
80dd48a added multi-ary xor_exprt
703e4a3 Remove unapproved C++11 header warning.
bf7ed1a Merge pull request diffblue#313 from diffblue/owen-jones-diffblue/add-structured-lhs-to-value-set
cc9398d Expose MiniSAT's `interrupt()`
8360233 Merge pull request diffblue#1646 from peterschrammel/list-goto-functions
e4a2763 Tests for scope changes for variables and functions
8ee1956 goto-diff tests for package name changes
ce3a5e9 Basic tests for java goto-diff
3bf9987 Compare access qualifiers in goto-diff
f71cc7f Attach class name to method symbol
1f06d35 Merge pull request diffblue#312 from diffblue/pull-support-20180112
fda9daa Cleanup of create-and-swap to emplace
e42e97a Merge commit '23666e3af35673c734c9816ffc131b6b9a379e86' into pull-support-20180112
53f1a41 Populate structured_lhs in all `entryt`s
d7121f2 dump-c: fix support of use-system-headers
eb5ec24 Merge pull request diffblue#1736 from hannes-steffenhagen-diffblue/develop_fix-bitfield-pretty-printing
7a0de46 Add comment suggested by @owen-jones-diffblue
b741d4b Use small intrusive pointer in irep
434cc99 Merge pull request diffblue#1732 from peterschrammel/catch-sat-memout
8ae53bb Merge pull request diffblue#1733 from peterschrammel/mem-limit
574101c Add `structured_lhs` field to entryt
4f1a67a Uses alternatives to get_string in type2name when possible
b46149d Merge pull request diffblue#1719 from smowton/smowton/cleanup/remove_exceptions_single_global
82a7ec6 Adds regression test for bitfield naming bug
651d8d1 Fixes use of wrong identifier when pretty printing bitfield types
638937a Merge pull request diffblue#1709 from romainbrenguier/doc/string-solver-intro
1d1be4c Move non-string-specific part of doc to solvers
549eb57 Delete trailing whitespaces
db3e044 Add introduction to string solver documentation
74be7fb Merge pull request diffblue#1729 from romainbrenguier/refactor/unused-nonempty-option
d101b22 Set memory limit utility
ef45a1d Replace assertions by invariants
84e04a7 Catch Glucose::OutOfMemoryException
89fc48d Replace assertions by invariants
5e85701 Catch Minisat::OutOfMemoryException
b8cee29 Enable list-goto-functions in clobber
d902ec8 Replace cout by message stream in show-goto-functions
d970673 Move show-loops in the right place in goto-diff
e1227ef Enable list-goto-functions in goto diff
7e1110c Enable list-goto-functions in goto-instrument
0fb4868 Enable list-goto-functions in goto-analyzer
e67abfa Remove exceptions: switch to single inflight exception global
2fabbd4 Enable list-goto-functions in JBMC
9e1705f Enable list-goto-functions in CBMC
ebd8248 Add list-goto-functions command line option
2fe43a9 Add parameter to list goto functions without printing bodies
3d492fe Add documentation of return values
5a8eea5 Remove the string-non-empty option
9810f92 Drop string_non_empty field for string refinement
fec16d7 expr2c now distinguishes binary and multi-ary expressions
d16a918 C library: network byteorder functions
05bc9ed Implement bswap in SAT back-end
4f37035 Introduce bswap_exprt

git-subtree-dir: cbmc
git-subtree-split: f7602af
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants