Skip to content

Add documentation on remove-instanceof [DOC-25] #2775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 2, 2018

Conversation

thomasspriggs
Copy link
Contributor

Add documentation on remove-instanceof.

```java
Object c = new C();
Object d = new D();
boolean b1 = a instanceof A; // true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mixed up variables here


For a simple example where class `A` is derived from `Object` and no classes derrive from `A`, the expression `e instanceof A` needs to be lowered to an expression equivalent to `e == null ? false : e->@class_identifier == "A"`. The null case needs to be taken into account first, as we can't lookup the value of a `@class_identifier` field in a null pointer. Note that according to the Java specification, `null` is not an instance of any type.

For an example with inheritance where class `D` is derrived from `C` and `C` is derrived from `Object`, then `e instanceof C` needs to be lowered to an expression equivalent to `e == null ? false : (e->@class_identifier == "C" || e->@class_identifier == "D")`. To lower `e instanceof C` correctly, we need to take into account each of the classes derrived from `C`. Working out what classes derive from a given class type is performed using an instance of the `class_hierarchyt` class.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Irritatingly doxygen is warning about the use of @class_identifier dunno if it is better to just use class_identifier to prevent the warning or just ignore - or see if you can fix in the doxygen settings

```
Then, `x` will be set to `true` and `y` will be set to `false`.

`instanceof` also works with inheritance. Given classes `C` and `D`, where `C` derrives from `Object` and `D` derives from `C`. If the following example java code is executed -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throughout for variations, the correct spelling is "derive"


`instanceof` has its own instruction type in the java bytecode. We need to rewrite it in terms of other, more fundamental operations in order to analyse it. This operation is referred to as "lowering" or in this specific case - `remove_instanceof`.

For a simple example where class `A` is derived from `Object` and no classes derrive from `A`, the expression `e instanceof A` needs to be lowered to an expression equivalent to `e == null ? false : e->@class_identifier == "A"`. The null case needs to be taken into account first, as we can't lookup the value of a `@class_identifier` field in a null pointer. Note that according to the Java specification, `null` is not an instance of any type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In java speak, "extends" is probably preferred to "derives"

@@ -43,6 +43,45 @@ To be documented.

To be documented.

\section java-bytecode-remove-instanceof Remove instanceof

Java has an operator called `instanceof`, which is used to check if an instance of a class is of a particular type. For example given that `A` and `B` are the identifiers of classes, which derive directly from Object. If the following example java code is executed -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a summary of the operator is useful - but I'd probably stop short of the full inheritance example (you could exapnd your initial example to include a instanceof Object. You can instead link to this:https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op2.html (irritatingly you can't link to the instanceof section)

```
Then, `b1` will be `true`, `b2` will be `false`, `b3` will be `true` and `b4` will be `true`.

`instanceof` has its own instruction type in the java bytecode. We need to rewrite it in terms of other, more fundamental operations in order to analyse it. This operation is referred to as "lowering" or in this specific case - `remove_instanceof`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One think that I think is useful is specific Java -> Java bytecode that would be translated by this step. I'd also avoid redefining "lowering" and why it is needed since imagine you are reading about all these different transformations, you don't want to keep reading why lowering is necessary.

I also don't know, but apart from the instanceof operator, does any other Java code get translated into an instanceof (e.g. do casts? catch blocks?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOC-25 should also create a description of the pipeline of lowering stages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I will add that overview sentence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken this overview sentence and made it into this PR: #2792


`instanceof` has its own instruction type in the java bytecode. We need to rewrite it in terms of other, more fundamental operations in order to analyse it. This operation is referred to as "lowering" or in this specific case - `remove_instanceof`.

For a simple example where class `A` is derived from `Object` and no classes derrive from `A`, the expression `e instanceof A` needs to be lowered to an expression equivalent to `e == null ? false : e->@class_identifier == "A"`. The null case needs to be taken into account first, as we can't lookup the value of a `@class_identifier` field in a null pointer. Note that according to the Java specification, `null` is not an instance of any type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think docs are better when they are less wordy as quicker to get the info out (i.e. pull the Java code out into a block rather than embedding inside a sentence).


For a simple example where class `A` is derived from `Object` and no classes derrive from `A`, the expression `e instanceof A` needs to be lowered to an expression equivalent to `e == null ? false : e->@class_identifier == "A"`. The null case needs to be taken into account first, as we can't lookup the value of a `@class_identifier` field in a null pointer. Note that according to the Java specification, `null` is not an instance of any type.

For an example with inheritance where class `D` is derrived from `C` and `C` is derrived from `Object`, then `e instanceof C` needs to be lowered to an expression equivalent to `e == null ? false : (e->@class_identifier == "C" || e->@class_identifier == "D")`. To lower `e instanceof C` correctly, we need to take into account each of the classes derrived from `C`. Working out what classes derive from a given class type is performed using an instance of the `class_hierarchyt` class.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do one example (the slightly more complex one) and work through that since the info in the first one is entirely contained in the second and it isn't super complex IMO.


For an example with inheritance where class `D` is derrived from `C` and `C` is derrived from `Object`, then `e instanceof C` needs to be lowered to an expression equivalent to `e == null ? false : (e->@class_identifier == "C" || e->@class_identifier == "D")`. To lower `e instanceof C` correctly, we need to take into account each of the classes derrived from `C`. Working out what classes derive from a given class type is performed using an instance of the `class_hierarchyt` class.

In terms of the goto program, the `instanceof` operator is identified in the expression tree, based on the `exprt.id()` being `ID_java_instance_of`. The left hand side of the operator must be a pointer and the right hand side must be a type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much detail here I think - try and get the fact of the java_instance_of as part of the previous worked example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to the class docs for these details.


In terms of the goto program, the `instanceof` operator is identified in the expression tree, based on the `exprt.id()` being `ID_java_instance_of`. The left hand side of the operator must be a pointer and the right hand side must be a type.

The lowered form has a refinement to help with null pointer analysis later on. Instead of dereferencing the pointer to get @class_identifier in each comparision, it is dereferenced a single time and the result is reused. This is easier to analyse for null pointers, because there is only a single dereference to consider. So the constructed program takes the form -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful but probably too in depth for an overview on the pass - I suggest pulling out of this PR and putting on the remove_instanceof doc itself.

@@ -43,6 +43,45 @@ To be documented.

To be documented.

\section java-bytecode-remove-instanceof Remove instanceof

Java has an operator called `instanceof`, which is used to check if an instance of a class is of a particular type. For example given that `A` and `B` are the identifiers of classes, which derive directly from Object. If the following example java code is executed -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lead with a one sentence summary of the whole thing before going into detail about the instanceof operator. I.e.

\ref remove_instanceof.h removes the bytecode instruction `instanceof ` and replaces it with two instructions:
 - check whether the pointer is null
 - if not null, does the class identifier match the type of any of its subtypes

@@ -43,6 +43,45 @@ To be documented.

To be documented.

\section java-bytecode-remove-instanceof Remove instanceof

Java has an operator called `instanceof`, which is used to check if an instance of a class is of a particular type. For example given that `A` and `B` are the identifiers of classes, which derive directly from Object. If the following example java code is executed -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link the source file resp. class at the beginning

```
Then, `b1` will be `true`, `b2` will be `false`, `b3` will be `true` and `b4` will be `true`.

`instanceof` has its own instruction type in the java bytecode. We need to rewrite it in terms of other, more fundamental operations in order to analyse it. This operation is referred to as "lowering" or in this specific case - `remove_instanceof`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOC-25 should also create a description of the pipeline of lowering stages.


For an example with inheritance where class `D` is derrived from `C` and `C` is derrived from `Object`, then `e instanceof C` needs to be lowered to an expression equivalent to `e == null ? false : (e->@class_identifier == "C" || e->@class_identifier == "D")`. To lower `e instanceof C` correctly, we need to take into account each of the classes derrived from `C`. Working out what classes derive from a given class type is performed using an instance of the `class_hierarchyt` class.

In terms of the goto program, the `instanceof` operator is identified in the expression tree, based on the `exprt.id()` being `ID_java_instance_of`. The left hand side of the operator must be a pointer and the right hand side must be a type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to the class docs for these details.

@thk123 thk123 changed the title Add documentation on remove-instanceof Add documentation on remove-instanceof [DOC-25] Sep 6, 2018
@thomasspriggs thomasspriggs force-pushed the tas/doc_remove_instanceof branch 2 times, most recently from ddb2086 to b68f550 Compare September 26, 2018 14:54
/// must be a type.
///
/// The lowered form has a refinement to help with null pointer analysis later
/// on. Instead of dereferencing the pointer to get @class_identifier in each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doxygen is complaining about the @ here looks like you can use \@

@thomasspriggs thomasspriggs force-pushed the tas/doc_remove_instanceof branch from b68f550 to b2b7dc3 Compare September 28, 2018 11:11
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed Diffblue compatibility checks (cbmc commit: b2b7dc3).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/86282968

@thomasspriggs
Copy link
Contributor Author

@jeannielynnmoulton Re-review?

@thk123 thk123 merged commit 5c62ca9 into diffblue:develop Oct 2, 2018
@thk123 thk123 deleted the tas/doc_remove_instanceof branch October 2, 2018 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants