-
Notifications
You must be signed in to change notification settings - Fork 273
Making the instrumented clinit_wrapper function thread-safe #1786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making the instrumented clinit_wrapper function thread-safe #1786
Conversation
a79f687
to
76b1842
Compare
. @smowton should definitely look at this as has been touching this recently, I'd also appreciate a test-gen pointer bump. |
76b1842
to
9694036
Compare
/// This wrapper is called immediately before any read or write of a static | ||
/// field or whenever an instance of the class is created. | ||
/// When class \p classname does not need static initialization we | ||
/// return a nil irep. We also explicitely avoid introducing calls to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicitly
|
||
// Initialize the syper-class C' and | ||
// the implemented interfaces l_1 ... l_n. | ||
// see JVM Spec p.359 step 7, for the exact defination of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definition
const clinit_statest state) | ||
{ | ||
mp_integer initv(static_cast<int>(state)); | ||
constant_exprt init_s=from_integer(initv, java_int_type()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you require a full int
here?
/// place, calls the actual <clinit> method if not, and initializes super- | ||
/// classes and interfaces. | ||
/// \param classname: Class name | ||
/// This function retrieves from or creates into the symbol table a wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth factoring out the clinit generation into a separate .cpp file in order to reduce the size of this monster.cpp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, will do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hang on! This is already done in the PR this is racing against :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks, and some bad news: this is going to clash with #1774, which I hope to get in later today, and reworks how static init methods are generated to create them lazily, and create their state variables (the bool already_run
, and now your new state
variable) before method conversion.
The good news however: this should be quite easy to rebase on #1774 -- simply create your additional global where #1774 does and copy-paste your init code into my new java_static_initializers.cpp
file.
The local variable should be created during method elaboration (i.e. in get_clinit_wrapper_body
).
// if we are initializing class C due to a use of class C within method | ||
// `C.<clinit>` then avoid the recursive (and unnecessary) call to | ||
// the wrapper. | ||
const irep_idt &real_clinit_name=id2string(classname)+".<clinit>:()V"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is new code, use the new clang-format-compatible / human-readable operator spacing (i.e. spaces around ==, =, etc)
true, false); | ||
|
||
code_blockt function_body; | ||
codet atomic_begin(ID_atomic_begin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making a std_code
entry for these?
function_body.add(conditional); | ||
} | ||
|
||
// Initialize the syper-class C' and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
superclass
// This is achieved by iterating through the base types and | ||
// adding recursive calls to the clinit_wrapper() | ||
{ | ||
const symbolt &class_symbol=*symbol_table.lookup(classname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lookup_ref
const irep_idt &name, | ||
const typet &type, | ||
const exprt &value, | ||
const bool is_thread_local, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally base_name is the unqualified name, e.g. name
== java::Class.field
=> base_name
== field
.
/// /param is_thread_local: if true this symbol will be set as thread local | ||
/// /param is_static_lifetime: if true this symbol will be set as static | ||
/// /return returns new symbol. | ||
symbolt java_bytecode_convert_methodt::add_new_symbol( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing is rather specific! Probably rename it "add_variable_symbol"?
const irep_idt &name, | ||
const typet &type, | ||
const exprt &value, | ||
const bool is_thread_local, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't very obvious at the callsite -- perhaps use two-valued enums so we don't mix up true, false
and false, true
?
/// equal exprt. | ||
/// /return returns a equal_exprt, equating \p symbol_expr to the integer | ||
/// representation of \p state | ||
equal_exprt java_bytecode_convert_methodt::gen_clinit_eqexpr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't really need to be members -- they're clearly static
helpers
@smowton, oks will rebase on top of 1774 |
f8a32ea
to
6691b0d
Compare
@smowton , rebase done, can you take a look please? |
7143b9f
to
72c061b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions
{ | ||
NOT_INIT, | ||
IN_PROGRESS, | ||
RETURN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INIT_COMPLETE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of 'RETURN'? yes INIT_COMPLETE is better, changing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Degiorgio Do not forget to update the documentation above
equal_exprt check_already_run( | ||
already_run_symbol.symbol_expr(), | ||
false_exprt()); | ||
const symbolt &clinit_s_sym = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use _state
instead of _s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
add_new_symbol( | ||
symbol_table, | ||
clinit_local_state_variable_name(class_name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this variable _thread_local
rather than just _local
to clarify we mean thread-locality not that it's a function local
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/// java::C::__CPROVER_clinit_state = IN_PROGRESS | ||
/// was_ready = false; | ||
/// } | ||
/// else If(java::C::__CPROVER_clinit_state == READY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of date? AFAIK there is no state named READY anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep this should be RETURN
/// java::C::__CPROVER_clinit_local_state = RETURN; | ||
/// | ||
/// ATOMIC_BEGIN | ||
/// Assume(java::C::__CPROVER_clinit_state != IN_PROGRESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?? Will that work? Presumably we want the other thread to busy-wait for IN_PROGRESS to transition to DONE in this case. If this is indeed magic that symex understands like a wait then write that in the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that will work. It's exactly how POSIX mutexes are implemented in pthread_mutex_lock
, see here:
https://github.com/diffblue/cbmc/blob/develop/src/ansi-c/library/pthread_lib.c#L132
Thanks, we will indeed document it.
true); | ||
|
||
// Create a local variable "was_ready". This variable is used optimization as | ||
// an optimization to make even more clear to the symex engine that it does |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "is used as an optimization"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
0b9e580
to
bfe8097
Compare
@smowton all suggestions applied, please have a another look |
43ed109
to
43e69ac
Compare
@peterschrammel can you take another look at this please? All the issues you raised should be addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, more stuff to fix
/// equal exprt. | ||
/// /return returns a equal_exprt, equating \p symbol_expr to the integer | ||
/// representation of \p state | ||
equal_exprt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
symbol_table, | ||
clinit_local_init_complete_var_name(class_name), | ||
bool_typet(), | ||
false_exprt(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw the value
member of a local variable is ignored; it's up to you to generate an initialiser. Therefore recommend passing nil_exprt()
there and checking in add-new-symbol that function-scoped (non-global) variables do not attempt to specify an initialiser this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
static_assert((int) clinit_statest::INIT_COMPLETE==2, "Check comment above"); | ||
#endif | ||
|
||
/// bool init_complete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-doxy (//) comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
true, | ||
true); | ||
|
||
// Create a local variable "init_complete". This variable is used as an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is inaccurate, does it belong to the thread-local variable instead? The function-local one is just used to copy the global so you can inspect it and then maybe return outside a critical section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment is misleading agreed. rewording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using a local boolean for inspection on purpose. The reason being that inspecting the global state resulted in symex related propagation issues in multi-threaded code.
// Create a local variable "init_complete". This variable is used as an | ||
// optimization to make even more clear to the symex engine that it does not | ||
// need to unwind ifinitely many copes of the wrapper | ||
add_new_symbol( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the local should be created in the populate-function-body phase. Only globals should be created at this stage (i.e. the global, the thread-scoped variable, and the function symbol itself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
de6a0e4
to
ef8bf61
Compare
A temporary variable is introduced later for this if_exprt and the mode needs to be known for the variable, other an invariant may fail.
This checks the ID_mode fix for if_expr in preprocessing works correctly.
This is an implementation detail that need not be exposed to users - the interface is the free function base_type_eq.
... and sort the remaining ones lexicographically.
The previous version of clinit_wrapper did not conform to the JVM specification (section 5.5) as it did not support concurrency. More specifically, the clinit_wrapper has to be carefully synchronised as other threads may try to initialize a given class or interface at the same-time. This commit makes the clinit_wrapper thread-safe by introducing a critical section and two state variables. The commit implements a simplification of the algorithm defined in section 5.5 of the JVM specification. For instance, exceptions thrown during the execution of static initializers are not properly handled.
Setting this option will enable support for analysing multi-threaded java programs. For now, this means that a thread-safe version of clinit_wrapper is generated. This was made opt-in instead of opt-out as the inclusion of thread-safe clinit wrapper has negative implications for performance. Commit also adds appropriate regression tests that use this flag to test the resulting modifications to the clinit_wrapper
a5ab2ef
to
4c7b646
Compare
The previous version of clinit_wrapper did not conform to the JVM specification (section 5.5) as it did not support concurrency. More specifically, the clinit_wrapper has to be carefully synchronized
as other threads may try to initialize a given class or interface at the same-time.
This commit adds a new implementation of the clinit_wrapper that is thread-safe. This is achieved by introducing a critical section and two state variables. The commit implements a simplification of the algorithm defined in section 5.5 of the JVM spec. For instance exceptions thrown during the execution of static initializers are not properly handled. To use this implementation instead of the old one, a new option has been added 'java-threading'. We made this opt-in instead of simply replacing the old one for performance reasons.
Commit duplicates some regression tests, while these tests do not exhibit concurrency, the resulting change to the clinit_wrapper trigged by specifying 'java-threading' effects non-concurrent code.
Note: this change is required in-order to add concurrency support to the java-front-end.