Skip to content

Making the instrumented clinit_wrapper function thread-safe #1786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Degiorgio
Copy link
Contributor

@Degiorgio Degiorgio commented Jan 31, 2018

The previous version of clinit_wrapper did not conform to the JVM specification (section 5.5) as it did not support concurrency. More specifically, the clinit_wrapper has to be carefully synchronized
as other threads may try to initialize a given class or interface at the same-time.

This commit adds a new implementation of the clinit_wrapper that is thread-safe. This is achieved by introducing a critical section and two state variables. The commit implements a simplification of the algorithm defined in section 5.5 of the JVM spec. For instance exceptions thrown during the execution of static initializers are not properly handled. To use this implementation instead of the old one, a new option has been added 'java-threading'. We made this opt-in instead of simply replacing the old one for performance reasons.

Commit duplicates some regression tests, while these tests do not exhibit concurrency, the resulting change to the clinit_wrapper trigged by specifying 'java-threading' effects non-concurrent code.

Note: this change is required in-order to add concurrency support to the java-front-end.

@thk123
Copy link
Contributor

thk123 commented Jan 31, 2018

. @smowton should definitely look at this as has been touching this recently, I'd also appreciate a test-gen pointer bump.

@Degiorgio Degiorgio force-pushed the concurrency-support-for-clinit branch from 76b1842 to 9694036 Compare February 1, 2018 10:31
/// This wrapper is called immediately before any read or write of a static
/// field or whenever an instance of the class is created.
/// When class \p classname does not need static initialization we
/// return a nil irep. We also explicitely avoid introducing calls to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicitly


// Initialize the syper-class C' and
// the implemented interfaces l_1 ... l_n.
// see JVM Spec p.359 step 7, for the exact defination of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definition

const clinit_statest state)
{
mp_integer initv(static_cast<int>(state));
constant_exprt init_s=from_integer(initv, java_int_type());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you require a full int here?

/// place, calls the actual <clinit> method if not, and initializes super-
/// classes and interfaces.
/// \param classname: Class name
/// This function retrieves from or creates into the symbol table a wrapper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth factoring out the clinit generation into a separate .cpp file in order to reduce the size of this monster.cpp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, will do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on! This is already done in the PR this is racing against :)

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks, and some bad news: this is going to clash with #1774, which I hope to get in later today, and reworks how static init methods are generated to create them lazily, and create their state variables (the bool already_run, and now your new state variable) before method conversion.

The good news however: this should be quite easy to rebase on #1774 -- simply create your additional global where #1774 does and copy-paste your init code into my new java_static_initializers.cpp file.

The local variable should be created during method elaboration (i.e. in get_clinit_wrapper_body).

// if we are initializing class C due to a use of class C within method
// `C.<clinit>` then avoid the recursive (and unnecessary) call to
// the wrapper.
const irep_idt &real_clinit_name=id2string(classname)+".<clinit>:()V";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is new code, use the new clang-format-compatible / human-readable operator spacing (i.e. spaces around ==, =, etc)

true, false);

code_blockt function_body;
codet atomic_begin(ID_atomic_begin);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making a std_code entry for these?

function_body.add(conditional);
}

// Initialize the syper-class C' and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superclass

// This is achieved by iterating through the base types and
// adding recursive calls to the clinit_wrapper()
{
const symbolt &class_symbol=*symbol_table.lookup(classname);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lookup_ref

const irep_idt &name,
const typet &type,
const exprt &value,
const bool is_thread_local,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally base_name is the unqualified name, e.g. name == java::Class.field => base_name == field.

/// /param is_thread_local: if true this symbol will be set as thread local
/// /param is_static_lifetime: if true this symbol will be set as static
/// /return returns new symbol.
symbolt java_bytecode_convert_methodt::add_new_symbol(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing is rather specific! Probably rename it "add_variable_symbol"?

const irep_idt &name,
const typet &type,
const exprt &value,
const bool is_thread_local,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't very obvious at the callsite -- perhaps use two-valued enums so we don't mix up true, false and false, true?

/// equal exprt.
/// /return returns a equal_exprt, equating \p symbol_expr to the integer
/// representation of \p state
equal_exprt java_bytecode_convert_methodt::gen_clinit_eqexpr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't really need to be members -- they're clearly static helpers

@Degiorgio
Copy link
Contributor Author

@smowton, oks will rebase on top of 1774

@Degiorgio Degiorgio force-pushed the concurrency-support-for-clinit branch 2 times, most recently from f8a32ea to 6691b0d Compare February 12, 2018 10:34
@Degiorgio
Copy link
Contributor Author

Degiorgio commented Feb 12, 2018

@smowton , rebase done, can you take a look please?

@Degiorgio Degiorgio force-pushed the concurrency-support-for-clinit branch 4 times, most recently from 7143b9f to 72c061b Compare February 12, 2018 12:55
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions

{
NOT_INIT,
IN_PROGRESS,
RETURN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INIT_COMPLETE?

Copy link
Contributor Author

@Degiorgio Degiorgio Feb 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of 'RETURN'? yes INIT_COMPLETE is better, changing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Degiorgio Do not forget to update the documentation above

equal_exprt check_already_run(
already_run_symbol.symbol_expr(),
false_exprt());
const symbolt &clinit_s_sym =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use _state instead of _s

Copy link
Contributor Author

@Degiorgio Degiorgio Feb 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


add_new_symbol(
symbol_table,
clinit_local_state_variable_name(class_name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this variable _thread_local rather than just _local to clarify we mean thread-locality not that it's a function local

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// java::C::__CPROVER_clinit_state = IN_PROGRESS
/// was_ready = false;
/// }
/// else If(java::C::__CPROVER_clinit_state == READY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of date? AFAIK there is no state named READY anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep this should be RETURN

/// java::C::__CPROVER_clinit_local_state = RETURN;
///
/// ATOMIC_BEGIN
/// Assume(java::C::__CPROVER_clinit_state != IN_PROGRESS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? Will that work? Presumably we want the other thread to busy-wait for IN_PROGRESS to transition to DONE in this case. If this is indeed magic that symex understands like a wait then write that in the comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that will work. It's exactly how POSIX mutexes are implemented in pthread_mutex_lock, see here:

https://github.com/diffblue/cbmc/blob/develop/src/ansi-c/library/pthread_lib.c#L132

Thanks, we will indeed document it.

true);

// Create a local variable "was_ready". This variable is used optimization as
// an optimization to make even more clear to the symex engine that it does
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "is used as an optimization"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Degiorgio Degiorgio force-pushed the concurrency-support-for-clinit branch 2 times, most recently from 0b9e580 to bfe8097 Compare February 13, 2018 16:17
@cesaro
Copy link
Contributor

cesaro commented Feb 13, 2018

@smowton all suggestions applied, please have a another look

@Degiorgio Degiorgio force-pushed the concurrency-support-for-clinit branch 3 times, most recently from 43ed109 to 43e69ac Compare February 15, 2018 14:47
@Degiorgio
Copy link
Contributor Author

@peterschrammel can you take another look at this please? All the issues you raised should be addressed.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, more stuff to fix

/// equal exprt.
/// /return returns a equal_exprt, equating \p symbol_expr to the integer
/// representation of \p state
equal_exprt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

symbol_table,
clinit_local_init_complete_var_name(class_name),
bool_typet(),
false_exprt(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw the value member of a local variable is ignored; it's up to you to generate an initialiser. Therefore recommend passing nil_exprt() there and checking in add-new-symbol that function-scoped (non-global) variables do not attempt to specify an initialiser this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

static_assert((int) clinit_statest::INIT_COMPLETE==2, "Check comment above");
#endif

/// bool init_complete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-doxy (//) comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

true,
true);

// Create a local variable "init_complete". This variable is used as an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is inaccurate, does it belong to the thread-local variable instead? The function-local one is just used to copy the global so you can inspect it and then maybe return outside a critical section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment is misleading agreed. rewording.

Copy link
Contributor Author

@Degiorgio Degiorgio Feb 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using a local boolean for inspection on purpose. The reason being that inspecting the global state resulted in symex related propagation issues in multi-threaded code.

// Create a local variable "init_complete". This variable is used as an
// optimization to make even more clear to the symex engine that it does not
// need to unwind ifinitely many copes of the wrapper
add_new_symbol(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the local should be created in the populate-function-body phase. Only globals should be created at this stage (i.e. the global, the thread-scoped variable, and the function symbol itself).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Degiorgio Degiorgio force-pushed the concurrency-support-for-clinit branch 2 times, most recently from de6a0e4 to ef8bf61 Compare February 15, 2018 17:22
tautschnig and others added 22 commits May 2, 2018 17:52
A temporary variable is introduced later for this if_exprt and the mode
needs to be known for the variable, other an invariant may fail.
This checks the ID_mode fix for if_expr in preprocessing works correctly.
This is an implementation detail that need not be exposed to users - the
interface is the free function base_type_eq.
... and sort the remaining ones lexicographically.
The previous version of clinit_wrapper did not conform to the JVM
specification (section 5.5) as it did not support concurrency. More
specifically, the clinit_wrapper has to be carefully synchronised
as other threads may try to initialize a given class or interface
at the same-time.

This commit makes the clinit_wrapper thread-safe by introducing a
critical section and two state variables. The commit implements a
simplification of the algorithm defined in section 5.5 of the JVM
specification. For instance, exceptions thrown during the execution
of static initializers are not properly handled.
Setting this option will enable  support for analysing
multi-threaded java programs. For now, this means that a thread-safe
version of clinit_wrapper is generated. This was made opt-in instead
of opt-out as the inclusion of thread-safe clinit wrapper has negative
implications for performance.

Commit also adds appropriate regression tests that use this flag to test
the resulting modifications to the clinit_wrapper
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants