-
Notifications
You must be signed in to change notification settings - Fork 274
Fix bug that prevented --malloc-may-fail from working #5500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#include <assert.h> | ||
#include <stdlib.h> | ||
|
||
int main(void) | ||
{ | ||
int *array = calloc(10, sizeof(int)); | ||
assert(array != NULL); | ||
free(array); | ||
return 0; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
CORE | ||
test.c | ||
--malloc-may-fail --malloc-fail-null | ||
^EXIT=10$ | ||
^SIGNAL=0$ | ||
\[main.assertion.1\] line 7 assertion array != .*: FAILURE | ||
-- | ||
-- | ||
Compiling a file with goto-cc should not affect how --malloc-may-fail behaves | ||
Regex in the assertion because on travis for some reason the preprocessor seems to run | ||
before cbmc. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -186,17 +186,15 @@ void ansi_c_internal_additions(std::string &code) | |
CPROVER_PREFIX "size_t size, " CPROVER_PREFIX "bool zero);\n" | ||
"const void *" CPROVER_PREFIX "alloca_object = 0;\n" | ||
|
||
"int " CPROVER_PREFIX "malloc_failure_mode="+ | ||
std::to_string(config.ansi_c.malloc_failure_mode)+";\n" | ||
"extern int " CPROVER_PREFIX "malloc_failure_mode;\n" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can the malloc-X options be removed from goto-cc now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @peterschrammel we never added them to |
||
"int " CPROVER_PREFIX "malloc_failure_mode_return_null="+ | ||
std::to_string(config.ansi_c.malloc_failure_mode_return_null)+";\n" | ||
"int " CPROVER_PREFIX "malloc_failure_mode_assert_then_assume="+ | ||
std::to_string(config.ansi_c.malloc_failure_mode_assert_then_assume)+";\n" | ||
CPROVER_PREFIX "size_t " CPROVER_PREFIX "max_malloc_size="+ | ||
integer2string(max_malloc_size(config.ansi_c.pointer_width, config | ||
.bv_encoding.object_bits))+";\n" | ||
CPROVER_PREFIX "bool " CPROVER_PREFIX "malloc_may_fail = " + | ||
std::to_string(config.ansi_c.malloc_may_fail) + ";\n" | ||
"extern " CPROVER_PREFIX "bool " CPROVER_PREFIX "malloc_may_fail;\n" | ||
|
||
// this is ANSI-C | ||
"extern " CPROVER_PREFIX "thread_local const char __func__[" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,6 +48,7 @@ Author: Daniel Kroening, [email protected] | |
#include <goto-checker/stop_on_fail_verifier.h> | ||
#include <goto-checker/stop_on_fail_verifier_with_fault_localization.h> | ||
|
||
#include <goto-programs/add_malloc_may_fail_variable_initializations.h> | ||
#include <goto-programs/adjust_float_expressions.h> | ||
#include <goto-programs/initialize_goto_model.h> | ||
#include <goto-programs/instrument_preconditions.h> | ||
|
@@ -879,6 +880,8 @@ bool cbmc_parse_optionst::process_goto_program( | |
link_to_library( | ||
goto_model, log.get_message_handler(), cprover_c_library_factory); | ||
|
||
add_malloc_may_fail_variable_initializations(goto_model); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is now part of the correct set up for goto programs can you either A. add this to all such set-up code ( goto-analyzer, goto-diff, goto-instrument at least) or B. add it to one of the existing methods -- I would not oppose it being part of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It definitely can’t go into The thing with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To summarise above, I don’t really agree from what I understand of the situation, but maybe I’m missing something. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They shouldn't just be My concern is avoiding divergence in how the tools set up and configure goto programs. This has been a source of subtle bugs in the past. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @martin-cs Again, I could put this code in goto-analyzer but I still don’t quite understand what good it’ll do. The models library as a matter of fact is currently not being used for goto-analyzer (and as far as I know this has been the case for at least 5 years), so if I add this code it’ll naturally go untested. So the problem is that then I’m adding code to goto-analyzer that is vestigial to begin with (i.e. literally a no-op in all circumstances). While I do understand that we want to keep goto transforms consistent between different tools in this particular case I don’t see the value in it. Right now goto-analyzer doesn’t even understand things like I think we’re both concerned about maintainability here, but we seem to disagree on which way round is better for maintainability. @peterschrammel @piotr-grabalski @NlightNFotis @thomasspriggs would you mind chipping in? Very much willing to be overruled here, but I want to be sure we’re doing the right thing here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think it should be called by goto-analyzer too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main reason I’d want it to be in the models library is because IMHO whether or not this can return NULL is part of the model, and you should be able to figure out what behaviour the models library has from looking at the library code. Instrumenting logic means distributing the logic between the models library source files and different parts of the C++ code base, but like you said we should discuss how to move forward with this when we’ve actually fixed the problem. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @hannes-steffenhagen-diffblue would you be OK adding a comments (and commented out code?) in the other locations saying why it shouldn't / isn't / may be needed? Would be happy to approve that and get it merged in an hour or so when the CI passes? @peterschrammel I feel the library solution has a number of advantages; should we talk about this at some point? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤔 There is kind of a design decision to be made, as to whether these checks should be part of the model or to be an instrumentation pass. I liked the idea of this being an instrumentation pass. ⛏️ If we are sticking with them being part of the implementation of the model, then I think the approach outlined in this PR isn't good in the long term either. The reason being that this is separating correctly configuring the model/library from adding it. This leaves things in an invalid state in between the two stages, where it is added but not correctly configured. However I am willing to approve as is because this is the simplest way to keep the existing user interface and fix the issue in a short period of time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @thomasspriggs agreed -- we do need to discuss this. Adding the models and configuring them are two different things and yes, we do need to think about it. Do you want to open an issue to discuss this? |
||
|
||
if(options.get_bool_option("string-abstraction")) | ||
string_instrumentation(goto_model, log.get_message_handler()); | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,6 +26,7 @@ Author: Daniel Kroening, [email protected] | |
|
||
#include <jsil/jsil_language.h> | ||
|
||
#include <goto-programs/add_malloc_may_fail_variable_initializations.h> | ||
#include <goto-programs/adjust_float_expressions.h> | ||
#include <goto-programs/goto_convert_functions.h> | ||
#include <goto-programs/goto_inline.h> | ||
|
@@ -814,7 +815,12 @@ bool goto_analyzer_parse_optionst::process_goto_program( | |
link_to_library( | ||
goto_model, ui_message_handler, cprover_cpp_library_factory); | ||
link_to_library(goto_model, ui_message_handler, cprover_c_library_factory); | ||
#endif | ||
|
||
// these are commented out as well because without the library | ||
// this initialization code doesn’t make any sense | ||
add_malloc_may_fail_variable_initializations(goto_model); | ||
|
||
#endif | ||
|
||
// remove function pointers | ||
log.status() << "Removing function pointers and virtual functions" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
/// \file | ||
/// \author Diffblue Ltd. | ||
|
||
#include "add_malloc_may_fail_variable_initializations.h" | ||
|
||
#include "goto_model.h" | ||
|
||
#include <util/arith_tools.h> | ||
#include <util/config.h> | ||
#include <util/irep.h> | ||
|
||
template <typename T> | ||
code_assignt make_integer_assignment( | ||
const symbol_table_baset &symbol_table, | ||
const irep_idt &symbol_name, | ||
T &&value) | ||
{ | ||
const auto &symbol = symbol_table.lookup_ref(symbol_name).symbol_expr(); | ||
return code_assignt{symbol, from_integer(value, symbol.type())}; | ||
} | ||
|
||
void add_malloc_may_fail_variable_initializations(goto_modelt &goto_model) | ||
{ | ||
const auto malloc_may_fail_id = irep_idt{CPROVER_PREFIX "malloc_may_fail"}; | ||
const auto malloc_failure_mode_id = | ||
irep_idt{CPROVER_PREFIX "malloc_failure_mode"}; | ||
if(!goto_model.symbol_table.has_symbol(malloc_may_fail_id)) | ||
{ | ||
// if malloc_may_fail isn't in the symbol table (i.e. builtin library not | ||
// loaded) then we don't generate initialisation code for it. | ||
return; | ||
} | ||
|
||
INVARIANT( | ||
goto_model.symbol_table.has_symbol(malloc_failure_mode_id), | ||
"if malloc_may_fail is in the symbol table then so should " | ||
"malloc_failure_mode"); | ||
|
||
auto const initialize_function_name = CPROVER_PREFIX "initialize"; | ||
PRECONDITION( | ||
goto_model.get_goto_functions().function_map.count( | ||
initialize_function_name) == 1); | ||
auto &initialize_function = | ||
goto_model.goto_functions.function_map.find(initialize_function_name) | ||
->second; | ||
const auto initialize_function_end = | ||
--initialize_function.body.instructions.end(); | ||
|
||
initialize_function.body.insert_before( | ||
initialize_function_end, | ||
goto_programt::make_assignment(make_integer_assignment( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ⛏️ I think I would have replaced the existing assignment in cprover_initialise, rather than adding extra assignments. Your approach will make the resulting code in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @thomasspriggs I don’t understand how having an extra assignment is more difficult to read/debug than adding extra logic to functions at runtime |
||
goto_model.symbol_table, | ||
malloc_may_fail_id, | ||
config.ansi_c.malloc_may_fail))); | ||
|
||
initialize_function.body.insert_before( | ||
initialize_function_end, | ||
goto_programt::make_assignment(make_integer_assignment( | ||
goto_model.symbol_table, | ||
malloc_failure_mode_id, | ||
config.ansi_c.malloc_failure_mode))); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
/// \file | ||
/// \author Diffblue Ltd. | ||
/// Adds an instrumentation to add initialisations for variables | ||
/// used by the --malloc-may-fail and related flags in cbmc | ||
|
||
// NOLINT(whitespace/line_length) | ||
#ifndef CPROVER_GOTO_PROGRAMS_ADD_MALLOC_MAY_FAIL_VARIABLE_INITIALIZATIONS_H | ||
// NOLINT(whitespace/line_length) | ||
#define CPROVER_GOTO_PROGRAMS_ADD_MALLOC_MAY_FAIL_VARIABLE_INITIALIZATIONS_H | ||
|
||
class goto_modelt; | ||
|
||
/// Some variables have different initial values based on what flags are being | ||
/// passed to cbmc; since the _start function might've been generated by goto-cc | ||
/// before getting passed over here we need to add the initialisation of these | ||
/// variables here as a post-processing step | ||
void add_malloc_may_fail_variable_initializations(goto_modelt &goto_model); | ||
|
||
// NOLINT(whitespace/line_length) | ||
#endif // CPROVER_GOTO_PROGRAMS_ADD_MALLOC_MAY_FAIL_VARIABLE_INITIALIZATIONS_H |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚫 I am not sure this comment is helpful. JBMC uses its own models rather than the ansi-c models. Also the java language uses
new
, not malloc. This is parsed from the class file intoID_java_new
/ID_java_new_array
, which is then lowered directly intoID_allocate
. This bypasses the use of a malloc function entirely. Seejbmc/src/java_bytecode/remove_java_new.cpp
for details.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasspriggs there are valid reasons why you might want to add the ansi-c library here; e.g. if there was JNI support and you wanted to link your java models to C models. Not a thing as of now as far as I know, but it's not inconceivable that his would be a concern in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that make a little more sense, if you were thinking about JNI support. AFAIK JNI support isn't planned for the near future. I also suspect that it would be tricky to implement given the complexities of lazy loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that’s why I didn’t want to put it here in the first place! But if you wanted to load the C library, you’d have to add this code and the flags as well, which I believe was @martin-cs's concern and why he asked for these comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasspriggs : Yes, as @hannes-steffenhagen-diffblue observed JNI is one thing. Another is that there are a lot of implicit assumptions in the modelling and instrumentation that the C library is present. It is likely that future modelling of more sophisticated things may well make use of / depend on the C library and I would hate to see the intermediate representation diverge because of this.
@hannes-steffenhagen-diffblue : yes, my concern is in divergence of how we handle things and the "well it works if you use X but not if you use Y first, then X". This way the divergence is clear and documented in the relevant place (ideally it would be in
goto-instrument
andgoto-diff
too).