Skip to content

Fix bug that prevented --malloc-may-fail from working #5500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions jbmc/src/jbmc/jbmc_parse_options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -726,6 +726,10 @@ int jbmc_parse_optionst::get_goto_program(

goto_modelt &goto_model = dynamic_cast<goto_modelt &>(*goto_model_ptr);

// if we added the ansi-c library models here this should also call
// add_malloc_may_fail_variable_initializations(goto_model);
// here
Copy link
Contributor

@thomasspriggs thomasspriggs Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 I am not sure this comment is helpful. JBMC uses its own models rather than the ansi-c models. Also the java language uses new, not malloc. This is parsed from the class file into ID_java_new / ID_java_new_array, which is then lowered directly into ID_allocate. This bypasses the use of a malloc function entirely. See jbmc/src/java_bytecode/remove_java_new.cpp for details.

Copy link
Contributor Author

@hannes-steffenhagen-diffblue hannes-steffenhagen-diffblue Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasspriggs there are valid reasons why you might want to add the ansi-c library here; e.g. if there was JNI support and you wanted to link your java models to C models. Not a thing as of now as far as I know, but it's not inconceivable that his would be a concern in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that make a little more sense, if you were thinking about JNI support. AFAIK JNI support isn't planned for the near future. I also suspect that it would be tricky to implement given the complexities of lazy loading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that’s why I didn’t want to put it here in the first place! But if you wanted to load the C library, you’d have to add this code and the flags as well, which I believe was @martin-cs's concern and why he asked for these comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasspriggs : Yes, as @hannes-steffenhagen-diffblue observed JNI is one thing. Another is that there are a lot of implicit assumptions in the modelling and instrumentation that the C library is present. It is likely that future modelling of more sophisticated things may well make use of / depend on the C library and I would hate to see the intermediate representation diverge because of this.

@hannes-steffenhagen-diffblue : yes, my concern is in divergence of how we handle things and the "well it works if you use X but not if you use Y first, then X". This way the divergence is clear and documented in the relevant place (ideally it would be in goto-instrument and goto-diff too).


if(cmdline.isset("validate-goto-model"))
{
goto_model.validate();
Expand Down
4 changes: 2 additions & 2 deletions regression/goto-analyzer/constant_propagation_01/test.desc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ main.c
--constants --simplify out.gb
^EXIT=0$
^SIGNAL=0$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 8, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 18, function calls: 2$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 7, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 17, function calls: 2$
--
^warning: ignoring
4 changes: 2 additions & 2 deletions regression/goto-analyzer/constant_propagation_02/test.desc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ main.c
--constants --simplify out.gb
^EXIT=0$
^SIGNAL=0$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 9, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 17, function calls: 2$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 8, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 16, function calls: 2$
--
^warning: ignoring
4 changes: 2 additions & 2 deletions regression/goto-analyzer/constant_propagation_03/test.desc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ main.c
--constants --simplify out.gb
^EXIT=0$
^SIGNAL=0$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 9, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 17, function calls: 2$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 8, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 16, function calls: 2$
--
^warning: ignoring
4 changes: 2 additions & 2 deletions regression/goto-analyzer/constant_propagation_04/test.desc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ main.c
--constants --simplify out.gb
^EXIT=0$
^SIGNAL=0$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 9, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 17, function calls: 2$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 8, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 0, assigns: 16, function calls: 2$
--
^warning: ignoring
4 changes: 2 additions & 2 deletions regression/goto-analyzer/constant_propagation_07/test.desc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ main.c
--constants --simplify out.gb
^EXIT=0$
^SIGNAL=0$
^Simplified: assert: 1, assume: 0, goto: 3, assigns: 14, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 1, assigns: 15, function calls: 2$
^Simplified: assert: 1, assume: 0, goto: 3, assigns: 13, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 1, assigns: 14, function calls: 2$
--
^warning: ignoring
4 changes: 2 additions & 2 deletions regression/goto-analyzer/constant_propagation_12/test.desc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ main.c
--constants --simplify out.gb
^EXIT=0$
^SIGNAL=0$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 8, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 1, assigns: 16, function calls: 2$
^Simplified: assert: 1, assume: 0, goto: 1, assigns: 7, function calls: 0$
^Unmodified: assert: 0, assume: 0, goto: 1, assigns: 15, function calls: 2$
--
^warning: ignoring
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#include <assert.h>
#include <stdlib.h>

int main(void)
{
int *array = calloc(10, sizeof(int));
assert(array != NULL);
free(array);
return 0;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CORE
test.c
--malloc-may-fail --malloc-fail-null
^EXIT=10$
^SIGNAL=0$
\[main.assertion.1\] line 7 assertion array != .*: FAILURE
--
--
Compiling a file with goto-cc should not affect how --malloc-may-fail behaves
Regex in the assertion because on travis for some reason the preprocessor seems to run
before cbmc.
6 changes: 2 additions & 4 deletions src/ansi-c/ansi_c_internal_additions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -186,17 +186,15 @@ void ansi_c_internal_additions(std::string &code)
CPROVER_PREFIX "size_t size, " CPROVER_PREFIX "bool zero);\n"
"const void *" CPROVER_PREFIX "alloca_object = 0;\n"

"int " CPROVER_PREFIX "malloc_failure_mode="+
std::to_string(config.ansi_c.malloc_failure_mode)+";\n"
"extern int " CPROVER_PREFIX "malloc_failure_mode;\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the malloc-X options be removed from goto-cc now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterschrammel we never added them to goto-cc. What we did add to goto-cc was --object-bits, which could be removed with a similar approach (or by doing the check the object bits are used for in a different way entirely the way @danpoe was suggesting).

"int " CPROVER_PREFIX "malloc_failure_mode_return_null="+
std::to_string(config.ansi_c.malloc_failure_mode_return_null)+";\n"
"int " CPROVER_PREFIX "malloc_failure_mode_assert_then_assume="+
std::to_string(config.ansi_c.malloc_failure_mode_assert_then_assume)+";\n"
CPROVER_PREFIX "size_t " CPROVER_PREFIX "max_malloc_size="+
integer2string(max_malloc_size(config.ansi_c.pointer_width, config
.bv_encoding.object_bits))+";\n"
CPROVER_PREFIX "bool " CPROVER_PREFIX "malloc_may_fail = " +
std::to_string(config.ansi_c.malloc_may_fail) + ";\n"
"extern " CPROVER_PREFIX "bool " CPROVER_PREFIX "malloc_may_fail;\n"

// this is ANSI-C
"extern " CPROVER_PREFIX "thread_local const char __func__["
Expand Down
3 changes: 3 additions & 0 deletions src/cbmc/cbmc_parse_options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Author: Daniel Kroening, [email protected]
#include <goto-checker/stop_on_fail_verifier.h>
#include <goto-checker/stop_on_fail_verifier_with_fault_localization.h>

#include <goto-programs/add_malloc_may_fail_variable_initializations.h>
#include <goto-programs/adjust_float_expressions.h>
#include <goto-programs/initialize_goto_model.h>
#include <goto-programs/instrument_preconditions.h>
Expand Down Expand Up @@ -879,6 +880,8 @@ bool cbmc_parse_optionst::process_goto_program(
link_to_library(
goto_model, log.get_message_handler(), cprover_c_library_factory);

add_malloc_may_fail_variable_initializations(goto_model);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is now part of the correct set up for goto programs can you either A. add this to all such set-up code ( goto-analyzer, goto-diff, goto-instrument at least) or B. add it to one of the existing methods -- I would not oppose it being part of link_to_library as it is part of the handling of the library.

Copy link
Contributor Author

@hannes-steffenhagen-diffblue hannes-steffenhagen-diffblue Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It definitely can’t go into link_to_library and shouldn’t go into any of the instrumentation tools (i.e. it can’t go into link_to_library because it shouldn’t go into the instrumentation tools). The reason this is put in this location is specifically to avoid instrumentation code setting these variables, they should only be set by cbmc. Arguably jbmc and goto-analyzer too, but definitely not goto-instrument etc.

The thing with goto-analyzer though is also that because the stdlib models are cbmc models, I don't think they'd even make sense for goto-analyzer (and I don't think they're loaded). Similar story with jbmc, I don't think it loads the C models (that would make more sense like to verify C code interacting with java code I guess, but I don't know if anyone's actually doing that).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarise above, I don’t really agree from what I understand of the situation, but maybe I’m missing something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They shouldn't just be cbmc models; goto-analyzer will need them as well. They may not work that well at the moment but let's not make them any more broken than they currently are. In jbmc I don't know what their models are written in, Java or C. It would be good if it was possible for it to work.

My concern is avoiding divergence in how the tools set up and configure goto programs. This has been a source of subtle bugs in the past.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-cs Again, I could put this code in goto-analyzer but I still don’t quite understand what good it’ll do. The models library as a matter of fact is currently not being used for goto-analyzer (and as far as I know this has been the case for at least 5 years), so if I add this code it’ll naturally go untested. So the problem is that then I’m adding code to goto-analyzer that is vestigial to begin with (i.e. literally a no-op in all circumstances). While I do understand that we want to keep goto transforms consistent between different tools in this particular case I don’t see the value in it. Right now goto-analyzer doesn’t even understand things like __CPROVER_allocate etc (or rather, I don’t think any of the domains do). For me the problem here is I’m adding code to places that I know does nothing right now, and that I can’t be sure will do the right thing whenever someone changes the surrounding context in such a way that would allow it to do something. Whenever someone in the future would read that goto-analyzer code and asks "what is this bit of the code for", the true answer would be "it does absolutely nothing because the preconditions for it doing anything are not there". If it ever starts doing anything we’ll have no way of knowing it’s doing the right thing because we didn’t (and couldn’t) test it until then.

I think we’re both concerned about maintainability here, but we seem to disagree on which way round is better for maintainability.

@peterschrammel @piotr-grabalski @NlightNFotis @thomasspriggs would you mind chipping in? Very much willing to be overruled here, but I want to be sure we’re doing the right thing here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it should be called by goto-analyzer too.
IMO, this shouldn't be implemented through the models library at all, but by a pass over the goto model - but that may follow afterwards. This PR is for urgently fixing a major bug in CBMC when used with goto-cc. I.e. priority is getting the bug fixed today. Tomorrow we can look how to improve the solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I’d want it to be in the models library is because IMHO whether or not this can return NULL is part of the model, and you should be able to figure out what behaviour the models library has from looking at the library code. Instrumenting logic means distributing the logic between the models library source files and different parts of the C++ code base, but like you said we should discuss how to move forward with this when we’ve actually fixed the problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannes-steffenhagen-diffblue would you be OK adding a comments (and commented out code?) in the other locations saying why it shouldn't / isn't / may be needed? Would be happy to approve that and get it merged in an hour or so when the CI passes?

@peterschrammel I feel the library solution has a number of advantages; should we talk about this at some point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 There is kind of a design decision to be made, as to whether these checks should be part of the model or to be an instrumentation pass. I liked the idea of this being an instrumentation pass.

⛏️ If we are sticking with them being part of the implementation of the model, then I think the approach outlined in this PR isn't good in the long term either. The reason being that this is separating correctly configuring the model/library from adding it. This leaves things in an invalid state in between the two stages, where it is added but not correctly configured. However I am willing to approve as is because this is the simplest way to keep the existing user interface and fix the issue in a short period of time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasspriggs agreed -- we do need to discuss this. Adding the models and configuring them are two different things and yes, we do need to think about it. Do you want to open an issue to discuss this?


if(options.get_bool_option("string-abstraction"))
string_instrumentation(goto_model, log.get_message_handler());

Expand Down
8 changes: 7 additions & 1 deletion src/goto-analyzer/goto_analyzer_parse_options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Author: Daniel Kroening, [email protected]

#include <jsil/jsil_language.h>

#include <goto-programs/add_malloc_may_fail_variable_initializations.h>
#include <goto-programs/adjust_float_expressions.h>
#include <goto-programs/goto_convert_functions.h>
#include <goto-programs/goto_inline.h>
Expand Down Expand Up @@ -814,7 +815,12 @@ bool goto_analyzer_parse_optionst::process_goto_program(
link_to_library(
goto_model, ui_message_handler, cprover_cpp_library_factory);
link_to_library(goto_model, ui_message_handler, cprover_c_library_factory);
#endif

// these are commented out as well because without the library
// this initialization code doesn’t make any sense
add_malloc_may_fail_variable_initializations(goto_model);

#endif

// remove function pointers
log.status() << "Removing function pointers and virtual functions"
Expand Down
3 changes: 2 additions & 1 deletion src/goto-programs/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
SRC = adjust_float_expressions.cpp \
SRC = add_malloc_may_fail_variable_initializations.cpp \
adjust_float_expressions.cpp \
builtin_functions.cpp \
class_hierarchy.cpp \
class_identifier.cpp \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/// \file
/// \author Diffblue Ltd.

#include "add_malloc_may_fail_variable_initializations.h"

#include "goto_model.h"

#include <util/arith_tools.h>
#include <util/config.h>
#include <util/irep.h>

template <typename T>
code_assignt make_integer_assignment(
const symbol_table_baset &symbol_table,
const irep_idt &symbol_name,
T &&value)
{
const auto &symbol = symbol_table.lookup_ref(symbol_name).symbol_expr();
return code_assignt{symbol, from_integer(value, symbol.type())};
}

void add_malloc_may_fail_variable_initializations(goto_modelt &goto_model)
{
const auto malloc_may_fail_id = irep_idt{CPROVER_PREFIX "malloc_may_fail"};
const auto malloc_failure_mode_id =
irep_idt{CPROVER_PREFIX "malloc_failure_mode"};
if(!goto_model.symbol_table.has_symbol(malloc_may_fail_id))
{
// if malloc_may_fail isn't in the symbol table (i.e. builtin library not
// loaded) then we don't generate initialisation code for it.
return;
}

INVARIANT(
goto_model.symbol_table.has_symbol(malloc_failure_mode_id),
"if malloc_may_fail is in the symbol table then so should "
"malloc_failure_mode");

auto const initialize_function_name = CPROVER_PREFIX "initialize";
PRECONDITION(
goto_model.get_goto_functions().function_map.count(
initialize_function_name) == 1);
auto &initialize_function =
goto_model.goto_functions.function_map.find(initialize_function_name)
->second;
const auto initialize_function_end =
--initialize_function.body.instructions.end();

initialize_function.body.insert_before(
initialize_function_end,
goto_programt::make_assignment(make_integer_assignment(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I think I would have replaced the existing assignment in cprover_initialise, rather than adding extra assignments. Your approach will make the resulting code in cprover_initialise more difficult to read, when it is printed for debugging purposes. I think that replacing the existing assignment would also have reduced the need to modify the regression tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasspriggs I don’t understand how having an extra assignment is more difficult to read/debug than adding extra logic to functions at runtime

goto_model.symbol_table,
malloc_may_fail_id,
config.ansi_c.malloc_may_fail)));

initialize_function.body.insert_before(
initialize_function_end,
goto_programt::make_assignment(make_integer_assignment(
goto_model.symbol_table,
malloc_failure_mode_id,
config.ansi_c.malloc_failure_mode)));
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
/// \file
/// \author Diffblue Ltd.
/// Adds an instrumentation to add initialisations for variables
/// used by the --malloc-may-fail and related flags in cbmc

// NOLINT(whitespace/line_length)
#ifndef CPROVER_GOTO_PROGRAMS_ADD_MALLOC_MAY_FAIL_VARIABLE_INITIALIZATIONS_H
// NOLINT(whitespace/line_length)
#define CPROVER_GOTO_PROGRAMS_ADD_MALLOC_MAY_FAIL_VARIABLE_INITIALIZATIONS_H

class goto_modelt;

/// Some variables have different initial values based on what flags are being
/// passed to cbmc; since the _start function might've been generated by goto-cc
/// before getting passed over here we need to add the initialisation of these
/// variables here as a post-processing step
void add_malloc_may_fail_variable_initializations(goto_modelt &goto_model);

// NOLINT(whitespace/line_length)
#endif // CPROVER_GOTO_PROGRAMS_ADD_MALLOC_MAY_FAIL_VARIABLE_INITIALIZATIONS_H