-
Notifications
You must be signed in to change notification settings - Fork 273
Set addr_bits via command line and per language #1161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As much as I hope to get #1086 done (and fully working) some time soon, this seems to be a very reasonable workaround. The bits I'd ask to improve are the user-facing messages, as the majority of users will have no idea about the encoding. |
823b7e8
to
50a56a1
Compare
50a56a1
to
9d9080f
Compare
src/cbmc/cbmc_parse_options.cpp
Outdated
@@ -1060,6 +1062,7 @@ void cbmc_parse_optionst::help() | |||
" --graphml-witness filename write the witness in GraphML format to filename\n" // NOLINT(*) | |||
"\n" | |||
"Backend options:\n" | |||
" --addr-bits n number of bits used for object addresses\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say "object-bits", but more important is the message, which we need to get as clear as possible. How about "number of bits to distinguish objects that have their address taken (default: 8)". Comments welcome!
throw "too many variables"; | ||
const std::size_t max_objects=std::size_t(1)<<object_bits; | ||
if(a==max_objects) | ||
throw "too many dynamic objects: maximum number of objects is set to 2^n="+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"too many addressed objects" maybe? Taking the address of a local or static variable also counts...
src/util/config.cpp
Outdated
} | ||
|
||
// set addr_bits according to entry point language | ||
if(symbol_table.has_symbol(CPROVER_PREFIX "_start")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use else if
and avoid duplicate error messages by just using the "DATA_INVARIANT" below?
src/cbmc/cbmc_parse_options.cpp
Outdated
@@ -662,6 +662,8 @@ int cbmc_parse_optionst::get_goto_program( | |||
if(!binaries.empty()) | |||
config.set_from_symbol_table(symbol_table); | |||
|
|||
config.set_addr_bits(cmdline, symbol_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be done by configt
automatically, as doing this only in "cbmc_parse_options.cpp" may not be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by 'automatically'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config.set(cmdline)
(line 103) should just take care of this, I'd say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't set language defaults at this point because the language isn't known yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can always set defaults for all languages, can't we? If we have a design that requires calling the config
1) multiple times and 2) in a certain sequence then this is bound to break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the config has already been a multistage process before, in particular there are side conditions that the symbol table has been populated. Ie. You don't get the final config before having processed all input files. I see two options: 1) update the config after loading input files (means hooking into language_Ui.final and read Goto binary), 2) introduce a config.finalise call that must be invoked after having read the last input file.
The first option wouldn't require any collaboration of *_parse_optionst, but scatters initialisation of config over several functions, which might become hard to understand. The second one does require collaboration but makes it a bit more uniform wrt how it is currently done in master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies, yes, it is a two-step process in case goto binaries are being read. But then shouldn't those two same steps suffice to adjust the address bits?
Your suggestions go further to eliminate even that second step, or rather avoid the redundancy across tools. Indeed read_object_and_link
could be a good place to invoke config.set_from_symbol_table
.
It's only called from within configt.
0bf0ba7
to
ba6e04d
Compare
ba6e04d
to
d1c7938
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, a few minor notes
@@ -0,0 +1,9 @@ | |||
CORE | |||
main.c | |||
--32 --little-endian --object-bits 25 --pointer-check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment either in this test,desc or in the test body what this assembly is testing
|
||
p.offset= | ||
binary2integer(value.substr(BV_ADDR_BITS, std::string::npos), true); | ||
binary2integer( | ||
value.substr(config.bv_encoding.object_bits, std::string::npos), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: break line before true);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is legal if it fits into the line.
bv_encoding.object_bits=cpp.default_object_bits; | ||
DATA_INVARIANT( | ||
0<bv_encoding.object_bits && bv_encoding.object_bits<ansi_c.pointer_width, | ||
"object_bits should fit into pointer width"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest reporting the outcome (and why) so that user reports show the decision process. For example
Running with 8 object bits, 24 offset bits (user-specified)
or
Running with 16 object bits, 48 offset bits (default for java language)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Added output to status stream.
d1c7938
to
5c6134a
Compare
@tautschnig, @smowton, I have taken into account your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you for all the refactoring!
One is detected with --pointer-check enabled, the other one is currently not detected.
Bug exhibited by cbmc-java/iterator2 when run with --object-bits 35
5c6134a
to
ded2d1c
Compare
@tautschnig, do we need still need these old develop->master PRs? |
I don't really think we need those anymore, unless they have resulted in useful discussion and should thus yield fixes in develop. |
I'll check before closing. |
Closing as merged in develop with "master" no longer being updated. |
In Java we often need more than 256 objects (which is a bottleneck), whereas some other people work with very large objects. Setting the number of address bits via the command line allows everyone who really needs it to adjust the encoding. Additionally language defaults are set so that for normal usage the option is not required.
1c4191a adds a test that should fail if we checked that offset arithmetic didn't overflow (which is not detected at the moment).
Tested with with --addr-bits n for n in [-1,65] on cbmc and cbmc-java regression tests. Found one bug in getting models from the solver 50a56a1.
Improved error messages (which looks ugly as it is now and shows the need for properly typed exceptions throughout the codebase).