Skip to content

Set addr_bits via command line and per language #1161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

peterschrammel
Copy link
Member

@peterschrammel peterschrammel commented Jul 21, 2017

In Java we often need more than 256 objects (which is a bottleneck), whereas some other people work with very large objects. Setting the number of address bits via the command line allows everyone who really needs it to adjust the encoding. Additionally language defaults are set so that for normal usage the option is not required.

1c4191a adds a test that should fail if we checked that offset arithmetic didn't overflow (which is not detected at the moment).

Tested with with --addr-bits n for n in [-1,65] on cbmc and cbmc-java regression tests. Found one bug in getting models from the solver 50a56a1.

Improved error messages (which looks ugly as it is now and shows the need for properly typed exceptions throughout the codebase).

@tautschnig
Copy link
Collaborator

As much as I hope to get #1086 done (and fully working) some time soon, this seems to be a very reasonable workaround. The bits I'd ask to improve are the user-facing messages, as the majority of users will have no idea about the encoding.

@peterschrammel peterschrammel force-pushed the object-encoding branch 3 times, most recently from 823b7e8 to 50a56a1 Compare July 22, 2017 20:26
@@ -1060,6 +1062,7 @@ void cbmc_parse_optionst::help()
" --graphml-witness filename write the witness in GraphML format to filename\n" // NOLINT(*)
"\n"
"Backend options:\n"
" --addr-bits n number of bits used for object addresses\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "object-bits", but more important is the message, which we need to get as clear as possible. How about "number of bits to distinguish objects that have their address taken (default: 8)". Comments welcome!

throw "too many variables";
const std::size_t max_objects=std::size_t(1)<<object_bits;
if(a==max_objects)
throw "too many dynamic objects: maximum number of objects is set to 2^n="+
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"too many addressed objects" maybe? Taking the address of a local or static variable also counts...

}

// set addr_bits according to entry point language
if(symbol_table.has_symbol(CPROVER_PREFIX "_start"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use else if and avoid duplicate error messages by just using the "DATA_INVARIANT" below?

@@ -662,6 +662,8 @@ int cbmc_parse_optionst::get_goto_program(
if(!binaries.empty())
config.set_from_symbol_table(symbol_table);

config.set_addr_bits(cmdline, symbol_table);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be done by configt automatically, as doing this only in "cbmc_parse_options.cpp" may not be sufficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by 'automatically'?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.set(cmdline) (line 103) should just take care of this, I'd say.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't set language defaults at this point because the language isn't known yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can always set defaults for all languages, can't we? If we have a design that requires calling the config 1) multiple times and 2) in a certain sequence then this is bound to break.

Copy link
Member Author

@peterschrammel peterschrammel Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the config has already been a multistage process before, in particular there are side conditions that the symbol table has been populated. Ie. You don't get the final config before having processed all input files. I see two options: 1) update the config after loading input files (means hooking into language_Ui.final and read Goto binary), 2) introduce a config.finalise call that must be invoked after having read the last input file.
The first option wouldn't require any collaboration of *_parse_optionst, but scatters initialisation of config over several functions, which might become hard to understand. The second one does require collaboration but makes it a bit more uniform wrt how it is currently done in master.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, yes, it is a two-step process in case goto binaries are being read. But then shouldn't those two same steps suffice to adjust the address bits?

Your suggestions go further to eliminate even that second step, or rather avoid the redundancy across tools. Indeed read_object_and_link could be a good place to invoke config.set_from_symbol_table.

@peterschrammel peterschrammel force-pushed the object-encoding branch 2 times, most recently from 0bf0ba7 to ba6e04d Compare July 25, 2017 14:02
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few minor notes

@@ -0,0 +1,9 @@
CORE
main.c
--32 --little-endian --object-bits 25 --pointer-check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment either in this test,desc or in the test body what this assembly is testing


p.offset=
binary2integer(value.substr(BV_ADDR_BITS, std::string::npos), true);
binary2integer(
value.substr(config.bv_encoding.object_bits, std::string::npos), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: break line before true);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is legal if it fits into the line.

bv_encoding.object_bits=cpp.default_object_bits;
DATA_INVARIANT(
0<bv_encoding.object_bits && bv_encoding.object_bits<ansi_c.pointer_width,
"object_bits should fit into pointer width");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest reporting the outcome (and why) so that user reports show the decision process. For example
Running with 8 object bits, 24 offset bits (user-specified) or
Running with 16 object bits, 48 offset bits (default for java language)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Added output to status stream.

@peterschrammel
Copy link
Member Author

@tautschnig, @smowton, I have taken into account your comments.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you for all the refactoring!

@tautschnig tautschnig changed the base branch from master to develop August 22, 2017 12:20
@tautschnig tautschnig changed the base branch from develop to master August 22, 2017 15:18
@tautschnig tautschnig changed the title Set addr_bits via command line and per language [develop->master] Set addr_bits via command line and per language Aug 22, 2017
@tautschnig tautschnig changed the title [develop->master] Set addr_bits via command line and per language Set addr_bits via command line and per language Sep 2, 2017
@peterschrammel
Copy link
Member Author

@tautschnig, do we need still need these old develop->master PRs?

@tautschnig
Copy link
Collaborator

I don't really think we need those anymore, unless they have resulted in useful discussion and should thus yield fixes in develop.

@peterschrammel
Copy link
Member Author

I'll check before closing.

@tautschnig
Copy link
Collaborator

Closing as merged in develop with "master" no longer being updated.

@tautschnig tautschnig closed this Mar 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants