Skip to content

Allow static attribute to be stripped from symbols #4409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 2, 2019

Conversation

karkhaz
Copy link
Collaborator

@karkhaz karkhaz commented Mar 19, 2019

This PR is intended to help users that want to write CBMC proofs for individual functions. The typical way to do this is to write a 'harness' in a different file, which calls the function under test with unconstrained values. However, CBMC cannot see the definition of that function if it is marked static (in the C language. static in Java means something else). We would like to enable writing harnesses even for static functions, or functions that call static functions.

There are two issues solved in this PR.

  1. goto-cc removes symbols from the symbol table if it determines that those symbols are not externally visible. This PR adds the --keep-implementations flag to goto-cc, which prevents it from removing those symbols from the symbol table.
  2. Symbols naming functions defined as static have the file_local flag set. This PR adds the --strip-static flag to goto-instrument, which unsets the file_local flag on all symbols whose name is given as an argument.

So the intended workflow is: compile the file under test with goto-cc, passing in a list of functions that should be exposed. Then pass that same list of functions to goto-instrument so that it can unset the flag. This goto binary can then be successfully linked with a harness that calls the static function.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 82a83e6).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/105034698

@karkhaz karkhaz assigned smowton, tautschnig and martin-cs and unassigned karkhaz Mar 19, 2019
@martin-cs
Copy link
Collaborator

martin-cs commented Mar 27, 2019 via email

@tautschnig
Copy link
Collaborator

Question: it seems like I will need to recursively walk over irepts to do the name-mangling in the program. Is there an existing way of doing this, i.e. some sort of visitor-pattern thing in the CBMC codebase, or do I just do it manually? Asking because I did a similar thing for the linker script work (where I had to match multi-level nested irepts to change their type)...I remember having a conversation with @tautschnig long ago about having a tool called goto-grep to find particular shapes in irepts, and the natural extension of that would be to have goto-sed ;-)

Maybe rename_symbolt is all that you need?

karkhaz added 2 commits March 31, 2019 17:19
This commit allows remove_internal_symbols() to take a message_handlert
object for logging in preparation for a commit where this function will
emit debugging information.
@karkhaz karkhaz force-pushed the kk-strip-statics branch 8 times, most recently from 00b1797 to 067a665 Compare April 2, 2019 02:37
@karkhaz
Copy link
Collaborator Author

karkhaz commented Apr 2, 2019

I'm going to leave the tests to run overnight, but I think I'm ready for review.

Summary

  • Everything is done in goto-cc now. There is no more goto-instrument pass.
  • The flag is called --export-function-local-symbols, because there isn't actually a GCC flag called -fkeep-static-functions (you probably meant -fkeep-inline-functions or -fkeep-static-consts).
  • If the user passes that flag, all static functions in the translation unit are kept and name-mangled.

Name mangling

I've included two implementations. The active one mangles function foo() in path/to/bar.c into

__CPROVER_file_local_bar_c_foo()

which should be good enough almost all of the time. To deal with having the same-named function in a same-named file in a different part of the codebase, you can pass an argument to --mangle-suffix:

__CPROVER_file_local_bar_c_SUFFIX_foo()

this way, users decide whether they need a suffix. They can then pass a different suffix to each invocation of goto-cc. That way, all functions in the same file will be mangled with the same suffix, but different files will have different suffixes.

there is another implementation that hashes the pathname using Daniel J. Bernstein's hash, but it's not currently used. The implementation is chosen using templates, so there's no runtime cost to keeping both. The idea with that implementation is that it takes the entire file path into account, so there's no need for a suffix (but you can still provide one if you like). I turned it off because it's not really readable, I'd like harnesses to be readable.

__CPROVER_file_local_4a82bea1_foo()

Misc

  • rename_symbolt worked nicely, thank you!
  • regression/goto-cc-file-local/chain.sh has become interesting. Usually we use the third line of test.desc to pass command-line args to whatever tool chain.sh is invoking, but I'm instead passing arguments to chain.sh itself to control what it does. Depending on the test, it sometimes links the goto-binary and sometimes doesn't, and then sometimes runs goto-instrument and other times runs cbmc.
  • @martin-cs I just remembered why I wanted to only selectively export functions. The codebase I'm working on is full of function pointers. The more function implementations we have lying around, the more exponential blowup CBMC will encounter while exploring every possible function implementation that fits the signature of a function pointer. Nevertheless, I reasoned that this is such a serious performance problem that I should spend time actually fixing it, rather than merely avoiding making it worse by only exporting some functions. So I'll get round to fixing the function pointer problem one day.

@karkhaz karkhaz assigned smowton, tautschnig and martin-cs and unassigned karkhaz Apr 2, 2019
@karkhaz
Copy link
Collaborator Author

karkhaz commented Apr 2, 2019

(assigning folks for re-review because I've changed the approach significantly. Thank you!)

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 067a665).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/106670075

karkhaz added 2 commits April 2, 2019 08:50
This is to avoid breaking the API while introducing a new parameter,
symbols_to_keep. This parameter will be used in a later commit that will
prevent the typecheck method from stripping out user-requested symbols.
This switch makes goto-cc do three things:

- retain the implementations of all static functions.
- mangle the names of all static functions.
- unset the file_local flag on all static function symbols.

This feature is intended to help users writing CBMC checks for
file-local functions (i.e., those declared as `static` in the C
language). The intended workflow is: compile the code under test with
the -fkeep-static-functions flag. Then link the code under test with
the 'harness' file, again using the flag. This allows the harness to
call into static functions in the codebase under test.
@karkhaz karkhaz force-pushed the kk-strip-statics branch from 067a665 to 4438e77 Compare April 2, 2019 15:51
@karkhaz
Copy link
Collaborator Author

karkhaz commented Apr 2, 2019

Thanks @smowton, I've resolved all your suggestions.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 4438e77).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/106771499

@tautschnig tautschnig merged commit 14b15f6 into diffblue:develop Apr 2, 2019
@karkhaz karkhaz deleted the kk-strip-statics branch April 2, 2019 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants