-
Notifications
You must be signed in to change notification settings - Fork 274
Add support for variables entering scope via a goto #8091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ee4c7d7
to
747bb63
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #8091 +/- ##
========================================
Coverage 79.09% 79.10%
========================================
Files 1695 1695
Lines 196514 196596 +82
========================================
+ Hits 155436 155517 +81
- Misses 41078 41079 +1 ☔ View full report in Codecov by Sentry. |
36a28a6
to
6575151
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced that we need to insert a fresh program variable. What if we used dominator analysis instead, as follows: for any use of a variable, lift its declaration to a place so that the declaration dominates all uses. Thereafter, move the "DEAD" statement so that it post-dominates the declaration.
src/goto-programs/scope_tree.h
Outdated
struct declaration_statet | ||
{ | ||
goto_programt::targett instruction; | ||
std::unordered_set<irep_idt, irep_id_hash> if_conditions_added; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what "if_conditions_added" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now renamed and documented this in the two commits I just pushed. Please take another look to see if the meaning is clear to you now.
// } | ||
// to code which looks like - | ||
// { | ||
// __CPROVER_bool going_to::user_label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced that something that can be solved at compile time should introduce runtime cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel confident that the change proposed in this PR will result in an analysis which is correct. I am not certain that the the issue can be solved at compile time in all cases. As such I would propose that we fix first and that optimisation should be considered as a follow-up if there is indeed a notable runtime cost. Please also bear in mind that this transformation is only performed in the case of a goto jumping into a scope and not for other constructs.
If you look at the regression/cbmc/declaration-goto/re-enter_scope.c
included in this PR then I would say that -
- The
assert(i == 42)
assertion should fail. The scope of thei
andj
variables does not intersect, therefore a C compiler should be free to use the same stack location for both variables. Therefore thei
variable is not guaranteed to hold the same value on re-entering the scope. - If we attempt to solve this problem in cbmc by moving the declaration of the
i
variable alone, then cbmc would reportSUCCESS
forassert(i == 42)
when it should fail.
I think that approach may have otherwise unforeseen consequences in the presence of loops. If a goto into a loop causes a variable to come into scope, then the common ancestor would be outside of the loop. Moving the declaration there would result in a single variable being reused for all loop iterations, rather than a fresh variable being declared in each iteration. This could then lead to all loop iterations then seeing the same non-det value, rather than a fresh nondet value for each iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor changes.
Also some architectural commits may be squashed to obtain a slightly cleaner history (the first 3 commits may be squashed, the one changing if_conditions_added
can be lifted to when that function was added, ...).
const build_declaration_hops_inputst &inputs) | ||
{ | ||
// In the case of a goto jumping into a scope, the declarations (but not the | ||
// initialisations) need to be executed. This function performs a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ US english
// initialisations) need to be executed. This function performs a | |
// initializations) need to be executed. This function performs a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our coding standards do not specify either US or British English. Given that the US spelling can be understood by users of British English and vice versa, it does not seem worthwhile to change this.
src/goto-programs/goto_convert.cpp
Outdated
// is illegal for C++ non-pod types and impossible in Java in any case. | ||
// This is however valid C. | ||
// Variables *entering* scope on goto, is illegal for C++ non-pod types | ||
// and impossible in Java. This is however valid C and should be taken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ Actually in C it's illegal to jump entering the scope of a variable-length array or variably-modified typed variables, so the comment is wrong and should rephrased.
However in general it is possible, so I'm just raising as a :pick and not blocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this detail to the comment.
30ce02b
to
ae2db07
Compare
I have addressed your various comments. The commits dealing with the rename were added in order to address Michael's comment. I did not wish to rebase whilst you were in the middle of reviewing as this can cause the github UI to lose your place. The Initial commits are separate as I was concerned about the legibility of a commit which combines a file rename with changes in the file being renamed. |
We'll need to split up goto conversion by language. |
Hi, I was reading the C11 standard about storage durations and goto statements (6.2.4 Storage durations of objects), and how it interacts with gotos and variable length arrays:
So I would think that you also need to handle cases where a goto jumps into a scope after the declaration, in that case the symbol must be declared but has a undefined value. Right now the fix seems to handle only the cas where a goto jumps over some declarations within a block (block in the C source sense). At goto level things could be handled by moving all DECLs of a C block to the beginning of the first GOTO block encoding that C block. That would handle gotos jumping over C decls. For gotos jumping somehwere in the middle a C block from outside of it, you could insert a new GOTO basic block of DECLs that declares all locals of the corresponding C block. {
int x;
if (x >0) goto LABEL;
{
int y = 0;
LABEL:
y += 1;
assert(y ==1);
}
} would give this GOTO program :
|
@remi-delmas-3000 Before we arrived at the solution presented in this PR we tried an implementation which relied on multiple declarations for the same variable. This approach did not yield the results we were hoping for as the two declarations were treated as declarations for different variables despite them both using the same identifier. We did not see the the phi functions we expected during The solution presented in this PR is also expected to work for backwards gotos. Example input with backwards goto.#include <assert.h>
#include <stdbool.h>
int main()
{
bool if_condition1;
if(if_condition1)
{
int x;
x = 0;
my_label:
assert(x == 0);
}
bool if_condition2;
if(if_condition2)
goto my_label;
return 0;
} Converted goto.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summarising the discussion above:
- What's the performance impact of this PR? None, except when the program actually exhibits the situation that the PR fixes.
- What are alternative solutions? Moving decl and dead statements to pre/postdominator locations and inserting nondet assignments in the appropriate places.
- Should goto conversion be language specific? Control flow and scoping semantics needs to be added at some point, either in the construction of the symbol table, the conversion to the goto program or during symbolic execution. Currently, this is done in the goto conversion. We assume that all supported front ends produce symbol tables that are compatible with what goto conversion is doing. If the language semantics deviates then we patch up the goto program (as we do for Java). This PR doesn't seem to be a good enough reason to change that, in particular as C is the only language concerned.
@thomasspriggs, needs a rebase. |
Because this will be needed later for entering a scope via a C front end goto statement. The declaration is optional because it is only needed for named front end declarations. It is not needed for anonymous symbols which cannot be referenced by user code.
This data structure is going to be used for declarations as well as destructors. So it needs to be renamed to match the expanded use case.
To match the new class names.
So it can be retrieved when producing the code for jumping into a scope.
Jumping into scope will may cause these declarations to be executed. Therefore we need to know what the declarations are for each scope in order acheive this.
So that we can track what changes have been made to each declaration.
When a goto which jumps into a scope is converted, the declarations for that scope need to be passed through. It needs to be the same declaration in both execution paths, so that symex will correctly merge the two different states when generating the phi functions. The requirement for executing the same declaration whether entering the scope through a regular route vs a goto into the middle of the block is acheived by setting up a chain of hops where - * Original goto hops to first declaration. * After first declaration we hop to the next declaration if entering via a goto. * Afer the final declaration we hop to the originally specified label. The identifiers in the existing `symex_should_filter_value_sets` test need to be incremented, because the instance being checked is now the second time the variable is added to the scope.
The names of the parameters of `unwind_destructor_stack` were the opposite way round in the header from the `.cpp`. The order in the `.cpp` is the actual order as it is currently implemented. So by (un)swapping the order in the header the names are made consistent.
ae2db07
to
49e8e53
Compare
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
We must convert the condition of an if/then/else first to make sure any declarations produced by converting the condition have a suitable scope-tree node. Previously, gotos in either branch of the if-then-else were led to believe that declarations resulting from condition were not yet in scope. The feature from diffblue#8091 would, thus, result in a spurious loop back to the declaration to put it in scope before following the goto edge.
This PR adds support for variables entering scope via a goto. This fixes the following github issue - #7997