Skip to content

Bug fix using format function on code_declt containing initialization #5833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

thomasspriggs
Copy link
Contributor

Bug fix using format function on code_declt containing initialization. Functionality was previously unimplemented, which resulted in part of the statement being missing from the formatted output.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • N/A The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • N/A My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

This was previously unimplemented, which resulted in the part of the
statement being missing from the formatted output.
To confirm the functionality added is correct and continues to work when
follow up refactorings are applied.
So that the initial value can be read out of a `code_declt` without
directly accessing the operands.
To avoid directly reading from the operands.
So that the initial value can be written to a `code_declt` without
directly accessing the operands.
To avoid directly writing to the operands of a `code_declt`.
To avoid repeatedly checking the same condition and so that `value` is
only in scope where it is required and non-empty.
To avoid directly setting the operands.
To avoid directly accessing the operands.
@codecov
Copy link

codecov bot commented Feb 17, 2021

Codecov Report

Merging #5833 (dae3eca) into develop (9778868) will increase coverage by 0.01%.
The diff coverage is 96.87%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5833      +/-   ##
===========================================
+ Coverage    72.85%   72.86%   +0.01%     
===========================================
  Files         1423     1422       -1     
  Lines       154072   154194     +122     
===========================================
+ Hits        112244   112357     +113     
- Misses       41828    41837       +9     
Impacted Files Coverage Δ
src/util/std_code.h 94.00% <90.90%> (-0.06%) ⬇️
src/goto-instrument/dump_c.cpp 79.97% <100.00%> (-0.08%) ⬇️
src/goto-instrument/goto_program2code.cpp 68.10% <100.00%> (ø)
src/goto-programs/goto_program.h 89.36% <100.00%> (+0.10%) ⬆️
src/util/format_expr.cpp 85.24% <100.00%> (+0.24%) ⬆️
unit/util/format.cpp 100.00% <100.00%> (ø)
src/solvers/flattening/boolbv.h 69.56% <0.00%> (-4.12%) ⬇️
src/solvers/flattening/bv_pointers.cpp 81.18% <0.00%> (-1.24%) ⬇️
src/solvers/flattening/boolbv_width.cpp 76.69% <0.00%> (-0.23%) ⬇️
src/util/endianness_map.h 100.00% <0.00%> (ø)
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c408367...dae3eca. Read the comment docs.

@SaswatPadhi
Copy link
Contributor

SaswatPadhi commented Feb 17, 2021

The weakest precondition computation here assumes no initial value for a code_declt:
https://github.com/diffblue/cbmc/blob/develop/src/goto-programs/wp.cpp#L224-L235

This function would also need an update -- it currently ignores the initial value, even if it's present. There might be more such cases elsewhere in the codebase?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I find it a bit surprising that we have initialisers in code_declts to begin with.


/// Sets the value to which this declaration initializes the declared
/// variable. Empty optional maybe passed to remove existing initialisation.
void set_initial_value(optionalt<exprt> initial_value)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these can exist with optional initial_values then I'd much prefer we had a constructor that took an initial value as the argument rather than having a separate setter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the examples I have found so far were mutating after construction. I could add the facility to set this via the constructor as well.

@thomasspriggs
Copy link
Contributor Author

The weakest precondition computation here assumes no initial value for a code_declt:
https://github.com/diffblue/cbmc/blob/develop/src/goto-programs/wp.cpp#L224-L235

This function would also need an update -- it currently ignores the initial value, even if it's present. There might be more such cases elsewhere in the codebase?

This PR doesn't make any changes to what data is present. I assume that function currently works due to the assignment/initialisation being separated out before the function is reached. I have tried adding a precondition in that function to initialisation not being in the declaration and all tests passed. So I may tack that change onto this PR, or add it as a separate follow-up PR.

Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes me concerned. Not because of what it does as such but more because of the context. I think you have uncovered one of the historic ambiguities over what, exactly, a well-formed goto_programt is.

I /guess/ what has happened is:

  1. DECL instructions used to have an initialiser statement.
  2. At some point it was decided to simplify things by having DECL be just declaration and initialisation being done with a following ASSIGN.
  3. Most of the code was converted over to split up initialisation. Some older, at-the-time untested, code was left. Also the option was not removed from goto_programt for some reason (backwards compatibility, overlooked, forgotten?).

So, perhaps the question I would ask is not /does/ it have initialisers but /should/ it have initialisers. Personally I would say no because:

  • Quite a bit of the code doesn't support it, it would be a non-trivial effort to do so.
  • AFAIK it is not currently widely used.
  • Orthogonality and simplicity in goto_programt is good because it makes analysis code easier to write.

So, sorry @thomasspriggs for hijacking what seems like a good and simple tidy up but ... should we just remove this all together?

Some of the comments imply that people have found sections of the code that do / don't take these initialisers into account. It might help the discussion if you could post links to these.

@thomasspriggs
Copy link
Contributor Author

This PR makes me concerned. Not because of what it does as such but more because of the context. I think you have uncovered one of the historic ambiguities over what, exactly, a well-formed goto_programt is.

I /guess/ what has happened is:

  1. DECL instructions used to have an initialiser statement.
  2. At some point it was decided to simplify things by having DECL be just declaration and initialisation being done with a following ASSIGN.
  3. Most of the code was converted over to split up initialisation. Some older, at-the-time untested, code was left. Also the option was not removed from goto_programt for some reason (backwards compatibility, overlooked, forgotten?).

So, perhaps the question I would ask is not /does/ it have initialisers but /should/ it have initialisers. Personally I would say no because:

  • Quite a bit of the code doesn't support it, it would be a non-trivial effort to do so.
  • AFAIK it is not currently widely used.
  • Orthogonality and simplicity in goto_programt is good because it makes analysis code easier to write.

I agree that it would be good to be certain whether it is or is not valid to include this on a code_declt or not. My current assumption is that is valid for the parse tree in the symbol table to have such an initialisation in a declaration, but not for a declaration statement in a goto program to have it. Such a distinction would be tricky to enforce given that the same data structure is used in both contexts. I will investigate how the C parser currently handles this and reply again. I would suggest that it could be better for this to be well documented and marked as deprecated, than it would be for it to be left as an undocumented gotcha for future maintainers.

So, sorry @thomasspriggs for hijacking what seems like a good and simple tidy up but ... should we just remove this all together?

My concern with trying to remove this altogether is that it is going to be difficult to be sure that we have found and removed all usages. code_declt has over 100 usages across the code base when I search for it. Any of these usages that exist could be accessing this initialisation directly through the operands. Making it potentially very time-consuming to review them all.

Some of the comments imply that people have found sections of the code that do / don't take these initialisers into account. It might help the discussion if you could post links to these.

The usages which I have looked at so far are either touched by this PR already, or mentioned in Saswat's comment.

@tautschnig
Copy link
Collaborator

Just a quick thought: should we introduce something like code_frontend_declt?

This function does not account for the case where a code_declt contains
`initialization`. Given that the test suite passes with this
PRECONDITION is in place, such statements must be separated out before
this function is reached. Should this condition be violated in future,
then the CONDITION can be removed and the handling of this case
implemented.
@thomasspriggs
Copy link
Contributor Author

Just a quick thought: should we introduce something like code_frontend_declt?

Or a code_goto_declt?

Initializations must be sepearated out of `code_declt` into a separate
assignment before adding to goto program. This is the case already.
These new INVARIANTs check that this continues to be true.
@thomasspriggs
Copy link
Contributor Author

@martin-cs I have investigated a bit further and found that declarations which initialise are produced by the C front end. I have added additional commits which add invariants/preconditions to disallow the declarations in goto-programs from containing initialisations as these should be removed by this point. Please can you re-review? If the clean-up which is in this PR is controversial then I can drop all but the first two commits from it. I do want to get the fix to format merged but I don't want to spend time expanding the scope of this PR to cover changes around using more separate data structures in the front end vs goto-programs.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a reasonable intermediate fix, but the long-term route should likely be one of having a separate some_variant_of_declt class that's to be used by the front-end.

Comment on lines +202 to +206
const auto &decl = expr_checked_cast<code_declt>(code);
INVARIANT(
!decl.initial_value(),
"code_declt in goto program may not contain initialization.");
return decl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the INVARIANT in set_decl is probably the right thing, this one may better be placed in goto_programt::instructiont::validate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this PR is now approved, I am going to merge as is and clean up this as a follow-up PR.

@thomasspriggs thomasspriggs merged commit c7985bd into diffblue:develop Feb 22, 2021
@thomasspriggs thomasspriggs deleted the tas/format_initialisation branch February 22, 2021 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants