Skip to content

More precise control-flow graph using the simplifier #2031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

tautschnig
Copy link
Collaborator

No description provided.

… false

Apply the simplifier as, e.g., the C front end may generate expressions such as
0!=0 for typing purposes.
const goto_functionst &goto_functions,
const goto_programt &goto_program,
I PC)
{
const goto_programt::instructiont &instruction=*PC;
entryt entry=entry_map[PC];
assert(this->size()>entry);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantry but please can this be an invariant rather than an assert?

@@ -399,12 +446,13 @@ void cfg_baset<T, P, I>::compute_edges(
break;

case ASSUME:
case ASSERT:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this change because I don't think that's the right semantics for assert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general "you can't get past 'assert(false)'" seems reasonable, but I note in --cover mode we intentionally sprinkle ASSERT(FALSE) all over the place and then selectively lift the assertions to try to achieve further coverage. In this case stopping at the first one isn't correct -- but I don't know if cfgt would ever be used on the GOTO program that had the coverage assertions added.

@martin-cs
Copy link
Collaborator

So... semantics of asserts and assumes... This hasn't been documented but ... it probably should be. IMHO:

ASSUME : if the condition is true continue as if nothing has happened. If it is false then immediately halt the whole process (@cesaro @tautschnig and others who have worked on multi-threading, should this be thread or process). Any assertion failures that happened before this point are still reported.

ASSERT : if the condition is true continue as if nothing has happened. If it is false then report a VC failure and continue. Any assertion failures that happened before this point are still reported.

If assert terminates analysis then 1. we have to add all assertions to the path condition and 2. we cannot use multiple 'assert false' conditions to find covering tests cases.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case? When would you want the cfg pass to do this rather than running expression simplifcation yourself prior to the CFG pass?

private:
friend cfg_baset<T, P, I, simp>;

grapht< cfg_base_nodet<T, I> > &container;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be intentional, but note the spacing avoiding >> is not necessary with C++11 onwards

@@ -399,12 +446,13 @@ void cfg_baset<T, P, I>::compute_edges(
break;

case ASSUME:
case ASSERT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general "you can't get past 'assert(false)'" seems reasonable, but I note in --cover mode we intentionally sprinkle ASSERT(FALSE) all over the place and then selectively lift the assertions to try to achieve further coverage. In this case stopping at the first one isn't correct -- but I don't know if cfgt would ever be used on the GOTO program that had the coverage assertions added.

@smowton
Copy link
Contributor

smowton commented May 15, 2018

@tautschnig pointed out in #2125 that asserts are currently possible to pass even if they fail, as @martin-cs mentioned above, so using assert_exprt in the Java front-end to guard null pointer derefs is broken.

Thinking about this more, I think the current semantics of ASSERT are bananas. In C, in Java, in basically any other language these kinds of checks abort if they fail. In particular if I wrote

assert(i == 1);
assert(i == 1);

Then it would be nuts to say it's possible to violate the second assertion, as in that case the first one should have aborted already.

I think the only reason why ASSERT is allowed to continue at the moment is to allow us to sprinkle ASSERT FALSE around to produce coverage goals, and to permit us to hit an ASSERT FALSE target that is dominated by other ones.

I suggest one of the following:

  1. Introduce CHECK, which works like ASSERT does at present (i.e. check and continue). Define ASSERT(x) == CHECK(x); ASSUME(x);
  2. Use ASSERT(NONDET(bool)) instead of ASSERT(FALSE) for coverage goals, which should be equivalent to IF NONDET(bool) THEN ASSERT(FALSE), which itself has the same meaning as CHECK as proposed above.

Option 2 has the advantage of being backward compatible with current cbmc. We could then always treat ASSERT(x) like "target(x); ASSUME(x)."

@martin-cs @tautschnig are either of you aware of any other use of "soft assert" / check / assert-and-continue except for coverage goals?

@smowton
Copy link
Contributor

smowton commented May 15, 2018

NB. it's also possible the intent was to allow us to e.g. verify pointer constraints with --pointer-checks in the knowledge that C programs might disable assertions (-DNDEBUG etc), and indeed this is possible in Java too. However at present this allowed some but not other assertions to be enabled, which is not reasonable in any current target language. If that use case is intended I suggest a straightforward --disable-user-assertions flag with the same meaning as -DNDEBUG or not specifying -ea in Java, i.e. user-written assertions are considered no-ops.

@tautschnig
Copy link
Collaborator Author

tautschnig commented May 15, 2018

@smowton Thank you for your very clear message! I do share the opinion that ASSERT(FALSE) terminates/ought to terminate execution. @martin-cs did oppose this, so I'd be keen to hear more details from him.

For the proposal to move forward: I'd vote against introducing CHECK (or indeed any new instruction for such purpose) and I agree that the short-term solution should be the use of ASSERT(NONDET) for coverage goals. This does have the disadvantage that N+1 SAT solver calls are required for N reachable coverage goals. But then the current implementation of coverage is a surprising hack that I hadn't quite expected to find in there, having published papers that show how doing it differently is advantageous.

@tautschnig
Copy link
Collaborator Author

Actually picking up @martin-cs' earlier comment:

If assert terminates analysis then 1. we have to add all assertions to the path condition

Yes, I think we should - this would also avoid counterexample traces with multiple violated assertions the tail of which (the part one most likely looks at on long counterexamples) usually is incomprehensible until one notices that there already is an earlier failing assertion.

and 2. we cannot use multiple 'assert false' conditions to find covering tests cases.

As said above: we shouldn't be doing so anyway.

@smowton
Copy link
Contributor

smowton commented May 15, 2018

Good point re: not being able to solve multiple coverage goals in one SAT session. Suspect we'd want to benchmark with Deeptest, and if it's much worse, introduce a __CPROVER_coverage_target() intrinsic that Symex understands as goal-and-continue or something like that.

@tautschnig
Copy link
Collaborator Author

[...] that Symex understands as goal-and-continue or something like that.

Just don't do it via instrumentation. Please. Guards have all the information that you need. There is no need to add overhead to compute reachability questions that are encoded already.

@martin-cs
Copy link
Collaborator

Hmmm... I hadn't planned on causing an major discussion... perhaps we can break this down a little...

Things I think we can all agree on:
A. This needs to be clarified and unambigiously documented (perhaps in some kind of normal fotation).

B. The semantics of assume(X) are effectively if (!X) { terminate or loop forever or something }.

C. The question of terminate thread or terminate whole process is out of scope of this discussion and is trolling by the chronically unhelpful.

D. We need a way of expressing C and Java assertions.

E. We need to be able to guard against / disregard what happens after undefined / erronious behaviour such as null pointer dereferences.

F. There are two possible semantics for ASSERT:
Check semantics : the assert denotes something the verifier needs to check and nothing more.
Do-or-Die semantics : this assert denotes something the verifier needs to check and if it fails, the process is terminated.

G. The current semantics are Check and deviations from this were historically bugs.

H. @tautschnig and @smowton are arguing that we should switch to Do-or-Die semantics.

I. Do-or-Die can be modelled in terms of Check semantics by:
ASSERT(X) // Check style!
ASSUME(X)

J. Check semantics is harder to model with Do-or-Die but something like:
ASSERT(X || nondet()) // Do-or-Die style

H. In underapproximate analysis, Do-or-die style assertions should be added to the path condition but Check style ones should not be.

I guess my argument in favour of Check style semantics has a number of parts:

  1. The current semantics work and changing them would be more hassle than it is worth.

  2. It is trivial and natural to represent Do-or-Die semantics using Check (the nondet in an assert seems hackish to my eyes).

  3. Having instructions be as orthogonal as possible is A Good Thing and Check semantics is more orthogonal.

  4. For use cases D and E the Do-or-Die semantics are sufficient, however there are lots of questions we may want to ask about the code that are not these. There are questions that concern the behaviour after the check has failed or require multiple failures. Critically these can be handled in a general way, by the existing infrastructure and the some simple user modelling. Examples include asking for particular parts of the code to be reached, particular values obtained (say ones that differential analysis tells you are possible in the new but not the old), API functions called with particular values, use of resources.

  5. I'm coming to believe that not being forced to add things to the path condition is A Good Thing because it reduces the number of live variables and removes control dependencies which make slicing way more effective.

Specific use cases where it matters:

  • Reachability / MC/DC test-case generation, here check semantics can cover multiple goals with a single SAT call and no program modification or assumptions. In Do-or-Die semantics we would need one call per goal and some changes to the program. It also requires simulating what happens after the assertion to recover additional goals found. This is big and will seriously affect testgen.

  • Maximising the number of assertion failures. This can be done for Check but not easily for Do-or-Die. Use cases include test-suite minimisation, improving the quality of counter examples (for example finding C/E that burn resources fastest).

  • Do the synthesis people want counter-examples that meet any of a number of criteria, and we'd like to know all of the ones that fail, and maybe fail as many as possible?

  • I think some of the information-flow work that was being done at Queen Mary and the probabilistic work being done by Vladimir want to reason about traces that may have one or more of a set of properties.

So... yeah, I think we should stick with Check / fix things that are Do-or-Die to be check.

@martin-cs
Copy link
Collaborator

Another "soft assert" use-case. --float-overflow-check and --nan-check. Generating infinity (or, with directed rounding modes, overflowing but not generating infinities) does not (on real hardware) and should not (IMHO) terminate execution.

If we are checking for NaN as well, terminating could give a false sense of security as many ways of generating NaN work by generating an infinity first. Thus Do-or-Die semantics mean that adding flags / checks may reduce the number of verification failures which ... I find a bit counter-intuitive.

Another one is if we are using a processor / compiler specific model of undefined behaviour (currently not done but would be desireable for security) as without Check style asserts we can't "flag an error and continue".

@martin-cs
Copy link
Collaborator

More soft asserts:

  1. If you want to know which summarised functions actually got called (or even, if you have an under as well as an over-approximation, in which cases did the approximation matter), add a "soft" assert in each one along with the non-det and the assume and each of your counter-examples already has the information in.

  2. I want to add an "annotate" task to goto-analyze (I think Dan even wrote the code for this somewhere) which takes the computed domain at each point and turns it into a "soft" assert so that CBMC can then automatically check the abstract domains for unsound approximations.

I'm beginning to think this is an issue I care about...

@peterschrammel
Copy link
Member

Yes, I think we should - this would also avoid counterexample traces with multiple violated assertions the tail of which (the part one most likely looks at on long counterexamples) usually is incomprehensible until one notices that there already is an earlier failing assertion.

The goto-trace is currently cut at the first failing assertion. That's why you have never observed that situation.

@smowton
Copy link
Contributor

smowton commented May 16, 2018

I guess the scenario @tautschnig has seen is not a single trace, but rather a summary cbmc report saying various conditions can be violated, then after some investigation it becomes clear some are spurious because they require you to somehow fail a (source level) assert then continue to fail another one.

@smowton
Copy link
Contributor

smowton commented May 16, 2018

@martin-cs good points all. In this case I will submit a PR to turn all Java pseudo-exceptions into assert; assume, and I suggest a similar patch would be appropriate for the C frontend (but as Martin points out, not for FP checks introduced by goto-check for example)

I think it would also be good to do a big cleanup that uses "assert" to mean do-or-die and "check" to mean check-and-continue so this sort of confusion is less likely to reoccur... But I don't expect that one to actually get in :)

@tautschnig
Copy link
Collaborator Author

@martin-cs Thank you very much for the excellent elaboration (as always!) - I'm with @smowton that there should be a PR to introduce assumes after each assert that is meant to be fatal. For C I would argue that this applies to user-defined assertions as well as anything guarding undefined behaviour.

@martin-cs
Copy link
Collaborator

Just or completeness...

Other "soft" asserts:
Unsigned integer over/underflow in C - something we check for but is not undefined behaviour and cutting after it means we may miss a more serious issue such as an array bounds check. Over / under flow of an array index is a common start of a security exploit.
Error locations - not clear we should terminate after reaching these.
Exceptions - you may want VCs to check whether exceptions are thrown but also want to check the exception handlers.

Also, a general principle : adding and/or removing asserts / VCs should not alter the semantics of the program. Check doesn't, Do-or-Die does. This becomes an issue when you do multi-stage analysis, such as goto-analyze --simplify first or if you slice as you can't slice away Do-or-Die assertions.

At the risk of pushing my luck; can I suggest that the patches to fix this are a bit more involved than @smowton and @tautschnig suggest:

  • We need to document this somewhere.
  • All language asserts, both Java and C need assumes after them, just like the old 'if' statement. IMHO __CPROVER_assert should just assert.
  • Java needs assumes to guard against erronious behaviour (or to throw the JVM level exceptions)
  • C needs assumes to guard against undefined behaviour, independently of whether it is asserted for.
  • For C it might be good to have a per-architecture / configurable option to model the undefined behaviour rather than assume it doesn't happen. For example, we can model (reasonably accurately) what happens with signed overflow or right shifts.

@smowton
Copy link
Contributor

smowton commented May 17, 2018

Here's the Java frontend part of this: #2191

martin-cs pushed a commit to martin-cs/cbmc that referenced this pull request May 17, 2018
After a discussion on issue diffblue#2031, this was agreed to be the correct
semantics for the following reasons:

1. We want to check for conditions that do not result in termination.
For example, generation of Inf and NaN, reachability of locations and
unsigned overflow / underflow, error locations, exceptions or the use of
program approximations.

2. Allowing traces with multiple assertion failures reduces the number of
solver calls and test cases generated, allowing minimisation without
complex program manipulation.

3. If ASSERT alters control flow then the truth of assertions can change
as more assertions are added, as the failures may get blocked.  Adding
more checks can result in less failures.

4. If ASSERT alters control flow, then it must be taken into account when
slicing, meaning that every variable in any assert must be part of the
slice.

5. ASSUME allows arbitrary blocking of execution, it is better than ASSERT
is orthogonal and:
ASSERT(X)
ASSUME(X)
is used for cases where assertion failure results in termination (such as
null pointer dereference or array bounds failures).
@martin-cs
Copy link
Collaborator

Here's the documentation fix #2193 .

@peterschrammel : Does it cut after the first assert failure or at the assertion it is supposed to be checking?

martin-cs pushed a commit to martin-cs/cbmc that referenced this pull request May 18, 2018
After a discussion on issue diffblue#2031, this was agreed to be the correct
semantics for the following reasons:

1. We want to check for conditions that do not result in termination.
For example, generation of Inf and NaN, reachability of locations and
unsigned overflow / underflow, error locations, exceptions or the use of
program approximations.  This is particularly important for binary
analysis and security.

2. Allowing traces with multiple assertion failures reduces the number of
solver calls and test cases generated, allowing minimisation without
complex program manipulation.

3. If ASSERT alters control flow then the truth of assertions can change
as more assertions are added, as the failures may get blocked.  Adding
more checks can result in less failures.

4. If ASSERT alters control flow, then it must be taken into account when
slicing, meaning that every variable in any assert must be part of the
slice.

5. ASSUME allows arbitrary blocking of execution, it is better than ASSERT
is orthogonal and:
ASSERT(X)
ASSUME(X)
is used for cases where assertion failure results in termination (such as
null pointer dereference or array bounds failures).
martin-cs pushed a commit to martin-cs/cbmc that referenced this pull request May 18, 2018
After a discussion on issue diffblue#2031, this was agreed to be the correct
semantics for the following reasons:

1. We want to check for conditions that do not result in termination.
For example, generation of Inf and NaN, reachability of locations and
unsigned overflow / underflow, error locations, exceptions or the use of
program approximations.  This is particularly important for binary
analysis and security.

2. Allowing traces with multiple assertion failures reduces the number of
solver calls and test cases generated, allowing minimisation without
complex program manipulation.

3. If ASSERT alters control flow then the truth of assertions can change
as more assertions are added, as the failures may get blocked.  Adding
more checks can result in less failures.

4. If ASSERT alters control flow, then it must be taken into account when
slicing, meaning that every variable in any assert must be part of the
slice.

5. ASSUME allows arbitrary blocking of execution, it is better than ASSERT
is orthogonal and:
ASSERT(X)
ASSUME(X)
is used for cases where assertion failure results in termination (such as
null pointer dereference or array bounds failures).
kroening pushed a commit that referenced this pull request May 19, 2018
After a discussion on issue #2031, this was agreed to be the correct
semantics for the following reasons:

1. We want to check for conditions that do not result in termination.
For example, generation of Inf and NaN, reachability of locations and
unsigned overflow / underflow, error locations, exceptions or the use of
program approximations.  This is particularly important for binary
analysis and security.

2. Allowing traces with multiple assertion failures reduces the number of
solver calls and test cases generated, allowing minimisation without
complex program manipulation.

3. If ASSERT alters control flow then the truth of assertions can change
as more assertions are added, as the failures may get blocked.  Adding
more checks can result in less failures.

4. If ASSERT alters control flow, then it must be taken into account when
slicing, meaning that every variable in any assert must be part of the
slice.

5. ASSUME allows arbitrary blocking of execution, it is better than ASSERT
is orthogonal and:
ASSERT(X)
ASSUME(X)
is used for cases where assertion failure results in termination (such as
null pointer dereference or array bounds failures).
@TGWDB
Copy link
Contributor

TGWDB commented Feb 23, 2021

Closing this PR as from the discussion it is clear that there is not a resolution on how this PR should proceed, or even if it should proceed.

If you believe this is erroneous, please reopen this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants