Remove invariant on `full_lhs_value` in printing `goto_tracet` #4153

thomasspriggs · 2019-02-11T10:54:40Z

This invariant in the printing is preventing other, otherwise usefull
output from being printed, in the case where a goto trace has been
produced which contains a full_lhs_value set to nil in an assignment
step.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

This invariant in the printing is preventing other, otherwise usefull output from being printed, in the case where a goto trace has been produced which contains a `full_lhs_value` set to `nil` in an assignment step.

kroening · 2019-02-11T10:58:08Z

Do we understand under which conditions this is nil?
If these are desirable cases, that member could be an optionalt?

thomasspriggs · 2019-02-11T11:13:03Z

No. I don't understand exactly why we have a nil in the output in this particular case. I encountered this issue due to failed regression tests after, adding the trace to diagnostic information printed from test-gen. If this PR is rejected as is, my next step will be to revert printing the trace instead.

NlightNFotis · 2019-02-11T11:21:55Z

It's generally discouraged to just blindly remove INVARIANTs like this. The invariants are part executable documentation of the code. It would be much better if this was debugged and the root issue was fixed rather than just remove the INVARIANT. Unless you're certain that the INVARIANT was incorrect in the first case.

thomasspriggs · 2019-02-11T11:28:14Z

Is printing really the appropriate place for these invariants at all? By this point the bulk of the time consuming evaluation has been completed and I think it is useful to be able to get the output even if it isn't necessarily 100% valid. Could the invariants in trace printing be moved into a separate "validate trace" process instead?

tautschnig

Please do dig deeper as others have already suggested.

tautschnig · 2019-02-11T11:28:25Z

src/goto-programs/json_goto_trace.cpp

@@ -132,9 +132,6 @@ void convert_decl(
  }
  else
  {
-    DATA_INVARIANT(
-      step.full_lhs_value.is_not_nil(),
-      "full_lhs_value in assignment must not be nil");
    full_lhs_value = json(step.full_lhs_value, ns, ID_unknown);


It's already worth investigating why we even end up in this branch. That means that the symbol on the LHS is not in the namespace, which should only be happening if it is a symbol that symex has introduced. Even then that's got a justification for why the full_lhs_value should be nil. We've probably got some bug in goto-symex in that case, but I have no idea which one that is.

Am I correct in my understanding of this comment that this invariant should hold for both the branch where the symbol is found and when it is not?

I think that should be the case, yes - and that observation of yours might be the key to resolving this: either it is an invariant, or it should go away completely. For debugging purposes I'd suggest copying this into symex_target_equationt to figure out where the invariant is first violated.

I've added in symex_target_equationt::assignment the following invariants:

DATA_INVARIANT( ssa_full_lhs.is_not_nil(), "full_lhs_value in assignment must not be nil"); DATA_INVARIANT( original_full_lhs.is_not_nil(), "full_lhs_value in assignment must not be nil"); DATA_INVARIANT( ssa_rhs.is_not_nil(), "full_lhs_value in assignment must not be nil");

And none of them trigger on the failing case (i.e. still see unknown in the trace). Is that what you meant? I can look at the assignment calls to see what these values are, but don't really know what I'd be looking for.

@thk123 Many thanks, that's good to know. Thus things must be going wrong in goto-symex/build_goto_trace.cpp line 344 I'd say. Would you mind inserting an appropriate invariant in there? I think we end up in boolbvt::bv_get_unbounded_array, which actually does return nil_exprt on a couple of branches (but I'm not sure that's a good idea). Actually there are quite a few occurrences of nil_exprt in solvers/flattening/boolbv_get.cpp. I'm not sure anyone of them is a good idea (and the hard-coded 100 in line 361 isn't good either). Could you work with @kroening to resolve this?

So it comes down to:

boolbvt::bv_get_unbounded_array

Which is failing to evaluate the size expr. The reason for this failure is the expression is (rewritten for readability):

if (&symex_dynamic::dynamic_object1 == constant(18000000000000)) 4; else symex::invalid_object.length;

Note the constant exprt on the rhs of the if is:

constant * type: pointer * width: 64 * value: 18000000000000 0: address_of 0: symbol * identifier: symex_dynamic::dynamic_object1 * expression: symbol * identifier: symex_dynamic::dynamic_object1 * #SSA_symbol: 1 * L1_object_identifier: symex_dynamic::dynamic_object1

So fixing the underling problem seems hard - the safe_pointers is not able to see through the GOTO:

ASSUME !((struct java::array[reference] *)a == null) // Null pointer check ASSERT ((struct java::array[reference] *)a)->length >= 0 // Array size should be >= 0 ASSUME ((struct java::array[reference] *)a)->length >= 0 // Array size should be >= 0 newarray_tmp1 = ALLOCATE(struct java::array[float], 16ul, false); *newarray_tmp1 = { [email protected]={ .@class_identifier="java::array[float]" }, .length=0, .data=null }; newarray_tmp1->length = ((struct java::array[reference] *)a)->length;

Obviously this can be fixed by maybe making safe pointers smarter or the front end put the null check nearer, but shouldn't fundamentally a trace still be produced without a fix here? Should bv_get_unbounded_array stumble on, looking up the array and disregarding the size? I don't think I have enough insight into the expected behaviour to realistically do this unfortunately.

I have created #4197 to address the limitations in bv_get_unbounded_array. Input, and in particular tests would be very much appreciated.

symbol on the LHS is not in the namespace, which should only be happening if it is a symbol that symex has introduced.

Even that shouldn't happen if the correct namespace was used (which at least the new goto-checker implementations should be doing).

tautschnig · 2019-02-11T11:38:46Z

Is printing really the appropriate place for these invariants at all?

No, this should really be checked by SSA validation (the code triggered by --validate-ssa-equation).

thk123 · 2019-02-11T11:53:03Z

I'm going to have a dig see if I can produce a minimal example here. It is entirely possible that turning --validate-ssa-equation would also cause this test to fail...

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: eb5b1c4).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/100433562

thk123 · 2019-02-12T14:33:49Z

So the essence of the problem is the symbol doesn't exist, because it is a symbol introduced by symex, and we print this trace using original goto model, rather than bmc.ns (which is used for regular trace printing).

This looks a little fiddly to fix on our end, so let me make an moral argument for why this fix is ok 😆

Printing a trace is a useful operation both for providing users how there program is broken, but also in debugging code that uses traces. In the latter case, you always want it to print, even (And indeed especially) if the trace is nonsense. Therefore I believe it should always have a bash at printing it. A validate trace step before printing might make sense for the first case.

As an aside, if I move this invariant to above the lookup:

--- src/goto-programs/json_goto_trace.cpp	(date 1549634233000)
+++ src/goto-programs/json_goto_trace.cpp	(date 1549981900000)
@@ -118,6 +118,10 @@
   const symbolt *symbol;
   irep_idt base_name, display_name;
 
+  DATA_INVARIANT(
+    step.full_lhs_value.is_not_nil(),
+    "full_lhs_value in assignment must not be nil");
+
   if(!ns.lookup(identifier, symbol))
   {
     base_name = symbol->base_name;
@@ -132,9 +136,6 @@
   }
   else
   {
-    DATA_INVARIANT(
-      step.full_lhs_value.is_not_nil(),
-      "full_lhs_value in assignment must not be nil");
     full_lhs_value = json(step.full_lhs_value, ns, ID_unknown);
   }

Then I can reproduce this invariant on JBMC. Notice the only transformation this nil expr goes through is simplify_expr so if this really should never be true, I think it is true in some cases.

thk123 · 2019-02-12T17:58:46Z

For example:

public static void minimal(Object[] a) {
    CProver.assume(a != null && a.length > 1);
    boolean[] assigned = new boolean[a.length];
    assigned[0] = true;

    // Cause NPE
    Object o = null;
    o.toString();
}

(Null Pointer Exception added to the end to ensure a trace is produced, there's probably a nicer way - it isn't required).

Running this through jbmc:

$ jbmc TestClass.class --function "TestClass.minimal:([Ljava/lang/Object;)V" --trace --json-ui

Produces a trace that contains:

{
  "assignmentType": "variable",
  "hidden": false,
  "internal": true,
  "lhs": "byte_extract_little_endian(dynamic_9_array[0l], 0l, signed char)",
  "sourceLocation": {
    "bytecodeIndex": "17",
    "file": "TestClass.java",
    "function": "java::TestClass.minimal:([Ljava/lang/Object;)V",
    "line": "31"
  },
  "stepType": "assignment",
  "thread": 0,
  "value": {
    "name": "unknown"
  }
},

Note the unknown value is exactly the case where the full_lhs_value is nil (and if the invariant is moved above the if, then it will crash).

I also note in this example, the boolean array is cast to a byte array (becuase java bytecode using bastore (byte array store) to store values in the array. The cast is wrong, but I think this bug lies in the Java bytecode front end

tautschnig · 2019-02-12T18:33:22Z

Thank you @thk123 for further debugging. There may be some Java-related problem, but I don't think it's restricted to the Java front-end. Two observations:

Omitting --json-ui you get something like

byte_extract_little_endian(dynamic_9_array, 0l + (signed long int)0l, signed char)=byte_extract_little_endian(irep("(\"nil\" \"type\" (\"\"))"), 0l, signed char) (?)

in the trace. This isn't appropriate output as it humans wouldn't be able to make sense of this. To me this confirms that the invariant is appropriate, and that we are failing it because of a bug, not because the invariant shouldn't be there.
2) I have previously seen this on C code as well, for example regression/cbmc/trace_address_arithmetic1/main.c.

thk123 · 2019-02-13T11:55:21Z

PS the cast to byte array, though probably wrong, is not ultimately responsible here - changing the array to a float array does not resolve the issue.

tautschnig · 2019-03-30T00:29:19Z

I would prefer #4372 over this one.

tautschnig · 2019-04-03T12:58:48Z

@thomasspriggs Should this be closed now that #4372 has been merged?

thomasspriggs · 2019-04-03T13:00:51Z

I am happy to close this. @thk123 Can you confirm that you are also happy for this PR to be closed?

thk123 · 2019-04-03T14:34:58Z

Yup - we now need to flip the printing back on since the invariant should have gone away...

Remove invariant on full_lhs_value in printing goto_tracet

eb5b1c4

This invariant in the printing is preventing other, otherwise usefull output from being printed, in the case where a goto trace has been produced which contains a `full_lhs_value` set to `nil` in an assignment step.

thomasspriggs requested review from kroening, peterschrammel, pkesseli, smowton and tautschnig as code owners February 11, 2019 10:54

thomasspriggs requested review from romainbrenguier and thk123 February 11, 2019 10:55

tautschnig requested changes Feb 11, 2019

View reviewed changes

allredj reviewed Feb 11, 2019

View reviewed changes

tautschnig mentioned this pull request Feb 15, 2019

Do not generate nil_exprt in bv_get #4197

Merged

3 tasks

tautschnig self-assigned this Feb 26, 2019

thk123 mentioned this pull request Mar 12, 2019

Add test verifying no invalid trace with dynamic array #4372

Merged

5 tasks

tautschnig assigned thk123 and thomasspriggs and unassigned tautschnig Mar 30, 2019

thk123 closed this Apr 3, 2019

thomasspriggs deleted the tas/trace_printing_error branch November 3, 2019 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove invariant on `full_lhs_value` in printing `goto_tracet` #4153

Remove invariant on `full_lhs_value` in printing `goto_tracet` #4153

thomasspriggs commented Feb 11, 2019 •

edited

Loading

kroening commented Feb 11, 2019

thomasspriggs commented Feb 11, 2019

NlightNFotis commented Feb 11, 2019

thomasspriggs commented Feb 11, 2019

tautschnig left a comment

tautschnig Feb 11, 2019

thk123 Feb 12, 2019

tautschnig Feb 12, 2019

thk123 Feb 13, 2019

tautschnig Feb 13, 2019

thk123 Feb 13, 2019

thk123 Feb 13, 2019

tautschnig Feb 15, 2019

peterschrammel Feb 16, 2019

tautschnig commented Feb 11, 2019

thk123 commented Feb 11, 2019

allredj left a comment

thk123 commented Feb 12, 2019

thk123 commented Feb 12, 2019 •

edited

Loading

tautschnig commented Feb 12, 2019

thk123 commented Feb 13, 2019

tautschnig commented Mar 30, 2019

tautschnig commented Apr 3, 2019

thomasspriggs commented Apr 3, 2019

thk123 commented Apr 3, 2019

Remove invariant on full_lhs_value in printing goto_tracet #4153

Remove invariant on full_lhs_value in printing goto_tracet #4153

Conversation

thomasspriggs commented Feb 11, 2019 • edited Loading

kroening commented Feb 11, 2019

thomasspriggs commented Feb 11, 2019

NlightNFotis commented Feb 11, 2019

thomasspriggs commented Feb 11, 2019

tautschnig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tautschnig commented Feb 11, 2019

thk123 commented Feb 11, 2019

allredj left a comment

Choose a reason for hiding this comment

thk123 commented Feb 12, 2019

thk123 commented Feb 12, 2019 • edited Loading

tautschnig commented Feb 12, 2019

thk123 commented Feb 13, 2019

tautschnig commented Mar 30, 2019

tautschnig commented Apr 3, 2019

thomasspriggs commented Apr 3, 2019

thk123 commented Apr 3, 2019

Remove invariant on `full_lhs_value` in printing `goto_tracet` #4153

Remove invariant on `full_lhs_value` in printing `goto_tracet` #4153

thomasspriggs commented Feb 11, 2019 •

edited

Loading

thk123 commented Feb 12, 2019 •

edited

Loading