Parsing and type-checking loop variants (aka ranking functions) #6183

LongPham7 · 2021-06-16T14:33:21Z

In this pull request, I have implemented the parsing and type-checking of loop variants (also known as ranking functions). A loop variant is an "integer expression" that (i) is bounded below and (ii) strictly decreases after each iteration of a loop. We use loop variants to prove termination of loops. This is the feature I am currently working on during my research internship at AWS.

Each commit message has a non-empty body, explaining why the change was made.
n/a ~~Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.~~
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
n/a ~~My commit message includes data points confirming performance improvements (if claimed).~~
My PR is restricted to a single feature or bugfix.
n/a ~~White-space or formatting changes outside the feature-related changed lines are in commits of their own.~~

codecov · 2021-06-16T16:46:23Z

Codecov Report

Merging #6183 (5f6b8a5) into develop (a9ad2b8) will decrease coverage by 0.50%.
The diff coverage is 43.47%.

❗ Current head 5f6b8a5 differs from pull request most recent head 72f9431. Consider uploading reports for the commit 72f9431 to get more accurate results

@@             Coverage Diff             @@
##           develop    #6183      +/-   ##
===========================================
- Coverage    75.39%   74.88%   -0.51%     
===========================================
  Files         1456     1454       -2     
  Lines       161080   160816     -264     
===========================================
- Hits        121440   120435    -1005     
- Misses       39640    40381     +741

Impacted Files	Coverage Δ
src/ansi-c/c_typecheck_base.h	`82.97% <0.00%> (-3.69%)`	⬇️
src/ansi-c/c_typecheck_expr.cpp	`60.13% <ø> (-0.43%)`	⬇️
src/ansi-c/scanner.l	`40.67% <0.00%> (-21.05%)`	⬇️
src/util/std_types.h	`94.44% <0.00%> (-0.67%)`	⬇️
src/ansi-c/c_typecheck_code.cpp	`66.02% <42.85%> (-11.89%)`	⬇️
src/ansi-c/parser.y	`56.50% <60.00%> (-22.59%)`	⬇️
src/goto-instrument/loop_utils.cpp	`94.64% <100.00%> (+0.09%)`	⬆️
src/ansi-c/gcc_types.cpp	`41.86% <0.00%> (-46.52%)`	⬇️
src/ansi-c/ansi_c_entry_point.cpp	`66.04% <0.00%> (-21.40%)`	⬇️
... and 107 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9be0eb...72f9431. Read the comment docs.

martin-cs

Looks good apart from a few small things. Please can we have some tests and some documentation!

martin-cs · 2021-06-18T12:18:37Z

src/util/std_types.h

+/// The int type. 
+/// This is necessary for the type checking of loop variants, which should have
+/// some kind of integer type.
+class int_typet:public typet


See

cbmc/src/util/mathematical_types.h

Line 20 in 0ae6010

/// Unbounded, signed integers (mathematical integers, not bitvectors)

Thank you so much for your suggestion! Ideally, I want to use a "bounded" int type (as opposed to an unbounded int type) because every loop variant/ranking function must be bounded below. Although ID_signed_int is used in CBMC's code base, it doesn't seem to be a valid type of GOTO programs. That is, the function implicit_typecast_followed in

cbmc/src/ansi-c/c_typecast.cpp

Line 489 in 0ae6010

void c_typecastt::implicit_typecast_followed(

doesn't accept ID_signed_int as the target of type-casting. Please let me know if you have a suggestion!

In CPROVER, Integer generally means mathematically bounded integer and C int is a ID_signedbv or a ID_unsignedbv. The choice between the two has significant impact in terms of decidability and performance.

Do you need to fix the type of this annotation? I am /guessing/ what you are going to do is take the expression dec(x,y,z,...) and generate two assertions : dec(x,y,z,...) < dec(__CPROVER_old(x), __CPROVER_old(y), ... ) and 0 <= dec(x,y,z,...)? If so then you can do this with any ordered type that has a 0.

Thanks a lot for your swift response! As you remarked, one idea is to bound loop variants from below; i.e. 0 <= dec(x,y,z,...). Another idea is to use an integer type that is already bounded below (e.g. ID_signedbv and ID_unsignedbv). I was initially considering the first idea, and then switched to the second idea. However, maybe the first idea is better.

By the way, I finished implementing a prototype of loop variants. If I use the type integer_typet from

cbmc/src/util/mathematical_types.h

Line 20 in 0ae6010

/// Unbounded, signed integers (mathematical integers, not bitvectors)

, I can successfully run goto-cc toy.c -o toy.goto and goto-instrument --enforce-all-contracts toy.goto toy.2.goto. However, cbmc toy.2.goto fails - it returns an error message of UNIMPLEMENTED. Instead, if I use ID_signedbv, cbmc toy.2.goto successfully runs. So I probably should switch from integer_typet to a different integer type.

From a decision procedures point of view, bit-vectors (ID_signedbv, ID_unsignedbv, ID_bv) and mathematical integers (ID_integer) are very very different things. For a fixed bit-width, bit-vector decision problems are in NP. There are even some sub-theories that are in P. Mathematical integers are in undecidable in the general case and NP in the restricted linear case. The decision procedures you use for them are very different and converting between them in an expression is generally very expensive because it is hard for the decision procedures to exchange information in a useful way (because they work so very differently). The native back-end does not support more than trivial reasoning over mathematical integers.

My suggestion would be to avoid fixing which you are using unless you really have to. What you are doing should work just fine for both. If you just have it as an expression then the user can give a ranking function in ID_signedbv, ID_integer, ID_float, whatever. You only have to block the case of ID_real or add something like 2*(dec(x'',y'',z'') - dec(x',y',z')) < (dec(x',y',z') - dec(x,y,z)) where each ' denotes an previous iteration.

It is feasible to specify/construct ranking functions using bitvector types. 2LS does that, for instance. However, you need to be careful to use sufficiently wide types so as to avoid any overflows in the arithmetic occurring in the expressions involved.

Thanks for the pointers, Martin, Peter. For now we implemented implicit_typecast_arithmetic check on the ranking function. This check is quite liberal and avoids fixing the expected type for the ranking function.

In the next PR, when we instrument the code to generate the monotonic decrement assertions from ranking functions, we would carefully explore the various corner cases that you have pointed out.

martin-cs · 2021-06-18T12:20:49Z

src/ansi-c/scanner.l

@@ -1281,6 +1281,7 @@ __decltype          { if(PARSER.cpp98 &&
 {CPROVER_PREFIX}"finally"      { loc(); return TOK_CPROVER_FINALLY; }
 {CPROVER_PREFIX}"ID"           { loc(); return TOK_CPROVER_ID; }
 {CPROVER_PREFIX}"loop_invariant" { loc(); return TOK_CPROVER_LOOP_INVARIANT; }
+{CPROVER_PREFIX}"loop_variant" { loc(); return TOK_CPROVER_LOOP_VARIANT; }


Are you committed to the name? My concern is that if we have loop_invariant and loop_variant it is natural to think that they are duals / inverses which is not really the intuition we want. Might it be better to call them ranking functions or similar?

Thanks, this is a good point.
There are several options (in addition to loop_variant and ranking_function) here. For example, Dafny calls it decreases (and I think ACSL too). Coq calls it measures.
I personally think __CPROVER_decreases would be the clearest.

What do you think?

cc: @kroening @tautschnig

For the syntax of loop variants, I've swithced from __CPROVER_loop_variant(...) to __CPROVER_decreases(...). If we change our mind later, we can easily change the syntax.

SaswatPadhi

Some initial comments.

As Martin suggested, let's add some regression tests. I'll take a closer look again later.

src/goto-instrument/loop_utils.cpp

src/ansi-c/parser.y

src/ansi-c/c_typecheck_code.cpp

src/ansi-c/scanner.l

SaswatPadhi · 2021-06-22T01:00:03Z

src/util/std_types.h

+/// The int type. 
+/// This is necessary for the type checking of loop variants, which should have
+/// some kind of integer type.
+class int_typet:public typet


Thanks for the pointers, Martin, Peter. For now we implemented implicit_typecast_arithmetic check on the ranking function. This check is quite liberal and avoids fixing the expected type for the ranking function.

In the next PR, when we instrument the code to generate the monotonic decrement assertions from ranking functions, we would carefully explore the various corner cases that you have pointed out.

SaswatPadhi

Approving with a couple of nitpicks.

SaswatPadhi · 2021-06-22T18:32:37Z

Hi @martin-cs,

Could you please take another look at this PR?

martin-cs · 2021-06-23T10:50:22Z

regression/contracts/variant_parsing_statement_fail/main.c

+  int N = 10;
+  while (i != N)
+    __CPROVER_loop_invariant(0 <= i && i <= N)
+    __CPROVER_decreases(int x = 0)


Is it possible to have a __CPROVER_decreases without a loop invariant?

Yes, it is possible. The default loop invariant is true. The code for that is already in my local repository, and I will file a second PR soon.

martin-cs · 2021-06-23T10:50:52Z

regression/contracts/variant_parsing_tuple_fail/test.desc

+^main.c: In function 'main':$
+^main.c:7:1: error: syntax error before ','$
+^     __CPROVER_decreases\(N - i, 42\)$
+^PARSING ERROR$


Nice to see some negative tests!

feliperodri

LGTM, great work!

feliperodri · 2021-06-23T16:59:25Z

src/ansi-c/c_typecheck_code.cpp

  {
-    exprt &constraint = static_cast<exprt &>(code.add(spec));
+    exprt &invariant = static_cast<exprt &>(code.add(ID_C_spec_loop_invariant));


Is there anything else we could check at this point? For instance, the number of operands? @SaswatPadhi

// this comes with 1 operand, which is a Boolean statement if(code.operands().size()!=1) { error().source_location = code.source_location(); error() << "decreases expected to have 1 operand" << eom; throw 0; }

This is a good idea, but that's already taken care of by the parsing rule, so we don't need to add it to the type checker.

For instance, take a look at regression/contracts/variant_parsing_tuple_fail regression test added in this PR.

Also note that in future this restriction would be relaxed to allow multi-dimensional ranking functions. I'm not sure if we'll make __CPROVER_decreases variadic, or allow some other syntax for tuples.

SaswatPadhi

Please fix the test.desc files (which are failing on Windows)
Fix the formatting of main.c files.

regression/contracts/variant_parsing_statement_fail/test.desc

LongPham7 force-pushed the Long_internship branch from 487973b to 08060b4 Compare June 16, 2021 20:08

martin-cs requested changes Jun 18, 2021

View reviewed changes

SaswatPadhi reviewed Jun 18, 2021

View reviewed changes

src/goto-instrument/loop_utils.cpp Outdated Show resolved Hide resolved

src/ansi-c/parser.y Outdated Show resolved Hide resolved

src/ansi-c/parser.y Outdated Show resolved Hide resolved

LongPham7 force-pushed the Long_internship branch 2 times, most recently from 92402a9 to e788e47 Compare June 18, 2021 21:15

SaswatPadhi assigned LongPham7 Jun 18, 2021

SaswatPadhi added aws Bugs or features of importance to AWS CBMC users Code Contracts Function and loop contracts enhancement labels Jun 18, 2021

LongPham7 force-pushed the Long_internship branch 4 times, most recently from 8d06b14 to c4b2bbb Compare June 21, 2021 23:47

SaswatPadhi marked this pull request as ready for review June 22, 2021 00:45

SaswatPadhi requested review from chrisr-diffblue, kroening and tautschnig as code owners June 22, 2021 00:45

SaswatPadhi reviewed Jun 22, 2021

View reviewed changes

SaswatPadhi approved these changes Jun 22, 2021

View reviewed changes

SaswatPadhi requested a review from martin-cs June 22, 2021 01:00

LongPham7 force-pushed the Long_internship branch from c4b2bbb to cd5decf Compare June 22, 2021 14:01

martin-cs approved these changes Jun 23, 2021

View reviewed changes

feliperodri approved these changes Jun 23, 2021

View reviewed changes

kroening approved these changes Jun 23, 2021

View reviewed changes

SaswatPadhi suggested changes Jun 23, 2021

View reviewed changes

LongPham7 force-pushed the Long_internship branch 3 times, most recently from a0c108c to d933897 Compare June 23, 2021 20:40

LongPham7 force-pushed the Long_internship branch from d933897 to 521bfa5 Compare June 23, 2021 20:44

Implemented parsing and type checking for loop variants

72f9431

LongPham7 force-pushed the Long_internship branch from 521bfa5 to 72f9431 Compare June 23, 2021 21:44

SaswatPadhi approved these changes Jun 23, 2021

View reviewed changes

SaswatPadhi merged commit 99ccbcd into diffblue:develop Jun 23, 2021

Parsing and type-checking loop variants (aka ranking functions) #6183

Parsing and type-checking loop variants (aka ranking functions) #6183

Uh oh!

Conversation

LongPham7 commented Jun 16, 2021 • edited by SaswatPadhi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

martin-cs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LongPham7 Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi left a comment

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi commented Jun 22, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LongPham7 Jun 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feliperodri left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi Jun 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaswatPadhi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LongPham7 commented Jun 16, 2021 •

edited by SaswatPadhi

Loading

codecov bot commented Jun 16, 2021 •

edited

Loading

LongPham7 Jun 18, 2021 •

edited

Loading

SaswatPadhi Jun 18, 2021 •

edited

Loading

LongPham7 Jun 23, 2021 •

edited

Loading

SaswatPadhi Jun 23, 2021 •

edited

Loading