Skip to content

Merge in feature branch that adds gnat2goto compatibility #3112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

NlightNFotis
Copy link
Contributor

@NlightNFotis NlightNFotis commented Oct 7, 2018

@tautschnig Hi, this is a big PR, effectively a merge of a feature branch we had that adds support to CBMC for it to work with another tool we have, gnat2goto (effectively an ADA to GOTO compiler).

I'm not entirely sure how to handle this PR. It's huge and an pain in the rear as far as I can understand to review, but on the other hand I have already squashed commits as much as I can (well, I can still do more, but it depends on what point you stop and say, okay, this commit changes code that has nothing to do with this other commit, or has a commit message that shouldn't be lost, etc).

What can I do to make this easier to review, push it through? There's some technical debt that I have identified in some places, where because the code is old it's doing things like throwing a string directly.

There's already some work I have done towards that direction, adding some tests and fixing some of the technical debt introduced in this PR, but the test suite I feel can take some more work to be done, and I'd rather it went on a separate PR, so that this can go in, adding the preliminary support we need that's blocking some other stuff. But I don't know how you might want it to be handled, or if you want it all together at once (I can do that, but the tests are not there yet).

@tautschnig
Copy link
Collaborator

A couple of general comments:

  1. I think all the "Document ..." commits should be squashed with their corresponding commits that introduce the functions in the first place.
  2. It would be really helpful if each and every commit could have a commit message with a non-empty body. Coming up with this text will likely make it very obvious where further squashing should occur. The reason I am emphasising this is, because this is code that is being added to prepare for work that sits outside the tree. Hence the rationale for the change may often be basically impossible to infer from the code itself, but is really tied to what's happening externally.

Once the above are doing it might even become fairly straightforward to review if done commit-by-commit.

@kroening
Copy link
Member

kroening commented Oct 7, 2018

Please do not reintroduce ID_symbol for symbol types!

static bool range_check_flag_set(const exprt &e)
{
return e.get_bool("range_check");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would introduce a separate expression for that, as opposed to attaching a non-comment attribute to arbitrary expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand how to do it. There's a variety of different expressions that can use a range check, and to create a new expression for them would mean that we would need to subclass them right? The problem is that these expressions are different classes, with sometimes different parents, which would mean that an inheritance based solution would be messy.

Would it be better if we just made both the range_check and the overflow_check comments for now, so that they don't affect structural equality checks on the ireps at least? What do you think @kroening ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-cs - Do you also have any thoughts on the best approach to take here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see it there are two issues here:

A. What to do with run-time checks that are inserted by the Ada front-end?
B. If and how we transfer information about ranged types from the Ada front-end to CPROVER.

For A. I think the answer is relatively clear:
A.1. gnat2goto should produce calls to internal functions (__CPROVER_Ada_range_check or something ) for all of the Ada run-time checks.
A.2. These should pass through the JSON Symtab code and wind up in the goto-program.
A.3. We should provide a "default" implementation of these so that things just work in the expected way.
A.4. We should also provide an instrumentation pass that recognises these specific functions calls and converts them into assert, assert then assume, assume, assert then throw if false or just throw if false.

Why : the actual semantics are throw if false but depending on what you are trying to do (find runtime errors, prove things about the "correct" behaviour of the code, etc.) and what run-time you are using (is there anything other than the default catch-and-terminate?) the other behaviours are more desireable.

I'm a bit more undecided on B. In some ways it would be nice to have the Ada typing carried over into CPROVER. However I'm not sure if the benefits are worth the hassle of doing it just yet. To me the key question is what we would just that information for. GNAT should have already inserted the right checks to capture overflow and underflow. So what would we use it for? Better nondets? That said, just because we can't come up with anything at this stage doesn't mean we should write this off for ever and we may have to come back and revisit this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-cs I'd personally prefer just generating the right thing from gnat2goto controllable via a switch rather than producing a bunch of ada specific intrinsics that need to be supported in cbmc.

As to B: I'd also prefer Ada typing to stay within Ada-land for the most part, especially as some Ada types are dynamic (which'd be a bit weird from the cbmc perspective and needs to be implemented on the gnat2goto side anyway). I'd suggest we generate globals that have the right range values and things like that for static types, and just do what the ada runtime would have to do to support dynamic types, i.e. pass them on as additional parameters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannes-steffenhagen-diffblue
A. There isn't a one right thing to do; it depends on the analysis you are doing and thus should not be part of the front-end IMHO. I would prefer to keep the compile process as simple and as easily integrated with various build systems as possible; additional flags will not help that.

B. CPROVER already supports RTTI for C++ (and I suspect Java) so tagged types should be handled like that. I guess my main concern is with ranged types and Ada 2012 type predicates. The current "base type + range check" solution is probably fine for now but it's something we may need to keep in mind (and one of the reasons I want the range checks clearly marked so that the abstract interpreter can pick up information from them if needs be).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a private discussion with @martin-cs, we agreed that it's better for this PR to go in as it is right now, and for the ada intrinsics to be added to gnat2goto at a later PR. Quoting the relevant parts of the discussion:

Me:

In your latest comment there (#3112 (comment)) do you mean that it's okay for this PR to go in with the range and overflow checks as they are right now, and for us to make note of adding intrinsics to GNAT2GOTO like __CPROVER_Ada_range_check at a later stage?

Martin:

Hmmm... I'd rather thought of it as a separate PR
I won't be offended either way around.

So now this is only blocked on @kroening

const typet &real_type = ns.follow(expr.type());
if(real_type.id() != ID_signedbv)
return;
const exprt &lower_bound =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have a ranged type! Look up range_typet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range_typet is not appropriate to use at this point in time, because the frontend gnat2goto is producing signed_bv values that also have a bound value bolted on (see https://github.com/diffblue/gnat2goto/blob/master/gnat2goto/driver/tree_walk.adb#L2804). This will mean that for this to change, we have to also make a significant change to gnat2goto. As this work is for getting preliminary support for gnat2goto in cbmc I suggest this stays as it is and gets updated later. What do you think about this @kroening ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to keep these as signedbv, as in Ada intermediary expressions are allowed to escape range boundaries as long as they stay within the boundaries of the underlying implementation (i.e. no under/overflow) and end up within the right range during the final assignment.

dest+=")";
return dest;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why put into the C language converter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming because up until this point we didn't have a dedicated ada mode, or ada subfolder. Though @smowton might be more appropriate to answer this question.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second the wtf

@@ -1038,9 +1038,11 @@ void goto_convertt::convert_while(
z->make_skip();
z->source_location=source_location;

exprt followed_cond = cond;
followed_cond.type() = ns.follow(followed_cond.type());
goto_programt tmp_branch;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are really trying to get rid of these 'follows'. May I suggest you expand while reading in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree with @kroening - where possible we should expand when reading in the json_symtab, or maybe have a cleanup pass immediately after reading in.

@@ -39,6 +39,9 @@ SRC = arith_tools.cpp \
json_expr.cpp \
json_irep.cpp \
json_stream.cpp \
json_symbol.cpp \
json_symbol_table.cpp \
json_symbol.cpp \
lispexpr.cpp \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three files should be in src/json-symtab-language.

@NlightNFotis NlightNFotis force-pushed the external_parser_support_rebased branch from 146a948 to b776e3f Compare October 11, 2018 09:16
#ifndef CPROVER_UTIL_JSON_SYMBOL_H
#define CPROVER_UTIL_JSON_SYMBOL_H

#include "json.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NlightNFotis Shouldn't it be #include <util/json.h>? At least I can't compile it locally like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's fixed already, sorry I hadn't pushed.

@@ -111,6 +108,7 @@ CPROVER_LIBS =../src/java_bytecode/java_bytecode$(LIBEXT) \
$(CPROVER_DIR)/src/ansi-c/ansi-c$(LIBEXT) \
$(CPROVER_DIR)/src/cpp/cpp$(LIBEXT) \
$(CPROVER_DIR)/src/json/json$(LIBEXT) \
$(CPROVER_DIR)/src/json_symtab_language/json_symtab_language$(LIBEXT) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs '-' rather than '_' in the directory name :-) but should be working once that's changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my bad.

include ../config.inc
include ../common

CLEANFILES = json_symtab_language$(LIBEXT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with other directories/libs and the directory name, this should probably be json-symlab-language$(LIBEXT) - but note you'll need to change that in all the makefiles/targets where it's used, of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And note that it's just the library name that needs to change, not any source file names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. This bit me in more than a couple of occasions, because some Makefiles refer to it as json_symtab_language and some others as json-symtab-language. I will try to change it to the second in all of them.

@NlightNFotis NlightNFotis force-pushed the external_parser_support_rebased branch 7 times, most recently from 6041d4c to 141fb31 Compare October 29, 2018 12:16
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: a7d6b3a).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/91914658

@NlightNFotis NlightNFotis force-pushed the external_parser_support_rebased branch from a7d6b3a to 3df89c8 Compare November 20, 2018 11:16
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 3df89c8).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/91999047

@NlightNFotis NlightNFotis force-pushed the external_parser_support_rebased branch from 3df89c8 to 891c71c Compare November 20, 2018 14:11
@NlightNFotis
Copy link
Contributor Author

@kroening All of the comments outstanding have now been addressed. Please re-review at your earliest convenience.

@NlightNFotis NlightNFotis assigned kroening and unassigned kroening Nov 20, 2018
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 891c71c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/92027581

@@ -1291,6 +1292,8 @@ void goto_checkt::add_guarded_claim(

void goto_checkt::check_rec(const exprt &expr, guardt &guard, bool address)
{
const auto &type = expr.type();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to goto_checkt::check_rec are unrelated to the commit they are in.

throw "`"+expr.id_string()+"' takes Boolean operands only, but got "+
op.pretty();
if(op.type().id() != ID_bool)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems not particularly useful; if you want to really have it, then please put into separate PR.

dest += convert_with_precedence(let_expr.where(), precedence);
dest += ")";
return dest;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already pointed out by Michael, I don't think that the pretty printer for LET expressions belongs into the C front-end. Note that format_expr can already pretty print LET expressions.

exprt &expr,
goto_programt &dest,
const irep_idt &mode)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that needed at all? We do support let expressions.

class symbol_tablet;

void symbol_table_from_json(const jsont &, symbol_tablet &);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have rvalues! Return that symbol_table!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest to manually construct and populate the symbol table from within the function? That does make sense, but it's not very convenient, because the symbol table is constructed much earlier in the call stack, and it's in the interface of languaget::typecheck, and passed around to end up in this function that populates it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the symbol table that gets passed to json_symtab_languaget::typecheck is expected to be empty; so you can simply overwrite.

catch(const std::string &str)
{
error() << "json_symtab_languaget::typecheck: " << str << eom;
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't announce the class where errors happen elsewhere, so I'd remove here for consistency.

@kroening
Copy link
Member

This is getting there, but a few of Michael's comments got overlooked.

Implement JSON symtab language, a new, more generic
language frontend, able to read GOTO programs in JSON
form, from any frontend that generates them. Currently
this stands as the support frontend for our ADA
conversion tool, GNAT2GOTO.
@NlightNFotis
Copy link
Contributor Author

@kroening Hi Daniel, we decided to drop a lot of the bolted on stuff and re-implement them in gnat2goto, where we believe they are more appropriately located, so this PR is now just the json-symtab-language frontend support. Can you re-review it at your earliest convenience please?

@NlightNFotis NlightNFotis assigned kroening and unassigned kroening Nov 21, 2018
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 86e8356).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/92194795

@chrisr-diffblue
Copy link
Contributor

As this has now been approved by @kroening I'm going to merge this while CI + JoelBot are both green. Any remaining minor cleanup can be implemented as new, smaller, less noisy to review PRs. Thanks everyone for the really constructive review comments!

@chrisr-diffblue chrisr-diffblue merged commit 58b9d93 into diffblue:develop Nov 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants