-
Notifications
You must be signed in to change notification settings - Fork 274
Add CSmith GitHub action #5700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CSmith GitHub action #5700
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #5700 +/- ##
========================================
Coverage 69.54% 69.54%
========================================
Files 1243 1243
Lines 100700 100700
========================================
Hits 70036 70036
Misses 30664 30664
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
This seems like a good idea but the script could use some more explanation for what is going on there. |
Indeed. I've added comments! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I /really/ like the concept but ... non-deterministic things make me nervous. I would be happier if the scripts/csmith.sh
took a random seed and csmith.yml
set it to the hash of the diff or something similar that could be locally reproduced.
I'm also a little unsure about what you feel should happen if you find issues of this form. Is it the job of the person who submitted the PR to fix them? Raise issues?
@martin-cs I'd say failure for these is not PR raiser responsibility, but if something does come up we should definitely take note. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO "cbmc matches csmith test behaviour" and "goto-cc --> dump-c should yield program with same behaviour" are two different tests and probably shouldn't be in the same script, but otherwise I think this is a good addition for finding potential problems hopefully before users run into them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
Run 10 randomly generated CSmith tests as a GitHub action to perform an integration test involving CBMC and goto-instrument --dump-c. Each test is first compiled and run using GCC to obtain a checksum, which is then included as an assertion for CBMC to verify.
Very good point - I've now modified the script to:
I've done multiple thousands of runs of CSmith in recent weeks, and am thus fairly confident that the likelihood of a PR tripping this up by a bug outside the changes introduced in said PR are fairly low. Thus I'd argue it is the job of the person who submitted the PR to get this back in working order. |
Yes, having two separate jobs might be nicer, but I'm a bit hesitant to have almost-the-same-script twice in the repository. So I'd for now stick with the current solution. If we find that we regularly only run into bugs in one or the other part then that'll provide an incentive to improve this. |
@tautschnig : sounds great; let's do it! |
Run 10 randomly generated CSmith tests as a GitHub action to perform an
integration test involving CBMC and goto-instrument --dump-c. Each test
is first compiled and run using GCC to obtain a checksum, which is then
included as an assertion for CBMC to verify.