Skip to content

Set max-nondet-string-length to 10000 by default #4643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

majakusber
Copy link

@majakusber majakusber commented May 10, 2019

Unless no-refine-strings is defined. If no max-nondet-string-length is set and a long string is generated, it gets discarded in the trace which causes problems in test-gen trace interpreter.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@majakusber majakusber self-assigned this May 10, 2019
@majakusber majakusber force-pushed the max-nondet-string-length-default branch 2 times, most recently from 9b2b116 to e63b6ed Compare May 10, 2019 10:36
@@ -1,6 +1,6 @@
CORE
Test.class
--function Test.checkMinLength --string-non-empty
--function Test.checkMinLength --string-non-empty --max-nondet-string-length 1000000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not big enough to cause an overflow (which is what one of the assertion is checking here), the maximum java integer is 2147483647 so you should use at least half that, 2000000000 should work

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @romainbrenguier about the overflow. However, if this is about finding overflows / checking if they occur, please could it say that somewhere in the test or the commit message.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I adjusted the value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The java file already say:

 // For this assertion to be valid, also need to limit the length of
 // strings because overflows could cause the sum to be negative.
 assert arg1.length() + arg2.length() > 1;

maybe the desc file should mention that we explicitly set the length limit high to be able to detect overflow behaviors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added explanation to the .desc files about the overflow

@majakusber majakusber force-pushed the max-nondet-string-length-default branch from e63b6ed to dd973d1 Compare May 10, 2019 13:18
Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable in principle but if there are overflow issues, I would rather they were actually documented / discussed.

@@ -1,6 +1,6 @@
CORE
Test.class
--function Test.checkMinLength --string-non-empty
--function Test.checkMinLength --string-non-empty --max-nondet-string-length 1000000000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @romainbrenguier about the overflow. However, if this is about finding overflows / checking if they occur, please could it say that somewhere in the test or the commit message.

@@ -74,6 +74,10 @@ void parse_object_factory_options(const cmdlinet &cmdline, optionst &options)
"max-nondet-string-length",
cmdline.get_value("max-nondet-string-length"));
}
else if(!cmdline.isset("no-refine-strings"))
{
options.set_option("max-nondet-string-length", 10000);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 10,000? Does this affect performance? If so is this a good choice? Creating an array of 10,000 16-bit characters is a non-trivial amount for the SAT solver to handle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As documented here: https://github.com/diffblue/test-gen/blob/develop/doc/cl-options.md
The performance should not be affected in any significant way. The value is anyway recommended to be set, so setting the default value is only to make sure that we do not trip over it if we forget.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has the option any effect if no-refine-strings is specified?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the two options cannot be specified together:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svorenova : I don't have access to that documentation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-cs Here is screenshot:
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like an argument for it to be 100 rather than 10000.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's because the doc is more concerned with test-generation than verification. If you are interested in the trace generated by jbmc it's better to use a reasonable value like 100, otherwise you can set it as high as you want.

^EXIT=10$
^SIGNAL=0$
^VERIFICATION FAILED$
file Test.java line 20 .*: FAILURE$
--
--
--max-nondet-string-length is not used on purpose, because this tests the behaviour
of string refinement on very large strings.
>his tests the behaviour of string refinement on very large strings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: This

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected

@@ -74,6 +74,10 @@ void parse_object_factory_options(const cmdlinet &cmdline, optionst &options)
"max-nondet-string-length",
cmdline.get_value("max-nondet-string-length"));
}
else if(!cmdline.isset("no-refine-strings"))
{
options.set_option("max-nondet-string-length", 10000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has the option any effect if no-refine-strings is specified?

@majakusber majakusber force-pushed the max-nondet-string-length-default branch from dd973d1 to 364af8d Compare May 10, 2019 14:18
@majakusber
Copy link
Author

majakusber commented May 10, 2019

@peterschrammel @martin-cs @romainbrenguier Ready for re-review. Please note that this is high priority as it's needed for a long-overdue test-gen bump.

@majakusber majakusber force-pushed the max-nondet-string-length-default branch from 364af8d to 2296946 Compare May 10, 2019 14:40
@@ -8,6 +8,7 @@ Author: Diffblue Ltd

#include "object_factory_parameters.h"

#include <solvers/strings/max_nondet_string_length_default.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solvers is not the right module for this file (util shouldn't depend on solvers).
Consider moving the definition to util/magic.h

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, moved

@majakusber majakusber force-pushed the max-nondet-string-length-default branch from 2296946 to 72be726 Compare May 10, 2019 14:57
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one formatting suggestion

@@ -74,6 +74,11 @@ void parse_object_factory_options(const cmdlinet &cmdline, optionst &options)
"max-nondet-string-length",
cmdline.get_value("max-nondet-string-length"));
}
else if(!cmdline.isset("no-refine-strings"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest add blank lines to make clear we've hidden an if/else amongst this sequence of ifs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point, although I would really like to merge this PR and the test-gen bump today (to avoid even more accumulation for the bump over the weekend) so I will not update this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else if(!cmdline.isset("no-refine-strings"))
else

I don't think the if part is needed.

Copy link
Author

@majakusber majakusber May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would cause the max length to be bounded if --no-refine-strings is given which I think is not desired. And it would be bounded by a reasonably small value. If anyone would like to increase it, while keeping --no-refine-strings, they would get an error as one cannot define both options at the same time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smowton I had to restart the CI anyway so I applied your suggestion

@@ -74,6 +74,11 @@ void parse_object_factory_options(const cmdlinet &cmdline, optionst &options)
"max-nondet-string-length",
cmdline.get_value("max-nondet-string-length"));
}
else if(!cmdline.isset("no-refine-strings"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else if(!cmdline.isset("no-refine-strings"))
else

I don't think the if part is needed.

@majakusber majakusber force-pushed the max-nondet-string-length-default branch from 72be726 to 15e5493 Compare May 10, 2019 16:02
Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs seem to indicate that a lower value would be better but ... that's between you and the major users of this, which is not me so approve.

@majakusber majakusber merged commit 9fd7dd4 into diffblue:develop May 10, 2019
@majakusber majakusber deleted the max-nondet-string-length-default branch May 10, 2019 17:52
@tautschnig
Copy link
Collaborator

I'm obviously late to the party, but I'm not particularly pleased with this fixed bound. My concern, in line with what @martin-cs and @romainbrenguier said is that there does not seem to be any rationale provided as to why it is the most appropriate number. Now we seem to be in a situation where the string length cannot be set to be unbounded anymore?

@majakusber
Copy link
Author

@tautschnig To use unbounded length, you could use either --no-refine-strings (which however means that no other string refinement options can be used either, namely --string-printable and --string-input-value) or you can set the bound high enough to allow overflow, see @romainbrenguier's comment here: #4643 (comment)

In any case, I'm happy to change the value to what makes the most sense to both CBMC and test-gen users. I do like the idea of having a default value though - I have wasted quite some time trying to debug a failing test before realizing (with the help of Romain) that the problem was the overflow.

The reason why I chose the bound 10000 rather than 100 is that I wanted to restrict the length as little as possible comparing to the unbound length (previous default). And according to the doc, this is the largest number that should still have little effect on performance.

@romainbrenguier
Copy link
Contributor

In any case, I'm happy to change the value to what makes the most sense to both CBMC and test-gen users.

You could have different ones for test-gen and CBMC I guess, in particular not having one for CBMC is maybe better.

@majakusber
Copy link
Author

@romainbrenguier Good idea, I will shortly open a PR for revert and a test-gen PR introducing the bound.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants