-
Notifications
You must be signed in to change notification settings - Fork 273
Do not escape ' in java strings #5038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not escape ' in java strings #5038
Conversation
d72ad34
to
9196782
Compare
Although it saddens me greatly that you've added this |
src/util/unicode.cpp
Outdated
@@ -257,11 +257,13 @@ std::wstring utf8_to_utf16_native_endian(const std::string &in) | |||
return result; | |||
} | |||
|
|||
/// Does not escape the '\'' character, as this is not required to be escaped in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say what it does do instead. Probably something like "escapes non-printable characters, whitespace except for spaces, double- and single-quotes and backslashes. This should yield a valid Java string literal"
src/util/unicode.cpp
Outdated
std::ostringstream &result, | ||
const std::locale &loc) | ||
{ | ||
if(ch <= 255 && isprint(ch, loc) && static_cast<unsigned char>(ch) == '\'') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this whole line with if(ch == (wchar_t)'\'')
src/util/unicode.cpp
Outdated
@@ -298,6 +301,28 @@ static void utf16_native_endian_to_java( | |||
} | |||
} | |||
|
|||
/// Additionally escapes the '\'' character for java characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say what it does / what it's for, probably something like "escapes non-printable characters, whitespace except for spaces, double- and single-quotes and backslashes. This should yield a valid Java identifier"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 9196782).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123868043
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, apart from Chris' comments
683cc08
to
3af91bc
Compare
src/util/unicode.cpp
Outdated
@@ -310,12 +338,13 @@ std::string utf16_native_endian_to_java(const char16_t ch) | |||
|
|||
/// \param in: String in UTF-16 (native endianness) format | |||
/// \return String in US-ASCII format, with \\uxxxx escapes for other characters | |||
std::string utf16_native_endian_to_java(const std::wstring &in) | |||
/// except the ' character which does not need to be escaped for strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, document what it does
When printing a character, it is required to escape the ' character, but when printing it as part of a string, it is not. Therefore, this has been separated into two separate functions.
This has been replaced by utf16_native_endian_to_java_string(std::wstring)
utf16_native_endian_to_java_string should not escape the ' character.
3af91bc
to
93004ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 683cc08).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123924010
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 3af91bc).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123926174
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 93004ed).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123927073
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
Java strings do not need to escape the ' character (e.g.
"That's cool"
is fine), but java characters do (e.g.'''
is not fine, it must be'\''
). This work creates a new conversion function from utf16 to java for characters contained in java strings that does not escape'
, and then uses that to build a java string from the characters.