Skip to content

Require space after -- to start single line comment in MySQL #1705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

hansott
Copy link
Contributor

@hansott hansott commented Feb 4, 2025

static constexpr uint8_t MY_CHAR_CTR =  040; /* Control character */

inline bool my_iscntrl(const CHARSET_INFO *cs, char ch) {
  return ((cs->ctype + 1)[static_cast<uint8_t>(ch)] & MY_CHAR_CTR) != 0;
}

static constexpr uint8_t MY_CHAR_SPC =  010; /* Spacing character */

inline bool my_isspace(const CHARSET_INFO *cs, char ch) {
  return ((cs->ctype + 1)[static_cast<uint8_t>(ch)] & MY_CHAR_SPC) != 0;
}

It's not super clear which characters belong to my_iscntrl and my_isspace but we cannot use char.is_whitespace() as it includes '\n'?

/// Otherwise it's interpreted as a double minus operator
///
/// MySQL: <https://dev.mysql.com/doc/refman/8.4/en/ansi-diff-comments.html>
fn requires_whitespace_to_start_comment(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn requires_whitespace_to_start_comment(&self) -> bool {
fn requires_single_line_comment_whitespace(&self) -> bool {

thinking something like this to flag that this is only for the -- comment style

@@ -880,6 +880,15 @@ pub trait Dialect: Debug + Any {
fn supports_table_hints(&self) -> bool {
false
}

/// Returns whether it's the start of a single line comment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns whether it's the start of a single line comment
/// Returns true if this dialect requires a whitespace after `--` for single line comments.

Comment on lines 129 to 132
/// Returns whether it's the start of a single line comment
/// e.g. MySQL requires a space after `--` to be a single line comment
/// Otherwise it's interpreted as a double minus operator
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is already duplicated at the definition of the dialect method, we could consider not including it (The link to the doc on the other hand is useful in any case)

src/tokenizer.rs Outdated
Comment on lines 1236 to 1239
// MySQL requires a space after the -- for a single-line comment
// Otherwise it's interpreted as two minus signs
// e.g. UPDATE account SET balance=balance--1
// WHERE account_id=5752;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the example to the doc string in the dialect method? That would help clarify the behavior to folks without having to dig deeper in the docs or the parser code.
Then for the rest of the comment here we can probably skip since we've already documented as much in the dialect method

src/tokenizer.rs Outdated
Comment on lines 1240 to 1243
match chars.peekable.clone().nth(1) {
Some(' ') => (),
_ => is_comment = false,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match chars.peekable.clone().nth(1) {
Some(' ') => (),
_ => is_comment = false,
}
is_comment = Some(' ') == chars.peekable.clone().nth(1);

looks like this can be simplified?

src/tokenizer.rs Outdated
Some('-') => {
chars.next(); // consume the second '-', starting a single-line comment

if let Some('-') = chars.peek() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to have this remain as a match statement, calling chars.peek() only once? i.e.

match chars.peek() {
    Some('-') => { ... }
    Some('>') => { ... }
 }

Comment on lines 3251 to 3255
match mysql().parse_sql_statements(
r#"
UPDATE account SET balance=balance--1
WHERE account_id=5752
"#,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use mysql().verified_stmt() for the tests?

@hansott
Copy link
Contributor Author

hansott commented Feb 5, 2025

@iffyio Addressed all your comments, great feedback.

@hansott
Copy link
Contributor Author

hansott commented Feb 5, 2025

@iffyio doing a check for which characters are allowed to start comment (using a script against mysql)

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @hansott!
cc @alamb

@iffyio iffyio merged commit 443f492 into apache:main Feb 5, 2025
9 checks passed
ayman-sigma pushed a commit to sigmacomputing/sqlparser-rs that referenced this pull request Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants