Skip to content

fix: Handle double quotes inside quoted identifiers correctly #411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 7, 2022

Conversation

Marwes
Copy link
Contributor

@Marwes Marwes commented Feb 4, 2022

This fixes #410 for standard SQL, however I don't know enough about other dialects to know if they
handle this differently. May need more extensive testing as well.

This fixes apache#410 for standard SQL, however I don't know enough about other dialects to know if they
handle this differently. May need more extensive testing as well.
@Marwes Marwes force-pushed the quoted_identifiers branch from 4ca1db4 to 113722e Compare February 4, 2022 16:57
@coveralls
Copy link

coveralls commented Feb 5, 2022

Pull Request Test Coverage Report for Build 1807104062

  • 60 of 60 (100.0%) changed or added relevant lines in 4 files are covered.
  • 436 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.3%) to 90.484%

Files with Coverage Reduction New Missed Lines %
tests/sqlparser_common.rs 17 98.97%
src/ast/mod.rs 140 77.18%
src/parser.rs 279 84.32%
Totals Coverage Status
Change from base Build 1615803810: 0.3%
Covered Lines: 6865
Relevant Lines: 7587

💛 - Coveralls

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Marwes -- this is looking good. I think with a few more tests this would be good to go

#[test]
fn parse_quoted_identifier() {
pg_and_generic().verified_stmt(r#"SELECT "quoted "" ident""#);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely how postgres handles things

alamb=# SELECT "quoted "" ident" from t1;
ERROR:  column "quoted " ident" does not exist
LINE 1: SELECT "quoted "" ident" from t1;
             

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, msql treats this differently

Database changed
mysql> SELECT "quoted "" ident" from t1;
+----------------+
| quoted " ident |
+----------------+
| quoted " ident |
| quoted " ident |
| quoted " ident |
| quoted " ident |
| quoted " ident |
+----------------+
5 rows in set (0.00 sec)

mysql> SELECT "quoted  ident" from t1;
+---------------+
| quoted  ident |
+---------------+
| quoted  ident |
| quoted  ident |
| quoted  ident |
| quoted  ident |
| quoted  ident |
+---------------+
5 rows in set (0.00 sec)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" is used for strings in mysql unless ANSI_QUOTES is enabled https://dev.mysql.com/doc/refman/8.0/en/identifiers.html . Neither double quoted strings or ANSI_QUOTES seems implemented in this crate and I don't think it is up to this PR to fix that.

@@ -834,6 +834,11 @@ fn parse_comments() {
}
}

#[test]
fn parse_quoted_identifier() {
pg_and_generic().verified_stmt(r#"SELECT "quoted "" ident""#);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we can add a test for dialects (e.g. mysql) that this isn't parsed as an identifier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test for ` quoted strings in mysql

src/tokenizer.rs Outdated
@@ -418,8 +418,26 @@ impl<'a> Tokenizer<'a> {
quote_start if self.dialect.is_delimited_identifier_start(quote_start) => {
chars.next(); // consume the opening quote
let quote_end = Word::matching_end_quote(quote_start);
let s = peeking_take_while(chars, |ch| ch != quote_end);
if chars.next() == Some(quote_end) {
let mut last_char = None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code would be easier to understand if it were pulled out into its own function (peeking_take_while_ident?) which could be unit tested, to ensure it handles cases such as

foo "" bar "" 
foo bar""

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Marwes

@alamb alamb merged commit 34fedf3 into apache:main Feb 7, 2022
@alamb
Copy link
Contributor

alamb commented Feb 7, 2022

I hope to work through some various other PRs that are open this week and make a release later in the week

@coveralls
Copy link

coveralls commented Mar 14, 2025

Pull Request Test Coverage Report for Build 1806179475

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 60 of 60 (100.0%) changed or added relevant lines in 4 files are covered.
  • 436 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.3%) to 90.484%

Files with Coverage Reduction New Missed Lines %
tests/sqlparser_common.rs 17 98.97%
src/ast/mod.rs 140 77.18%
src/parser.rs 279 84.32%
Totals Coverage Status
Change from base Build 1615803810: 0.3%
Covered Lines: 6865
Relevant Lines: 7587

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Quoted identifiers do not handle escaped quotes
3 participants