Skip to content

Enhancing Trailing Comma Option #1212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 7, 2024
4 changes: 4 additions & 0 deletions src/dialect/bigquery.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ impl Dialect for BigQueryDialect {
ch == '`'
}

fn supports_projection_trailing_commas(&self) -> bool {
true
}

fn is_identifier_start(&self, ch: char) -> bool {
ch.is_ascii_lowercase() || ch.is_ascii_uppercase() || ch == '_'
}
Expand Down
4 changes: 4 additions & 0 deletions src/dialect/duckdb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ pub struct DuckDbDialect;

// In most cases the redshift dialect is identical to [`PostgresSqlDialect`].
impl Dialect for DuckDbDialect {
fn supports_trailing_commas(&self) -> bool {
true
}

fn is_identifier_start(&self, ch: char) -> bool {
ch.is_alphabetic() || ch == '_'
}
Expand Down
8 changes: 8 additions & 0 deletions src/dialect/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,14 @@ pub trait Dialect: Debug + Any {
// return None to fall back to the default behavior
None
}
/// Does the dialect support trailing commas around the query?
fn supports_trailing_commas(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supports_trailing_commas suggests to me that it is supporting trailing commas everywhere, and supports_projection_trailing_commas suggests just for projection

If supports_trailing_commas means just for queries, can you please name it to something more specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should try to allow trailing commas everywhere. I'm not really following what you mean by "just for queries"

Copy link
Contributor Author

@MohamedAbdeen21 MohamedAbdeen21 Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed something while going over the tests, Create table by default allows trailing commas, because it doesn't use the parse_comma_separated function, which was added later.

Edit: made a temporary change to create table to remove default trailing commas. I say temporary because we probably can rewrite it using parse_comma_separated, but that should be in a follow-up

false
}
/// Does the dialect support trailing commas in the projection list?
fn supports_projection_trailing_commas(&self) -> bool {
self.supports_trailing_commas()
}
/// Dialect-specific infix parser override
fn parse_infix(
&self,
Expand Down
4 changes: 4 additions & 0 deletions src/dialect/snowflake.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ impl Dialect for SnowflakeDialect {
ch.is_ascii_lowercase() || ch.is_ascii_uppercase() || ch == '_'
}

fn supports_projection_trailing_commas(&self) -> bool {
true
}

fn is_identifier_part(&self, ch: char) -> bool {
ch.is_ascii_lowercase()
|| ch.is_ascii_uppercase()
Expand Down
28 changes: 22 additions & 6 deletions src/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ impl<'a> Parser<'a> {
index: 0,
dialect,
recursion_counter: RecursionCounter::new(DEFAULT_REMAINING_DEPTH),
options: ParserOptions::default(),
options: ParserOptions::new().with_trailing_commas(dialect.supports_trailing_commas()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a good change

}
}

Expand Down Expand Up @@ -3066,7 +3066,7 @@ impl<'a> Parser<'a> {
// This pattern could be captured better with RAII type semantics, but it's quite a bit of
// code to add for just one case, so we'll just do it manually here.
let old_value = self.options.trailing_commas;
self.options.trailing_commas |= dialect_of!(self is BigQueryDialect | SnowflakeDialect);
self.options.trailing_commas |= self.dialect.supports_projection_trailing_commas();

let ret = self.parse_comma_separated(|p| p.parse_select_item());
self.options.trailing_commas = old_value;
Expand Down Expand Up @@ -5014,12 +5014,17 @@ impl<'a> Parser<'a> {
} else {
return self.expected("column name or constraint definition", self.peek_token());
}

let comma = self.consume_token(&Token::Comma);
if self.consume_token(&Token::RParen) {
// allow a trailing comma, even though it's not in standard
break;
} else if !comma {
let rparen = self.peek_token().token == Token::RParen;

if !comma && !rparen {
return self.expected("',' or ')' after column definition", self.peek_token());
};

if rparen && (!comma || self.options.trailing_commas) {
let _ = self.consume_token(&Token::RParen);
break;
}
}

Expand Down Expand Up @@ -8530,6 +8535,9 @@ impl<'a> Parser<'a> {
with_privileges_keyword: self.parse_keyword(Keyword::PRIVILEGES),
}
} else {
let old_value = self.options.trailing_commas;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a change we should discuss in a separate PR as I think it will cause some statements that used to parse to stop working (even if though it is more consistent)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that will break any existing statements. The original behavior was that if trailing commas were enabled and a DCL used 'SELECT', the parsing would fail.

This change temporarily ignores the trailing commas option when parsing DCL.

This is definitely not an ideal solution and should have its own PR (and therefore the corresponding issue should not be closed, even tho this was supposed to be the main target issue), but this has to be pushed with the new dialect trait method to pass the tests.

self.options.trailing_commas = false;

let (actions, err): (Vec<_>, Vec<_>) = self
.parse_comma_separated(Parser::parse_grant_permission)?
.into_iter()
Expand All @@ -8553,6 +8561,8 @@ impl<'a> Parser<'a> {
})
.partition(Result::is_ok);

self.options.trailing_commas = old_value;

if !err.is_empty() {
let errors: Vec<Keyword> = err.into_iter().filter_map(|x| x.err()).collect();
return Err(ParserError::ParserError(format!(
Expand Down Expand Up @@ -9007,6 +9017,12 @@ impl<'a> Parser<'a> {
Expr::Wildcard => Ok(SelectItem::Wildcard(
self.parse_wildcard_additional_options()?,
)),
Expr::Identifier(v) if v.value.to_lowercase() == "from" => {
parser_err!(
format!("Expected an expression, found: {}", v),
self.peek_token().location
)
}
expr => {
let expr: Expr = if self.dialect.supports_filter_during_aggregation()
&& self.parse_keyword(Keyword::FILTER)
Expand Down
78 changes: 74 additions & 4 deletions tests/sqlparser_common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3354,8 +3354,13 @@ fn parse_create_table_clone() {

#[test]
fn parse_create_table_trailing_comma() {
let sql = "CREATE TABLE foo (bar int,)";
all_dialects().one_statement_parses_to(sql, "CREATE TABLE foo (bar INT)");
let dialect = TestedDialects {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about this change -- I think it means that the trailing comma will no longer be supported by all dialects

Copy link
Contributor Author

@MohamedAbdeen21 MohamedAbdeen21 May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that all dialects allowed trailing commas, only in create statements, because create statements are parsed by a special function and not by the parse_comma_separated that actually considers the trailing_commas option. So that was a bug not a feature; however, as mentioned before, that behavior can be restored using the parsingOptions struct.

PS: I pointed that out in our previous conversation and recommended a follow-up to migrate the create statement to use the general parse_comma_separated function.

dialects: vec![Box::new(DuckDbDialect {})],
options: None,
};

let sql = "CREATE TABLE foo (bar int,);";
dialect.one_statement_parses_to(sql, "CREATE TABLE foo (bar INT)");
}

#[test]
Expand Down Expand Up @@ -8394,9 +8399,11 @@ fn parse_non_latin_identifiers() {

#[test]
fn parse_trailing_comma() {
// At the moment, Duck DB is the only dialect that allows
// trailing commas anywhere in the query
let trailing_commas = TestedDialects {
dialects: vec![Box::new(GenericDialect {})],
options: Some(ParserOptions::new().with_trailing_commas(true)),
dialects: vec![Box::new(DuckDbDialect {})],
options: None,
};

trailing_commas.one_statement_parses_to(
Expand All @@ -8414,11 +8421,74 @@ fn parse_trailing_comma() {
"SELECT DISTINCT ON (album_id) name FROM track",
);

trailing_commas.one_statement_parses_to(
"CREATE TABLE employees (name text, age int,)",
"CREATE TABLE employees (name TEXT, age INT)",
);

trailing_commas.verified_stmt("SELECT album_id, name FROM track");

trailing_commas.verified_stmt("SELECT * FROM track ORDER BY milliseconds");

trailing_commas.verified_stmt("SELECT DISTINCT ON (album_id) name FROM track");

// doesn't allow any trailing commas
let trailing_commas = TestedDialects {
dialects: vec![Box::new(GenericDialect {})],
options: None,
};

assert_eq!(
trailing_commas
.parse_sql_statements("SELECT name, age, from employees;")
.unwrap_err(),
ParserError::ParserError("Expected an expression, found: from".to_string())
);

assert_eq!(
trailing_commas
.parse_sql_statements("CREATE TABLE employees (name text, age int,)")
.unwrap_err(),
ParserError::ParserError(
"Expected column name or constraint definition, found: )".to_string()
)
);
}

#[test]
fn parse_projection_trailing_comma() {
// Some dialects allow trailing commas only in the projection
let trailing_commas = TestedDialects {
dialects: vec![Box::new(SnowflakeDialect {}), Box::new(BigQueryDialect {})],
options: None,
};

trailing_commas.one_statement_parses_to(
"SELECT album_id, name, FROM track",
"SELECT album_id, name FROM track",
);

trailing_commas.verified_stmt("SELECT album_id, name FROM track");

trailing_commas.verified_stmt("SELECT * FROM track ORDER BY milliseconds");

trailing_commas.verified_stmt("SELECT DISTINCT ON (album_id) name FROM track");

assert_eq!(
trailing_commas
.parse_sql_statements("SELECT * FROM track ORDER BY milliseconds,")
.unwrap_err(),
ParserError::ParserError("Expected an expression:, found: EOF".to_string())
);

assert_eq!(
trailing_commas
.parse_sql_statements("CREATE TABLE employees (name text, age int,)")
.unwrap_err(),
ParserError::ParserError(
"Expected column name or constraint definition, found: )".to_string()
),
);
}

#[test]
Expand Down
2 changes: 1 addition & 1 deletion tests/sqlparser_postgres.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3572,7 +3572,7 @@ fn parse_create_table_with_alias() {
int2_col INT2,
float8_col FLOAT8,
float4_col FLOAT4,
bool_col BOOL,
bool_col BOOL
);";
match pg_and_generic().one_statement_parses_to(sql, "") {
Statement::CreateTable {
Expand Down
Loading