-
Notifications
You must be signed in to change notification settings - Fork 612
Support DISTINCT
for SetOperator
#689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
55b1d2d
e424471
0ac391c
f6e8305
ac41a8e
0337257
826dcc5
03e4950
e52e567
ae6fb81
f336342
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -64,6 +64,23 @@ impl fmt::Display for Query { | |
} | ||
} | ||
|
||
/// Options for SetOperator | ||
#[derive(Debug, Clone, PartialEq, Eq, Hash)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
pub enum SetOperatorOption { | ||
All, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general it should be possible to round trip structures in sqlparser -- so it should be possible to distinguish between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll add the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I fixed this at ac41a8e |
||
Distinct, | ||
} | ||
|
||
impl fmt::Display for SetOperatorOption { | ||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | ||
match self { | ||
SetOperatorOption::All => write!(f, " ALL"), | ||
SetOperatorOption::Distinct => write!(f, " DISTINCT"), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Printing spaces inside the enum structure doesn't make much sense. If there's a space or not, the structure that holds it should handle it (in this case, the SetExpr). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I fixed this at f6e8305 |
||
} | ||
} | ||
} | ||
|
||
/// A node in a tree, representing a "query body" expression, roughly: | ||
/// `SELECT ... [ {UNION|EXCEPT|INTERSECT} SELECT ...]` | ||
#[allow(clippy::large_enum_variant)] | ||
|
@@ -78,7 +95,7 @@ pub enum SetExpr { | |
/// UNION/EXCEPT/INTERSECT of two queries | ||
SetOperation { | ||
op: SetOperator, | ||
all: bool, | ||
op_option: Option<SetOperatorOption>, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This change makes users change their code to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @unvalley I'd name it set_quantifier, as it is the name used in ANSI (1). [1] : https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#set-quantifier There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree set_quantifier sounds good (along with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
left: Box<SetExpr>, | ||
right: Box<SetExpr>, | ||
}, | ||
|
@@ -98,10 +115,14 @@ impl fmt::Display for SetExpr { | |
left, | ||
right, | ||
op, | ||
all, | ||
op_option, | ||
} => { | ||
let all_str = if *all { " ALL" } else { "" }; | ||
write!(f, "{} {}{} {}", left, op, all_str, right) | ||
write!(f, "{} {}", left, op)?; | ||
if let Some(ref o) = op_option { | ||
write!(f, "{}", o)?; | ||
} | ||
write!(f, " {}", right)?; | ||
Ok(()) | ||
} | ||
} | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4173,10 +4173,11 @@ impl<'a> Parser<'a> { | |
break; | ||
} | ||
self.next_token(); // skip past the set operator | ||
let op_option = self.parse_set_operator_option(&op); | ||
expr = SetExpr::SetOperation { | ||
left: Box::new(expr), | ||
op: op.unwrap(), | ||
all: self.parse_keyword(Keyword::ALL), | ||
op_option, | ||
right: Box::new(self.parse_query_body(next_precedence)?), | ||
}; | ||
} | ||
|
@@ -4193,6 +4194,40 @@ impl<'a> Parser<'a> { | |
} | ||
} | ||
|
||
pub fn parse_set_operator_option( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd be cautious about 2 things:
I found that at least There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think ALL / DISTINCT are ANSI SQL and thus should be handled by all dialects (unless we have specific evidence that some dialects don't support them there is no reason to prevent them being parsed) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for commenting.
sorry, could you elaborate this?
OK, I'll remove the specific dialects handling (bq, mysql) and make it parse by GenericDialect. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @unvalley you're parsing them differently. Is there any reason why you can't do something like:
Also, FYI (not saying you must do it in this PR, besides the second one):
[1] : https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#query-expression-body There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @AugustoFKL Thanks, I got it. I should implement I thought the BigQuery dialect should not allow There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @unvalley one option was to parse it conditionally for all of them, the same you do for Your PR is to support If you want the EXACT same support for a dialect (being it permissive or restrictive), I'd say to do it separately, working from a generic scope to a restrict one. Trying to do all at once make the code confusing, and requires reading all documentations at once. |
||
&mut self, | ||
op: &Option<SetOperator>, | ||
) -> Option<SetOperatorOption> { | ||
match op { | ||
Some(SetOperator::Union) => { | ||
if self.parse_keyword(Keyword::ALL) { | ||
Some(SetOperatorOption::All) | ||
} else if self.parse_keyword(Keyword::DISTINCT) | ||
&& dialect_of!(self is MySqlDialect | BigQueryDialect) | ||
{ | ||
Some(SetOperatorOption::Distinct) | ||
} else { | ||
None | ||
} | ||
} | ||
Some(SetOperator::Except) => { | ||
if self.parse_keyword(Keyword::ALL) { | ||
Some(SetOperatorOption::All) | ||
} else { | ||
None | ||
} | ||
} | ||
Some(SetOperator::Intersect) => { | ||
if self.parse_keyword(Keyword::ALL) { | ||
Some(SetOperatorOption::All) | ||
} else { | ||
None | ||
} | ||
} | ||
_ => None, | ||
} | ||
} | ||
|
||
/// Parse a restricted `SELECT` statement (no CTEs / `UNION` / `ORDER BY`), | ||
/// assuming the initial `SELECT` was already consumed | ||
pub fn parse_select(&mut self) -> Result<Select, ParserError> { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice having more description about the structure. Maybe a reference of which big dialects use it...