Skip to content

Add support of parsing CLUSTERED BY clause for Hive #1397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 1, 2024

Conversation

git-hulk
Copy link
Member

This PR supports CLUSTERED BY clause in the CREATE TABLE for Hive dialect, which is used to group data into buckets by CLUSTERED BY columns. For more information, please refer to:

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable

It also introduces the following keywords:

  • CLUSTERED
  • SORTED
  • BUCKETS

This resolves part of issue #1395.

@@ -236,19 +239,6 @@ impl Display for CreateTable {
HiveDistributionStyle::PARTITIONED { columns } => {
write!(f, " PARTITIONED BY ({})", display_comma_separated(columns))?;
}
HiveDistributionStyle::CLUSTERED {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PARTIONED BY and CLUSTERED BYcan appear at the same time, so we cannot use an enum to put them together. I remove CLUSTERED from HiveDistributionStyle here but it won't affect users since we don't support parsing CLUSTERED BY before this PR. cc @iffyio

This PR supports `CLUSTERED BY` clause in CREATE TABLE for Hive dialect,
which is used to group data into buckets by CLUSTERED BY columns. For
more information, please refer to:

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable

And it also introduces the following keywords:

- CLUSTERED
- SORTED
- BUCKETS
@git-hulk git-hulk force-pushed the feature/clustered-by-for-hive branch from b990684 to e7ae11e Compare August 24, 2024 12:57
@coveralls
Copy link

Pull Request Test Coverage Report for Build 10538676696

Details

  • 70 of 74 (94.59%) changed or added relevant lines in 7 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.04%) to 89.253%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/ast/helpers/stmt_create_table.rs 5 6 83.33%
src/parser/mod.rs 23 24 95.83%
src/ast/ddl.rs 6 8 75.0%
Files with Coverage Reduction New Missed Lines %
tests/sqlparser_common.rs 1 89.57%
Totals Coverage Status
Change from base Build 10528509561: 0.04%
Covered Lines: 28620
Relevant Lines: 32066

💛 - Coveralls

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! cc @alamb

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iffyio and @git-hulk !

@alamb alamb merged commit 7b4ac7c into apache:main Sep 1, 2024
10 checks passed
ayman-sigma pushed a commit to sigmacomputing/sqlparser-rs that referenced this pull request Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants