Skip to content

Remove "SQL" from types (and other renames) #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 25, 2019
Merged

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Jun 5, 2019

Please view this PR as a first-draft proposal, not a ready-to-merge patch! I'm most interested in sparking a discussion.


It has long bothered me that some of the types in this package start with "SQL", like SQLObjectName, while others do not, like Value and Join. The module naming has the same inconsistency: we have sqltokenizer and a sql_operator modules, but also dialect and value modules.

Digging through the repository history, it seems this is an artifact of extracting this package from DataFusion. In DataFusion it was natural to have a "sqltokenizer" module, since you couldn't assume that a "tokenizer" was a SQL tokenizer, while for us, where our only concern is SQL, it's rather redundant to stick "SQL" in front of everything. So this PR takes the approach of stripping the "SQL" prefix from all types and modules. It's also possible that we'll want to go the other direction, adding "SQL" where it doesn't exist. I'm open to that as well, and would be happy to update this PR accordingly. My biggest concern is consistency. Note that if we do stick with the "SQL" prefix, we should spell it "Sql" to be consistent with Rust naming conventions—and in fact some of the types do already use this alternative capitalization.

I also took the opportunity to correct a few inconsistencies around ASTNode::BinaryExpr and ASTNode::Unary, and renamed ASTNode to Expr; I think I've left rationale for all of these changes in the individual commits.

@benesch benesch requested a review from nickolay June 5, 2019 16:41
@benesch
Copy link
Contributor Author

benesch commented Jun 5, 2019

I forgot to mention: it's worth being sensitive to the effect that renames will have on downstream consumers. I'm hopeful that it's early enough in the lifecycle of this package that we can make a sweeping change like this—it's certainly not going to get any easier down the road.

@coveralls
Copy link

coveralls commented Jun 5, 2019

Pull Request Test Coverage Report for Build 377

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at ?%

Totals Coverage Status
Change from base Build 375: 0.0%
Covered Lines:
Relevant Lines: 0

💛 - Coveralls

@nickolay
Copy link
Contributor

nickolay commented Jun 6, 2019

I think that the "Standardize BinaryOp and UnaryOp nodes" and "Split operators by arity" commits deserve to be split into a separate PR and landed, as it's a clear improvement for a small change.

I didn't make a PR with the ASTNode -> Expr rename when I split non-expression variants from ASTNode, because of the effect on everyone, which you noted. I feel that with the amount of changes made to the AST in this cycle it's probably better to do it before the next release, rather than later.

I understand the desire to remove the "SQL" prefix, be consistent internally and with the Rust naming guidelines, and I agree the status quo is weird with things like SQLStatement::SQLAlterTable, but:

  • we can't consistently avoid the SQL prefix (as you kept SqlType and SqlOption)
  • names like "Drop" are less greppable than SQLDrop, and I don't have trust in IDE-like features like "find all references" for Rust yet
  • with names like "Word", "Operator", "Ident" it's not as obvious this is a custom type, not a stdlib thing. (But maybe it just takes getting used to.)
  • "Sql" makes me cringe - naming conventions be damned:) - so I wouldn't insist on using it in every name.

On the other hand, objecting to having a naming scheme when you do the hard work of converting everything to conform to it would be unwise of me. So I'd like to abstain.

@benesch
Copy link
Contributor Author

benesch commented Jun 6, 2019

On the other hand, objecting to having a naming scheme when you do the hard work of converting everything to conform to it would be unwise of me. So I'd like to abstain.

No no, I'm very interested in your opinion. I use JetBrains for stuff like this, because its support for large-scale refactors blows absolutely everything else out of the water. (Not only does it work, it's nearly instant.) I spent about 20m in total crafting this PR, so don't feel like I've made a ton of investment. I like to see how the refactor feels, too; it's tough for me to say whether a given naming scheme is good or bad without actually seeing it in use.

@nickolay
Copy link
Contributor

nickolay commented Jun 9, 2019

Well I shared my opinion, but I do think it'd be a shame if we preserved all the inconsistencies as the result. I can't come up with a solution that looks satisfactory to me, so I'm ready to live with whatever you or Andy come up with.

To expand on my previous points:

names like "Drop" are less greppable than SQLDrop, and I don't have trust in IDE-like features like "find all references" for Rust yet

I use RLS and trying to "find all references" on SQLDrop by clicking on it in this line just doesn't find anything. Given that I didn't find an issue about that quickly, I probably ought to investigate this further... But unless it's a problem on my end, this seems to be a somewhat valid concern - I'm not sure.

"Sql" makes me cringe

This on the other hand was tongue-in-cheek. I'm not against "Sql" itself, but I'd rather not do a mechanical SQL->Sql rename until the naming scheme is figured out.

@andygrove
Copy link
Member

Quick thoughts on this:

I agree it is redundant prefixing names with SQL in this project, and yes, this partly came about because this crate was extracted from a larger project.

Sql is annoying (compared to SQL) but I could live with it.

The changes made in this repo compared to the version that people are actually using (last time I checked at least) are already big enough that consumer have considerable refactoring to do so I wouldn't worry too much about breaking changes at this point. Maybe it is time to do a check though and see what projects have dependencies on the published crate and maybe reach out to people using this to get some opinions.

We should also talk a release schedule at some point. The last release was 2 months ago apparently.

@nickolay
Copy link
Contributor

Maybe it is time to do a check though and see what projects have dependencies on the published crate and maybe reach out to people using this to get some opinions.

There's still the same two published dependent crates - locustdb and datafusion, both still on 0.2.x...

I think we should close this before the next release (#117).

@benesch, would you like to update and land the ASTNode -> Expr and the module renaming?

Regarding types,

  • I withdraw my concerns about "Word", "Operator" (now Unary/BinaryOperator), "Ident". Indeed you get used to it.
  • SQLType could be renamed to DataType instead
  • For SQLStatement::SQLDrop/SQLCopy and others - it turns out grepping for Drop/Copy is not actually a problem, as there are few hits. If you still are in favor of Statement::Drop naming - let's go with it!

@benesch
Copy link
Contributor Author

benesch commented Jun 18, 2019

Sounds good! I'm going to tackle this later today (once I land #118).

@benesch benesch mentioned this pull request Jun 18, 2019
@benesch
Copy link
Contributor Author

benesch commented Jun 18, 2019

Ok, I'm going to do this in chunks, to try to keep the reviews manageable. The first chunk, the ASTNode -> Expr rename, is ready for review! See #119.

Since this crate only deals with SQL parsing, the modules are understood
to refer to SQL and don't need to restate that explicitly.
@benesch
Copy link
Contributor Author

benesch commented Jun 24, 2019

Ok, I've rebased this on the latest master! @nickolay please take a look when you get a chance.

Copy link
Contributor

@nickolay nickolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing jumps out at me, looks good, thanks again!

pub mod keywords;
mod mssql;
mod postgresql;

use std::fmt::Debug;

pub use self::ansi_sql::AnsiSqlDialect;
pub use self::generic_sql::GenericSqlDialect;
pub use self::ansi::AnsiSqlDialect;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to suggest that ansi::AnsiDialect and generic::GenericDialect would be more consistent, only to see you've renamed them in the following commit!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 yeah, I've learned not to try to do renames in the same commit as anything else. Makes it really hard to tell what's changed if Git fails to detect the rename.

src/parser.rs Outdated
"TRUE" | "FALSE" | "NULL" => {
self.prev_token();
self.parse_sql_value()
self.parse_value_expr()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new name is a bit confusing, perhaps just use Ok(Expr::SQLValue(self.parse_value()?)) here and in the other 2 callsites?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Done.

src/parser.rs Outdated
@@ -1935,7 +1933,7 @@ impl Parser {
let (quantity, percent) = if self.parse_one_of_keywords(&["ROW", "ROWS"]).is_some() {
(None, false)
} else {
let quantity = self.parse_sql_value()?;
let quantity = self.parse_value_expr()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to get back to the original PR to remember why we store quantity as an Expr(Value()) rather than just Value: at least Postgres claims to support more complex expressions - while we don't parse them, we "future-proofed" the AST in case we do.

Copy link
Contributor Author

@benesch benesch Jun 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

The rationale here is the same as the last commit: since this crate
exclusively parses SQL, there's no need to restate that in every type
name. (The prefix seems to be an artifact of this crate's history as a
submodule of Datafusion, where it was useful to explicitly call out
which types were related to SQL parsing.)

This commit has the additional benefit of making all type names
consistent; over type we'd added some types which were not prefixed with
"SQL".
@benesch
Copy link
Contributor Author

benesch commented Jun 25, 2019

Awesome, thanks for the review, @nickolay! Merging on green.

@benesch benesch merged commit bafb207 into apache:master Jun 25, 2019
@benesch benesch deleted the renames branch June 25, 2019 17:16
nickolay added a commit to nickolay/sqlparser-rs that referenced this pull request Jul 1, 2019
nickolay added a commit that referenced this pull request Jul 1, 2019
Update comments after the renaming done in PR #105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants