Skip to content

Support CREATE MATERIALIZED VIEW #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 8, 2019

Conversation

benesch
Copy link

@benesch benesch commented Mar 6, 2019

Opening this here since upstream hasn't been responsive lately, it seems. Speaking of which—thanks for all the great fixes on top of upstream!

Copy link
Owner

@nickolay nickolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @benesch, and thanks for the interest in my fork!

I can't merge this as is, because it would conflict with my private changes to accept CREATE [UNIQUE] INDEX:

     } else if self.parse_keyword("VIEW") {
         self.parse_create_view()
   + } else if self.parse_keyword("UNIQUE") || self.parse_keyword("INDEX") {
   +     // CREATE [UNIQUE] [CLUSTERED] INDEX name ON tbl (cols) ON [PRIMARY]
   +     self.prev_token();
      ...

Would you be willing to use a similar approach (as already done in parse_query) or maybe suggest a better one?


Also, given andygrove/sqlparser-rs#22 and pending discussion on andygrove/sqlparser-rs#7 I avoided non-ANSI syntax in this branch as much as possible (I have a number of private commits on top of this fork that enable parsing of the MSSQL I need to handle).

I personally think we could start by including the dialect-specific code in the common parser, as long as we clearly note which parts are dialect-specific (I believe the CREATE MATERIALIZED VIEW is from Postgres and Oracle?)...

What are your thoughts on this?

@benesch
Copy link
Author

benesch commented Mar 6, 2019

Oof, that's thorny. Happy to adjust the patch to avoid conflict with your changes.

I personally think we could start by including the dialect-specific code in the common parser, as long as we clearly note which parts are dialect-specific (I believe the CREATE MATERIALIZED VIEW is from Postgres and Oracle?)...

Yep, that's right!

What are your thoughts on this?

I am wholeheartedly in favor of including dialect-specific code in the common parser. I'd love to be proven wrong, but I just don't see how you could modularize the parsing of dialects without vastly complicating the parser.

On the bright side, SQL dialects tend to be additive rather than conflicting, so it actually seems tractable to construct a parser that accepts a superset of PostgreSQL/Oracle/MySQL/MSSQL/whatever syntax. If you do for some reason need to reject queries that don't exactly match a dialect, you could simply traverse the parse tree looking for, say, non-MySQL-isms. In practice I suspect that most users of this library are already doing such a traversal to check for unsupported features, since if you're building your own SQL engine you almost certainly don't support all features of your chosen dialect. The only downside of this approach is increased memory usage—e.g., users who only care about MySQL will have an extra materialized: bool in their CreateViewNodes even though the MySQL dialect doesn't support materialized views—but if you care that much about memory usage you'll probably need to write a custom parser anyway.

Understood if you and @andygrove come to a different conclusion, in which case I'll maintain a fork somewhere!

@nickolay
Copy link
Owner

nickolay commented Mar 7, 2019

Thanks! I was thinking along the same lines, though not as clearly.

Though I too don't see a sensible approach to the modularization of the various dialects, I was hoping there is a workable solution (hinted at here): I fear the polyglot parser has to eventually grow into quite a beast, where most of the code is uninteresting to any given contributor, who probably doesn't care/know about most of the dialects... But to get to the point where that becomes a major problem requires good coverage of multiple dialects, so I'm not worried too much at the moment!

I'll be happy to merge this after you adjust the PR as discussed.

@benesch
Copy link
Author

benesch commented Mar 7, 2019

Sounds great! The PR is adjusted per your recommendation now.

Though I too don't see a sensible approach to the modularization of the various dialects, I was hoping there is a workable solution (hinted at here)

Yeah, I took a look at the custom parser in datafusion (https://github.com/andygrove/datafusion/blob/master/src/dfparser.rs), which seems like a reasonable way to add entirely new statements to a dialect. But I'm skeptical that the approach would work for adding something deeper in the grammar. Will supporting CREATE TABLE IF NOT EXISTS require reimplementing all of CREATE TABLE? Plus then you'd have two AST nodes for each variant, when you really want a if_not_exists: bool on one create table node. (But, again, I'd be very happy if someone proved me wrong here.)

@nickolay nickolay merged commit 23a0fc7 into nickolay:master Mar 8, 2019
nickolay added a commit that referenced this pull request Mar 8, 2019
@nickolay
Copy link
Owner

nickolay commented Mar 8, 2019

Thanks again! (Forgot to mention that since upstream started using cargo fmt last year, I try to keep my fork formatted with the default settings too, even though it looks a bit weird sometimes.)

@andygrove
Copy link

Thanks for the great work on this. I have been hugely distracted with my day job and Apache Arrow lately and have been neglecting this parser. It's great that people can fork the repo and not be blocked. I would like to solve this design for supporting different dialects. It is a challenging problem for sure. I hope to have time to think about this in the next week or two. I did have a plan for this originally.

@benesch benesch deleted the materialized branch June 21, 2019 22:35
DavidBM pushed a commit to Couragium/mysqldump-mutator-rs that referenced this pull request Jan 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants