-
Notifications
You must be signed in to change notification settings - Fork 1
Support CREATE MATERIALIZED VIEW #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @benesch, and thanks for the interest in my fork!
I can't merge this as is, because it would conflict with my private changes to accept CREATE [UNIQUE] INDEX
:
} else if self.parse_keyword("VIEW") {
self.parse_create_view()
+ } else if self.parse_keyword("UNIQUE") || self.parse_keyword("INDEX") {
+ // CREATE [UNIQUE] [CLUSTERED] INDEX name ON tbl (cols) ON [PRIMARY]
+ self.prev_token();
...
Would you be willing to use a similar approach (as already done in parse_query
) or maybe suggest a better one?
Also, given andygrove/sqlparser-rs#22 and pending discussion on andygrove/sqlparser-rs#7 I avoided non-ANSI syntax in this branch as much as possible (I have a number of private commits on top of this fork that enable parsing of the MSSQL I need to handle).
I personally think we could start by including the dialect-specific code in the common parser, as long as we clearly note which parts are dialect-specific (I believe the CREATE MATERIALIZED VIEW
is from Postgres and Oracle?)...
What are your thoughts on this?
Oof, that's thorny. Happy to adjust the patch to avoid conflict with your changes.
Yep, that's right!
I am wholeheartedly in favor of including dialect-specific code in the common parser. I'd love to be proven wrong, but I just don't see how you could modularize the parsing of dialects without vastly complicating the parser. On the bright side, SQL dialects tend to be additive rather than conflicting, so it actually seems tractable to construct a parser that accepts a superset of PostgreSQL/Oracle/MySQL/MSSQL/whatever syntax. If you do for some reason need to reject queries that don't exactly match a dialect, you could simply traverse the parse tree looking for, say, non-MySQL-isms. In practice I suspect that most users of this library are already doing such a traversal to check for unsupported features, since if you're building your own SQL engine you almost certainly don't support all features of your chosen dialect. The only downside of this approach is increased memory usage—e.g., users who only care about MySQL will have an extra Understood if you and @andygrove come to a different conclusion, in which case I'll maintain a fork somewhere! |
Thanks! I was thinking along the same lines, though not as clearly. Though I too don't see a sensible approach to the modularization of the various dialects, I was hoping there is a workable solution (hinted at here): I fear the polyglot parser has to eventually grow into quite a beast, where most of the code is uninteresting to any given contributor, who probably doesn't care/know about most of the dialects... But to get to the point where that becomes a major problem requires good coverage of multiple dialects, so I'm not worried too much at the moment! I'll be happy to merge this after you adjust the PR as discussed. |
Sounds great! The PR is adjusted per your recommendation now.
Yeah, I took a look at the custom parser in datafusion (https://github.com/andygrove/datafusion/blob/master/src/dfparser.rs), which seems like a reasonable way to add entirely new statements to a dialect. But I'm skeptical that the approach would work for adding something deeper in the grammar. Will supporting |
Thanks again! (Forgot to mention that since upstream started using |
Thanks for the great work on this. I have been hugely distracted with my day job and Apache Arrow lately and have been neglecting this parser. It's great that people can fork the repo and not be blocked. I would like to solve this design for supporting different dialects. It is a challenging problem for sure. I hope to have time to think about this in the next week or two. I did have a plan for this originally. |
Opening this here since upstream hasn't been responsive lately, it seems. Speaking of which—thanks for all the great fixes on top of upstream!