-
Notifications
You must be signed in to change notification settings - Fork 602
Extend support for INDEX parsing #1707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
`Copy` trait derive was previously removed as a variant `Custom` with `Ident` data type was added to the enum. This variant has afterwards been removed as deemed, while necessary to fully capture custom indices, extremely rare (the Postgres documentation itself says that creating custom indices is a rather hard task) and it would be very hard to fully capture in a good manner all possible custom variants.
Co-authored-by: Ifeanyi Ubah <[email protected]>
Marking this as draft in the meantime as its no longer awaiting review, @LucaCappelletti94 please feel free to undraft and ping when ready! |
At this time I am somewhat stuck as I would not know how to parse the class operator names if not as a keyword - do you have any examples that could be applied to this use case? |
@LucaCappelletti94 (not sure I followed entirely) did you mean to parse and represent an operator class like |
While things may change in the future, to the best of my knowledge some indices can have operator classes and some cannot. Since we model the different indices, is it preferable if I continue to model this semantical difference between them? Or should I remove it? I believe that different indices which accept different parametrizations are more similar to syntax than semantics. Do let me know what is your opinion on it. |
Sorry im not sure I understood the question, could you clarify with a sql example if that's possible? |
Sure, so: In a GIN index, there MAY be an operator class from a provided limited set, in the following example CREATE INDEX documents_content_trgm_idx ON documents USING GIN (content gin_trgm_ops); In a BRIN index, instead, to the best of my knowledge there are no such operator classes. I base this claim on having read the actual postgres source code, as unfortunately the documentation is rather scarse in regard of which operator classes are available. CREATE INDEX logs_event_time_brin_idx ON logs USING BRIN (event_time); One more important thing is that the actual index type, in these cases respectively If the Operator classes are to be treated as generic strings, it may be the case that also index types should be treated as such, if the goal is to make the library generically future-proofed. I am not certain it is necessarily the best way forward, as it may become exceedingly generic, but most likely this is my need to validate the semantics of the SQL that sneaks in. Summarizing, let me know your opinion of these three questions:
|
Ah I see thanks for clarifying!
|
I have tried to replace the parsing of keywords with the generic String Ident as proposed, but it leads to much more complex parsing. Primarily, I don't understand how I could distinguish an operator class from a keyword, such as: |
I have managed to do it, but I really dislike the approach I used, which peeks whether the parser will next encounter one of the expected keywords and if not, assumes that the next token is an operator class. I much preferred the previous approach that distinctively treated operator classes as keywords which was much stricter, but at some point it boils down to personal opinion. Let me know your opinion on it. |
I tried to do the merge using the GitHub tool but it was lagging immensely, I will now fix it offline. |
tests/sqlparser_postgres.rs
Outdated
|
||
#[test] | ||
fn parse_create_projects_name_description_trgm_index() { | ||
let sql = "CREATE INDEX projects_name_description_trgm_idx ON projects USING GIN (concat_projects_name_description(name, description) gin_trgm_ops)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test scenario looks identical to parse_create_users_name_trgm_index
in terms of coverage?
Im thinking generally for the tests we can group them into the same test function, it would be sufficient with the AST assertion on only one of the test scenarios then for the rest we can rely only on verified_stmt()
, that would keep the tests code smaller and easier to track what part of the syntax is covered.
Then could we add scenarios for all the introduced index types (i.e. BRIN, BLOOM etc)? As well as the custom type which probably would live in common I imagine?
Co-authored-by: Ifeanyi Ubah <[email protected]>
Co-authored-by: Ifeanyi Ubah <[email protected]>
Co-authored-by: Ifeanyi Ubah <[email protected]>
Co-authored-by: Ifeanyi Ubah <[email protected]>
Co-authored-by: Ifeanyi Ubah <[email protected]>
Co-authored-by: Ifeanyi Ubah <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @LucaCappelletti94!
cc @alamb
Co-authored-by: Ifeanyi Ubah <[email protected]>
Co-authored-by: Ifeanyi Ubah <[email protected]>
This pull request addresses issue #1706.
Summary of changes:
operator_classes.rs
containing the classesddl.rs
IndexColumn