Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improve parsing of JSON accesses on Postgres and Snowflake #1215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve parsing of JSON accesses on Postgres and Snowflake #1215
Changes from 3 commits
11b8eae
5e2d30d
d25ca82
24d2af2
b55728f
3e26164
fa5fb61
c8e0f4e
7a9a3f9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering would it be possible to reuse the existing
MapAccess
expr variant to represent this scenario?thinking that e.g with an example like below, the only difference would be an extension to the existing syntax enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this comment, I appreciate your review! So I did consider that, but my interpretation of this syntax is that
:
is an operator that takes aVARIANT
expression and a JSON path expression as a single syntactic element (VARIANT : JSONPATH
).Another SQL dialect with the same syntax explicitly says this is the case:
https://docs.databricks.com/en/sql/language-manual/functions/colonsign.html
https://docs.databricks.com/en/sql/language-manual/sql-ref-json-path-expression.html
(If we do keep this, I should actually rename it back to
JsonAccess
since I do plan to add Databricks as a dialect here in the future...)Tangentially, there are also some things I'm not happy with in
MapAccess
, particularlyMapAccessSyntax::Dot
:MapAccessSyntax::Dot
overlaps withCompositeAccess
.MapAccessKey
contains anExpr
even when using dot syntax, which can only legally containIdentifier
orDoubleQuotedString
in that scenario. It would make more sense to use anIdent
in that case.So my broader preference here is:
CompositeAccess
for all non-jsonpath dot accesses.MapAccess
for all non-jsonpath bracketed accesses (i.e. eliminateMapAccessSyntax
).JsonAccess
for all jsonpath accesss.What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm not sure if that matters a lot from a parsing/ast perspective - in that were not required to represent that literally so we have some leeway to be flexible if it lets us simplify the code/representation?
I think a reasonable goal to aim for could be to reuse the same solution for both the map and the variant expressions - since beyond the leading
:
special case, they are identical? I think this probably means merging both MapAccess and JsonAccess based on your proposal? If that seems reasonable then yeah feel free to make changes to that effect if it means revamping the current MapAccess setup.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the variant remains it might be good to instead impl display on the
VariantPathElem
and here use thedisplay_separated
function so that we avoid looping and the same logic an be reused in other references toVariantPathElem
in the futureThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trouble with that is that an individual
VariantPathElem
needs to know if it's the initial path segment to render as a:
or a.
(and I don't want to addVariantPathElem::Colon
because (a) the two current variants map to the two documented Snowflake syntaxes and (b) it makes more AST states unrepresentable (e.g. if a user manually constructs an AST to render, they have to remember this detail). Another option if we wanted to factor this out would be to wrap the path elements in aVariantPath
type,though that doesn't really seem worth it to meI went ahead and made this change.