Skip to content

Add FETCH/OFFSET and JOIN LATERAL (subquery) #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 2, 2019

Conversation

thomas-jeepe
Copy link
Contributor

I've added code and passing tests for this, just two points which should be gone over.

First is the ambiguity of the sql spec. Per the ANSI SQL:2011 standard, there is no limit clause, and the order seems to be set in stone.

Obviously, many databases and currently sqlparser support the LIMIT clause.

POSTGRES seems to support OFFSET without {ROW | ROWS}. LIMIT, OFFSET, FETCH can be reordered in POSTGRES. Having all of LIMIT, OFFSET and FETCH causes a parsing error.

I've implemented it in such a fashion that you can't reorder those clauses and so that you can have all of LIMIT, OFFSET and FETCH. I don't support OFFSET without {ROW | ROWS}

Another point is keywords, in sqlparser, no distinction is made between reserved and non-reserved words. I added a couple new keywords to the keyword list, I hope this should be fine.

@coveralls
Copy link

coveralls commented May 21, 2019

Pull Request Test Coverage Report for Build 223

  • 177 of 188 (94.15%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.3%) to 89.761%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tests/sqlparser_common.rs 93 98 94.9%
src/sqlparser.rs 56 62 90.32%
Totals Coverage Status
Change from base Build 222: 0.3%
Covered Lines: 3226
Relevant Lines: 3594

💛 - Coveralls

@nickolay
Copy link
Contributor

Thanks for the PR! I'm sorry it took so long to get back to you - I wasn't familiar with this syntax and got distracted by the other PRs...

I only looked into the FETCH/OFFSET patch, not the LATERAL DERIVED bit. My comment grew larger than I intended, but the gist is: I would update the syntax comments (1) and settle the FetchQuantity question (3) one way or the other before merging, and file a separate issue for other possible improvements (2).

I'd also like to coordinate merges with @benesch, per discussion in #84.


  1. With this patch we support parsing (I'd update the code comments to use this format):
  • LIMIT { <N> | ALL } (as before)
  • OFFSET <N> { ROW | ROWS }
  • FETCH { FIRST | NEXT } <N> [ PERCENT ] { ROW | ROWS } { ONLY | WITH TIES }
  1. I looked in Postgres docs on LIMIT, assuming other DBs implement more or less the same semantics, and googled a bit to compare the implementation with what PG supports and the standard mandates:
  • In FETCH, FIRST/NEXT are synonyms as are ROW/ROWS - because SQL likes to think of itself as English.
    • Accordingly, we don't store this distinction in AST. The PR adds parse_one_of_keywords/expect_one_of_keywords to handle these.
  • The standard defines <N> to be a <simple value specification>, which means a literal (not necessarily an unsigned number!) or exotic things that I think resolve to a literal (<host parameter name> | <SQL parameter reference> | <embedded variable name>). In Postgres "other expressions are allowed, but will generally need to be enclosed in parentheses to avoid ambiguity".
    • Whereas we parse <N> as an unsigned integer for LIMIT and OFFSET (which I make clearer in andygrove@1492f42 ), or any kind of literal for FETCH. The AST stores an ASTNode::SQLValue.
  • <N> in FETCH is optional, both per the standard and in Postgres.
    • We don't implement this yet.
  • LIMIT ALL is equivalent to not having a LIMIT clause at all.
    • We parse it and represent the result as limit: None (just like the lack of the LIMIT clause)
  • LIMIT <N> is equivalent to FETCH FIRST <N> ROWS ONLY.
    • We have a separate fields in the AST for LIMIT and FETCH and, as you mentioned, we parse a query with all of LIMIT, OFFSET and FETCH without an error.
  • As you mentioned, we don't support Postgresql-isms, like reordering of the clauses and OFFSET <N>, yet.
  1. The FETCH clause is represented as Option<Fetch>, where Fetch is a struct { with_ties: bool, quantity: FetchQuantity }, where FetchQuantity is an enum { RowCount(ASTNode), Percentage(ASTNode) }. I wonder why not Fetch { quantity: ASTNode, percent: bool, with_ties: bool }?

@nickolay
Copy link
Contributor

Another point is keywords, in sqlparser, no distinction is made between reserved and non-reserved words. I added a couple new keywords to the keyword list, I hope this should be fine.

BTW: Yes, it's fine. I was thinking of removing the global keywords list altogether.

@thomas-jeepe
Copy link
Contributor Author

I've made the two changes you suggested, I find having percent as a boolean much simpler to use and having more specific comments is good.

I added support for parsing fetch without a quantity, I represent this by having quantity be an Option.

Other improvements I agree should be discussed in a new issue.

On the lateral derived, I apologize for throwing this into the PR. I needed lateral joins for the project I am using sqlparser for, so I built it in my fork of the repository. I am still relatively new to open source workflows, but I imagine I am supposed to make a branch for the lateral derived feature?

LATERAL is actually very simple to parse (just check for LATERAL then parse a subquery). I think the AST for LateralDerived is simple enough and I have a test passing for parsing a lateral subquery.

All in all the branch is ready to merge and we can create an issue to discuss what to do with FETCH, LIMIT and OFFSET improvements.

@benesch
Copy link
Contributor

benesch commented May 31, 2019

This looks great to me! I think the support for LATERAL derived queries here is sufficient to parse all LATERAL-ness present in the SQL spec, but Postgres, if I'm reading things right, allows LATERAL to apply to any JOIN factor, not just derived factors: https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-LATERAL. Let's punt that to later work, though. I think we'll need to tweak the structure of TableFactor a bit, and there's no point holding you up in the meantime.

I'll leave it to @nickolay to double-check, but I do believe you've addressed everything he requested!

@benesch benesch force-pushed the master branch 2 times, most recently from d169569 to 1f6da84 Compare May 31, 2019 21:58
@benesch
Copy link
Contributor

benesch commented May 31, 2019

@thomas-jeepe, just a heads up that I force-pushed a rebase that mostly cleans up the Git history. The FETCH and OFFSET support is isolated to the first commit now, and LATERAL support to the second. I didn't make any code changes to FETCH and OFFSET, but I did adjust the LATERAL support to use a lateral bool instead of a new enum variant, which I think is a bit cleaner. What do you think?

@thomas-jeepe
Copy link
Contributor Author

@benesch Thanks for the comments and review! I agree on the lateral boolean field, it definitely looks more concise and cleaner. Thanks for the rebase as well!

@nickolay nickolay merged commit 721c241 into apache:master Jun 2, 2019
@nickolay
Copy link
Contributor

nickolay commented Jun 2, 2019

Merged, thanks!

@nickolay nickolay changed the title Add FETCH and OFFSET support Add FETCH/OFFSET and JOIN LATERAL (subquery) Jun 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants