Skip to content

Consolidate representation of function calls, remove AggregateExpressionWithFilter, ArraySubquery, ListAgg and ArrayAgg #1247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 3, 2024

Conversation

jmhain
Copy link
Contributor

@jmhain jmhain commented May 1, 2024

This addresses a handful of issues and redundancies related to our representation of function calls.

  • I split off a new FunctionArguments enum with three options:
    • A regular argument list with a list of additional clauses
    • No argument list - for representing things like CURRENT_TIMESTAMP. This is better than the special bit we had because the latter allows for many invalid AST states.
    • A subquery. This is used to losslessly represent Snowflake's support for unparenthesized subqueries as function arguments, and allowed removal of the special ArraySubquery expression variant.
  • Reimplemented the fix from Support BigQuery window function null treatment #1239 to take advantage of the new structure.
  • I added WITHIN GROUP support to Function, which makes functions like percentile_cont(...) within group (...) on Postgres work.
  • I removed the special ListAgg and ArrayAgg expression variants and subsumed them into Function
  • Removed the redundant AggregateExpressionWithFilter expression variant. This was only needed before because of ArrayAgg / ListAgg being separate expression nodes.
  • On BigQuery, ARRAY_CONCAT_AGG and STRING_AGG support the same set of options as ARRAY_AGG, and we can parse those now.
  • The ALL in SELECT COUNT(ALL column) is now preserved in roundtrips.

This lays the groundwork for adding additional special function argument clauses like bigquery's ANY_VALUE(foo HAVING MIN value).

@coveralls
Copy link

coveralls commented May 1, 2024

Pull Request Test Coverage Report for Build 8943406713

Details

  • 471 of 506 (93.08%) changed or added relevant lines in 14 files are covered.
  • 25 unchanged lines in 6 files lost coverage.
  • Overall coverage decreased (-0.03%) to 89.172%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tests/sqlparser_bigquery.rs 9 10 90.0%
tests/sqlparser_mssql.rs 5 6 83.33%
src/ast/mod.rs 37 39 94.87%
src/parser/mod.rs 87 91 95.6%
tests/sqlparser_common.rs 192 219 87.67%
Files with Coverage Reduction New Missed Lines %
src/dialect/snowflake.rs 1 91.79%
tests/sqlparser_mssql.rs 1 92.76%
src/ast/mod.rs 2 80.78%
src/dialect/mod.rs 3 76.98%
src/parser/mod.rs 4 92.81%
tests/sqlparser_common.rs 14 89.46%
Totals Coverage Status
Change from base Build 8943225733: -0.03%
Covered Lines: 24178
Relevant Lines: 27114

💛 - Coveralls

@jmhain jmhain changed the title Functions rework Improve representation of function calls May 1, 2024
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very nice with the new representation, thanks @jmhain!

cc @alamb

@alamb alamb changed the title Improve representation of function calls Consolidate representation of function calls, remove AggregateExpressionWithFilter, ArraySubquery, ListAgg and ArrayAgg May 3, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 -- thank you @jmhain -- this is really nice. While it will be a non trivial change for some downstream crates I think it will be for the better.

Also, thank you for the reviews @iffyio and @universalmind303

🚀

special: false,
order_by: vec![],
}),
call("safe_offset", [Expr::Value(number("2"))]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️


#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are really lovely. Thank you @jmhain

@alamb alamb merged commit a14faa3 into apache:main May 3, 2024
10 checks passed
@jmhain jmhain deleted the functions-rework branch May 3, 2024 19:07
pub over: Option<WindowType>,
/// aggregate functions may specify eg `COUNT(DISTINCT x)`
pub distinct: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I get this property now then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see DuplicateTreatment now. But it can cause downstream to restructure the handling code to adjust these structural changes.

Trying it out ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us know if there is something wrong that can't be handled -- I can make another release with a fix if necessary

Copy link
Member

@tisonkun tisonkun May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to upgrade sqlparser to 0.46.0 in datafusion this morning but found that except my breaking changes on Insert and Delete struct, it includes others like this one and the JsonAccess refactor. So I stop the work until I can have time to understand the changes :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if there's any way I can help...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmhain Thanks for offering the help. I just created apache/datafusion#10392 that you can pull locally and run cargo build to see the compilation failure and see if you have the idea to fix it up.

Welcome to leave a comment on that PR or create a PR target to that PR. I'm actively responding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I've got so far that I'm reasonably sure is correct without more context on DataFusion: apache/datafusion@261f346

Out of time for now but I'll try to take a second look later tonight at the remaining JSON access changes needed

JichaoS pushed a commit to luabase/sqlparser-rs that referenced this pull request May 7, 2024
…sionWithFilter`, `ArraySubquery`, `ListAgg` and `ArrayAgg` (apache#1247)

Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants