Skip to content

Consolidate MapAccess, and Subscript into CompoundExpr to handle the complex field access chain #1551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Dec 22, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
192cab4
v1 tmp
goldmedal Nov 20, 2024
6be3c35
remove MapAccess
goldmedal Nov 25, 2024
0e916dd
fix fmt
goldmedal Nov 25, 2024
767b531
remove debug message
goldmedal Nov 25, 2024
22f4e67
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Nov 27, 2024
dee8b40
fix span test
goldmedal Nov 27, 2024
fc1cd59
introduce CompoundExpr
goldmedal Dec 4, 2024
8590896
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Dec 4, 2024
4ad37d8
fix merge conflict
goldmedal Dec 4, 2024
31a1e74
replace subscript with compound expr
goldmedal Dec 4, 2024
0355290
fix snowflake syntax
goldmedal Dec 4, 2024
1de9b21
limit the access chain supported dialect
goldmedal Dec 4, 2024
2a32b9f
fmt
goldmedal Dec 4, 2024
495d1b3
enhance doc and fix the name
goldmedal Dec 4, 2024
e7b55be
fix typo
goldmedal Dec 4, 2024
6652905
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Dec 9, 2024
47a5da1
update doc
goldmedal Dec 9, 2024
b58e50c
update doc and rename AccessExpr
goldmedal Dec 9, 2024
7cb2e00
remove unused crate
goldmedal Dec 9, 2024
397335a
update the out date doc
goldmedal Dec 9, 2024
ac25e5d
remove unused parsing
goldmedal Dec 9, 2024
a08e5c2
rename to `CompoundFieldAccess`
goldmedal Dec 9, 2024
09b39eb
rename chain and display AccessExpr by itself
goldmedal Dec 9, 2024
8968fcc
rename `parse_compound_expr`
goldmedal Dec 9, 2024
1328274
fmt and clippy
goldmedal Dec 9, 2024
d6743e9
fix doc
goldmedal Dec 9, 2024
7d030c1
remove unnecessary check
goldmedal Dec 16, 2024
90e03eb
improve the doc
goldmedal Dec 16, 2024
5c54d1b
remove the unused method `parse_map_access`
goldmedal Dec 16, 2024
57830e2
avoid the unnecessary cloning
goldmedal Dec 16, 2024
4b3818c
extract parse outer_join_expr
goldmedal Dec 16, 2024
67cd877
consume LBarcket by `parse_multi_dim_subscript`
goldmedal Dec 16, 2024
23aea03
Merge branch 'main' into feature/1533-dereference-expr-v3
goldmedal Dec 16, 2024
94847d7
fix compile
goldmedal Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 35 additions & 56 deletions src/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -454,40 +454,6 @@ pub enum CastFormat {
ValueAtTimeZone(Value, Value),
}

/// Represents the syntax/style used in a map access.
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub enum MapAccessSyntax {
/// Access using bracket notation. `mymap[mykey]`
Bracket,
/// Access using period notation. `mymap.mykey`
Period,
}

/// Expression used to access a value in a nested structure.
///
/// Example: `SAFE_OFFSET(0)` in
/// ```sql
/// SELECT mymap[SAFE_OFFSET(0)];
/// ```
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub struct MapAccessKey {
pub key: Expr,
pub syntax: MapAccessSyntax,
}

impl fmt::Display for MapAccessKey {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self.syntax {
MapAccessSyntax::Bracket => write!(f, "[{}]", self.key),
MapAccessSyntax::Period => write!(f, ".{}", self.key),
}
}
}

/// An element of a JSON path.
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
Expand Down Expand Up @@ -624,6 +590,12 @@ pub enum Expr {
Identifier(Ident),
/// Multi-part identifier, e.g. `table_alias.column` or `schema.table.col`
CompoundIdentifier(Vec<Ident>),
/// Multi-part Expression accessing. It's used to represent an access chain from a root expression.
/// e.g. `expr[0]`, `expr[0][0]`, or `expr.field1.filed2[1].field3`, ...
CompoundExpr {
root: Box<Expr>,
chain: Vec<AccessField>,
},
/// Access data nested in a value containing semi-structured data, such as
/// the `VARIANT` type on Snowflake. for example `src:customer[0].name`.
///
Expand Down Expand Up @@ -877,14 +849,6 @@ pub enum Expr {
data_type: DataType,
value: String,
},
/// Access a map-like object by field (e.g. `column['field']` or `column[4]`
/// Note that depending on the dialect, struct like accesses may be
/// parsed as [`Subscript`](Self::Subscript) or [`MapAccess`](Self::MapAccess)
/// <https://clickhouse.com/docs/en/sql-reference/data-types/map/>
MapAccess {
column: Box<Expr>,
keys: Vec<MapAccessKey>,
},
/// Scalar function call e.g. `LEFT(foo, 5)`
Function(Function),
/// Arbitrary expr method call
Expand Down Expand Up @@ -973,11 +937,6 @@ pub enum Expr {
/// ```
/// [1]: https://duckdb.org/docs/sql/data_types/map#creating-maps
Map(Map),
/// An access of nested data using subscript syntax, for example `array[2]`.
Subscript {
expr: Box<Expr>,
subscript: Box<Subscript>,
},
/// An array expression e.g. `ARRAY[1, 2]`
Array(Array),
/// An interval expression e.g. `INTERVAL '1' YEAR`
Expand Down Expand Up @@ -1094,6 +1053,25 @@ impl fmt::Display for Subscript {
}
}

/// The contents inside the `.` in an access chain.
/// It can be an expression or a subscript.
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub enum AccessField {
Expr(Expr),
Subscript(Subscript),
}

impl fmt::Display for AccessField {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
AccessField::Expr(expr) => write!(f, "{}", expr),
AccessField::Subscript(subscript) => write!(f, "{}", subscript),
}
}
}

/// A lambda function.
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
Expand Down Expand Up @@ -1289,12 +1267,19 @@ impl fmt::Display for Expr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Expr::Identifier(s) => write!(f, "{s}"),
Expr::MapAccess { column, keys } => {
write!(f, "{column}{}", display_separated(keys, ""))
}
Expr::Wildcard(_) => f.write_str("*"),
Expr::QualifiedWildcard(prefix, _) => write!(f, "{}.*", prefix),
Expr::CompoundIdentifier(s) => write!(f, "{}", display_separated(s, ".")),
Expr::CompoundExpr { root, chain } => {
write!(f, "{}", root)?;
for field in chain {
match field {
AccessField::Expr(expr) => write!(f, ".{}", expr)?,
AccessField::Subscript(subscript) => write!(f, "[{}]", subscript)?,
}
}
Ok(())
}
Expr::IsTrue(ast) => write!(f, "{ast} IS TRUE"),
Expr::IsNotTrue(ast) => write!(f, "{ast} IS NOT TRUE"),
Expr::IsFalse(ast) => write!(f, "{ast} IS FALSE"),
Expand Down Expand Up @@ -1714,12 +1699,6 @@ impl fmt::Display for Expr {
Expr::Map(map) => {
write!(f, "{map}")
}
Expr::Subscript {
expr,
subscript: key,
} => {
write!(f, "{expr}[{key}]")
}
Expr::Array(set) => {
write!(f, "{set}")
}
Expand Down
18 changes: 13 additions & 5 deletions src/ast/spans.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use core::iter;
use crate::tokenizer::Span;

use super::{
AlterColumnOperation, AlterIndexOperation, AlterTableOperation, Array, Assignment,
AccessField, AlterColumnOperation, AlterIndexOperation, AlterTableOperation, Array, Assignment,
AssignmentTarget, CloseCursor, ClusteredIndex, ColumnDef, ColumnOption, ColumnOptionDef,
ConflictTarget, ConnectBy, ConstraintCharacteristics, CopySource, CreateIndex, CreateTable,
CreateTableOptions, Cte, Delete, DoUpdate, ExceptSelectItem, ExcludeSelectItem, Expr,
Expand Down Expand Up @@ -1233,6 +1233,9 @@ impl Spanned for Expr {
Expr::Identifier(ident) => ident.span,
Expr::CompoundIdentifier(vec) => union_spans(vec.iter().map(|i| i.span)),
Expr::CompositeAccess { expr, key } => expr.span().union(&key.span),
Expr::CompoundExpr { root, chain } => {
union_spans(iter::once(root.span()).chain(chain.iter().map(|i| i.span())))
}
Expr::IsFalse(expr) => expr.span(),
Expr::IsNotFalse(expr) => expr.span(),
Expr::IsTrue(expr) => expr.span(),
Expand Down Expand Up @@ -1307,9 +1310,6 @@ impl Spanned for Expr {
Expr::Nested(expr) => expr.span(),
Expr::Value(value) => value.span(),
Expr::TypedString { .. } => Span::empty(),
Expr::MapAccess { column, keys } => column
.span()
.union(&union_spans(keys.iter().map(|i| i.key.span()))),
Expr::Function(function) => function.span(),
Expr::GroupingSets(vec) => {
union_spans(vec.iter().flat_map(|i| i.iter().map(|k| k.span())))
Expand Down Expand Up @@ -1405,7 +1405,6 @@ impl Spanned for Expr {
Expr::Named { .. } => Span::empty(),
Expr::Dictionary(_) => Span::empty(),
Expr::Map(_) => Span::empty(),
Expr::Subscript { expr, subscript } => expr.span().union(&subscript.span()),
Expr::Interval(interval) => interval.value.span(),
Expr::Wildcard(token) => token.0.span,
Expr::QualifiedWildcard(object_name, token) => union_spans(
Expand Down Expand Up @@ -1444,6 +1443,15 @@ impl Spanned for Subscript {
}
}

impl Spanned for AccessField {
fn span(&self) -> Span {
match self {
AccessField::Expr(ident) => ident.span(),
AccessField::Subscript(subscript) => subscript.span(),
}
}
}

impl Spanned for ObjectName {
fn span(&self) -> Span {
let ObjectName(segments) = self;
Expand Down
4 changes: 4 additions & 0 deletions src/dialect/snowflake.rs
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,10 @@ impl Dialect for SnowflakeDialect {
RESERVED_FOR_IDENTIFIER.contains(&kw)
}
}

fn supports_partiql(&self) -> bool {
true
}
Comment on lines +238 to +240
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I don't think this is necessarily correct since partiql is a redshift feature, was this required somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to integrate the conditions at

} else if dialect_of!(self is SnowflakeDialect) || self.dialect.supports_partiql() {
self.prev_token();
self.parse_json_access(expr)

Then, we can only check supports_partiql in parse_compound_expr. 🤔

                    if self.consume_token(&Token::LBracket) {
                        if self.dialect.supports_partiql() {
                            ending_lbracket = true;
                            break;
                        } else {
                            self.parse_multi_dim_subscript(&mut chain)?
                        }
                    }

Indeed, the name is a little weird for Snowflake but I think they mean the same thing 🤔

}

/// Parse snowflake create table statement.
Expand Down
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ extern crate alloc;
#[macro_use]
#[cfg(test)]
extern crate pretty_assertions;
extern crate core;

pub mod ast;
#[macro_use]
Expand Down
Loading
Loading