Skip to content

Properly handle mixed implicit and explicit joins #109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 10, 2019

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Jun 8, 2019

Parse a query like

SELECT * FROM a NATURAL JOIN b, c NATURAL JOIN d

as the SQL specification requires, i.e.:

from: [
    TableReference {
        relation: TableFactor::Table("a"),
        joins: [Join {
            relation: TableFactor::Table("b"),
            join_operator: JoinOperator::Natural,
        }]
    },
    TableReference {
        relation: TableFactor::Table("c"),
        joins: [Join {
            relation: TableFactor::Table("d"),
            join_operator: JoinOperator::Natural,
        }]
    }
]

Previously we were parsing such queries as

relation: TableFactor::Table("a"),
joins: [
    Join {
        relation: TableFactor::Table("b"),
        join_operator: JoinOperator::Natural,
    },
    Join {
        relation: TableFactor::Table("c"),
        join_operator: JoinOperator::Implicit,
    },
    Join {
        relation: TableFactor::Table("d"),
        join_operator: JoinOperator::Natural,
    },
]

which did not make the join hierarchy clear.

@benesch benesch requested a review from nickolay June 8, 2019 03:50
@benesch benesch mentioned this pull request Jun 8, 2019
@coveralls
Copy link

coveralls commented Jun 8, 2019

Pull Request Test Coverage Report for Build 330

  • 163 of 174 (93.68%) changed or added relevant lines in 5 files are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.1%) to 91.892%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/sqlparser.rs 54 56 96.43%
tests/sqlparser_common.rs 88 97 90.72%
Files with Coverage Reduction New Missed Lines %
src/sqlparser.rs 1 92.76%
tests/sqlparser_common.rs 2 91.18%
Totals Coverage Status
Change from base Build 327: -0.1%
Covered Lines: 4250
Relevant Lines: 4625

💛 - Coveralls

@benesch
Copy link
Contributor Author

benesch commented Jun 8, 2019

@nickolay I realize this is likely to be a bit prickly, as it is a substantial refactoring of a core AST. I wanted to flesh out what it looked like, though, and it didn't turn out to be too bad.

Note that because we now support nested joins, we could alternatively parse FROM a NATURAL JOIN b, c NATURAL JOIN d as:

relation: TableFactor::Table("a"),
joins: [
    Join { relation: TableFactor::Table("b"), operator: Natural },
    Join {
        relation: TableFactor::NestedJoin {
            base: TableFactor::Table("c"),
            joins: [Join { relation: TableFactor::("d"), operator: Natural }]
        },
        operator: Cross
    }
]

But perhaps downstream consumers will want to distinguish between implicit and explicit cross joins.

Copy link
Contributor

@nickolay nickolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thank you! I had some ideas about naming / organization, but feel free to merge in any case.

@@ -197,6 +192,23 @@ impl ToString for SQLSelectItem {
}
}

/// A table reference is one table factor followed by any number of joins.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that this follows the names of the productions in ANSI SQL syntax, but maybe TableWithJoins?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I find them hard to remember too. I've taken your suggestion for TableWithJoins.

base: Box<TableFactor>,
joins: Vec<Join>,
},
NestedJoin(Box<TableReference>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about this whole design wasn't sitting right with me, so I decided to address the underlying issue you brought up in #108, rather than renaming this variant. I've pushed two commits that make it so that NestedJoin always contains a table reference with at least one join. A nice side effect of this work was supporting derived tables with set operations, like SELECT * FROM (((SELECT 1) UNION (SELECT 2)) t1 NATURAL JOIN t2).

src/sqlparser.rs Outdated
group_by,
having,
})
}

pub fn parse_table_reference(&mut self) -> Result<TableReference, ParserError> {
let relation = self.parse_table_factor()?;
let joins = self.parse_joins()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would parse_joins be useful elsewhere? Perhaps inline it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, not that I can see. Inlined!

@nickolay
Copy link
Contributor

nickolay commented Jun 9, 2019

Could you please extract the new parse_top_level() test (with the required changes in parse_statement) into its own commit and land it along with the already reviewed changes in this PR? Or land all of it and let me think about it later?

Making two parse attempts bothers me, plus the names from the spec (parse_derived_table_factor/parse_table_reference) are hard to remember; I'll need some time to let it sink in.

@benesch benesch force-pushed the implicit-join-fix branch 2 times, most recently from 7210715 to 94c23f3 Compare June 10, 2019 15:05
@benesch
Copy link
Contributor Author

benesch commented Jun 10, 2019

Could you please extract the new parse_top_level() test (with the required changes in parse_statement) into its own commit and land it along with the already reviewed changes in this PR?

Yep, working on that now. I'll open a new PR for the remaining changes as soon as this lands.

benesch added 2 commits June 10, 2019 11:11
Parse a query like

    SELECT * FROM a NATURAL JOIN b, c NATURAL JOIN d

as the SQL specification requires, i.e.:

    from: [
        TableReference {
            relation: TableFactor::Table("a"),
            joins: [Join {
                relation: TableFactor::Table("b"),
                join_operator: JoinOperator::Natural,
            }]
        },
        TableReference {
            relation: TableFactor::Table("c"),
            joins: [Join {
                relation: TableFactor::Table("d"),
                join_operator: JoinOperator::Natural,
            }]
        }
    ]

Previously we were parsing such queries as

    relation: TableFactor::Table("a"),
    joins: [
        Join {
            relation: TableFactor::Table("b"),
            join_operator: JoinOperator::Natural,
        },
        Join {
            relation: TableFactor::Table("c"),
            join_operator: JoinOperator::Implicit,
        },
        Join {
            relation: TableFactor::Table("d"),
            join_operator: JoinOperator::Natural,
        },
    ]

which did not make the join hierarchy clear.
@benesch benesch force-pushed the implicit-join-fix branch from 94c23f3 to ce171c2 Compare June 10, 2019 15:11
@benesch benesch merged commit 5896f01 into apache:master Jun 10, 2019
@benesch benesch deleted the implicit-join-fix branch June 10, 2019 15:17
@benesch benesch mentioned this pull request Jun 10, 2019
nickolay added a commit to nickolay/sqlparser-rs that referenced this pull request Jun 13, 2019
nickolay added a commit to nickolay/sqlparser-rs that referenced this pull request Jun 13, 2019
This is a follow-up to apache#109 which moved the handling of Token::Comma
from parse_table_and_joins to the `FROM` parser.
@nickolay nickolay mentioned this pull request Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants