Skip to content

Make Parser & Tokenizer generic over Dialect, remove test_utils module #407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

not-my-profile
Copy link

  • rustc monomorphizes generics which is more efficient than
    trait object methods which are dispatched dynamically

  • the Parser struct no longer has a lifetime, making it overall
    more convenient (e.g it now is Send, Sync & UnwindSafe)

For details see the commit messages.

If you like these changes, please merge them without squashing.

Previously the test_utils were part of the public API (albeit their
documentation was hidden), so that they could be used by a single unit
test. Since the unit test only used public APIs, we can just move it
to the integration tests, allowing us to outsource the test_utils.
This is a preparation step for the introduction
of generics in the next commit.
@alamb alamb changed the title Make Parser & Tokenizer generic over Dialect Make Parser & Tokenizer generic over Dialect, remove test_utils module Feb 5, 2022
@coveralls
Copy link

coveralls commented Feb 5, 2022

Pull Request Test Coverage Report for Build 1799792471

  • 140 of 143 (97.9%) changed or added relevant lines in 22 files are covered.
  • 6 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.02%) to 90.301%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tests/sqlparser_common.rs 24 25 96.0%
tests/test_utils.rs 20 22 90.91%
Files with Coverage Reduction New Missed Lines %
src/tokenizer.rs 1 90.28%
src/parser.rs 5 84.19%
Totals Coverage Status
Change from base Build 1799434596: -0.02%
Covered Lines: 6750
Relevant Lines: 7475

💛 - Coveralls

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @not-my-profile -- this is a very nice contribution

I had a few questions related to test changes and the use of PhantomData but otherwise this looks like a nice change to me, even though it is backwards incompatible.

cc @Dandandan @houqp @andygrove and @nickolay

// This is required to make utilities accessible by both the crate-internal
// unit-tests and by the integration tests <https://stackoverflow.com/a/44541071/1026>
// External users are not supposed to rely on this module.
pub mod test_utils;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable change, though it may cause other users of this crate upgrade pain 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was #[doc(hidden)] and documented that "External users are not supposed to rely on this module." So I really don't see the problem.

//! let sql = "SELECT a, b, 123, myfunc(b) \
//! FROM table_1 \
//! WHERE a > b AND b < 100 \
//! ORDER BY a DESC, b";
//!
//! let ast = Parser::parse_sql(&dialect, sql).unwrap();
//! let ast = Parser::<GenericDialect>::parse_sql(sql).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//! let ast = Parser::<GenericDialect>::parse_sql(sql).unwrap();
//! // Parse using a generic dialect of SQL.
//! // Other dialects such as `AnsiDialect` are available
//! let ast = Parser::<GenericDialect>::parse_sql(sql).unwrap();

Copy link
Author

@not-my-profile not-my-profile Feb 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to improve the documentation in a followup PR. This PR is already big enough IMO.

tokens: Vec<Token>,
/// The index of the first unprocessed token in `self.tokens`
index: usize,
dialect: &'a dyn Dialect,
dialect: PhantomData<D>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the parser is generic over Dialect do we need a dialect field at all ?

It seems like we could remove dialect entirely

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc requires that generic type parameters of structs are actually used or it fails with a compile error telling you to use PhantomData. Note that PhantomData consumes no space it's just used by the compiler for the static analysis, see core::marker::PhantomData and PhantomData in the Nomicon.

@@ -3629,30 +3630,3 @@ impl Word {
}
}
}

#[cfg(test)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not clear to me initially, but this test was moved below

};
}

pub trait Parse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are using this trait so you can have a dyn Parse and dispatch dynamically at runtime what dialect is being used. This is likely something other users of the crate might want / might be using (dynamically pick a dialect).

What would you think about putting this trait in the main crate (or perhaps even calling it Parser and changing the struct to ParserImpl)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I highly doubt that any API users actually need Parser trait objects. Note that with generics it's still possible to "dynamically" choose which dialect is being used at runtime, you don't need trait objects for that. As an example for that see the updated examples/cli.rs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example for that see the updated examples/cli.rs.

Indeed -- I think in the case of examples/cli.rs the distinction is that the call to parse<D> is now also generic (the generics perolate back up)

@alamb
Copy link
Contributor

alamb commented Feb 5, 2022

Looks like there are some CI failures as well

* rustc monomorphizes generics which is more efficient than
  trait object methods which are dispatched dynamically

* the Parser struct no longer has a lifetime, making it overall
  more convenient (e.g it now is Send, Sync & UnwindSafe)

* to keep the tests working a Parse trait is introduced
  (however only within the tests, it's not part of the public API)
@not-my-profile
Copy link
Author

Hi Andrew, thanks for the review! I fixed the CI failure (it was just caused by std::marker::PhantomData not being available with no_std, I changed it to core::marker::PhantomData) and answered your comments :)

@alamb
Copy link
Contributor

alamb commented Feb 8, 2022

Sadly I don't have the time to help drive projects such as this. Unless one of the other maintainers wants to do so, PRs like this are not likely to be merged. I am sorry I don't have the time and I am sorry if your time was wasted.

As the current state of affairs might not have been clear in the project's README. I have added #416 to try and help

@alamb
Copy link
Contributor

alamb commented Apr 29, 2022

Thanks again for this PR. I am closing it for now to clean up the PR list. Please reopen if that was a mistake.

@alamb alamb closed this Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants