Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH: Pluggable SQL performance via new SQL
engine
keyword #40556New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Pluggable SQL performance via new SQL
engine
keyword #40556Changes from 22 commits
569b1bc
845b504
4c9db09
663ebae
f471a72
8fb4df6
383c1cb
2562f71
982593c
962a36c
0e96765
dbf0cfa
b77b6a3
7f022fe
965538d
2ab9d85
c34c97b
5adb8b2
80e3a1b
69051bc
1423693
4f6f8ea
0be19ce
3beb9aa
f084faa
36adf43
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already suggested using Python's
entrypoint
mechanism for registering additional engines. I think this would be a good place to leverage. The idea is that you can implement an engine without contributing anything topandas
making the development of that much more flexible / fast-paced.In the library implementing the engine, you would add the following to the setup.py:
Here you can then load the engines using the entry points mechanism (without importing anything!):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know... I've never used
entrypoint
s before, makes me a little uneasy to use them without understanding them, I'd need to learn some more about it... But we don't do this for any other I/O engines like Parquet or Excel, so why not keep the same here?@jreback what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snowflake
,turbodbc
andpostgres
.pandas
CI would probably be quite heavy at the end.pandas
functionality with additions that are experimental and wouldn't be merged into core pandas. They don't need a technique like entrypoints as you instantiate them explicitly by using the class constructor of the ExtensionArray. This isn't the case for the database engines as you only specify the engine via a string.pandas
code, i.e. you don't need to ensure that you have imported the engine code before any other code usespandas
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could certainly add this (we already use entry points for plotting), but let's do as a followup (pls create an issue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could also use the entrypoint mechanism to do the actual engine load.