-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Pluggable SQL performance via new SQL engine
keyword
#40556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
update from master
weird, looks like CI only has one error but it's totally unrelated - something about IPython code blocks in the docs, but it's in the Parquet section, nothing to do with this PR. I can't reproduce the error locally. (base) root@eda624c2c1ba:/workspaces/pandas-josh/doc# grep -B10 \"^<<<-------------------------------------------------------------------------$\" sphinx.log
(base) root@eda624c2c1ba:/workspaces/pandas-josh/doc# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks reasonable. some changes, would be ok with merging this. then if you could refactor pandas/io/sqlpy to make a sub-module e.g.
pandas/io/sql/engine, .....etc (similar in concept to how we do pandas/io/excel).
could also do that as a pre-cursor here.
@pandas-dev/pandas-core if any comments on the structure. |
Made all requested changes except the test, as noted above |
Weird CI failure, unrelated to my changes - something to do with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of comments, pls merge master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm ping on green
ok will do, thanks! |
IIRC we had an xfail in place for this partcular test. go ahead an dput it back and can be tackled there. |
@jreback This is not a flaky test. This is failing because we are not raising the correct error message. See #34431. Xfailed case is also xfailed here I think and is not the problem. EDIT: Please disregard second half of comment. Accidentally copy pasted wrong thing. |
Great. So I catch these 2?
Basically, just add the prefix |
@yehoshuadimarsky The last one to catch is OperationalError: (1054, "Unknown column 'inf' in 'field list'") Sorry if I was unclear before. You can also test locally by running the first two cases in the OP in #34431.
Both cases need to raise Lines 1595 to 1599 in 49c7791
EDIT: Was actually sqlalchemy.exc.ProgrammingError: (pymysql.err.ProgrammingError) inf can not be used with MySQL and I was wrong here. See my later review/suggestion. |
For the life of me can't figure this out: https://stackoverflow.com/questions/67358166 |
@lithomas1 shoot I still can't get the CI to pass, any suggestions? |
Co-authored-by: Thomas Li <[email protected]>
@lithomas1 nope still failed CI |
Co-authored-by: Thomas Li <[email protected]>
@yehoshuadimarsky Yay, looks like the CI fix worked finally. I think the last thing to do is to evaluate @xhochy 's proposal. I'm not a core team member so I'll leave this to @jreback and @charlesdong1991. |
Thanks for all your help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @yehoshuadimarsky very nice (comment for followup)
|
||
with cf.config_prefix("io.sql"): | ||
cf.register_option( | ||
"engine", | ||
"auto", | ||
sql_engine_doc, | ||
validator=is_one_of_factory(["auto", "sqlalchemy"]), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could certainly add this (we already use entry points for plotting), but let's do as a followup (pls create an issue)
thanks @yehoshuadimarsky very nice! |
@yehoshuadimarsky can you do a followup to add a whatsnew note / update io.rst if needed anywhere. |
Yes, just for this change? There's not much here that changed to the external user, it's just refactoring to support more future engines. |
yeah just a quick one for the addition of the option (even though it doesn't do anything different yet) |
This is the first step in implementing an
engine
backend to the SQL I/O functions, as discussed in #36893. This PR is simply to refactor the existing SQLAlchemy parts to make them extensible for an external engine.No new engines were added at this stage. No new tests were added, only verified that existing tests fail ("test behavior, not implementation" suggests that we don't need to test internal refactoring)
Lots of the code was borrowed from the Parquet I/O implementation.