-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: support both SQLAlchemy engines and connections #7877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is it possible to provide this engine yourself by doing the following in the Because then it is the question: do we ask the user to do this themselves, or do we provide the convenience (but also clutter in interface) to also accept a connection? |
Can this issue be closed? I agree w/ jorisvandenbossche that a change to pandas isn't necessary--you can easily access the .engine property from a SQLAlchemy connection object:
|
You can indeed easily access the engine, but I don't really know how then the atomicity is handled (the reason to use a transaction in this case). Accessing the engine and using that, within the |
It's already done as a transaction: https://github.com/pydata/pandas/blob/master/pandas/io/sql.py#L633-L644 |
yes, but that is for one |
I tested it and running a statement as conn.engine.execute(sql) does not run it within the transaction (in other words, the statement would not be rolledback as needed) One possibility would be to a data.to_sql(table.name, engine, connection=connection, if_exists='append')
|
That would work. Still feels like it might be easier to pass either a |
@artemyk Are you sure that providing a |
@jorisvandenbossche The connection contextmanager (i.e. |
@artemyk, what is the advantage of using a sqlalchemy engine instead of a connection? Going through the io/sql.py code, we really only use In fact, it seems to make more sense for the scope of pandas to operate at the connection level only. That way the user can have full control over sqlalchemy connection issues such as isolation levels, nested transactions/savepoints, etc (see sqlalchemy engines and connections). Maybe we could default to use connections and create our own connection in case the user passes an engine? That way this would backwards compatible, but still allow for more advanced usage. |
@tr11 that sounds like a good idea to me. @jorisvandenbossche thoughts? |
OK, I really needed this right now, so I just hacked a quick version (https://github.com/tr11/pandas/commit/fe6fee394f78a908e74cf3675668a4475db3c616) that passes all the tests with engines or connections. I will add standard tests that use connection parameters and a couple other ones that show the new things that can be done with transaction rollbacks, using temporary tables, etc. If there is any interest in merging this, I can also update the documentation before submitting a pull request. |
I've updated the PR with these changes, and added tests |
update pymysql to 0.6.3 to avoid cursor bugs Add documentation and tests for SQLAlchemy connectables explicit reference to connection/engine in docs Temporary table test pass compile the connectable
Feature request for supporting also Connection objects instead of only Engines.
Also reported in #8533, #10104
My use case - I often add a week's worth of data (stored in a pandas dataframe) to an existing table. Before I add the data, I want to be able to delete any existing data for this week from the table, but I want the whole thing to be wrapped in a transaction so I'm not left in a mess halfway through.
Here's what I therefore want to be able to do
i.e. my change is that I want to be able to pass a SQLAlchemy
Connection
object intoDataFrame.to_sql
instead of anEngine
. (The underlyingEngine
of theConnection
object can be accessed as.engine
, so we should still be able to pull metadata etc)The text was updated successfully, but these errors were encountered: