ENH: Add table prefixes to to_sql method #60409

Diadochokinetic · 2024-11-24T20:45:03Z

closes ENH: Add table prefixes to to_sql method #60422
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This PR adds a prefixes parameter to the to_sql method and passes it down to the underlying sqlalchemy.Table class. It takes into account that temporary tables are not stored in a database's meta data and adds temporary table specific methods for _exists_temporary and _drop_temporary_table.

Unfortunately, temporary tables don't work with null pooling, so I had to create additional sqlalchemy fixtures with the default pooling.

TODO:

case sensitivity check always warns: SQLDatabase.check_case_sensitive checks the databases's meta data. It will never find a temporary table and always throw a Warning. - I disabled case sensitivity checks for temporary tables.

…andas into table_prefixes

WillAyd · 2024-11-27T00:41:53Z

Thanks for the PR, but this seems to only work with SQLAlchemy, meaning we won't get the same behavior for our ADBC drivers. I don't think this is worth pursuing unless there is a way to avoid fragmenting features across those two different areas

Diadochokinetic · 2024-11-27T06:33:40Z

I took a look into the ADBC implementation. to_sql uses adbc_ingest, which supports an (experimental) parameter temporary. I'll give it a shot. Although, this would only support the use case of temporary tables and not table prefixes in general. A more general solution could be to insert the prefixes into table_name. I will report, when I did some experimenting with it.

…andas into table_prefixes

Diadochokinetic · 2024-11-27T15:19:06Z

I did some testing and was able to implement the feature partially for adbc drivers. I used the temporary parameter of the adbc_ingest method. On the one hand, I don't like that it doesn't implement the full feature capacity of the paramater prefixes as its done in sqlalchemy, on the other hand it enables creating and appending temporary tables with the to_sql method for sqlachemy and adbc drivers, which is the main reason why I started working on this feature in the first place. Furthermore, I assume temporary tables will be like >99% of the use cases for prefixes anyway.

Now the question is:

Should the paramater prefixes stay as proposed? Then the documentation needs to be expanded and explicitly state only ["TEMPORARY"] will have an effect for adbc drivers. Maybe even throw a NotImplementedError for other values.
Or reduce the parameter to a boolean temporary so it works exactly the same for both drivers, but loses functionality for sqlalchemy.

I personally prefer the first option, because I connect to db2 databases via sqlalchemy and sometimes need to pass prefixes=["GLOBAL", "TEMPORARY"].

WillAyd · 2024-12-02T19:05:00Z

Can you check if there is an open issue for this upstream in the arrow-adbc repository? I think it may be of interest to them to handle more than just the temporary case.

ADBC releases are pretty quick, so if that is implemented upstream it benefits the entire ecosystem, and likely wouldn't be a long turnaround to get into pandas

Diadochokinetic · 2024-12-02T20:07:44Z

That is a very good point. It wold be very beneficial, if both sqlalchemy and adbc support prefixes. So far, there is no open issue for that, so I opened one: apache/arrow-adbc#2343. Let's see what the adbc team thinks about this.

Diadochokinetic · 2024-12-14T10:55:29Z

It looks like the adbc team is hesitant to implement an abstract parameter prefixes as it is available in sqlalchemy. So the option to wait for feature parity in both back ends seems to be unavailable. I prefer to implement prefixes in to_sql() to:

have the full feature for sqlalchemy back-ends
only use the TEMPORARY keyword for adbc back-ends and throw a NotImplentedError for other keywords

@WillAyd What are your thoughts on this?

WillAyd · 2024-12-16T16:45:33Z

I don't think its worth adding to pandas if the various backends cannot both fully support it; we may have to document instead how you could do it with more low-level SQLAlchemy operations, but to have the pandas API only offer this for SQLAlchemy but not ADBC makes the abstraction of the pandas API more complicated

Diadochokinetic · 2024-12-16T17:15:07Z

And what about a reduced parameter temporary, that is supported by both back ends? That wouldn't solve all my use cases, but at least ~ 90% of it.

github-actions · 2025-01-22T00:07:11Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

Diadochokinetic · 2025-01-22T04:37:24Z

I'm still interested in working on this.

Diadochokinetic · 2025-01-27T16:55:43Z

And what about a reduced parameter temporary, that is supported by both back ends? That wouldn't solve all my use cases, but at least ~ 90% of it.

@WillAyd, what's your thought on this? Having a parameter temporary, that's supported by both back ends, would be a solution that solves 90% of prefix usage and does satisfy feature parity.

WillAyd · 2025-01-28T19:21:43Z

I think that would be a reasonable solution across the implementations

WillAyd · 2025-02-10T14:32:47Z

pandas/io/sql.py

+        try:
+            _ = self.pd_sql.read_query(query)
+            return True
+        except ProgrammingError:


Does sqlalchemy not provide a higher level abstraction than ProgrammingError or OperationalError? I'm not sure I understand the distinction between those, and I am not sure if they would catch all possible errors thrown by sqlalchemy either

The is a higher abstraction possible with DatabaseError.

WillAyd · 2025-02-10T14:33:26Z

pandas/io/sql.py

+            _ = self.pd_sql.read_query(query)
+            return True
+        except ProgrammingError:
+            # Some DBMS (e.g. postgres) require a rollback after a caught exception


The DMBS-specific features are something we want to avoid in pandas, as maintaining compatability on those is not a core specialty of our team. Does sqlalchemy not handle this natively?

WillAyd · 2025-02-10T14:34:04Z

pandas/io/sql.py

+        meta data. The existence is duck tested by a SELECT statement."""
+        from adbc_driver_manager import ProgrammingError
+
+        # sqlite doesn't allow a rollback at this point


Similar comment as before - we really want to avoid putting DBMS-specific logic into our implementation

…vel.

mroeschke · 2025-04-02T16:20:56Z

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

Diadochokinetic and others added 9 commits October 29, 2024 20:22

Implement prefixes parameter.

03c8183

[WIP] implent special cases for temporary tables.

f09fd54

Specify Exception in _exists_temporary.

03b8642

remove print

ccb9eac

Finalize working mysql implementation.

d792f27

[WIP] Add rollback for postgres.

2582834

Add support for sqlite.

b138532

Add connectables with default pool and add test for if_exists=append.

f696257

Merge branch 'pandas-dev:main' into table_prefixes

33b69d9

Diadochokinetic changed the title ~~ENH: Add table prefixes~~ ENH: Add table prefixes to to_sql method Nov 24, 2024

Diadochokinetic added 6 commits November 25, 2024 20:44

Fix typo in prefixes docstring.

0bc6504

Undo experimental import changes.

8a94c2b

Merge branch 'table_prefixes' of https://github.com/Diadochokinetic/p…

ec32a70

…andas into table_prefixes

Add some documentation.

be305ff

Fix typo in NDFrame.to_sql docstring.

35a6394

Add prefixes parameter in to_sql suberclass.

145d18c

Diadochokinetic mentioned this pull request Nov 26, 2024

ENH: Add table prefixes to to_sql method #60422

Open

3 tasks

Diadochokinetic added 3 commits November 26, 2024 15:20

Add prefixes parameter to ADBC subclass method to_sql.

eddf687

Disable case sensitivity check for temporary tables.

3190142

Merge remote-tracking branch 'upstream/main' into table_prefixes

154c208

Diadochokinetic marked this pull request as ready for review November 26, 2024 15:16

Merge branch 'main' into table_prefixes

98a153f

Diadochokinetic added 3 commits November 27, 2024 12:56

[WIP] Add support for adbc driver.

8a0611d

Merge branch 'table_prefixes' of https://github.com/Diadochokinetic/p…

eb6e9f0

…andas into table_prefixes

Fix mypy unsupported operand types error.

522f842

Merge branch 'main' into table_prefixes

ca0551d

Diadochokinetic mentioned this pull request Dec 2, 2024

Upgrade adbc_ingest parameter temporary to prefixes sqlalchemy/sqlalchemy#12149

Closed

Diadochokinetic mentioned this pull request Dec 2, 2024

Upgrade adbc_ingest parameter temporary to prefixes apache/arrow-adbc#2343

Open

Diadochokinetic added 2 commits December 2, 2024 21:13

Merge branch 'main' into table_prefixes

b595021

Merge branch 'main' into table_prefixes

3c8a12e

Merge branch 'main' into table_prefixes

1ad55bb

github-actions bot added the Stale label Jan 22, 2025

Merge branch 'main' into table_prefixes

25ecbc3

Merge branch 'main' into table_prefixes

775a05a

Diadochokinetic and others added 8 commits January 28, 2025 21:21

Replace prefixes parameter with temporary.

efad6b4

Merge branch 'main' into table_prefixes

0543666

Merge branch 'main' into table_prefixes

90d1f1f

Add whatsnew entry.

4da49dd

Add issue number to whatsnew

b0db943

Merge branch 'main' into table_prefixes

c52c71a

Merge branch 'main' into table_prefixes

cd2fff4

Merge branch 'main' into table_prefixes

d1233ce

WillAyd requested changes Feb 10, 2025

View reviewed changes

Diadochokinetic added 3 commits February 15, 2025 20:02

Change error handling of _exists_temporary to a higher abstraction le…

cd51d7c

…vel.

Use nested try except to handle rollbacks.

5d306c2

Nested try except blocks don't work with adbc.

8cafc44

mroeschke closed this Apr 2, 2025

Uh oh!

ENH: Add table prefixes to to_sql method #60409

ENH: Add table prefixes to to_sql method #60409

Uh oh!

Conversation

Diadochokinetic commented Nov 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WillAyd commented Nov 27, 2024

Uh oh!

Diadochokinetic commented Nov 27, 2024

Uh oh!

Diadochokinetic commented Nov 27, 2024

Uh oh!

WillAyd commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Diadochokinetic commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Diadochokinetic commented Dec 14, 2024

Uh oh!

WillAyd commented Dec 16, 2024

Uh oh!

Diadochokinetic commented Dec 16, 2024

Uh oh!

github-actions bot commented Jan 22, 2025

Uh oh!

Diadochokinetic commented Jan 22, 2025

Uh oh!

Diadochokinetic commented Jan 27, 2025

Uh oh!

WillAyd commented Jan 28, 2025

Uh oh!

WillAyd Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

Diadochokinetic Feb 15, 2025

Choose a reason for hiding this comment

Uh oh!

WillAyd Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

WillAyd Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

mroeschke commented Apr 2, 2025

Uh oh!

Uh oh!

Diadochokinetic commented Nov 24, 2024 •

edited

Loading

WillAyd commented Dec 2, 2024 •

edited

Loading

Diadochokinetic commented Dec 2, 2024 •

edited

Loading