Skip to content

Fixing multi method for to_sql for non-oracle databases #57311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 17, 2024
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Fixed regressions
- Fixed regression in :meth:`DataFrame.sort_index` not producing a stable sort for a index with duplicates (:issue:`57151`)
- Fixed regression in :meth:`DataFrame.to_dict` with ``orient='list'`` and datetime or timedelta types returning integers (:issue:`54824`)
- Fixed regression in :meth:`DataFrame.to_json` converting nullable integers to floats (:issue:`57224`)
- Fixed regression in :meth:`DataFrame.to_sql` when ``method="multi"`` is passed and the dialect type is not Oracle (:issue:`57310`)
- Fixed regression in :meth:`DataFrameGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmax` ignoring the ``skipna`` argument (:issue:`57040`)
- Fixed regression in :meth:`DataFrameGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmax` where values containing the minimum or maximum value for the dtype could produce incorrect results (:issue:`57040`)
- Fixed regression in :meth:`ExtensionArray.to_numpy` raising for non-numeric masked dtypes (:issue:`56991`)
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2808,6 +2808,9 @@ def to_sql(
database. Otherwise, the datetimes will be stored as timezone unaware
timestamps local to the original timezone.

Not all datastores support ``method="multi"``. Oracle, for example,
does not support multi-value insert.

References
----------
.. [1] https://docs.sqlalchemy.org
Expand Down
11 changes: 4 additions & 7 deletions pandas/io/sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -996,22 +996,19 @@ def _execute_insert(self, conn, keys: list[str], data_iter) -> int:

def _execute_insert_multi(self, conn, keys: list[str], data_iter) -> int:
"""
Alternative to _execute_insert for DBs support multivalue INSERT.
Alternative to _execute_insert for DBs support multi-value INSERT.

Note: multi-value insert is usually faster for analytics DBs
and tables containing a few columns
but performance degrades quickly with increase of columns.

"""

from sqlalchemy import insert

data = [dict(zip(keys, row)) for row in data_iter]
stmt = insert(self.table)
# conn.execute is used here to ensure compatibility with Oracle.
# Using stmt.values(data) would produce a multi row insert that
# isn't supported by Oracle.
# see: https://docs.sqlalchemy.org/en/20/core/dml.html#sqlalchemy.sql.expression.Insert.values
result = conn.execute(stmt, data)
stmt = insert(self.table).values(data)
result = conn.execute(stmt)
return result.rowcount

def insert_data(self) -> tuple[list[str], list[np.ndarray]]:
Expand Down