Skip to content

QST: Deadlock when using df.to_sql for none-related tables #36542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yuv-c opened this issue Sep 22, 2020 · 7 comments
Closed

QST: Deadlock when using df.to_sql for none-related tables #36542

yuv-c opened this issue Sep 22, 2020 · 7 comments
Labels
IO SQL to_sql, read_sql, read_sql_query Usage Question

Comments

@yuv-c
Copy link

yuv-c commented Sep 22, 2020

I have a multithreaded ETL that occasionally fails due to an issue I can't understand and would like an explanation.
I have asked the following question but did not get a clear answer.
Basically my ETL uses the load method defined below in different instances, each instance on a different thread, and occasionally I get a deadlock error.

    def __init__(self):
        self.connection = sqlalchemy.create_engine(MYSQL_CONNECTION_STR)

    def load(self, df: pd.DataFrame) -> None:
        df.to_sql(
            name=self.table, con=self.connection, if_exists="replace", index=False,
        )

    @abc.abstractmethod
    def transform(self, data: object) -> pd.DataFrame:
        pass

    @abc.abstractmethod
    def extract(self) -> object:
        pass

    #  other methods...


class ComplianceInfluxQuery(Query):
    # Implements abstract methods... load method is the same as Query class


ALL_QUERIES = [ComplianceInfluxQuery("cc_influx_share_count"),ComplianceInfluxQuery("cc_influx_we_count")....]


while True:
    with ThreadPoolExecutor(max_workers=8) as pool:
        for query in ALL_QUERIES:
            pool.submit(execute_etl, query) # execute_etl function calls extract, transform and load

This is the error log:

2020-09-17 09:48:28,138 | INFO | root | query | 44 | 1 | ComplianceInfluxQuery.load()
2020-09-17 09:48:28,160 | INFO | root | query | 44 | 1 | ComplianceInfluxQuery.load()
2020-09-17 09:48:28,241 | ERROR | root | calculate | 124 | 1 | Failed to execute query ComplianceInfluxQuery
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.8/site-packages/pymysql/cursors.py", line 163, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.8/site-packages/pymysql/cursors.py", line 321, in _query
    conn.query(q)
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 505, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 724, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 1069, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 676, in _read_packet
    packet.raise_for_error()
  File "/usr/local/lib/python3.8/site-packages/pymysql/protocol.py", line 223, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.8/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "calculate.py", line 119, in execute_etl
    query.run()
  File "/chil_etl/query.py", line 45, in run
    self.load(df)
  File "/chil_etl/query.py", line 22, in load
    df.to_sql(
  File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 2653, in to_sql
    sql.to_sql(
  File "/usr/local/lib/python3.8/site-packages/pandas/io/sql.py", line 512, in to_sql
    pandas_sql.to_sql(
  File "/usr/local/lib/python3.8/site-packages/pandas/io/sql.py", line 1316, in to_sql
    table.create()
  File "/usr/local/lib/python3.8/site-packages/pandas/io/sql.py", line 649, in create
    self._execute_create()
  File "/usr/local/lib/python3.8/site-packages/pandas/io/sql.py", line 641, in _execute_create
    self.table.create()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 927, in create
    bind._run_visitor(ddl.SchemaGenerator, self, checkfirst=checkfirst)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2097, in _run_visitor
    conn._run_visitor(visitorcallable, element, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1656, in _run_visitor
    visitorcallable(self.dialect, self, **kwargs).traverse_single(element)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/visitors.py", line 145, in traverse_single
    return meth(obj, **kw)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/ddl.py", line 827, in visit_table
    self.connection.execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/ddl.py", line 72, in _execute_on_connection
    return connection._execute_ddl(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1068, in _execute_ddl
    ret = self._execute_context(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.8/site-packages/pymysql/cursors.py", line 163, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.8/site-packages/pymysql/cursors.py", line 321, in _query
    conn.query(q)
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 505, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 724, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 1069, in read
    first_packet = self.connection._read_packet()
  File "/usr/local/lib/python3.8/site-packages/pymysql/connections.py", line 676, in _read_packet
    packet.raise_for_error()
  File "/usr/local/lib/python3.8/site-packages/pymysql/protocol.py", line 223, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.8/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
    raise errorclass(errno, errval)
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
[SQL: 
CREATE TABLE cc_influx_share_count (
    unique_identifier TEXT, 
    nfs_share_count FLOAT(53), 
    smb_share_count FLOAT(53), 
    s3_bucket_count FLOAT(53)
)

]
(Background on this error at: http://sqlalche.me/e/13/e3q8)

As you can see, when two threads use the DataFrame.to_sql() method at roughly the same time this could cause a deadlock.
The deadlock could occur on any of the classes.
I know I can call load again until it succeeds, but I want to understand why this happens in the first place.

I tried debugging this issue first through MySQL DB but when I run SHOW ENGINE INNODB STATUS no deadlocks show up, which leads me here.
My question is why would two none-related tables could have a deadlock?
what does pandas df.to_sql() does that could cause this?

I assumed df.to_sql() simply executes an INSERT statement but when I checked the logs I saw it does many more things I do not understand:
SELECT * FROM mysql.general_log WHERE event_time >= "2020-09-17 09:48:27" AND event_time <= "2020-09-17 09:48:29" ORDER BY event_time ASC;

Result:

Time: 2020-09-17 09:48:27.010747. Event: COMMIT


Time: 2020-09-17 09:48:27.011075. Event: ROLLBACK


Time: 2020-09-17 09:48:27.012042. Event: CREATE TABLE cc_influx_last_metric_time (
    unique_identifier TEXT, 
    timestamp TIMESTAMP NULL, 
    uptime BIGINT
)


Time: 2020-09-17 09:48:27.033973. Event: COMMIT


Time: 2020-09-17 09:48:27.034327. Event: ROLLBACK


Time: 2020-09-17 09:48:27.053837. Event: INSERT INTO cc_influx_last_metric_time (unique_identifier, timestamp, uptime) VALUES (...)


Time: 2020-09-17 09:48:27.066930. Event: COMMIT


Time: 2020-09-17 09:48:27.068657. Event: ROLLBACK


Time: 2020-09-17 09:48:27.887579. Event: DESCRIBE `container_post_deployments`


Time: 2020-09-17 09:48:27.889705. Event: ROLLBACK


Time: 2020-09-17 09:48:27.890186. Event: DESCRIBE `container_post_deployments`


Time: 2020-09-17 09:48:27.892125. Event: ROLLBACK


Time: 2020-09-17 09:48:27.892619. Event: SHOW FULL TABLES FROM `chil_etl`


Time: 2020-09-17 09:48:27.894964. Event: SHOW CREATE TABLE `container_post_deployments`


Time: 2020-09-17 09:48:27.896491. Event: ROLLBACK


Time: 2020-09-17 09:48:27.897097. Event: DROP TABLE container_post_deployments


Time: 2020-09-17 09:48:27.907816. Event: COMMIT


Time: 2020-09-17 09:48:27.908322. Event: ROLLBACK


Time: 2020-09-17 09:48:27.909890. Event: CREATE TABLE container_post_deployments (
    image TEXT, 
    `clientId` TEXT, 
    message TEXT, 
    timestamp TIMESTAMP NULL, 
    status_code BIGINT, 
    something TEXT, 
    user_agent TEXT
)


Time: 2020-09-17 09:48:27.928665. Event: COMMIT


Time: 2020-09-17 09:48:27.929089. Event: ROLLBACK


Time: 2020-09-17 09:48:27.932310. Event: INSERT INTO container_post_deployments (image, `clientId`, message, timestamp, status_code, something, user_agent) VALUES (...)


Time: 2020-09-17 09:48:27.934410. Event: COMMIT


Time: 2020-09-17 09:48:27.936774. Event: ROLLBACK


Time: 2020-09-17 09:48:28.140219. Event: DESCRIBE `cc_influx_share_count`


Time: 2020-09-17 09:48:28.163517. Event: DESCRIBE `cc_influx_we_count`


Time: 2020-09-17 09:48:28.166070. Event: ROLLBACK


Time: 2020-09-17 09:48:28.168159. Event: DESCRIBE `cc_influx_share_count`


Time: 2020-09-17 09:48:28.169895. Event: ROLLBACK


Time: 2020-09-17 09:48:28.170583. Event: DESCRIBE `cc_influx_we_count`


Time: 2020-09-17 09:48:28.174444. Event: ROLLBACK


Time: 2020-09-17 09:48:28.176339. Event: SHOW FULL TABLES FROM `chil_etl`


Time: 2020-09-17 09:48:28.177915. Event: ROLLBACK


Time: 2020-09-17 09:48:28.179331. Event: SHOW FULL TABLES FROM `chil_etl`


Time: 2020-09-17 09:48:28.182284. Event: SHOW CREATE TABLE `cc_influx_share_count`


Time: 2020-09-17 09:48:28.185154. Event: ROLLBACK


Time: 2020-09-17 09:48:28.192493. Event: SHOW CREATE TABLE `cc_influx_we_count`


Time: 2020-09-17 09:48:28.192887. Event: DROP TABLE cc_influx_share_count


Time: 2020-09-17 09:48:28.194530. Event: ROLLBACK


Time: 2020-09-17 09:48:28.195707. Event: DROP TABLE cc_influx_we_count


Time: 2020-09-17 09:48:28.207712. Event: COMMIT


Time: 2020-09-17 09:48:28.208141. Event: ROLLBACK


Time: 2020-09-17 09:48:28.210087. Event: CREATE TABLE cc_influx_share_count (
    unique_identifier TEXT, 
    nfs_share_count FLOAT(53), 
    smb_share_count FLOAT(53), 
    s3_bucket_count FLOAT(53)
)


Time: 2020-09-17 09:48:28.215350. Event: COMMIT


Time: 2020-09-17 09:48:28.216115. Event: ROLLBACK


Time: 2020-09-17 09:48:28.217996. Event: CREATE TABLE cc_influx_we_count (
    unique_identifier TEXT, 
    timestamp TIMESTAMP NULL, 
    `ANF` FLOAT(53), 
    `S3` FLOAT(53), 
    `CVO` FLOAT(53)
)


Time: 2020-09-17 09:48:28.240455. Event: ROLLBACK


Time: 2020-09-17 09:48:28.240908. Event: ROLLBACK


Time: 2020-09-17 09:48:28.244425. Event: COMMIT


Time: 2020-09-17 09:48:28.244965. Event: ROLLBACK


Time: 2020-09-17 09:48:28.249009. Event: INSERT INTO cc_influx_we_count (unique_identifier, timestamp, `ANF`, `S3`, `CVO`) VALUES (...)


Time: 2020-09-17 09:48:28.253638. Event: COMMIT


Time: 2020-09-17 09:48:28.256299. Event: ROLLBACK


Time: 2020-09-17 09:48:28.525814. Event: DESCRIBE `cc_influx_disk_usage`


Time: 2020-09-17 09:48:28.530211. Event: ROLLBACK


Time: 2020-09-17 09:48:28.532392. Event: DESCRIBE `cc_influx_disk_usage`


Time: 2020-09-17 09:48:28.539685. Event: ROLLBACK


Time: 2020-09-17 09:48:28.541868. Event: SHOW FULL TABLES FROM `chil_etl`


Time: 2020-09-17 09:48:28.560271. Event: SHOW CREATE TABLE `cc_influx_disk_usage`


Time: 2020-09-17 09:48:28.565451. Event: ROLLBACK


Time: 2020-09-17 09:48:28.569257. Event: DROP TABLE cc_influx_disk_usage


Time: 2020-09-17 09:48:28.585562. Event: COMMIT


Time: 2020-09-17 09:48:28.595193. Event: ROLLBACK


Time: 2020-09-17 09:48:28.598230. Event: CREATE TABLE cc_influx_disk_usage (
    unique_identifier TEXT, 
    timestamp TIMESTAMP NULL, 
    total_gb FLOAT(53), 
    used_gb FLOAT(53)
)


Time: 2020-09-17 09:48:28.619580. Event: COMMIT


Time: 2020-09-17 09:48:28.620411. Event: ROLLBACK


Time: 2020-09-17 09:48:28.625385. Event: INSERT INTO cc_influx_disk_usage (unique_identifier, timestamp, total_gb, used_gb) VALUES (....)


Time: 2020-09-17 09:48:28.628706. Event: COMMIT


Time: 2020-09-17 09:48:28.631955. Event: ROLLBACK


Time: 2020-09-17 09:48:28.840143. Event: DESCRIBE `cc_influx_aws_subscription`


Time: 2020-09-17 09:48:28.844303. Event: ROLLBACK


Time: 2020-09-17 09:48:28.845637. Event: DESCRIBE `cc_influx_aws_subscription`


Time: 2020-09-17 09:48:28.848076. Event: ROLLBACK


Time: 2020-09-17 09:48:28.848646. Event: SHOW FULL TABLES FROM `chil_etl`


Time: 2020-09-17 09:48:28.851165. Event: SHOW CREATE TABLE `cc_influx_aws_subscription`


Time: 2020-09-17 09:48:28.852202. Event: ROLLBACK


Time: 2020-09-17 09:48:28.852691. Event: DROP TABLE cc_influx_aws_subscription


Time: 2020-09-17 09:48:28.861657. Event: COMMIT


Time: 2020-09-17 09:48:28.862099. Event: ROLLBACK


Time: 2020-09-17 09:48:28.863288. Event: CREATE TABLE cc_influx_aws_subscription (
    unique_identifier TEXT, 
    timestamp TIMESTAMP NULL, 
    is_subscribed BIGINT
)


Time: 2020-09-17 09:48:28.878554. Event: COMMIT


Time: 2020-09-17 09:48:28.879113. Event: ROLLBACK


Time: 2020-09-17 09:48:28.881054. Event: INSERT INTO cc_influx_aws_subscription (unique_identifier, timestamp, is_subscribed) VALUES (....)


Time: 2020-09-17 09:48:28.882642. Event: COMMIT


Time: 2020-09-17 09:48:28.884614. Event: ROLLBACK


Time: 2020-09-17 09:48:28.918677. Event: DESCRIBE `hubspot_data`


Time: 2020-09-17 09:48:28.922938. Event: ROLLBACK


Time: 2020-09-17 09:48:28.923993. Event: DESCRIBE `hubspot_data`


Time: 2020-09-17 09:48:28.928181. Event: ROLLBACK


Time: 2020-09-17 09:48:28.928808. Event: SHOW FULL TABLES FROM `chil_etl`


Time: 2020-09-17 09:48:28.931225. Event: SHOW CREATE TABLE `hubspot_data`


Time: 2020-09-17 09:48:28.934269. Event: ROLLBACK


Time: 2020-09-17 09:48:28.934851. Event: DROP TABLE hubspot_data


Time: 2020-09-17 09:48:28.949309. Event: COMMIT


Time: 2020-09-17 09:48:28.949778. Event: ROLLBACK


Time: 2020-09-17 09:48:28.953829. Event: CREATE TABLE hubspot_data (...)


Time: 2020-09-17 09:48:28.973177. Event: COMMIT


Time: 2020-09-17 09:48:28.973652. Event: ROLLBACK
@yuv-c yuv-c added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Sep 22, 2020
@jreback
Copy link
Contributor

jreback commented Sep 22, 2020

these functions are just wrappers around the db itself

w/o a full reproducible example your best resource is SO

@jreback jreback added IO SQL to_sql, read_sql, read_sql_query and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 22, 2020
@jreback jreback added this to the No action milestone Sep 22, 2020
@jreback jreback closed this as completed Sep 22, 2020
@keithkmyers
Copy link

FWIW, it appears the to_sql operation (via sqlalchemy?) is requesting information from the information schema for table objects.

Specifically for an MSSQL server I see this firing:

SELECT [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]
FROM [INFORMATION_SCHEMA].[TABLES]
WHERE [INFORMATION_SCHEMA].[TABLES].[TABLE_SCHEMA] = CAST(@p1 AS NVARCHAR(max)) AND [INFORMATION_SCHEMA].[TABLES].[TABLE_TYPE] = CAST(@p2 AS NVARCHAR(max)) ORDER BY [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]

This query, for whatever reason, is showing that it is blocked with a "LCK_M_S" lock, waiting on some ETL process I have running right now, on totally unrelated tables.

@match-gabeflores
Copy link

match-gabeflores commented Mar 20, 2023

@keithkmyers , fyi there is more info about this for SQL Server here:

I'm running through the same issue.
There's a separate process in our prod environment that is creating an columnstore index on unrelated table1 and takes a long time. While this is running, it's locking information_schema tables.
When my process starts to run which uses pandas to_sql to load to table2, my process is blocked because it can't access information_schema tables.

It seems like this should be worked around from Pandas side.
maybe the if_exists can have a skip check value that skips checking if the table exists (that is, it skips querying the information_schema system tables).

@romiof
Copy link

romiof commented Mar 24, 2023

I'm in the same boat.

Trying to implement a new ETL process using Python / Pandas, but my destination is MS SQL (on-prem 2014).

When inserting one table, all database get locked, with queries in INFORMATION_SCHEMA :(
When the insert step is done, the dabase get unlocked.

Idk if it's happens only with MS SQL or is something by-design in pandas...

Any news @match-gabeflores @keithkmyers ??

@malow106
Copy link

malow106 commented Jul 16, 2023

Hi, Same issue on ELT process (on Load part). To load I use to_sql with pandas 2.x against MSSQL SERVER 17 DB. This is a multithreaded process filling different tables.

Only workaround is a retry

@match-gabeflores
Copy link

match-gabeflores commented Aug 31, 2023

I posted a workaround here, as well as more details on the root cause.

Use a table name that is all lower case (use mytable instead of MyTable) OR

Use Read Uncommitted in the engine. Note, this doesn't work for Dask's implementation since that requires a URI string

# https://docs.sqlalchemy.org/en/20/dialects/mssql.html#transaction-isolation-level
engine = sa.create_engine("mssql+pyodbc:///?odbc_connect={}".format(params)
                          , fast_executemany=True
                           , isolation_level="READ UNCOMMITTED"
                          )

@rohitpharande
Copy link

Mentioning the proper version of the ODBC driver in the connection string has solved this for me. I don't know how it solved it but the error is gone.
Sample code

from sqlalchemy import create_engine

db_connection_str = 'mssql+pyodbc://test_user:test_password@localhost/test_database?driver=ODBC+Driver+17+for+SQL+Server&TrustServerCertificate=yes'
engine = create_engine(db_connection_str)

df = pd.DataFrame({'id': [1, 2, 3], 'name': ['John', 'Mary', 'Peter']})
df.to_sql("my_empty_table", engine,if_exists="append", index=False)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO SQL to_sql, read_sql, read_sql_query Usage Question
Projects
None yet
Development

No branches or pull requests

7 participants