BUG: SQL connection #59224

vladiscripts · 2024-07-11T02:12:11Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('mysql+pymysql://user:pwd@host/Data', echo=True)
with db.engine.begin() as conn:
    df = pd.read_sql_table('LocationIDs', conn.conection)

Traceback (most recent call last):
  File "example.py", line 148, in <module>
    df = pd.read_sql_table('LocationIDs', conn.connection)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 385, in read_sql_table
    if not pandas_sql.has_table(table_name):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 2865, in has_table
    return len(self.execute(query, [name]).fetchall()) > 0
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 2686, in execute
    raise ex from exc
pandas.errors.DatabaseError: Execution failed on sql '
        SELECT
            name
        FROM
            sqlite_master
        WHERE
            type IN ('table', 'view')
            AND name=?;
        ': not all arguments converted during string formatting
python-BaseException

Issue Description

The connection to the database does not work. Moreover, this error is similar for all SQL methods that use the internal method pandasSQL_builder().

pandas/pandas/io/sql.py

Line 870 in d966462

def pandasSQL_builder(
Although I explicitly specify a working MySQL connection, the internal pandasSQL_builder() method changes it to an SQLite connection without any notification. Only after several hours of research and googling I noticed on stackoverflow a post about this problem and guessed that the problem refers with SQLite.
pandasSQL_builder() requires con to be a sqlalchemy.engine.Connectable instance . Such instances in sqlalchemy are the engine and conn objects (with db.engine.begin() as conn:), but they do not have a cursor object, which is required. Whereas conn.connection has a cursor object, but it is rejected because it is not a Connectable instance. As a result, connecting via sqlalchemy is impossible.

pandas/pandas/io/sql.py

Line 890 in d966462

if sqlalchemy is not None and isinstance(con, (str, sqlalchemy.engine.Connectable)):
Also pandasSQL_builder() does not detect the sqlalchemy module I have installed. This detection is necessary for the above check and for the ability to specify the database URL in the form of a literal (like pd.read_sql_table(tablename, con='mysql+pymysql://user:pwd@host/Data' )).

pandas/pandas/io/sql.py

Line 885 in d966462

sqlalchemy = import_optional_dependency("sqlalchemy", errors="ignore")

As a result, no connection via sqlalchemy is possible.

I have a suspicion that the problem may be partitial linked that I'm using SQLAlchemy v1.4.49, as required by some of my code's dependencies, since many Python modules stlll not support SQLAlchemy v2. But then I would like Pandas support for the previous version of Sqlalchemy, or locking it as a dependency on a specific version up to which Pandas has support.

Also I found issue in the method that looks for installed SQLAlchemy, but it doesn't find it. This detects the SQLAlchemy installed. But while the minimum version is v2 set in it (VERSIONS = {"sqlalchemy": "2.0.0",}) and a notification is assigned there that Pandas requires a version no lower than this (msg at line 175). But since the detection is made with the ignore argument (sqlalchemy = import_optional_dependency("sqlalchemy", errors="ignore") and mod = import_optional_dependency(modname, errors="ignore") in pd.show_versions()), the message is not shown and is returned the result is None, as if no installed SQLAlchemy was found.

pandas/pandas/compat/_optional.py

Line 171 in d966462

minimum_version = min_version if min_version is not None else VERSIONS.get(parent)

Expected Behavior

Should work:

Connect via connection SQLAlchemy.
Connect via literal database URL.
There should be no implicit silent connection to non-existent SQLite, when a MySQL (or other) connection is explicitly specified.
Should show SQLAlchemy installed, whereas currently it shows no installation when SQLAlchemy is installed.
If Pandas requires SQLAlchemy to be a "minimum version" of 2.0.0, this must be reported.

Installed Versions

pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 16.1.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : 2.0.1
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

The text was updated successfully, but these errors were encountered:

vladiscripts · 2024-07-11T02:49:36Z

Curiously, another method works: df = pd.read_sql('SELECT * FROM LocationIDs', conn.connection). Although it also does not define the connection correctly, PandasSQL_builder() gives the MySQL connection the wrong sqlite class SQLiteDatabase.

pandas/pandas/io/sql.py

Line 690 in d966462

if isinstance(pandas_sql, SQLiteDatabase):

Animesh-Shukla · 2024-07-18T14:44:37Z

take

vladiscripts added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 11, 2024

github-actions bot assigned Animesh-Shukla Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: SQL connection #59224

BUG: SQL connection #59224

vladiscripts commented Jul 11, 2024 •

edited

Loading

vladiscripts commented Jul 11, 2024 •

edited

Loading

Animesh-Shukla commented Jul 18, 2024

BUG: SQL connection #59224

BUG: SQL connection #59224

Comments

vladiscripts commented Jul 11, 2024 • edited Loading

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

vladiscripts commented Jul 11, 2024 • edited Loading

Animesh-Shukla commented Jul 18, 2024

vladiscripts commented Jul 11, 2024 •

edited

Loading

vladiscripts commented Jul 11, 2024 •

edited

Loading