Skip to content

BUG: SQL connection #59224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
vladiscripts opened this issue Jul 11, 2024 · 2 comments
Open
3 tasks done

BUG: SQL connection #59224

vladiscripts opened this issue Jul 11, 2024 · 2 comments
Assignees
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@vladiscripts
Copy link

vladiscripts commented Jul 11, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('mysql+pymysql://user:pwd@host/Data', echo=True)
with db.engine.begin() as conn:
    df = pd.read_sql_table('LocationIDs', conn.conection)
Traceback (most recent call last):
  File "example.py", line 148, in <module>
    df = pd.read_sql_table('LocationIDs', conn.connection)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 385, in read_sql_table
    if not pandas_sql.has_table(table_name):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 2865, in has_table
    return len(self.execute(query, [name]).fetchall()) > 0
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/web/lib/python3.11/site-packages/pandas/io/sql.py", line 2686, in execute
    raise ex from exc
pandas.errors.DatabaseError: Execution failed on sql '
        SELECT
            name
        FROM
            sqlite_master
        WHERE
            type IN ('table', 'view')
            AND name=?;
        ': not all arguments converted during string formatting
python-BaseException

Issue Description

  • The connection to the database does not work. Moreover, this error is similar for all SQL methods that use the internal method pandasSQL_builder().

    def pandasSQL_builder(

  • Although I explicitly specify a working MySQL connection, the internal pandasSQL_builder() method changes it to an SQLite connection without any notification. Only after several hours of research and googling I noticed on stackoverflow a post about this problem and guessed that the problem refers with SQLite.

  • pandasSQL_builder() requires con to be a sqlalchemy.engine.Connectable instance . Such instances in sqlalchemy are the engine and conn objects (with db.engine.begin() as conn:), but they do not have a cursor object, which is required. Whereas conn.connection has a cursor object, but it is rejected because it is not a Connectable instance. As a result, connecting via sqlalchemy is impossible.

    if sqlalchemy is not None and isinstance(con, (str, sqlalchemy.engine.Connectable)):

  • Also pandasSQL_builder() does not detect the sqlalchemy module I have installed. This detection is necessary for the above check and for the ability to specify the database URL in the form of a literal (like pd.read_sql_table(tablename, con='mysql+pymysql://user:pwd@host/Data' )).

    sqlalchemy = import_optional_dependency("sqlalchemy", errors="ignore")

As a result, no connection via sqlalchemy is possible.

I have a suspicion that the problem may be partitial linked that I'm using SQLAlchemy v1.4.49, as required by some of my code's dependencies, since many Python modules stlll not support SQLAlchemy v2. But then I would like Pandas support for the previous version of Sqlalchemy, or locking it as a dependency on a specific version up to which Pandas has support.

  • Also I found issue in the method that looks for installed SQLAlchemy, but it doesn't find it. This detects the SQLAlchemy installed. But while the minimum version is v2 set in it (VERSIONS = {"sqlalchemy": "2.0.0",}) and a notification is assigned there that Pandas requires a version no lower than this (msg at line 175). But since the detection is made with the ignore argument (sqlalchemy = import_optional_dependency("sqlalchemy", errors="ignore") and mod = import_optional_dependency(modname, errors="ignore") in pd.show_versions()), the message is not shown and is returned the result is None, as if no installed SQLAlchemy was found.
    minimum_version = min_version if min_version is not None else VERSIONS.get(parent)

Expected Behavior

Should work:

  • Connect via connection SQLAlchemy.
  • Connect via literal database URL.
  • There should be no implicit silent connection to non-existent SQLite, when a MySQL (or other) connection is explicitly specified.
  • Should show SQLAlchemy installed, whereas currently it shows no installation when SQLAlchemy is installed.
  • If Pandas requires SQLAlchemy to be a "minimum version" of 2.0.0, this must be reported.

Installed Versions

pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 16.1.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : 2.0.1
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

@vladiscripts vladiscripts added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 11, 2024
@vladiscripts
Copy link
Author

vladiscripts commented Jul 11, 2024

Curiously, another method works: df = pd.read_sql('SELECT * FROM LocationIDs', conn.connection). Although it also does not define the connection correctly, PandasSQL_builder() gives the MySQL connection the wrong sqlite class SQLiteDatabase.

if isinstance(pandas_sql, SQLiteDatabase):

@Animesh-Shukla
Copy link

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants