Skip to content

BUG: Executing SQL containing non-escaped percent sign fails without parameters #34211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
john-bodley opened this issue May 16, 2020 · 3 comments
Closed
3 tasks done
Labels
Bug IO SQL to_sql, read_sql, read_sql_query
Milestone

Comments

@john-bodley
Copy link
Contributor

john-bodley commented May 16, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd 

pd.read_sql("SELECT 100 AS '100%'", con="mysql://root@localhost:3306") 

raises the follow exception:

ProgrammingError: (MySQLdb._exceptions.ProgrammingError) not enough arguments for format string

Problem description

Many DB-APIs (MySQL, PyHive, etc.) when executing a statement explicitly check whether the optional parameters are/are not explicitly None (as opposed to falsy) prior to substitution. The SQL statements in Pandas are executed via the SQLAlchemy connection (per here) resulting in parameters of type NoneType being translated into an empty dictionary which can be problematic for the checks.

Many of these DB-APIs use the % operator for applying the parameters which is problematic if no parameters are specified and the SQL statement contains the % character (common in LIKE operator). One could modify their SQL by escaping the % percent character however the preferred fix is merely to leverage SQLAlchemy's execution_options setting no_parameters=True per,

no_parameters – When True, if the final parameter list or dictionary is totally empty, will invoke the statement on the cursor as cursor.execute(statement), not passing the parameter collection at all. Some DBAPIs such as psycopg2 and mysql-python consider percent signs as significant only when parameters are present; this option allows code to generate SQL containing percent signs (and possibly other characters) that is neutral regarding whether it’s executed by the DBAPI or piped into a script that’s later invoked by command line tools.

which ensures that falsy parameters will not pass the parameters to the DB-API resulting in parameters being None and thus no parameter formatting will be attempted.

Expected Output

   100%
0   100

Output of pd.show_versions()

[paste the output of pd.show_versions() here leaving a blank line after the details tag]

@john-bodley john-bodley added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 16, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 16, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 16, 2020
@mroeschke
Copy link
Member

#11896 may be related

@mroeschke mroeschke added IO SQL to_sql, read_sql, read_sql_query and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 16, 2020
@john-bodley
Copy link
Contributor Author

@mroeschke I'm uncertain whether #11896 is the same issue and I'm unable to repo the MySQL example using the mysqlclient driver. The Pandas SQL logic seems quite complex, i.e., there seems to be different cursor logic for read vs. write and it seems like this logic (which is modified in #34212) isn't executed during the write phase.

john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 17, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 17, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 19, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 19, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 19, 2020
john-bodley pushed a commit to john-bodley/pandas that referenced this issue May 20, 2020
WillAyd pushed a commit that referenced this issue May 28, 2020
@jreback jreback added this to the 1.1 milestone Dec 22, 2020
@lithomas1
Copy link
Member

fixed by #34212

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

No branches or pull requests

4 participants