BUG: pd.read_sql_table() raises unknown column error when column name of a table contains `%` #37157

TrigunaBN · 2020-10-16T08:27:57Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

Mysql Create Table

DROP TABLE IF EXISTS test.my_table;
CREATE TABLE test.my_table 
(
    `id` VARCHAR(3),
    PRIMARY KEY(id),
    `price` DOUBLE NULL,
    `%_variation` DOUBLE NULL
);

INSERT INTO test.my_table VALUES('101', 25.2, 0.1);
INSERT INTO test.my_table VALUES('102', 40.2, -0.5);
INSERT INTO test.my_table VALUES('103', 55.2, 0.9);

import pandas as pd
from sqlalchemy import create_engine

DB_PW = 'My_password'
DB_PORT = '3306'
DB_NAME = 'Test'

db_uri = f'mysql+mysqldb://root:{DB_PW}@localhost:{DB_PORT}'
engine = create_engine(f'{db_uri}/{DB_NAME}')

df = pd.read_sql_table(table_name='my_table', con=engine, schema=DB_NAME)

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution]

As you can see above, the %_variation column is read as %%_variation in the SELECT statement.

But the %%_variation column is not present in the database and thus reading this table from the database provides an error.

Expected Output

I expect pandas to read the table from the database.

I tried reading with pandas version 1.0.5 and I was able to read the table from the database without any problem.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : db08276
python : 3.7.7.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
processor : Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 1.1.3
numpy : 1.19.0
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 50.3.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.2.9
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : 1.3.19
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

erfannariman · 2020-10-17T11:16:32Z

Using raw sqlalchemy to fetch the results does not raise.

import pandas as pd
from sqlalchemy import create_engine

print(pd.__version__)

connection_string = "mysql+mysqldb://root:pw@localhost:3306/test"
con = create_engine(connection_string).connect()
with con:
    result = con.execute("select * from my_table;")
    values = [dict(row) for row in result]
    df = pd.DataFrame(values)
    print(df)
con.close()

Output:

1.2.0.dev0+808.g40b65da08
    id  price  %_variation
0  101   25.2          0.1
1  102   40.2         -0.5
2  103   55.2          0.9

Process finished with exit code 0

Same for fetchall

connection_string = "mysql+mysqldb://root:pw[@localhost:3306/test"
con = create_engine(connection_string).connect()
with con:
    result = con.execute("select * from my_table;")
    df = pd.DataFrame(result.fetchall())
    print(df)
con.close()

Output

     0     1    2
0  101  25.2  0.1
1  102  40.2 -0.5
2  103  55.2  0.9

erfannariman · 2020-10-31T13:39:12Z

take

erfannariman · 2020-10-31T13:39:20Z

git bisect gave:

fc41f92cab49050b74cdd3dae37a8186b923c502 is the first bad commit
commit fc41f92cab49050b74cdd3dae37a8186b923c502
Author: John Bodley <[email protected]>
Date:   Wed May 27 19:10:03 2020 -0700

    BUG: #34211 (#34212)

 doc/source/whatsnew/v1.1.0.rst |  1 +
 pandas/io/sql.py               |  4 +++-
 pandas/tests/io/test_sql.py    | 10 ++++++++++
 3 files changed, 14 insertions(+), 1 deletion(-)

jreback · 2020-12-22T00:05:28Z

this was deliberately changed in #34212 so not sure why you think this is a regression.

TrigunaBN added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 16, 2020

erfannariman mentioned this issue Oct 17, 2020

column with % sign causes error in pandas read_sql_table sqlalchemy/sqlalchemy#5652

Closed

jbrockmendel added the IO SQL to_sql, read_sql, read_sql_query label Oct 25, 2020

github-actions bot assigned erfannariman Oct 31, 2020

erfannariman mentioned this issue Oct 31, 2020

BUG: error raise when column contains percentage #37534

Merged

5 tasks

jreback added this to the 1.2 milestone Dec 22, 2020

simonjayhawkins added Regression Functionality that used to work in a prior pandas version and removed Needs Triage Issue that has not been reviewed by a pandas team member Regression Functionality that used to work in a prior pandas version labels Dec 22, 2020

jreback closed this as completed in #37534 Dec 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: pd.read_sql_table() raises unknown column error when column name of a table contains `%` #37157

BUG: pd.read_sql_table() raises unknown column error when column name of a table contains `%` #37157

TrigunaBN commented Oct 16, 2020

INSTALLED VERSIONS

erfannariman commented Oct 17, 2020 •

edited

Loading

erfannariman commented Oct 31, 2020

erfannariman commented Oct 31, 2020

jreback commented Dec 22, 2020

BUG: pd.read_sql_table() raises unknown column error when column name of a table contains % #37157

BUG: pd.read_sql_table() raises unknown column error when column name of a table contains % #37157

Comments

TrigunaBN commented Oct 16, 2020

Code Sample, a copy-pastable example

Mysql Create Table

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

erfannariman commented Oct 17, 2020 • edited Loading

erfannariman commented Oct 31, 2020

erfannariman commented Oct 31, 2020

jreback commented Dec 22, 2020

BUG: pd.read_sql_table() raises unknown column error when column name of a table contains `%` #37157

BUG: pd.read_sql_table() raises unknown column error when column name of a table contains `%` #37157

Output of `pd.show_versions()`

erfannariman commented Oct 17, 2020 •

edited

Loading