Skip to content

BUG: to_sql errors with numeric index name - needs conversion to string #15404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
redbullpeter opened this issue Feb 14, 2017 · 4 comments · Fixed by #15423
Closed

BUG: to_sql errors with numeric index name - needs conversion to string #15404

redbullpeter opened this issue Feb 14, 2017 · 4 comments · Fixed by #15423
Labels
Bug IO SQL to_sql, read_sql, read_sql_query
Milestone

Comments

@redbullpeter
Copy link
Contributor

redbullpeter commented Feb 14, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd
import sqlalchemy as sa


d = {'2000': 1, '2001': 2, '2002': 3, '2003': 4}

df = pd.DataFrame(list(d.items()))
df = df.set_index([0])

db = sa.create_engine("sqlite:///test.db")
df.to_sql("test_tbl", db, if_exists="append")

Problem description

This errors as follows:

  File "sql.py", line 12, in <module>
    df.to_sql("test_tbl", db, if_exists="append")
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/pandas/core/generic.py", line 1201, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/pandas/io/sql.py", line 470, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/pandas/io/sql.py", line 1146, in to_sql
    schema=schema, dtype=dtype)
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/pandas/io/sql.py", line 563, in __init__
    self.table = self._create_table_setup()
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/pandas/io/sql.py", line 770, in _create_table_setup
    for name, typ, is_index in column_names_and_types]
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/pandas/io/sql.py", line 770, in <listcomp>
    for name, typ, is_index in column_names_and_types]
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/sqlalchemy/sql/schema.py", line 1245, in __init__
    self._init_items(*args)
  File "/home/zzz/.virtualenvs/sqlalchemy/lib/python3.5/site-packages/sqlalchemy/sql/schema.py", line 79, in _init_items
    item._set_parent_with_dispatch(self)
AttributeError: 'int' object has no attribute '_set_parent_with_dispatch'

Commenting out the df.set_index line results in expected behaviour though with a simple integer index. Using sqlite3 as the library also results in errors.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-62-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.1.1
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None

@jorisvandenbossche jorisvandenbossche added Bug Difficulty Novice IO SQL to_sql, read_sql, read_sql_query labels Feb 16, 2017
@jorisvandenbossche jorisvandenbossche added this to the Next Major Release milestone Feb 16, 2017
@jorisvandenbossche
Copy link
Member

@redbullpeter Thanks for the report

The reason for the error is the integer index name. If you set this to a string, it will work

In [25]: df.to_sql("test_tbl", db, if_exists="append")
...
AttributeError: 'int' object has no attribute '_set_parent_with_dispatch'

In [26]: df.index.name = 'year'

In [27]: df.to_sql("test_tbl", db, if_exists="append")

In [28]: pd.read_sql("test_tbl", db)
Out[28]: 
   year  1
0  2002  3
1  2000  1
2  2003  4
3  2001  2

So integer column names are not allowed for writing to sql. For columns, this is checked for, and the integers are converted to a string. This check apparently does not happen for the index name (which also becomes a column label in the sql table), and by keeping it as an int, an error is raised in the underlying sqlalchemy code.

Fixes very welcome!

@jorisvandenbossche jorisvandenbossche changed the title to_sql errors with datetimeindex with sqlalchemy and legacy fallback BUG: to_sql errors with numeric index name - needs conversion to string Feb 16, 2017
@redbullpeter
Copy link
Contributor Author

I can probably do the fix for that. Will it be sufficient to naively convert an int to a string for the index column?

@jorisvandenbossche
Copy link
Member

Yes, I think it is just this line:

column_names_and_types.append((idx_label, idx_type, True))
(note that a few lines below, the column name is wrapped in text_type(..), this is not done for the index label.
You should confirm with a test that this indeed solves the issue.

redbullpeter added a commit to redbullpeter/pandas that referenced this issue Feb 16, 2017
…o_sql errors with numeric index name - needs conversion to string
@redbullpeter
Copy link
Contributor Author

Reran the code sample above and got the expected outcomes. This worked for both using the sqlalchmey and sqlite3 packages.

jorisvandenbossche pushed a commit that referenced this issue Feb 17, 2017
* Converted index name to string to fix issue #15404 - BUG: to_sql errors with numeric index name - needs conversion to string

* Additional int to string conversion added. Associated test cases added.

* PEP 8 compliance edits

* Removed extraneous brackets
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, Next Major Release Feb 17, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
…ev#15423)

* Converted index name to string to fix issue pandas-dev#15404 - BUG: to_sql errors with numeric index name - needs conversion to string

* Additional int to string conversion added. Associated test cases added.

* PEP 8 compliance edits

* Removed extraneous brackets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants