Bug in generate CREATE TABLE SQL for DataFrame with MultiIndex with different types #8021

artemyk · 2014-08-13T19:06:19Z

When using SQLAlchemy to create a table from a dataframe, Pandas messes up the types of the index columns. For example:

import sqlalchemy
engine = sqlalchemy.create_engine('sqlite:///:memory:')

import pandas as pd
from pandas.io.sql import PandasSQLTable, PandasSQLAlchemy
df = pd.DataFrame.from_records([{'a':1,'b':2.1,'c':'txt1','a':2,'b':1.5,'c':'txt2'}], index=['a','b'])
o = PandasSQLTable('test_df',PandasSQLAlchemy(engine),frame=df)
print o.sql_schema()

outputs

CREATE TABLE test_df (
    a FLOAT, 
    b BIGINT, 
    c TEXT
)

when a should be BIGINT and b should be FLOAT.

This occurs because the index columns are enumerated backwards in PandasSQLTable._create_table_statement in pandas.io.sql but the types don't take into account this backward iteration:

        if self.index is not None:
            for i, idx_label in enumerate(self.index[::-1]):
                idx_type = self._sqlalchemy_type(
                    self.frame.index.get_level_values(i))
                columns.insert(0, Column(idx_label, idx_type, index=True))

I will make a pull request with a fix for this issue. Versions appended below:

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-358.6.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.14.1-200-g534784b
nose: 1.3.0
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.4
bottleneck: None
tables: None
numexpr: 2.4
matplotlib: 1.3.1
openpyxl: None
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: 2.4.2
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2014-08-13T19:31:07Z

Thanks for the report, and great you can do a PR!

artemyk · 2014-08-14T08:33:54Z

Pull request submitted at #8022 .

jorisvandenbossche added the SQL label Aug 13, 2014

jorisvandenbossche added this to the 0.15.0 milestone Aug 13, 2014

artemyk mentioned this issue Aug 14, 2014

Fixed incorrect datatype conversion on multi-indexes #8022

Merged

jorisvandenbossche closed this as completed in #8022 Aug 15, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in generate CREATE TABLE SQL for DataFrame with MultiIndex with different types #8021

Bug in generate CREATE TABLE SQL for DataFrame with MultiIndex with different types #8021

artemyk commented Aug 13, 2014

jorisvandenbossche commented Aug 13, 2014

artemyk commented Aug 14, 2014

Bug in generate CREATE TABLE SQL for DataFrame with MultiIndex with different types #8021

Bug in generate CREATE TABLE SQL for DataFrame with MultiIndex with different types #8021

Comments

artemyk commented Aug 13, 2014

jorisvandenbossche commented Aug 13, 2014

artemyk commented Aug 14, 2014