read_sql: TypeError: Argument 'rows' has incorrect type #11522

PetitLepton · 2015-11-05T07:49:19Z

Hi,
I encountered a problem when using read_sql with a remote mySQL database.

After loading the modules,

import pymysql
import pandas as pd
import sqlalchemy

I create both a pymysql connector and a sqlalchemy engine with pymysql for the same database.

connector = pymysql.connect(...)
engine = sqlalchemy.create_engine('mysql+pymysql://...')

Everything is fine with the engine

df = pd.DataFrame()
query = " SELECT id, created_at FROM themes"
for chunk in pd.read_sql(query, engine, chunksize=100):
    df = df.append(chunk)

but fails with the connector

df = pd.DataFrame()
query = " SELECT id, created_at FROM themes"
for chunk in pd.read_sql(query, connector, chunksize=100):
    df = df.append(chunk)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-9a0ee2188a78> in <module>()
      1 df = pd.DataFrame()
      2 query = " SELECT id, created_at FROM themes"
----> 3 for chunk in pd.read_sql(query, connector_mySQL, chunksize=100):
      4     df = df.append(chunk)

/usr/local/lib/python3.5/site-packages/pandas/io/sql.py in _query_iterator(cursor, chunksize, columns, index_col, coerce_float, parse_dates)
   1563                 yield _wrap_result(data, columns, index_col=index_col,
   1564                                    coerce_float=coerce_float,
-> 1565                                    parse_dates=parse_dates)
   1566 
   1567     def read_query(self, sql, index_col=None, coerce_float=True, params=None,

/usr/local/lib/python3.5/site-packages/pandas/io/sql.py in _wrap_result(data, columns, index_col, coerce_float, parse_dates)
    135 
    136     frame = DataFrame.from_records(data, columns=columns,
--> 137                                    coerce_float=coerce_float)
    138 
    139     _parse_date_columns(frame, parse_dates)

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in from_records(cls, data, index, exclude, columns, coerce_float, nrows)
    928         else:
    929             arrays, arr_columns = _to_arrays(data, columns,
--> 930                                              coerce_float=coerce_float)
    931 
    932             arr_columns = _ensure_index(arr_columns)

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in _to_arrays(data, columns, coerce_float, dtype)
   5211     if isinstance(data[0], (list, tuple)):
   5212         return _list_to_arrays(data, columns, coerce_float=coerce_float,
-> 5213                                dtype=dtype)
   5214     elif isinstance(data[0], collections.Mapping):
   5215         return _list_of_dict_to_arrays(data, columns,

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in _list_to_arrays(data, columns, coerce_float, dtype)
   5289 def _list_to_arrays(data, columns, coerce_float=False, dtype=None):
   5290     if len(data) > 0 and isinstance(data[0], tuple):
-> 5291         content = list(lib.to_object_array_tuples(data).T)
   5292     else:
   5293         # list of lists

TypeError: Argument 'rows' has incorrect type (expected list, got tuple)

Here are the versions of the modules

print(pymysql.__version__)
print(sqlalchemy.__version__)
print(pd.__version__)

0.6.7.None
1.0.9
0.17.0

Best regards,
Flavien.

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2015-11-05T08:52:56Z

@PetitLepton Thanks for the report.

Can you show the output of?

cur = connector.cursor()
cur.execute("SELECT id, created_at FROM themes").fetchall()

PetitLepton · 2015-11-06T02:12:18Z

@jorisvandenbossche Hi, this one works fine, I was using it so far

cur = connector.cursor()
cur.execute("SELECT id, created_at FROM themes")
cur.fetchall()
((1, datetime.datetime(2012, 12, 2, 4, 6, 17)),
 (2, datetime.datetime(2012, 12, 2, 4, 20, 56)),
 (4, datetime.datetime(2012, 12, 2, 4, 57, 39)),
 (5, datetime.datetime(2012, 12, 2, 4, 59, 31)),
 (6, datetime.datetime(2012, 12, 2, 5, 3, 7)),
 ...

jorisvandenbossche · 2015-11-06T09:12:01Z

Thanks! As I suspected, the problem is that it returns a tuple of tuples instead of a list of tuples, and DataFrame.from_records fails on this:

In [1]: pd.DataFrame.from_records([(1,2), (3,4)])
Out[1]:
   0  1
0  1  2
1  3  4

In [2]: pd.DataFrame.from_records(((1,2), (3,4)))
...
c:\users\vdbosscj\scipy\pandas-joris\pandas\core\frame.pyc in _list_to_arrays(da
ta, columns, coerce_float, dtype)
   5300 def _list_to_arrays(data, columns, coerce_float=False, dtype=None):
   5301     if len(data) > 0 and isinstance(data[0], tuple):
-> 5302         content = list(lib.to_object_array_tuples(data).T)
   5303     else:
   5304         # list of lists

TypeError: Argument 'rows' has incorrect type (expected list, got tuple)

This has always been the case in pandas, so I am wondering if this is a change in pymysql

@jreback This is easily solved in the sql code by ensuring it is a list before passing to DataFrame.from_records, or do you think this is something from_records should handle itself?

jreback · 2015-11-07T15:22:22Z

@jorisvandenbossche no, list-of-tuples is the specified type, tuple-of-tuple is not allowed as I think it can signify nested types that would require more parsing (its not allowed in the DataFrame constructor either).

I suppose these might be able to be relaxed but would be a separate issue

In [3]: DataFrame([('a','b')])
Out[3]: 
   0  1
0  a  b

In [4]: DataFrame((('a','b')))
PandasError: DataFrame constructor not properly called!

In [5]: DataFrame(np.array([('a','b')]))
Out[5]: 
   0  1
0  a  b

jorisvandenbossche · 2015-11-10T12:33:46Z

OK, no problem. It is easy to ensure in the SQL code that it is a list of tuples and not a tuple of tuples

BUG: force list type for tuples from chunked sql table reads #11522

jorisvandenbossche added the IO SQL to_sql, read_sql, read_sql_query label Nov 5, 2015

jorisvandenbossche added this to the 0.17.1 milestone Nov 6, 2015

jorisvandenbossche added Difficulty Novice labels Nov 10, 2015

jreback modified the milestones: Next Major Release, 0.17.1 Nov 15, 2015

grahamjeffries mentioned this issue Dec 17, 2015

BUG: force list type for tuples from chunked sql table reads #11522 #11861

Merged

jorisvandenbossche closed this as completed in #11861 Dec 19, 2015

jorisvandenbossche added a commit that referenced this issue Dec 19, 2015

Merge pull request #11861 from grahamjeffries/bugfix-11522

1357321

BUG: force list type for tuples from chunked sql table reads #11522

jorisvandenbossche modified the milestones: 0.18.0, Next Major Release Dec 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_sql: TypeError: Argument 'rows' has incorrect type #11522

read_sql: TypeError: Argument 'rows' has incorrect type #11522

PetitLepton commented Nov 5, 2015

jorisvandenbossche commented Nov 5, 2015

PetitLepton commented Nov 6, 2015

jorisvandenbossche commented Nov 6, 2015

jreback commented Nov 7, 2015

jorisvandenbossche commented Nov 10, 2015

read_sql: TypeError: Argument 'rows' has incorrect type #11522

read_sql: TypeError: Argument 'rows' has incorrect type #11522

Comments

PetitLepton commented Nov 5, 2015

jorisvandenbossche commented Nov 5, 2015

PetitLepton commented Nov 6, 2015

jorisvandenbossche commented Nov 6, 2015

jreback commented Nov 7, 2015

jorisvandenbossche commented Nov 10, 2015