Skip to content

BUG: dt.strftime causes TypeError when there is only one row in DataFrame #15494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yp514 opened this issue Feb 24, 2017 · 5 comments · Fixed by #19127
Closed

BUG: dt.strftime causes TypeError when there is only one row in DataFrame #15494

yp514 opened this issue Feb 24, 2017 · 5 comments · Fixed by #19127
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@yp514
Copy link

yp514 commented Feb 24, 2017

Code below outlines the issue, if I have a DataFrame with a date column and want to transfer the datetime to string using df.loc[:,'date'] = df['date'].dt.strftime('%Y-%m-%d'), it works fine as long as there is more than 1 row in the Data Frame. If there is only one row then you get a Type Error (see below for full error). If, on the other hand, I use df['date']=df['date'].dt.strftime('%Y-%m-%d') then it works fine for one or more rows. This was observed on both 0.17.1 and 0.19.2.

----- Error Generated -----
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pandas/core/internals.py", line 2265, in _try_coerce_args
other = other.astype('i8', copy=False).view('i8')
ValueError: invalid literal for int() with base 10: '2017-02-23'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./testpd.py", line 35, in
do_stuff(d1)
File "./testpd.py", line 17, in do_stuff
df.loc[:,'date'] = df['date'].dt.strftime('%Y-%m-%d')
File "/usr/local/lib/python3.4/site-packages/pandas/core/indexing.py", line 141, in setitem
self._setitem_with_indexer(indexer, value)
File "/usr/local/lib/python3.4/site-packages/pandas/core/indexing.py", line 533, in _setitem_with_indexer
setter(labels[0], value)
File "/usr/local/lib/python3.4/site-packages/pandas/core/indexing.py", line 473, in setter
s._data = s._data.setitem(indexer=pi, value=v)
File "/usr/local/lib/python3.4/site-packages/pandas/core/internals.py", line 3168, in setitem
return self.apply('setitem', **kwargs)
File "/usr/local/lib/python3.4/site-packages/pandas/core/internals.py", line 3056, in apply
applied = getattr(b, f)(**kwargs)
File "/usr/local/lib/python3.4/site-packages/pandas/core/internals.py", line 668, in setitem
values, _, value, _ = self._try_coerce_args(self.values, value)
File "/usr/local/lib/python3.4/site-packages/pandas/core/internals.py", line 2270, in _try_coerce_args
raise TypeError
TypeError

----- Code to Reproduce Issue -----

#!/usr/bin/env python3
""" Demonstrate Pandas issue """

import datetime as dt
import pandas as pd

def do_stuff(d):
    df = pd.DataFrame(d)
    df['date'] = df['date'].dt.strftime('%Y-%m-%d')
    df['id'] = df['id'].astype(str)
    df['fl'] = df['fl'].astype(str)
    print("df[df]: " + str(df))

    df = pd.DataFrame(d)
    df.loc[:,'id'] = df['id'].astype(str)
    df.loc[:,'fl'] = df['fl'].astype(str)
    df.loc[:,'date'] = df['date'].dt.strftime('%Y-%m-%d')
    print("df.loc[df]: " + str(df))


""" With 2 rows in Data Frame, works fine """
d2 = { 'date': pd.Series([dt.datetime.today(), dt.datetime.today()]),
      'bsym': pd.Series(['blah', 'blah2']),
      'id': pd.Series([1, 2]),
      'fl': pd.Series([1.5, 3]) }
print("D2...")
do_stuff(d2)

""" With 1 row in Data Frame, exception occurs on date column """
d1 = { 'date': pd.Series([dt.datetime.today()]),
      'bsym': pd.Series(['blah']),
      'id': pd.Series([1]),
      'fl': pd.Series([1.5]) }
print("D1...")
do_stuff(d1)
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 24, 2017

More comprehensive reproducible example:

In [14]: df1 = pd.DataFrame({'date': pd.Series([dt.datetime.today()])})

In [15]: df1
Out[15]: 
                        date
0 2017-02-24 10:41:07.209562

In [18]: df1['date'] = 'string'

In [19]: df1
Out[19]: 
     date
0  string

In [20]: df1 = pd.DataFrame({'date': pd.Series([dt.datetime.today()])})

In [21]: df1.loc[:,'date'] = 'string'
...
TypeError: 

And indeed works when having more than 1 row.

@jorisvandenbossche jorisvandenbossche added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Feb 24, 2017
@jreback
Copy link
Contributor

jreback commented Feb 24, 2017

yeah .loc does inference on the resulting column to potentially change the dtype, looks like an uncaught path.

@jreback jreback added Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions labels Feb 24, 2017
@jreback jreback added this to the Next Major Release milestone Feb 24, 2017
@jreback jreback changed the title dt.strftime causes TypeError when there is only one row in DataFrame BUG: dt.strftime causes TypeError when there is only one row in DataFrame Feb 24, 2017
@yp514
Copy link
Author

yp514 commented Feb 24, 2017 via email

@jorisvandenbossche
Copy link
Member

Yes, df['date'] and df.loc[:, 'date'] should be equivalent

@jschendel
Copy link
Member

This appears to be fixed on 0.22.0 (so really probably on 0.21.x too):

In [1]: import pandas as pd; import datetime as dt

In [2]: pd.__version__
Out[2]: '0.22.0'

In [3]: df1 = pd.DataFrame({'date': pd.Series([dt.datetime.today()])})

In [4]: df1.loc[:,'date'] = 'string'

In [5]: df1
Out[5]:
     date
0  string

Could still use a test to ensure there's not a regression.

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Jan 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants