BUG: boxing Timedeltas on .apply #11349

amelio-vazquez-reina · 2015-10-16T23:35:39Z

Consider the following Series:

object_id
0CKVYKjyFn    76 days
0CrPL2QKH3   -15 days
0CrVStlVrg    23 days
0Cc5ZvS67u    76 days
0CTOk5OdtI    76 days
0CTSWtTzBa    76 days
0CwBqVeNCX    76 days
0CIRJFIOcD    58 days
0CRQPCxzQe   350 days
0CAq4m9Nru    15 days
0C617yvXBj    76 days
0CzUUJNKX9   -16 days
Name: days_left, dtype: timedelta64[ns]

I am hoping to convert the above to hours.

If I do:

my_series.dt.hours

I get:

AttributeError: 'Series' object has no attribute 'hours

What's even more strange is that if I do:

> my_series[0].total_seconds()/3600
1824.0

it works for one element, but if I do:

> my_series.apply(lambda x: x.total_seconds())

I get:

AttributeError: 'numpy.timedelta64' object has no attribute 'total_seconds'

I thought apply would run the function I pass it item by item in the series. Why does total_seconds() work for a single item, but not with apply?

The text was updated successfully, but these errors were encountered:

chris-b1 · 2015-10-17T00:01:39Z

As outlined in the docs the way to do conversions is via astype (which truncates units) or by dividing by the appropriate delta (which doesn't)

In [8]: s.astype('m8[h]')
Out[8]: 
0     1824
1     -360
2      552
3     1824
4     1824
5     1824
6     1824
7     1392
8     8400
9      360
10    1824
11    -384
Name: 1, dtype: float64

In [9]: s / np.timedelta64(1, 'h')
Out[9]: 
0     1824
1     -360
2      552
3     1824
4     1824
5     1824
6     1824
7     1392
8     8400
9      360
10    1824
11    -384
Name: 1, dtype: float64

You're seeing that result with apply because a single element is boxed in a Timedelta when accessed (which has extra properties), but the underlying storage is a np.timedelta64 array, which doesn't.

jreback · 2015-10-17T15:21:14Z

In [5]: s = Series(pd.timedelta_range('1 day 1 s',periods=5,freq='h'))

In [6]: s
Out[6]: 
0   1 days 00:00:01
1   1 days 01:00:01
2   1 days 02:00:01
3   1 days 03:00:01
4   1 days 04:00:01
dtype: timedelta64[ns]

In [7]: s.dt.components
Out[7]: 
   days  hours  minutes  seconds  milliseconds  microseconds  nanoseconds
0     1      0        0        1             0             0            0
1     1      1        0        1             0             0            0
2     1      2        0        1             0             0            0
3     1      3        0        1             0             0            0
4     1      4        0        1             0             0            0

In [8]: s.dt.
s.dt.components      s.dt.days            s.dt.freq            s.dt.microseconds    s.dt.nanoseconds     s.dt.seconds         s.dt.to_pytimedelta  s.dt.total_seconds

@amelio-vazquez-reina the reason we don't support hour/minutes is for compatibility to datetime.timedelta and to make it slightly less confusing.

datetime.timedelta give you days,seconds,microseconds which are the TOTAL amount (which IMHO is actually confusing, but that is what the API is).

.components will give you the 'displayed' values (e.g. the components of the timedeltas), which you can then access.

so

s.apply(....) should actually box these into Timedelta objects (and not just leave them as np.timedelta64), as we do similarly for .apply with a datetime64[ns]

In [9]: s.apply(lambda x: type(x))
Out[9]: 
0    <type 'numpy.timedelta64'>
1    <type 'numpy.timedelta64'>
2    <type 'numpy.timedelta64'>
3    <type 'numpy.timedelta64'>
4    <type 'numpy.timedelta64'>
dtype: object

In [10]: Series(pd.date_range('20130101',periods=3)).apply(lambda x: type(x))
Out[10]: 
0    <class 'pandas.tslib.Timestamp'>
1    <class 'pandas.tslib.Timestamp'>
2    <class 'pandas.tslib.Timestamp'>
dtype: object

So this is a bug here
should be something like what is happening in __iter__ where the needs_i8_conversion and i8_boxer is called. I am going to repurpose this issue.

pull-requests welcome!

jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timedelta Timedelta data type labels Oct 17, 2015

jreback added this to the 0.17.1 milestone Oct 17, 2015

jreback changed the title ~~Operations with Series holding Timedeltas~~ BUG: boxing Timedeltas on .apply Oct 17, 2015

kawochen mentioned this issue Nov 10, 2015

BUG: GH11349 where Series.apply and Series.map did not box timedelta64 #11564

Merged

jreback modified the milestones: Next Major Release, 0.17.1 Nov 13, 2015

jreback modified the milestones: 0.18.0, Next Major Release Dec 30, 2015

jreback closed this as completed in #11564 Dec 31, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: boxing Timedeltas on .apply #11349

BUG: boxing Timedeltas on .apply #11349

amelio-vazquez-reina commented Oct 16, 2015

chris-b1 commented Oct 17, 2015

jreback commented Oct 17, 2015

BUG: boxing Timedeltas on .apply #11349

BUG: boxing Timedeltas on .apply #11349

Comments

amelio-vazquez-reina commented Oct 16, 2015

chris-b1 commented Oct 17, 2015

jreback commented Oct 17, 2015