Skip to content

BUG/API: Fix operating with timedelta64/pd.offsets on rhs of a datelike series/index #4534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 13, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,11 @@ pandas 0.13
- Add ``rename`` and ``set_names`` methods to ``Index`` as well as
``set_names``, ``set_levels``, ``set_labels`` to ``MultiIndex``.
(:issue:`4039`)
- A Series of dtype ``timedelta64[ns]`` can now be divided/multiplied
by an integer series (:issue`4521`)
- A Series of dtype ``timedelta64[ns]`` can now be divided by another
``timedelta64[ns]`` object to yield a ``float64`` dtyped Series. This
is frequency conversion.

**API Changes**

Expand Down Expand Up @@ -166,6 +171,10 @@ pandas 0.13
- Fixed issue where individual ``names``, ``levels`` and ``labels`` could be
set on ``MultiIndex`` without validation (:issue:`3714`, :issue:`4039`)
- Fixed (:issue:`3334`) in pivot_table. Margins did not compute if values is the index.
- Fix bug in having a rhs of ``np.timedelta64`` or ``np.offsets.DateOffset`` when operating
with datetimes (:issue:`4532`)
- Fix arithmetic with series/datetimeindex and ``np.timedelta64`` not working the same (:issue:`4134`)
and buggy timedelta in numpy 1.6 (:issue:`4135`)

pandas 0.12
===========
Expand Down
88 changes: 58 additions & 30 deletions doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ Take care, ``to_datetime`` may not act as you expect on mixed data:

.. ipython:: python

pd.to_datetime([1, '1'])
to_datetime([1, '1'])

.. _timeseries.daterange:

Expand Down Expand Up @@ -297,7 +297,7 @@ the year or year and month as strings:

ts['2011-6']

This type of slicing will work on a DataFrame with a ``DateTimeIndex`` as well. Since the
This type of slicing will work on a DataFrame with a ``DateTimeIndex`` as well. Since the
partial string selection is a form of label slicing, the endpoints **will be** included. This
would include matching times on an included date. Here's an example:

Expand Down Expand Up @@ -1112,7 +1112,8 @@ Time Deltas
-----------

Timedeltas are differences in times, expressed in difference units, e.g. days,hours,minutes,seconds.
They can be both positive and negative.
They can be both positive and negative. :ref:`DateOffsets<timeseries.offsets>` that are absolute in nature
(``Day, Hour, Minute, Second, Milli, Micro, Nano``) can be used as ``timedeltas``.

.. ipython:: python

Expand All @@ -1128,41 +1129,16 @@ They can be both positive and negative.
s - s.max()
s - datetime(2011,1,1,3,5)
s + timedelta(minutes=5)
s + Minute(5)
s + Minute(5) + Milli(5)

Getting scalar results from a ``timedelta64[ns]`` series

.. ipython:: python
:suppress:

from distutils.version import LooseVersion

.. ipython:: python

y = s - s[0]
y

.. code-block:: python

if LooseVersion(np.__version__) <= '1.6.2':
y.apply(lambda x: x.item().total_seconds())
y.apply(lambda x: x.item().days)
else:
y.apply(lambda x: x / np.timedelta64(1, 's'))
y.apply(lambda x: x / np.timedelta64(1, 'D'))

.. note::

As you can see from the conditional statement above, these operations are
different in numpy 1.6.2 and in numpy >= 1.7. The ``timedelta64[ns]`` scalar
type in 1.6.2 is much like a ``datetime.timedelta``, while in 1.7 it is a
nanosecond based integer. A future version of pandas will make this
transparent.

.. note::

In numpy >= 1.7 dividing a ``timedelta64`` array by another ``timedelta64``
array will yield an array with dtype ``np.float64``.

Series of timedeltas with ``NaT`` values are supported

.. ipython:: python
Expand Down Expand Up @@ -1218,3 +1194,55 @@ issues). ``idxmin, idxmax`` are supported as well.

df.min().idxmax()
df.min(axis=1).idxmin()

.. _timeseries.timedeltas_convert:

Time Deltas & Conversions
-------------------------

.. versionadded:: 0.13

Timedeltas can be converted to other 'frequencies' by dividing by another timedelta.
These operations yield ``float64`` dtyped Series.

.. ipython:: python

td = Series(date_range('20130101',periods=4))-Series(date_range('20121201',periods=4))
td[2] += np.timedelta64(timedelta(minutes=5,seconds=3))
td[3] = np.nan
td

# to days
td / np.timedelta64(1,'D')

# to seconds
td / np.timedelta64(1,'s')

Dividing or multiplying a ``timedelta64[ns]`` Series by an integer or integer Series
yields another ``timedelta64[ns]`` dtypes Series.

.. ipython:: python

td * -1
td * Series([1,2,3,4])

Numpy < 1.7 Compatibility
~~~~~~~~~~~~~~~~~~~~~~~~~

Numpy < 1.7 has a broken ``timedelta64`` type that does not correctly work
for arithmetic. Pandas bypasses this, but for frequency conversion as above,
you need to create the divisor yourself. The ``np.timetimedelta64`` type only
has 1 argument, the number of **micro** seconds.

The following are equivalent statements in the two versions of numpy.

.. code-block:: python

from distutils.version import LooseVersion
if LooseVersion(np.__version__) <= '1.6.2':
y / np.timedelta(86400*int(1e6))
y / np.timedelta(int(1e6))
else:
y / np.timedelta64(1,'D')
y / np.timedelta64(1,'s')

34 changes: 34 additions & 0 deletions doc/source/v0.13.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,40 @@ Enhancements
- Added a more informative error message when plot arguments contain
overlapping color and style arguments (:issue:`4402`)

- ``timedelta64[ns]`` operations

- A Series of dtype ``timedelta64[ns]`` can now be divided by another
``timedelta64[ns]`` object to yield a ``float64`` dtyped Series. This
is frequency conversion. See :ref:`here<timeseries.timedeltas_convert>` for the docs.

.. ipython:: python

from datetime import timedelta
td = Series(date_range('20130101',periods=4))-Series(date_range('20121201',periods=4))
td[2] += np.timedelta64(timedelta(minutes=5,seconds=3))
td[3] = np.nan
td

# to days
td / np.timedelta64(1,'D')

# to seconds
td / np.timedelta64(1,'s')

- Dividing or multiplying a ``timedelta64[ns]`` Series by an integer or integer Series

.. ipython:: python

td * -1
td * Series([1,2,3,4])

- Absolute ``DateOffset`` objects can act equivalenty to ``timedeltas``

.. ipython:: python

from pandas import offsets
td + offsets.Minute(5) + offsets.Milli(5)

Bug Fixes
~~~~~~~~~

Expand Down
49 changes: 47 additions & 2 deletions pandas/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@
import pandas.algos as algos
import pandas.lib as lib
import pandas.tslib as tslib

from distutils.version import LooseVersion
from pandas import compat
from pandas.compat import StringIO, BytesIO, range, long, u, zip, map
from datetime import timedelta

from pandas.core.config import get_option
from pandas.core import array as pa

Expand All @@ -33,6 +35,10 @@ class PandasError(Exception):
class AmbiguousIndexError(PandasError, KeyError):
pass

# versioning
_np_version = np.version.short_version
_np_version_under1p6 = LooseVersion(_np_version) < '1.6'
_np_version_under1p7 = LooseVersion(_np_version) < '1.7'

_POSSIBLY_CAST_DTYPES = set([ np.dtype(t) for t in ['M8[ns]','m8[ns]','O','int8','uint8','int16','uint16','int32','uint32','int64','uint64'] ])
_NS_DTYPE = np.dtype('M8[ns]')
Expand Down Expand Up @@ -1144,7 +1150,45 @@ def _possibly_convert_platform(values):
def _possibly_cast_to_timedelta(value, coerce=True):
""" try to cast to timedelta64, if already a timedeltalike, then make
sure that we are [ns] (as numpy 1.6.2 is very buggy in this regards,
don't force the conversion unless coerce is True """
don't force the conversion unless coerce is True

if coerce='compat' force a compatibilty coercerion (to timedeltas) if needeed
"""

# coercion compatability
if coerce == 'compat' and _np_version_under1p7:

def convert(td, type):

# we have an array with a non-object dtype
if hasattr(td,'item'):
td = td.astype(np.int64).item()
if td == tslib.iNaT:
return td
if dtype == 'm8[us]':
td *= 1000
return td

if td == tslib.compat_NaT:
return tslib.iNaT

# convert td value to a nanosecond value
d = td.days
s = td.seconds
us = td.microseconds

if dtype == 'object' or dtype == 'm8[ns]':
td = 1000*us + (s + d * 24 * 3600) * 10 ** 9
else:
raise ValueError("invalid conversion of dtype in np < 1.7 [%s]" % dtype)

return td

# < 1.7 coercion
if not is_list_like(value):
value = np.array([ value ])
dtype = value.dtype
return np.array([ convert(v,dtype) for v in value ], dtype='m8[ns]')

# deal with numpy not being able to handle certain timedelta operations
if isinstance(value,np.ndarray) and value.dtype.kind == 'm':
Expand All @@ -1154,6 +1198,7 @@ def _possibly_cast_to_timedelta(value, coerce=True):

# we don't have a timedelta, but we want to try to convert to one (but don't force it)
if coerce:

new_value = tslib.array_to_timedelta64(value.astype(object), coerce=False)
if new_value.dtype == 'i8':
value = np.array(new_value,dtype='timedelta64[ns]')
Expand Down
Loading