Skip to content

API: PeriodIndex.values now return array of Period objects #13988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 22 additions & 4 deletions doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Highlights include:
- :func:`merge_asof` for asof-style time-series joining, see :ref:`here <whatsnew_0190.enhancements.asof_merge>`
- ``.rolling()`` are now time-series aware, see :ref:`here <whatsnew_0190.enhancements.rolling_ts>`
- pandas development api, see :ref:`here <whatsnew_0190.dev_api>`
- ``PeriodIndex`` now has its own ``period`` dtype. see ref:`here <whatsnew_0190.api.perioddtype>`
- ``PeriodIndex`` now has its own ``period`` dtype, and changed to be more consistent with other ``Index`` classes. See ref:`here <whatsnew_0190.api.period>`

.. contents:: What's new in v0.19.0
:local:
Expand Down Expand Up @@ -643,10 +643,13 @@ Furthermore:
- Passing duplicated ``percentiles`` will now raise a ``ValueError``.
- Bug in ``.describe()`` on a DataFrame with a mixed-dtype column index, which would previously raise a ``TypeError`` (:issue:`13288`)

.. _whatsnew_0190.api.perioddtype:
.. _whatsnew_0190.api.period:

``Period`` changes
^^^^^^^^^^^^^^^^^^

``PeriodIndex`` now has ``period`` dtype
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
""""""""""""""""""""""""""""""""""""""""

``PeriodIndex`` now has its own ``period`` dtype. The ``period`` dtype is a
pandas extension dtype like ``category`` or :ref:`timezone aware dtype <timeseries.timezone_series>` (``datetime64[ns, tz]``). (:issue:`13941`).
Expand Down Expand Up @@ -681,7 +684,7 @@ New Behavior:
.. _whatsnew_0190.api.periodnat:

``Period('NaT')`` now returns ``pd.NaT``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
""""""""""""""""""""""""""""""""""""""""

Previously, ``Period`` has its own ``Period('NaT')`` representation different from ``pd.NaT``. Now ``Period('NaT')`` has been changed to return ``pd.NaT``. (:issue:`12759`, :issue:`13582`)

Expand Down Expand Up @@ -719,6 +722,21 @@ New Behavior:
pd.NaT + 1
pd.NaT - 1

``PeriodIndex.values`` now returns array of ``Period`` object
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

``.values`` is changed to return array of ``Period`` object, rather than array
of ``int64`` (:issue:`13988`)

.. code-block:: ipython
In [6]: pi = pd.PeriodIndex(['2011-01', '2011-02'], freq='M')
In [7]: pi.values
array([492, 493])

.. ipython:: python

pi = pd.PeriodIndex(['2011-01', '2011-02'], freq='M')
pi.values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sinhrks is it correct to say that to get the integer values as before (if you for some reason were using it), you cas access it through .asi8 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.asi8 is not really public

.astype is the single method for converting types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

further the codes are a private impl


.. _whatsnew_0190.api.difference:

Expand Down
27 changes: 13 additions & 14 deletions pandas/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1251,7 +1251,7 @@ def _constructor(self):
@cache_readonly
def _engine(self):
# property, for now, slow to look up
return self._engine_type(lambda: self.values, len(self))
return self._engine_type(lambda: self._values, len(self))

def _validate_index_level(self, level):
"""
Expand Down Expand Up @@ -1823,13 +1823,13 @@ def union(self, other):

if self.is_monotonic and other.is_monotonic:
try:
result = self._outer_indexer(self.values, other._values)[0]
result = self._outer_indexer(self._values, other._values)[0]
except TypeError:
# incomparable objects
result = list(self.values)
result = list(self._values)

# worth making this faster? a very unusual case
value_set = set(self.values)
value_set = set(self._values)
result.extend([x for x in other._values if x not in value_set])
else:
indexer = self.get_indexer(other)
Expand All @@ -1838,10 +1838,10 @@ def union(self, other):
if len(indexer) > 0:
other_diff = algos.take_nd(other._values, indexer,
allow_fill=False)
result = _concat._concat_compat((self.values, other_diff))
result = _concat._concat_compat((self._values, other_diff))

try:
self.values[0] < other_diff[0]
self._values[0] < other_diff[0]
except TypeError as e:
warnings.warn("%s, sort order is undefined for "
"incomparable objects" % e, RuntimeWarning,
Expand All @@ -1853,7 +1853,7 @@ def union(self, other):
result.sort()

else:
result = self.values
result = self._values

try:
result = np.sort(result)
Expand Down Expand Up @@ -1906,17 +1906,17 @@ def intersection(self, other):

if self.is_monotonic and other.is_monotonic:
try:
result = self._inner_indexer(self.values, other._values)[0]
result = self._inner_indexer(self._values, other._values)[0]
return self._wrap_union_result(other, result)
except TypeError:
pass

try:
indexer = Index(self.values).get_indexer(other._values)
indexer = Index(self._values).get_indexer(other._values)
indexer = indexer.take((indexer != -1).nonzero()[0])
except:
# duplicates
indexer = Index(self.values).get_indexer_non_unique(
indexer = Index(self._values).get_indexer_non_unique(
other._values)[0].unique()
indexer = indexer[indexer != -1]

Expand Down Expand Up @@ -2536,7 +2536,7 @@ def _reindex_non_unique(self, target):
missing = _ensure_platform_int(missing)
missing_labels = target.take(missing)
missing_indexer = _ensure_int64(l[~check])
cur_labels = self.take(indexer[check])._values
cur_labels = self.take(indexer[check]).values
cur_indexer = _ensure_int64(l[check])

new_labels = np.empty(tuple([len(indexer)]), dtype=object)
Expand All @@ -2556,7 +2556,7 @@ def _reindex_non_unique(self, target):
else:

# need to retake to have the same size as the indexer
indexer = indexer._values
indexer = indexer.values
indexer[~check] = 0

# reset the new indexer to account for the new size
Expand Down Expand Up @@ -2879,7 +2879,7 @@ def _join_monotonic(self, other, how='left', return_indexers=False):
else:
return ret_index

sv = self.values
sv = self._values
ov = other._values

if self.is_unique and other.is_unique:
Expand Down Expand Up @@ -3185,7 +3185,6 @@ def insert(self, loc, item):
"""
_self = np.asarray(self)
item = self._coerce_scalar_to_index(item)._values

idx = np.concatenate((_self[:loc], item, _self[loc:]))
return self._shallow_copy_with_infer(idx)

Expand Down
24 changes: 16 additions & 8 deletions pandas/io/pytables.py
Original file line number Diff line number Diff line change
Expand Up @@ -2349,6 +2349,11 @@ def f(values, freq=None, tz=None):
return DatetimeIndex._simple_new(values, None, freq=freq,
tz=tz)
return f
elif klass == PeriodIndex:
def f(values, freq=None, tz=None):
return PeriodIndex._simple_new(values, None, freq=freq)
return f

return klass

def validate_read(self, kwargs):
Expand Down Expand Up @@ -2450,7 +2455,9 @@ def write_index(self, key, index):
setattr(self.attrs, '%s_variety' % key, 'regular')
converted = _convert_index(index, self.encoding,
self.format_type).set_name('index')

self.write_array(key, converted.values)

node = getattr(self.group, key)
node._v_attrs.kind = converted.kind
node._v_attrs.name = index.name
Expand Down Expand Up @@ -2552,12 +2559,12 @@ def read_index_node(self, node, start=None, stop=None):
kwargs['tz'] = node._v_attrs['tz']

if kind in (u('date'), u('datetime')):
index = factory(
_unconvert_index(data, kind, encoding=self.encoding),
dtype=object, **kwargs)
index = factory(_unconvert_index(data, kind,
encoding=self.encoding),
dtype=object, **kwargs)
else:
index = factory(
_unconvert_index(data, kind, encoding=self.encoding), **kwargs)
index = factory(_unconvert_index(data, kind,
encoding=self.encoding), **kwargs)

index.name = name

Expand Down Expand Up @@ -4377,9 +4384,10 @@ def _convert_index(index, encoding=None, format_type=None):
index_name=index_name)
elif isinstance(index, (Int64Index, PeriodIndex)):
atom = _tables().Int64Col()
return IndexCol(
index.values, 'integer', atom, freq=getattr(index, 'freq', None),
index_name=index_name)
# avoid to store ndarray of Period objects
return IndexCol(index._values, 'integer', atom,
freq=getattr(index, 'freq', None),
index_name=index_name)

if isinstance(index, MultiIndex):
raise TypeError('MultiIndex not supported here!')
Expand Down
15 changes: 12 additions & 3 deletions pandas/tests/indexes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,9 +245,18 @@ def test_ensure_copied_data(self):
tm.assert_numpy_array_equal(index.values, result.values,
check_same='copy')

result = index_type(index.values, copy=False, **init_kwargs)
tm.assert_numpy_array_equal(index.values, result.values,
check_same='same')
if not isinstance(index, PeriodIndex):
result = index_type(index.values, copy=False, **init_kwargs)
tm.assert_numpy_array_equal(index.values, result.values,
check_same='same')
tm.assert_numpy_array_equal(index._values, result._values,
check_same='same')
else:
# .values an object array of Period, thus copied
result = index_type(ordinal=index.asi8, copy=False,
**init_kwargs)
tm.assert_numpy_array_equal(index._values, result._values,
check_same='same')

def test_copy_and_deepcopy(self):
from copy import copy, deepcopy
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/test_datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -781,7 +781,7 @@ def test_astype(self):
idx = period_range('1990', '2009', freq='A')
result = idx.astype('i8')
self.assert_index_equal(result, Index(idx.asi8))
self.assert_numpy_array_equal(result.values, idx.values)
self.assert_numpy_array_equal(result.values, idx.asi8)

def test_astype_raises(self):
# GH 13149, GH 13209
Expand Down
28 changes: 21 additions & 7 deletions pandas/tests/indexing/test_coercion.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,16 +490,30 @@ def test_insert_index_period(self):
self._assert_insert_conversion(obj, pd.Period('2012-01', freq='M'),
exp, 'period[M]')

# ToDo: must coerce to object?
exp = pd.PeriodIndex(['2011-01', '2012-01', '2011-02',
'2011-03', '2011-04'], freq='M')
# period + datetime64 => object
exp = pd.Index([pd.Period('2011-01', freq='M'),
pd.Timestamp('2012-01-01'),
pd.Period('2011-02', freq='M'),
pd.Period('2011-03', freq='M'),
pd.Period('2011-04', freq='M')], freq='M')
self._assert_insert_conversion(obj, pd.Timestamp('2012-01-01'),
exp, 'period[M]')
exp, np.object)

# period + int => object
msg = "Given date string not likely a datetime."
with tm.assertRaisesRegexp(ValueError, msg):
print(obj.insert(1, 1))
exp = pd.Index([pd.Period('2011-01', freq='M'),
1,
pd.Period('2011-02', freq='M'),
pd.Period('2011-03', freq='M'),
pd.Period('2011-04', freq='M')], freq='M')
self._assert_insert_conversion(obj, 1, exp, np.object)

# period + object => object
exp = pd.Index([pd.Period('2011-01', freq='M'),
'x',
pd.Period('2011-02', freq='M'),
pd.Period('2011-03', freq='M'),
pd.Period('2011-04', freq='M')], freq='M')
self._assert_insert_conversion(obj, 'x', exp, np.object)


class TestWhereCoercion(CoercionBase, tm.TestCase):
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -4137,8 +4137,8 @@ def test_series_partial_set_period(self):
idx = pd.period_range('2011-01-01', '2011-01-02', freq='D', name='idx')
ser = Series([0.1, 0.2], index=idx, name='s')

result = ser.loc[[pd.Period('2011-01-01', freq='D'), pd.Period(
'2011-01-02', freq='D')]]
result = ser.loc[[pd.Period('2011-01-01', freq='D'),
pd.Period('2011-01-02', freq='D')]]
exp = Series([0.1, 0.2], index=idx, name='s')
tm.assert_series_equal(result, exp, check_index_type=True)

Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ def test_ops(self):
if not isinstance(o, PeriodIndex):
expected = getattr(o.values, op)()
else:
expected = pd.Period(ordinal=getattr(o.values, op)(),
expected = pd.Period(ordinal=getattr(o._values, op)(),
freq=o.freq)
try:
self.assertEqual(result, expected)
Expand Down
2 changes: 1 addition & 1 deletion pandas/tseries/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ def sort_values(self, return_indexer=False, ascending=True):
sorted_index = self.take(_as)
return sorted_index, _as
else:
sorted_values = np.sort(self.values)
sorted_values = np.sort(self._values)
attribs = self._get_attributes_dict()
freq = attribs['freq']

Expand Down
6 changes: 3 additions & 3 deletions pandas/tseries/converter.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,11 +141,11 @@ def convert(values, units, axis):
is_float(values)):
return get_datevalue(values, axis.freq)
if isinstance(values, PeriodIndex):
return values.asfreq(axis.freq).values
return values.asfreq(axis.freq)._values
if isinstance(values, Index):
return values.map(lambda x: get_datevalue(x, axis.freq))
if is_period_arraylike(values):
return PeriodIndex(values, freq=axis.freq).values
return PeriodIndex(values, freq=axis.freq)._values
if isinstance(values, (list, tuple, np.ndarray, Index)):
return [get_datevalue(x, axis.freq) for x in values]
return values
Expand Down Expand Up @@ -518,7 +518,7 @@ def _daily_finder(vmin, vmax, freq):
info = np.zeros(span,
dtype=[('val', np.int64), ('maj', bool),
('min', bool), ('fmt', '|S20')])
info['val'][:] = dates_.values
info['val'][:] = dates_._values
info['fmt'][:] = ''
info['maj'][[0, -1]] = True
# .. and set some shortcuts
Expand Down
Loading