-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Period resample with length=0 doesn't set freq #13067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
related to any of: #12871 ? |
Related to #12868. It must change CC: @MaximilianR |
Thanks for spotting this @sinhrks. This was my change so mea culpa. Had a quick look and there are a few odd things going on:
In [34]: pd.Series(range(4)).reindex(pd.PeriodIndex([],freq='M'))
Out[34]: Series([], Freq: M, dtype: int64)
# works
In [43]: pd.Series([], index=pd.PeriodIndex([], freq='D')).reindex(pd.PeriodIndex([], freq='M'))
Out[43]: Series([], Freq: D, dtype: float64)
# freq is still 'D' - doesn't work ...if this worked, the current resampler implementation would work as expected
In [6]: s = pd.Series(index=pd.PeriodIndex([], freq='D'))
In [9]: resampler=s.resample('M')
In [13]: resampler._get_new_index()
Out[13]: PeriodIndex([], dtype='float64', freq='M') ...not sure if this is having any real consequences though Would be better to have these root issues solved rather than add a fix for the symptoms here, but a fix for the symptoms is better than nothing. |
Current coverage is 84.14%@@ master #13067 diff @@
==========================================
Files 137 137
Lines 50227 50230 +3
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42256 42264 +8
+ Misses 7971 7966 -5
Partials 0 0
|
@MaximilianR I think
|
Agreed |
I removed a bunch of the if/thens and generally cleared some stuff up, including solving the issues above. The thrust is to solve the root causes of why empty indexes need to be handled differently, rather than adding more special casing @sinhrks do you want to use this PR for this, or do you want me to start another? Assuming you think it's an improvement From f1e9a6eaa8608cc21a6ca52d5ce6888ced7f4181 Mon Sep 17 00:00:00 2001
From: Maximilian Roos
Date: Tue, 3 May 2016 21:18:21 -0400
Subject: [PATCH] Lots of clearing up PeriodIndex resampling behaviour
---
pandas/core/groupby.py | 2 +-
pandas/tseries/period.py | 2 +-
pandas/tseries/resample.py | 18 ++++++------------
pandas/tseries/tests/test_resample.py | 4 +++-
4 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py
index 7a47911..3d197ac 100644
--- a/pandas/core/groupby.py
+++ b/pandas/core/groupby.py
@@ -2669,7 +2669,7 @@ class SeriesGroupBy(GroupBy):
def _wrap_applied_output(self, keys, values, not_indexed_same=False):
if len(keys) == 0:
# GH #6265
- return Series([], name=self.name)
+ return Series([], name=self.name, index=Index(keys))
def _get_index():
if self.grouper.nkeys > 1:
diff --git a/pandas/tseries/period.py b/pandas/tseries/period.py
index 478b255..61e0e85 100644
--- a/pandas/tseries/period.py
+++ b/pandas/tseries/period.py
@@ -271,7 +271,7 @@ class PeriodIndex(DatelikeOps, DatetimeIndexOpsMixin, Int64Index):
@classmethod
def _simple_new(cls, values, name=None, freq=None, **kwargs):
if not getattr(values, 'dtype', None):
- values = np.array(values, copy=False)
+ values = np.array(values, copy=False, dtype='int64')
if is_object_dtype(values):
return PeriodIndex(values, name=name, freq=freq, **kwargs)
diff --git a/pandas/tseries/resample.py b/pandas/tseries/resample.py
index 167c6c1..2240ab6 100644
--- a/pandas/tseries/resample.py
+++ b/pandas/tseries/resample.py
@@ -789,19 +789,17 @@ class PeriodIndexResampler(DatetimeIndexResampler):
ax = self.ax
new_index = self._get_new_index()
- if len(new_index) == 0:
- result = self._selected_obj
- if isinstance(self._selected_obj.index, PeriodIndex):
- result = result.asfreq(self.freq, how=self.convention)
- return self._wrap_result(result.reindex(new_index))
# Start vs. end of period
memb = ax.asfreq(self.freq, how=self.convention)
if is_subperiod(ax.freq, self.freq):
# Downsampling
- rng = np.arange(memb.values[0], memb.values[-1] + 1)
- bins = memb.searchsorted(rng, side='right')
+ if len(new_index) == 0:
+ bins = []
+ else:
+ rng = np.arange(memb.values[0], memb.values[-1] + 1)
+ bins = memb.searchsorted(rng, side='right')
grouper = BinGrouper(bins, new_index)
return self._groupby_and_aggregate(how, grouper=grouper)
elif is_superperiod(ax.freq, self.freq):
@@ -811,8 +809,7 @@ class PeriodIndexResampler(DatetimeIndexResampler):
raise ValueError('Frequency {axfreq} cannot be '
'resampled to {freq}'.format(
- axfreq=ax.freq,
- freq=self.freq))
+ axfreq=ax.freq, freq=self.freq))
def _upsample(self, method, limit=None):
"""
@@ -835,9 +832,6 @@ class PeriodIndexResampler(DatetimeIndexResampler):
obj = self.obj
new_index = self._get_new_index()
- if len(new_index) == 0:
- return self._wrap_result(self._selected_obj.reindex(new_index))
-
# Start vs. end of period
memb = ax.asfreq(self.freq, how=self.convention)
diff --git a/pandas/tseries/tests/test_resample.py b/pandas/tseries/tests/test_resample.py
index c796277..346c546 100644
--- a/pandas/tseries/tests/test_resample.py
+++ b/pandas/tseries/tests/test_resample.py
@@ -2053,9 +2053,11 @@ class TestPeriodIndex(Base, tm.TestCase):
for method in resample_methods:
result = getattr(s.resample(freq), method)()
- assert_series_equal(result, expected)
+ if method!='ohlc': # returns df
+ assert_series_equal(result, expected, check_dtype=False)
self.assertEqual(result.index.freq, freq)
+
def test_resample_count(self):
# GH12774
--
2.6.3 |
If yours includes the same fix in a better way, please issue another PR. I'll close this. |
superseded by #13079 |
closes #13067 closes #13212 Author: Maximilian Roos <[email protected]> Closes #13079 from MaximilianR/period_resample_0 and squashes the following commits: 8c7b9db [Maximilian Roos] empty PeriodIndex issues
git diff upstream/master | flake8 --diff
given freq is not applied if data is empty.
Expected