Skip to content

API: Make .shift always copy (Fixes #22397) #22517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 15, 2018
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -501,6 +501,7 @@ Other API Changes
- :class:`Index` subtraction will attempt to operate element-wise instead of raising ``TypeError`` (:issue:`19369`)
- :class:`pandas.io.formats.style.Styler` supports a ``number-format`` property when using :meth:`~pandas.io.formats.style.Styler.to_excel` (:issue:`22015`)
- :meth:`DataFrame.corr` and :meth:`Series.corr` now raise a ``ValueError`` along with a helpful error message instead of a ``KeyError`` when supplied with an invalid method (:issue:`22298`)
- :meth:`shift` will now always return a copy, instead of the previous behaviour of returning self when shifting by 0 (:issue:`22397`)

.. _whatsnew_0240.deprecations:

Expand Down
2 changes: 1 addition & 1 deletion pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -548,7 +548,7 @@ def shift(self, n, freq=None):

if n == 0:
# immutable so OK
return self
return self.copy()

if self.freq is None:
raise NullFrequencyError("Cannot shift with no freq")
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -8165,7 +8165,7 @@ def mask(self, cond, other=np.nan, inplace=False, axis=None, level=None,
@Appender(_shared_docs['shift'] % _shared_doc_kwargs)
def shift(self, periods=1, freq=None, axis=0):
if periods == 0:
return self
return self.copy()

block_axis = self._get_block_manager_axis(axis)
if freq is None:
Expand Down
19 changes: 19 additions & 0 deletions pandas/tests/generic/test_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,3 +227,22 @@ def test_valid_deprecated(self):
# GH18800
with tm.assert_produces_warning(FutureWarning):
pd.Series([]).valid()

@pytest.mark.parametrize("s", [
Series([np.arange(5)]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add period & timedelta here as well :<

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont really have a great place for these series / index tests functionailty, except for test_base which is already too big.

cc @jbrockmendel @mroeschke

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ofc, will do asap, thanks for the input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay on this, but can you please clarify? I wasn't sure if it was expected that we can shift Period / Timedelta's or if we're supposed to shift by them, things I've tested below:

Shifting on them:

>>> pd.Timedelta('1 days 2 hours').shift
Traceback (most recent call last):
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-17-fc984c065985>", line 1, in <module>
    pd.Timedelta('1 days 2 hours').shift
AttributeError: 'Timedelta' object has no attribute 'shift'
>>> pd.Period('2011-01', 'M').shift
Traceback (most recent call last):
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-18-d3c2e47fd2e3>", line 1, in <module>
    pd.Period('2011-01', 'M').shift
AttributeError: 'Period' object has no attribute 'shift'

Shifting with them:

>>> pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Timedelta('1 days 2 hours'))
Traceback (most recent call last):
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-19-f8499141727f>", line 1, in <module>
    pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Timedelta('1 days 2 hours'))
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/core/indexes/datetimelike.py", line 784, in shift
    start = self[0] + n * self.freq
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 425, in __rmul__
    return self.__mul__(someInt)
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 422, in __mul__
    **self.kwds)
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 195, in __init__
    self.n = int(n)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timedelta'
>>> pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Period('2011-02', 'M'))
Traceback (most recent call last):
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-24-b87cfa9cba95>", line 1, in <module>
    pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Period('2011-02', 'M'))
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/core/indexes/datetimelike.py", line 784, in shift
    start = self[0] + n * self.freq
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 425, in __rmul__
    return self.__mul__(someInt)
  File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 421, in __mul__
    return self.__class__(n=someInt * self.n, normalize=self.normalize,
TypeError: unsupported operand type(s) for *: 'Period' and 'int'

Genuine apologies if I'm being stupid here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to make a series with a datetimeindex and shift using the freq argument.

In [16]: s = pd.Series(range(5), index=pd.date_range("2017", periods=5))

In [17]: s.shift(freq=pd.Timedelta('1D'))
Out[17]:
2017-01-02    0
2017-01-03    1
2017-01-04    2
2017-01-05    3
2017-01-06    4
Freq: D, dtype: int64

it'll complicate the test a bit. let us know if you need assistance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test for timedelta, I couldn't find any docs on the intended usage of shifting by pd.Period? I tried a few approaches but couldn't get any to work. Happy to add the test if you could provide a small example!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if period is valid there. I only tried timedelta.

pd.date_range('1/1/2011', periods=24, freq='H'),
pd.Series(range(5), index=pd.date_range("2017", periods=5))
])
@pytest.mark.parametrize("shift_size", [0, 1, 2])
def test_shift_always_copy(self, s, shift_size):
# GH22397
assert s.shift(shift_size) is not s

@pytest.mark.parametrize("move_by_freq", [
pd.Timedelta('1D'),
pd.Timedelta('1M'),
])
def test_datetime_shift_always_copy(self, move_by_freq):
# GH22397
s = pd.Series(range(5), index=pd.date_range("2017", periods=5))
assert s.shift(freq=move_by_freq) is not s