-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Make .shift always copy (Fixes #22397) #22517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Make .shift always copy (Fixes #22397) #22517
Conversation
Codecov Report
@@ Coverage Diff @@
## master #22517 +/- ##
==========================================
- Coverage 92.04% 92.03% -0.01%
==========================================
Files 169 169
Lines 50787 50780 -7
==========================================
- Hits 46746 46737 -9
- Misses 4041 4043 +2
Continue to review full report at Codecov.
|
pandas/tests/generic/test_series.py
Outdated
# GH22397 | ||
s = Series([1, 2, 3]) | ||
assert s.shift(0) is not s | ||
assert s.shift(1) is not s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we parametrize this? (I know it's only two examples, but why not three for example?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to be as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
We should confirm that we're OK with outright changing the behavior, rather than deprecating it first. Deprecation would require adding a keyword that would then become obsolete, which I would rather avoid.
pandas/tests/generic/test_series.py
Outdated
# GH22397 | ||
s = Series([1, 2, 3]) | ||
assert s.shift(0) is not s | ||
assert s.shift(1) is not s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to be as is.
pandas/tests/generic/test_series.py
Outdated
|
||
def test_shift_always_copy(self): | ||
# GH22397 | ||
s = Series([1, 2, 3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u also test for datetime types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (assuming DatetimeIndex was meant, which required another fix) if you just meant a series of datetimes, will change - let me know
@@ -227,3 +227,12 @@ def test_valid_deprecated(self): | |||
# GH18800 | |||
with tm.assert_produces_warning(FutureWarning): | |||
pd.Series([]).valid() | |||
|
|||
@pytest.mark.parametrize("s", [ | |||
Series([np.arange(5)]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add period & timedelta here as well :<
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we dont really have a great place for these series / index tests functionailty, except for test_base which is already too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ofc, will do asap, thanks for the input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay on this, but can you please clarify? I wasn't sure if it was expected that we can shift Period / Timedelta's or if we're supposed to shift by them, things I've tested below:
Shifting on them:
>>> pd.Timedelta('1 days 2 hours').shift
Traceback (most recent call last):
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-17-fc984c065985>", line 1, in <module>
pd.Timedelta('1 days 2 hours').shift
AttributeError: 'Timedelta' object has no attribute 'shift'
>>> pd.Period('2011-01', 'M').shift
Traceback (most recent call last):
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-18-d3c2e47fd2e3>", line 1, in <module>
pd.Period('2011-01', 'M').shift
AttributeError: 'Period' object has no attribute 'shift'
Shifting with them:
>>> pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Timedelta('1 days 2 hours'))
Traceback (most recent call last):
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-19-f8499141727f>", line 1, in <module>
pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Timedelta('1 days 2 hours'))
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/core/indexes/datetimelike.py", line 784, in shift
start = self[0] + n * self.freq
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 425, in __rmul__
return self.__mul__(someInt)
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 422, in __mul__
**self.kwds)
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 195, in __init__
self.n = int(n)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timedelta'
>>> pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Period('2011-02', 'M'))
Traceback (most recent call last):
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-24-b87cfa9cba95>", line 1, in <module>
pd.date_range('1/1/2011', periods=24, freq='H').shift(pd.Period('2011-02', 'M'))
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/core/indexes/datetimelike.py", line 784, in shift
start = self[0] + n * self.freq
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 425, in __rmul__
return self.__mul__(someInt)
File "/anaconda3/envs/pandas-dev/lib/python3.6/site-packages/pandas/tseries/offsets.py", line 421, in __mul__
return self.__class__(n=someInt * self.n, normalize=self.normalize,
TypeError: unsupported operand type(s) for *: 'Period' and 'int'
Genuine apologies if I'm being stupid here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is to make a series with a datetimeindex and shift using the freq
argument.
In [16]: s = pd.Series(range(5), index=pd.date_range("2017", periods=5))
In [17]: s.shift(freq=pd.Timedelta('1D'))
Out[17]:
2017-01-02 0
2017-01-03 1
2017-01-04 2
2017-01-05 3
2017-01-06 4
Freq: D, dtype: int64
it'll complicate the test a bit. let us know if you need assistance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test for timedelta, I couldn't find any docs on the intended usage of shifting by pd.Period? I tried a few approaches but couldn't get any to work. Happy to add the test if you could provide a small example!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if period is valid there. I only tried timedelta.
Hello @AaronCritchley! Thanks for updating the PR.
|
You'll need to fetch and merge master for the CI failures to be fixed. |
thanks @AaronCritchley |
git diff upstream/master -u -- "*.py" | flake8 --diff