-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: pd.Series.shift and .diff to accept a collection of numbers #44660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d6790a0
d60c2d9
a8b3a7f
5cb3f9e
b865128
7f59f5f
51830d0
9c41dce
99b7d6d
4bb6c35
bc96dbc
25fda61
381936a
8db0131
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -186,6 +186,38 @@ representation of :class:`DataFrame` objects (:issue:`4889`). | |
df | ||
df.to_dict(orient='tight') | ||
|
||
.. _whatsnew_140.enhancements.shift: | ||
|
||
DataFrame.shift and Series.shift now accept an iterable for parameter ``'period'`` and new parameter ``'suffix'`` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The :meth:`DataFrame.shift` and :meth:`Series.shift` functions can take in an iterable, such as a list, for the period parameter. When an iterable is passed | ||
to either function it returns a :class:`DataFrame` object with all of the shifted rows or columns concatenated with one another. | ||
The function applies a shift designated by each element in the iterable. The resulting :class:`DataFrame` object's columns will retain the | ||
names from the :class:`DataFrame` object that called shift, but postfixed with <name>_<num>, where name is the original | ||
column name and num correlates to the current element of the period iterable. The function also now takes in a ``'suffix'`` parameter to add a custom suffix | ||
to the column names instead of adding the current element of the period iterable (:issue:`44424`). | ||
|
||
Usage within the :class:`DataFrame` class: | ||
|
||
.. ipython:: python | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is fine, but the main source of information is the doc-strings which need updating with examples & prose. |
||
|
||
df = pd.DataFrame({ | ||
'a': [1, 2, 3], | ||
'b': [4, 5, 6] | ||
}) | ||
shifts = [0, 1, 2] | ||
df.shift(shifts) | ||
|
||
Usage within the :class:`Series` class: | ||
|
||
.. ipython:: python | ||
|
||
ser = pd.Series([1, 2, 3]) | ||
shifts = [0, 1, 2] | ||
|
||
ser.shift(shifts) | ||
|
||
.. _whatsnew_140.enhancements.other: | ||
|
||
Other enhancements | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5326,9 +5326,38 @@ def shift( | |
freq: Frequency | None = None, | ||
axis: Axis = 0, | ||
fill_value=lib.no_default, | ||
suffix=None, | ||
) -> DataFrame: | ||
axis = self._get_axis_number(axis) | ||
|
||
# GH#44424 Handle the case of multiple shifts | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this comment is not needed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if axis==1 this should fail |
||
if is_list_like(periods): | ||
|
||
new_df = DataFrame() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is superfluous |
||
|
||
from pandas.core.reshape.concat import concat | ||
|
||
new_df_list = [] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use result and not new_df |
||
|
||
for i in periods: | ||
skwirskj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if not isinstance(i, int): | ||
skwirskj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
raise TypeError( | ||
f"Value {i} in periods is not an integer, expected an integer" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this have tests? |
||
) | ||
|
||
new_df_list.append( | ||
super() | ||
.shift(periods=i, freq=freq, axis=axis, fill_value=fill_value) | ||
.add_suffix(f"_{i}" if suffix is None else suffix) | ||
) | ||
|
||
new_df = concat(new_df_list, axis=1) | ||
|
||
if new_df.empty: | ||
return self | ||
|
||
return new_df | ||
|
||
ncols = len(self.columns) | ||
if axis == 1 and periods != 0 and fill_value is lib.no_default and ncols > 0: | ||
# We will infer fill_value to match the closest column | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,7 @@ | |
TYPE_CHECKING, | ||
Any, | ||
Callable, | ||
Collection, | ||
Hashable, | ||
Iterable, | ||
Literal, | ||
|
@@ -4938,7 +4939,18 @@ def _replace_single(self, to_replace, method: str, inplace: bool, limit): | |
|
||
# error: Cannot determine type of 'shift' | ||
@doc(NDFrame.shift, klass=_shared_doc_kwargs["klass"]) # type: ignore[has-type] | ||
def shift(self, periods=1, freq=None, axis=0, fill_value=None) -> Series: | ||
def shift( | ||
self, periods: int | Collection[int] = 1, freq=None, axis=0, fill_value=None | ||
) -> Series | DataFrame: | ||
# Handle the case of multiple shifts | ||
if is_list_like(periods): | ||
if len(periods) == 0: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this tested? this should be an error i think |
||
return self | ||
|
||
df = self.to_frame() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are you doing this? shouldn't you just be concatting the result series? |
||
|
||
return df.shift(periods, freq=freq, axis=axis, fill_value=fill_value) | ||
|
||
return super().shift( | ||
periods=periods, freq=freq, axis=axis, fill_value=fill_value | ||
) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -429,3 +429,23 @@ def test_shift_axis1_categorical_columns(self): | |
columns=ci, | ||
) | ||
tm.assert_frame_equal(result, expected) | ||
|
||
def test_shift_with_iterable(self): | ||
# GH#44424 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about error conditions |
||
data = {"a": [1, 2, 3], "b": [4, 5, 6]} | ||
shifts = [0, 1, 2] | ||
|
||
df = DataFrame(data) | ||
shifted = df.shift(shifts) | ||
|
||
expected = DataFrame( | ||
{ | ||
"a_0": [1, 2, 3], | ||
"b_0": [4, 5, 6], | ||
"a_1": [np.NaN, 1.0, 2.0], | ||
"b_1": [np.NaN, 4.0, 5.0], | ||
"a_2": [np.NaN, np.NaN, 1.0], | ||
"b_2": [np.NaN, np.NaN, 4.0], | ||
} | ||
) | ||
tm.assert_frame_equal(expected, shifted) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,6 +14,7 @@ | |
offsets, | ||
) | ||
import pandas._testing as tm | ||
from pandas.core.frame import DataFrame | ||
|
||
from pandas.tseries.offsets import BDay | ||
|
||
|
@@ -376,3 +377,15 @@ def test_shift_non_writable_array(self, input_data, output_data): | |
expected = Series(output_data, dtype="float64") | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
def test_shift_with_iterable(self): | ||
# GH#44424 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same question |
||
ser = Series([1, 2, 3]) | ||
shifts = [0, 1, 2] | ||
|
||
shifted = ser.shift(shifts) | ||
expected = DataFrame( | ||
{"0_0": [1, 2, 3], "0_1": [np.NaN, 1.0, 2.0], "0_2": [np.NaN, np.NaN, 1.0]} | ||
) | ||
|
||
tm.assert_frame_equal(expected, shifted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need this sentence to the end. you can say that the
suffix
parameter controls the names but don't have to fully explain here. that's the point of the doc-string.