Skip to content

ENH: pd.Series.shift and .diff to accept a collection of numbers #44660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from

Conversation

skwirskj
Copy link
Contributor

@skwirskj skwirskj commented Nov 28, 2021

…e shift functions to allow iterables to be passed as the period
…ake in a list for periods and then return a concatenated DataFrame of each consecutive shift in the periods list
…llection

Merging the master branch with my changes for Enhancement at issue 44424
Created two new tests to test shifting data in Series and DataFrames with an iterable as the period parameter.
…llection

Merging after updating tests and resetting the generic.py file
Reverted some changes that were made when originally finding a solution and then reset it to the latest version on the upstream master branch.
@jreback jreback added Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 29, 2021
Updated how we are creating the new DataFrame after shifting to improve performance. Also reverted core/generic.py to master
Attempting to fix whatsnew rst file
@skwirskj
Copy link
Contributor Author

skwirskj commented Nov 30, 2021

@phofl @jreback I noticed that I am failing a check in CI/Checks. It is due to Series.shift expecting a series as output, but the issue requested that in the case of an itterable being passed for the period parameter, that Series.shift returns a DataFrame. Is there some way to get around this?


Usage within the :class:`DataFrame` class:

.. ipython:: python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, but the main source of information is the doc-strings which need updating with examples & prose.


The :meth:`DataFrame.shift` and :meth:`Series.shift` functions can take in an iterable, such as a list, for the period parameter. When an iterable is passed
to either function it returns a :class:`DataFrame` object with all of the shifted rows or columns concatenated with one another.
The function applies a shift designated by each element in the iterable. The resulting :class:`DataFrame` object's columns will retain the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this sentence to the end. you can say that the suffix parameter controls the names but don't have to fully explain here. that's the point of the doc-string.

) -> DataFrame:
axis = self._get_axis_number(axis)

# GH#44424 Handle the case of multiple shifts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is not needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if axis==1 this should fail

for i in periods:
if not isinstance(i, int):
raise TypeError(
f"Value {i} in periods is not an integer, expected an integer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have tests?

# GH#44424 Handle the case of multiple shifts
if is_list_like(periods):

new_df = DataFrame()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is superfluous


from pandas.core.reshape.concat import concat

new_df_list = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use result and not new_df

) -> Series | DataFrame:
# Handle the case of multiple shifts
if is_list_like(periods):
if len(periods) == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this tested? this should be an error i think

if len(periods) == 0:
return self

df = self.to_frame()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you doing this? shouldn't you just be concatting the result series?

@@ -429,3 +429,23 @@ def test_shift_axis1_categorical_columns(self):
columns=ci,
)
tm.assert_frame_equal(result, expected)

def test_shift_with_iterable(self):
# GH#44424
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about error conditions

@@ -376,3 +377,15 @@ def test_shift_non_writable_array(self, input_data, output_data):
expected = Series(output_data, dtype="float64")

tm.assert_series_equal(result, expected)

def test_shift_with_iterable(self):
# GH#44424
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jan 11, 2022
@jreback
Copy link
Contributor

jreback commented Jan 16, 2022

closing as stale if you want continue, pls ping

@jreback jreback closed this Jan 16, 2022
@jona-sassenhagen
Copy link
Contributor

jona-sassenhagen commented Jul 13, 2023

@jreback @skwirskj I am considering taking over this PR, would that be ok with you?

@jreback
Copy link
Contributor

jreback commented Jul 13, 2023

sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: pd.Series.shift and .diff to accept a collection of numbers
6 participants