-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Series pct_change fill_method behavior #25291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 23 commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
0e4e1c2
Add skipna to pct_change (#25006)
4be1bdc
Add tests
192bded
Fix PEP8 issues
bb74285
Fix PEP8 issue
4418bf1
Fix test
3670ffe
Fix test
8f36c7a
Fix tests
add18de
Fix linting
59eab18
Merge branch 'master' into fix-25006
279f433
Set default skipna=True
4072ca0
Use pytest.raises
a016d8a
Merge branch 'master' into fix-25006
80a09c9
Address requested changes
1bf00f8
Fix tests passing periods as kwarg
9208f61
Merge branch 'master' into fix-25006
fd2cdf8
Merge branch 'master' into fix-25006
66cc4a4
Add whatsnew note
14c7a05
Fix for the case axis=1
932fc66
Address requested changes
ed86a7b
Replace None with np.nan
a1ca0ca
Replace DataFrame with ABCDataFrame
efefaf6
Merge branch 'master' into fix-25006
1e854ed
Merge branch 'master' into fix-25006
84c036a
Merge remote-tracking branch 'upstream/master' into fix-25006
WillAyd 1acee7c
blackify
WillAyd 764846d
Changed whatsnew
WillAyd 7184698
Updated versionadded
WillAyd 3821857
Signature fixup
WillAyd a2be8f6
test failure fixup
WillAyd e456c6b
Merge remote-tracking branch 'upstream/master' into fix-25006
WillAyd fd18d04
docstring for skipna=False
WillAyd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9993,6 +9993,10 @@ def _check_percentile(self, q): | |
The number of consecutive NAs to fill before stopping. | ||
freq : DateOffset, timedelta, or offset alias string, optional | ||
Increment to use from time series API (e.g. 'M' or BDay()). | ||
skipna : bool, default True | ||
Exclude NA/null values before computing percent change. | ||
|
||
.. versionadded:: 0.25.0 | ||
**kwargs | ||
Additional keyword arguments are passed into | ||
`DataFrame.shift` or `Series.shift`. | ||
|
@@ -10009,6 +10013,11 @@ def _check_percentile(self, q): | |
Series.shift : Shift the index by some number of periods. | ||
DataFrame.shift : Shift the index by some number of periods. | ||
|
||
Notes | ||
----- | ||
The default `skipna=True` drops NAs before computing the percentage | ||
change, and the results are reindexed like the original calling object. | ||
|
||
Examples | ||
-------- | ||
**Series** | ||
|
@@ -10032,22 +10041,42 @@ def _check_percentile(self, q): | |
2 -0.055556 | ||
dtype: float64 | ||
|
||
See the percentage change in a Series where filling NAs with last | ||
valid observation forward to next valid. | ||
See how the computing of percentage change is performed in a Series | ||
with NAs. With default `skipna=True`, NAs are dropped before the | ||
computation and eventually the results are reindexed like the original | ||
object, thus keeping the original NAs. | ||
|
||
>>> s = pd.Series([90, 91, None, 85]) | ||
>>> s = pd.Series([90, 91, np.nan, 85, np.nan, 95]) | ||
>>> s | ||
0 90.0 | ||
1 91.0 | ||
2 NaN | ||
3 85.0 | ||
4 NaN | ||
5 95.0 | ||
dtype: float64 | ||
|
||
>>> s.pct_change() | ||
0 NaN | ||
1 0.011111 | ||
2 NaN | ||
3 -0.065934 | ||
4 NaN | ||
5 0.117647 | ||
dtype: float64 | ||
|
||
On the other hand, if a fill method is passed, NAs are filled before | ||
the computation. For example, before the computation of percentage | ||
change, forward fill method `ffill` first fills NAs with last valid | ||
observation forward to next valid. | ||
|
||
>>> s.pct_change(fill_method='ffill') | ||
0 NaN | ||
1 0.011111 | ||
2 0.000000 | ||
3 -0.065934 | ||
4 0.000000 | ||
5 0.117647 | ||
dtype: float64 | ||
|
||
**DataFrame** | ||
|
@@ -10089,14 +10118,75 @@ def _check_percentile(self, q): | |
2016 2015 2014 | ||
GOOG NaN -0.151997 -0.086016 | ||
APPL NaN 0.337604 0.012002 | ||
|
||
In a DataFrame with NAs, when computing the percentage change with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @WillAyd are you sure that example is useful?
|
||
default `skipna=True`, NAs are first droppped on each column/row, and | ||
the results are eventually reindexed as originally. | ||
|
||
>>> df = pd.DataFrame({ | ||
... 'a': [90, 91, np.nan, 85, np.nan, 95], | ||
... 'b': [91, np.nan, 85, np.nan, 95, np.nan], | ||
... 'c': [np.nan, 85, np.nan, 95, np.nan, np.nan]}) | ||
>>> df | ||
a b c | ||
0 90.0 91.0 NaN | ||
1 91.0 NaN 85.0 | ||
2 NaN 85.0 NaN | ||
3 85.0 NaN 95.0 | ||
4 NaN 95.0 NaN | ||
5 95.0 NaN NaN | ||
|
||
>>> df.pct_change() | ||
a b c | ||
0 NaN NaN NaN | ||
1 0.011111 NaN NaN | ||
2 NaN -0.065934 NaN | ||
3 -0.065934 NaN 0.117647 | ||
4 NaN 0.117647 NaN | ||
5 0.117647 NaN NaN | ||
|
||
>>> df.pct_change(axis=1) | ||
a b c | ||
0 NaN 0.011111 NaN | ||
1 NaN NaN -0.065934 | ||
2 NaN NaN NaN | ||
3 NaN NaN 0.117647 | ||
4 NaN NaN NaN | ||
5 NaN NaN NaN | ||
|
||
Otherwise, if a fill method is passed, NAs are filled before the | ||
computation. | ||
|
||
>>> df.pct_change(fill_method='ffill') | ||
a b c | ||
0 NaN NaN NaN | ||
1 0.011111 0.000000 NaN | ||
2 0.000000 -0.065934 0.000000 | ||
3 -0.065934 0.000000 0.117647 | ||
4 0.000000 0.117647 0.000000 | ||
5 0.117647 0.000000 0.000000 | ||
""" | ||
|
||
@Appender(_shared_docs['pct_change'] % _shared_doc_kwargs) | ||
albertvillanova marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def pct_change(self, periods=1, fill_method='pad', limit=None, freq=None, | ||
**kwargs): | ||
# TODO: Not sure if above is correct - need someone to confirm. | ||
def pct_change(self, periods=1, fill_method=None, limit=None, freq=None, | ||
skipna=None, **kwargs): | ||
if fill_method is not None and skipna: | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
raise ValueError("cannot pass both fill_method and skipna") | ||
elif limit is not None and skipna: | ||
raise ValueError("cannot pass both limit and skipna") | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if fill_method is None and limit is None and skipna is None: | ||
skipna = True | ||
axis = self._get_axis_number(kwargs.pop('axis', self._stat_axis_name)) | ||
if fill_method is None: | ||
if skipna and isinstance(self, ABCDataFrame): | ||
# If DataFrame, apply to each column/row | ||
return self.apply( | ||
lambda s: s.pct_change(periods=periods, freq=freq, | ||
skipna=skipna, **kwargs), | ||
axis=axis | ||
) | ||
if skipna: | ||
data = self.dropna() | ||
elif fill_method is None: | ||
data = self | ||
else: | ||
data = self.fillna(method=fill_method, limit=limit, axis=axis) | ||
|
@@ -10107,6 +10197,8 @@ def pct_change(self, periods=1, fill_method='pad', limit=None, freq=None, | |
if freq is None: | ||
mask = isna(com.values_from_object(data)) | ||
np.putmask(rs.values, mask, np.nan) | ||
if skipna: | ||
rs = rs.reindex_like(self) | ||
return rs | ||
|
||
def _agg_by_level(self, name, axis=0, level=0, skipna=True, **kwargs): | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sounds like this is a change, but its actually not, you are just adding the
skipna
arg. pls make that more clear. put this in other enhancements.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I am not just adding
skipna
arg; I am settingskipna=True
as default (before, the default wasfill_method='pad'
). I think this is an API breaking change. Indeed I quoted what you told me:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, maybe also say adding the
skipna
arg (as its not obvious that it was added in the note)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you say that thiis is current behavior and the default is NO change.