Skip to content

pct change bug issue 21200 #21235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Dec 12, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
ee2c034
pct change bug issue 21200
simonariddell May 28, 2018
47b5908
add whatsnew
simonariddell May 28, 2018
41beb4a
fix pep8 issues
simonariddell May 28, 2018
25efd37
Removed vectorization from groupby pct change
simonariddell May 30, 2018
849fac4
try using cache arguments to keep vectorization
simonariddell Jun 4, 2018
eaede34
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Jun 9, 2018
64097cc
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Jun 19, 2018
fcd7183
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Jun 19, 2018
e8681d6
Merge branch 'groupby_pctchange_bug' of https://github.com/SimonAleck…
simonariddell Jun 24, 2018
77c3935
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Jun 24, 2018
d037e65
Update test to address WillAyds code suggestion
simonariddell Jul 29, 2018
a4fcb33
incorporate Will CRs
simonariddell Sep 27, 2018
64d2e4f
cr update
simonariddell Sep 28, 2018
eabfe1f
merge upstream
simonariddell Sep 28, 2018
a63553a
incorporate CR and merge in master
simonariddell Sep 29, 2018
c72c112
Merge branch 'master' of https://github.com/pandas-dev/pandas
simonariddell Nov 23, 2018
a745209
merge in master
simonariddell Nov 23, 2018
5d3fc2d
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Nov 25, 2018
ac9d005
Merge remote-tracking branch 'upstream/master'
simonariddell Nov 25, 2018
1f1f705
Merge branch 'master' into groupby_pctchange_bug
simonariddell Nov 25, 2018
795d7be
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Nov 25, 2018
f0a9c6d
Merge branch 'groupby_pctchange_bug' of https://github.com/SimonAleck…
simonariddell Nov 27, 2018
a979f79
BUG,TST: Remove case where vectorization fails in pct_change groupby …
simonariddell Nov 27, 2018
5eb79c2
Merge branch 'groupby_pctchange_bug' of https://github.com/SimonAleck…
simonariddell Nov 27, 2018
b11d20f
BUG,TST: GH21234 groupby pct_change
simonariddell Nov 27, 2018
f480299
BUG,TST: GH21234 groupby pct_change
simonariddell Nov 28, 2018
a3c95cb
BUG,TST: GH21235 groupby pct_change whatsnew
simonariddell Dec 2, 2018
e5289a2
BUG, TST: GH 21235 groupby pct_change whatsnew
simonariddell Dec 9, 2018
1ccc507
Merge remote-tracking branch 'upstream/master'
simonariddell Dec 9, 2018
7f4a244
merge master
simonariddell Dec 9, 2018
c26cf60
BUG, TST: GH 21235 groupby pct_change whatsnew
simonariddell Dec 9, 2018
877b433
BUG, TST: GH 21235 groupby pct_change whatsnew
simonariddell Dec 9, 2018
c8b686e
Merge remote-tracking branch 'upstream/master'
simonariddell Dec 11, 2018
faf532e
Merge remote-tracking branch 'upstream/master'
simonariddell Dec 12, 2018
6d87075
Merge branch 'master' into groupby_pctchange_bug
simonariddell Dec 12, 2018
ad934b7
BUG, TST: GH 21235 groupby pct_change whatsnew
simonariddell Dec 12, 2018
01d705f
BUG, TST: GH 21235 groupby pct_change test parametrize
simonariddell Dec 12, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions doc/source/whatsnew/v0.24.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,33 @@ Backwards incompatible API changes
- :meth:`read_csv` will now raise a ``ValueError`` if a column with missing values is declared as having dtype ``bool`` (:issue:`20591`)
- The column order of the resultant :class:`DataFrame` from :meth:`MultiIndex.to_frame` is now guaranteed to match the :attr:`MultiIndex.names` order. (:issue:`22420`)

Percentage change on groupby changes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Fixed a bug where calling :func:`SeriesGroupBy.pct_change` or :func:`DataFrameGroupBy.pct_change` would previously work across groups when calculating the percent change, where it now correctly works per group (:issue:`21200`, :issue:`21235`).

.. ipython:: python

df = pd.DataFrame({'grp': ['a', 'a', 'b'], 'foo': [1.0, 1.1, 2.2]})
df

Previous behavior:

.. code-block:: ipython

In [1]: df.groupby('grp').pct_change()
Out[1]:
foo
0 NaN
1 0.1
2 1.0

New behavior:

.. ipython:: python

df.groupby('grp').pct_change()

.. _whatsnew_0240.api_breaking.deps:

Dependencies have increased minimum versions
Expand Down
10 changes: 8 additions & 2 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1221,9 +1221,15 @@ def _apply_to_column_groupbys(self, func):
return func(self)

def pct_change(self, periods=1, fill_method='pad', limit=None, freq=None):
"""Calculate percent change of each value to previous entry in group"""
"""Calcuate pct_change of each value to previous entry in group"""
# TODO: Remove this conditional when #23918 is fixed
if freq:
return self.apply(lambda x: x.pct_change(periods=periods,
fill_method=fill_method,
limit=limit, freq=freq))
filled = getattr(self, fill_method)(limit=limit)
shifted = filled.shift(periods=periods, freq=freq)
fill_grp = filled.groupby(self.grouper.labels)
shifted = fill_grp.shift(periods=periods, freq=freq)

return (filled / shifted) - 1

Expand Down
9 changes: 4 additions & 5 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -2025,11 +2025,10 @@ def pct_change(self, periods=1, fill_method='pad', limit=None, freq=None,
fill_method=fill_method,
limit=limit, freq=freq,
axis=axis))

filled = getattr(self, fill_method)(limit=limit).drop(
self.grouper.names, axis=1)
shifted = filled.shift(periods=periods, freq=freq)

filled = getattr(self, fill_method)(limit=limit)
filled = filled.drop(self.grouper.names, axis=1)
fill_grp = filled.groupby(self.grouper.labels)
shifted = fill_grp.shift(periods=periods, freq=freq)
return (filled / shifted) - 1

@Substitution(name='groupby')
Expand Down
46 changes: 23 additions & 23 deletions pandas/tests/groupby/test_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -765,36 +765,36 @@ def test_pad_stable_sorting(fill_method):


@pytest.mark.parametrize("test_series", [True, False])
@pytest.mark.parametrize("freq", [
None,
pytest.param('D', marks=pytest.mark.xfail(
reason='GH#23918 before method uses freq in vectorized approach'))])
@pytest.mark.parametrize("periods,fill_method,limit", [
(1, 'ffill', None), (1, 'ffill', 1),
(1, 'bfill', None), (1, 'bfill', 1),
(-1, 'ffill', None), (-1, 'ffill', 1),
(-1, 'bfill', None), (-1, 'bfill', 1)])
def test_pct_change(test_series, periods, fill_method, limit):
vals = [np.nan, np.nan, 1, 2, 4, 10, np.nan, np.nan]
exp_vals = Series(vals).pct_change(periods=periods,
fill_method=fill_method,
limit=limit).tolist()

df = DataFrame({'key': ['a'] * len(vals) + ['b'] * len(vals),
'vals': vals * 2})
grp = df.groupby('key')

def get_result(grp_obj):
return grp_obj.pct_change(periods=periods,
fill_method=fill_method,
limit=limit)
(-1, 'bfill', None), (-1, 'bfill', 1),
])
def test_pct_change(test_series, freq, periods, fill_method, limit):
# GH 21200, 21621
vals = [3, np.nan, np.nan, np.nan, 1, 2, 4, 10, np.nan, 4]
keys = ['a', 'b']
key_v = np.repeat(keys, len(vals))
df = DataFrame({'key': key_v, 'vals': vals * 2})

df_g = getattr(df.groupby('key'), fill_method)(limit=limit)
grp = df_g.groupby('key')

expected = grp['vals'].obj / grp['vals'].shift(periods) - 1

if test_series:
exp = pd.Series(exp_vals * 2)
exp.name = 'vals'
grp = grp['vals']
result = get_result(grp)
tm.assert_series_equal(result, exp)
result = df.groupby('key')['vals'].pct_change(
periods=periods, fill_method=fill_method, limit=limit, freq=freq)
tm.assert_series_equal(result, expected)
else:
exp = DataFrame({'vals': exp_vals * 2})
result = get_result(grp)
tm.assert_frame_equal(result, exp)
result = df.groupby('key').pct_change(
periods=periods, fill_method=fill_method, limit=limit, freq=freq)
tm.assert_frame_equal(result, expected.to_frame('vals'))


@pytest.mark.parametrize("func", [np.any, np.all])
Expand Down