Skip to content

Fix GroupBy nth Handling with Observed=False #26419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Aug 20, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
03ee26b
Added test coverage for observed=False with ops
WillAyd May 15, 2019
ee549ed
Fixed issue with observed=False and nth
WillAyd May 15, 2019
f0a510d
Stubbed whatsnew note
WillAyd May 15, 2019
f671204
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd May 16, 2019
94dda01
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd May 17, 2019
e59a991
lint fixup
WillAyd May 17, 2019
3677471
Simplified test
WillAyd May 19, 2019
34c2f06
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd May 19, 2019
2ca34e3
whatsnew whitespace fix
WillAyd May 19, 2019
d3e5efa
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jun 3, 2019
f9758b8
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jun 4, 2019
ad729c5
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jun 27, 2019
5b7b6bc
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jul 15, 2019
aff7327
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jul 15, 2019
47201fb
blackify
WillAyd Jul 15, 2019
56822cc
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jul 15, 2019
1804e27
Removed doc whitespace
WillAyd Jul 15, 2019
4c2e413
Merge remote-tracking branch 'upstream/master' into nth-na-handling
WillAyd Jul 25, 2019
a837564
moved whatsnew to 0.25.1
WillAyd Jul 25, 2019
308e569
Merge remote-tracking branch 'upstream/master' into WillAyd-nth-na-ha…
TomAugspurger Aug 19, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.25.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ Groupby/resample/rolling
^^^^^^^^^^^^^^^^^^^^^^^^

- Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.transform` where applying a timezone conversion lambda function would drop timezone information (:issue:`27496`)
- Bug in :meth:`pandas.core.groupby.GroupBy.nth` where ``observed=False`` was being ignored for Categorical groupers (:issue:`26385`)
- Bug in windowing over read-only arrays (:issue:`27766`)
- Fixed segfault in `pandas.core.groupby.DataFrameGroupBy.quantile` when an invalid quantile was passed (:issue:`27470`)
-
Expand Down
6 changes: 5 additions & 1 deletion pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1773,7 +1773,11 @@ def nth(self, n: Union[int, List[int]], dropna: Optional[str] = None) -> DataFra
if not self.as_index:
return out

out.index = self.grouper.result_index[ids[mask]]
result_index = self.grouper.result_index
out.index = result_index[ids[mask]]

if not self.observed and isinstance(result_index, CategoricalIndex):
out = out.reindex(result_index)

return out.sort_index() if self.sort else out

Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/groupby/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -434,6 +434,21 @@ def test_observed_groups_with_nan(observed):
tm.assert_dict_equal(result, expected)


def test_observed_nth():
# GH 26385
cat = pd.Categorical(["a", np.nan, np.nan], categories=["a", "b", "c"])
ser = pd.Series([1, 2, 3])
df = pd.DataFrame({"cat": cat, "ser": ser})

result = df.groupby("cat", observed=False)["ser"].nth(0)

index = pd.Categorical(["a", "b", "c"], categories=["a", "b", "c"])
expected = pd.Series([1, np.nan, np.nan], index=index, name="ser")
expected.index.name = "cat"

tm.assert_series_equal(result, expected)


def test_dataframe_categorical_with_nan(observed):
# GH 21151
s1 = Categorical([np.nan, "a", np.nan, "a"], categories=["a", "b", "c"])
Expand Down