Skip to content

Commit a8f9ec5

Browse files
watercrossingAnkurDedania
authored andcommitted
Bug in groupby.get_group on categoricalindex
closes pandas-dev#15155 Author: watercrossing <[email protected]> Closes pandas-dev#15163 from watercrossing/indexgroup and squashes the following commits: 742d4a5 [watercrossing] BUG: GroupBy.get_group failing with a categorical grouper (pandas-dev#15155)
1 parent 11c1bca commit a8f9ec5

File tree

3 files changed

+24
-0
lines changed

3 files changed

+24
-0
lines changed

doc/source/whatsnew/v0.20.0.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,7 @@ Bug Fixes
388388

389389
- Bug in compat for passing long integers to ``Timestamp.replace`` (:issue:`15030`)
390390
- Bug in ``.loc`` that would not return the correct dtype for scalar access for a DataFrame (:issue:`11617`)
391+
- Bug in ``GroupBy.get_group()`` failing with a categorical grouper (:issue:`15155`)
391392

392393

393394

pandas/indexes/category.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,9 @@ def categories(self):
255255
def ordered(self):
256256
return self._data.ordered
257257

258+
def _reverse_indexer(self):
259+
return self._data._reverse_indexer()
260+
258261
def __contains__(self, key):
259262
hash(key)
260263
return key in self.values

pandas/tests/groupby/test_categorical.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,26 @@ def setUp(self):
6767
'E': np.random.randn(11),
6868
'F': np.random.randn(11)})
6969

70+
def test_level_groupby_get_group(self):
71+
# GH15155
72+
df = DataFrame(data=np.arange(2, 22, 2),
73+
index=MultiIndex(
74+
levels=[pd.CategoricalIndex(["a", "b"]), range(10)],
75+
labels=[[0] * 5 + [1] * 5, range(10)],
76+
names=["Index1", "Index2"]))
77+
g = df.groupby(level=["Index1"])
78+
79+
# expected should equal test.loc[["a"]]
80+
# GH15166
81+
expected = DataFrame(data=np.arange(2, 12, 2),
82+
index=pd.MultiIndex(levels=[pd.CategoricalIndex(
83+
["a", "b"]), range(5)],
84+
labels=[[0] * 5, range(5)],
85+
names=["Index1", "Index2"]))
86+
result = g.get_group('a')
87+
88+
assert_frame_equal(result, expected)
89+
7090
def test_apply_use_categorical_name(self):
7191
from pandas import qcut
7292
cats = qcut(self.df.C, 4)

0 commit comments

Comments
 (0)