Groupby level fails to enumerate groups #15155

watercrossing · 2017-01-18T18:36:35Z

Code Sample

test = pd.DataFrame(data=np.arange(2,22,2), 
             index=pd.MultiIndex(levels=[pd.CategoricalIndex(["a", "b"]), range(10)],
                                 labels=[[0]*5 + [1]*5, range(10)],
                                 names = ["Index1", "Index2"]))
testGroupedBy = test.groupby(level=["Index1"])
testGroupedBy.get_group("a")

Problem description

instead of returning the data as expected, an error is thrown:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1466-65ba7dd9f763> in <module>()
----> 1 testGroupedBy.get_group("a")

python2.7/site-packages/pandas/core/groupby.pyc in get_group(self, name, obj)
    614             obj = self._selected_obj
    615 
--> 616         inds = self._get_index(name)
    617         if not len(inds):
    618             raise KeyError(name)

python2.7/site-packages/pandas/core/groupby.pyc in _get_index(self, name)
    461     def _get_index(self, name):
    462         """ safe get index, translate keys for datelike to underlying repr """
--> 463         return self._get_indices([name])[0]
    464 
    465     @cache_readonly

python2.7/site-packages/pandas/core/groupby.pyc in _get_indices(self, names)
    428             return []
    429 
--> 430         if len(self.indices) > 0:
    431             index_sample = next(iter(self.indices))
    432         else:

python2.7/site-packages/pandas/core/groupby.pyc in __getattr__(self, attr)
    527 
    528         raise AttributeError("%r object has no attribute %r" %
--> 529                              (type(self).__name__, attr))
    530 
    531     plot = property(GroupByPlot)

AttributeError: 'DataFrameGroupBy' object has no attribute 'indices'

Expected Output

I guess an alternative way to get this output is test.loc["a"] - which yields the expected output below:

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-431.11.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_GB.utf8
LANG: en_GB.utf8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 32.3.1
Cython: None
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2017-01-18T18:46:45Z

This is a bug, which is hidden because of this missing method on CategoricalIndex. If you would like to do a PR which adds this (and tests of course).

diff --git a/pandas/indexes/category.py b/pandas/indexes/category.py
index 2c89f72..e3ffa40 100644
--- a/pandas/indexes/category.py
+++ b/pandas/indexes/category.py
@@ -255,6 +255,9 @@ class CategoricalIndex(Index, base.PandasDelegate):
     def ordered(self):
         return self._data.ordered
 
+    def _reverse_indexer(self):
+        return self._data._reverse_indexer()
+
     def __contains__(self, key):
         hash(key)
         return key in self.values

…#15155)

closes pandas-dev#15155 Author: watercrossing <[email protected]> Closes pandas-dev#15163 from watercrossing/indexgroup and squashes the following commits: 742d4a5 [watercrossing] BUG: GroupBy.get_group failing with a categorical grouper (pandas-dev#15155)

jreback added Bug Categorical Categorical Data Type Groupby Difficulty Novice labels Jan 18, 2017

jreback added this to the 0.20.0 milestone Jan 18, 2017

watercrossing mentioned this issue Jan 19, 2017

Bug in groupby.get_group on categoricalindex #15163

Closed

4 tasks

watercrossing added a commit to watercrossing/pandas that referenced this issue Jan 19, 2017

BUG: GroupBy.get_group failing with a categorical grouper (pandas-dev…

742d4a5

…#15155)

jreback closed this as completed in 4c65d5f Jan 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Groupby level fails to enumerate groups #15155

Groupby level fails to enumerate groups #15155

watercrossing commented Jan 18, 2017 •

edited

Loading

INSTALLED VERSIONS

jreback commented Jan 18, 2017

Groupby level fails to enumerate groups #15155

Groupby level fails to enumerate groups #15155

Comments

watercrossing commented Jan 18, 2017 • edited Loading

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jreback commented Jan 18, 2017

watercrossing commented Jan 18, 2017 •

edited

Loading

Output of `pd.show_versions()`