Skip to content

Groupby level fails to enumerate groups #15155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
watercrossing opened this issue Jan 18, 2017 · 1 comment
Closed

Groupby level fails to enumerate groups #15155

watercrossing opened this issue Jan 18, 2017 · 1 comment
Labels
Bug Categorical Categorical Data Type Groupby
Milestone

Comments

@watercrossing
Copy link
Contributor

watercrossing commented Jan 18, 2017

Code Sample

test = pd.DataFrame(data=np.arange(2,22,2), 
             index=pd.MultiIndex(levels=[pd.CategoricalIndex(["a", "b"]), range(10)],
                                 labels=[[0]*5 + [1]*5, range(10)],
                                 names = ["Index1", "Index2"]))
testGroupedBy = test.groupby(level=["Index1"])
testGroupedBy.get_group("a")

Problem description

instead of returning the data as expected, an error is thrown:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1466-65ba7dd9f763> in <module>()
----> 1 testGroupedBy.get_group("a")

python2.7/site-packages/pandas/core/groupby.pyc in get_group(self, name, obj)
    614             obj = self._selected_obj
    615 
--> 616         inds = self._get_index(name)
    617         if not len(inds):
    618             raise KeyError(name)

python2.7/site-packages/pandas/core/groupby.pyc in _get_index(self, name)
    461     def _get_index(self, name):
    462         """ safe get index, translate keys for datelike to underlying repr """
--> 463         return self._get_indices([name])[0]
    464 
    465     @cache_readonly

python2.7/site-packages/pandas/core/groupby.pyc in _get_indices(self, names)
    428             return []
    429 
--> 430         if len(self.indices) > 0:
    431             index_sample = next(iter(self.indices))
    432         else:

python2.7/site-packages/pandas/core/groupby.pyc in __getattr__(self, attr)
    527 
    528         raise AttributeError("%r object has no attribute %r" %
--> 529                              (type(self).__name__, attr))
    530 
    531     plot = property(GroupByPlot)

AttributeError: 'DataFrameGroupBy' object has no attribute 'indices'

Expected Output

I guess an alternative way to get this output is test.loc["a"] - which yields the expected output below:

         0
Index2    
0        2
1        4
2        6
3        8
4       10

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-431.11.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_GB.utf8
LANG: en_GB.utf8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 32.3.1
Cython: None
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Jan 18, 2017

This is a bug, which is hidden because of this missing method on CategoricalIndex. If you would like to do a PR which adds this (and tests of course).

diff --git a/pandas/indexes/category.py b/pandas/indexes/category.py
index 2c89f72..e3ffa40 100644
--- a/pandas/indexes/category.py
+++ b/pandas/indexes/category.py
@@ -255,6 +255,9 @@ class CategoricalIndex(Index, base.PandasDelegate):
     def ordered(self):
         return self._data.ordered
 
+    def _reverse_indexer(self):
+        return self._data._reverse_indexer()
+
     def __contains__(self, key):
         hash(key)
         return key in self.values

@jreback jreback added this to the 0.20.0 milestone Jan 18, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
closes pandas-dev#15155

Author: watercrossing <[email protected]>

Closes pandas-dev#15163 from watercrossing/indexgroup and squashes the following commits:

742d4a5 [watercrossing] BUG: GroupBy.get_group failing with a categorical grouper (pandas-dev#15155)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type Groupby
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants