Skip to content

SeriesGroupby.nunique raises an IndexError on empty Series #12553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mfixman opened this issue Mar 7, 2016 · 4 comments
Closed

SeriesGroupby.nunique raises an IndexError on empty Series #12553

mfixman opened this issue Mar 7, 2016 · 4 comments
Milestone

Comments

@mfixman
Copy link

mfixman commented Mar 7, 2016

Code Sample, a copy-pastable example if possible

In [18]: b = pandas.Series()

In [19]: g = b.groupby(level = 0)

In [20]: g.nunique()
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-20-fbbfc3108eac> in <module>()
----> 1 g.nunique()

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in nunique(self, dropna)
   2693 
   2694         out = np.add.reduceat(inc, idx).astype('int64', copy=False)
-> 2695         return Series(out if ids[0] != -1 else out[1:],
   2696                       index=self.grouper.result_index,
   2697                       name=self.name)

IndexError: index 0 is out of bounds for axis 0 with size 0

Expected Output

0

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-79-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.1
pip: 8.0.2
setuptools: 20.1.1
Cython: None
numpy: 1.10.4
scipy: 0.13.3
statsmodels: 0.5.0
IPython: 4.1.1
sphinx: None
patsy: 0.2.1
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.2.2
matplotlib: 1.3.1
openpyxl: 1.7.0
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: 4.2.1
html5lib: 0.999
httplib2: 0.8
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None
@jreback
Copy link
Contributor

jreback commented Mar 7, 2016

ok, pull-requests are welcome to fix

@thanasis2028
Copy link

Is this supposed to return 0 or an empty Series? I am fixing it right now but it seems awkward returning 0, while other calls of the nunique() return Series objects.

@jreback
Copy link
Contributor

jreback commented Mar 7, 2016

equiv to this, an empty Series should be returned. You should only do that null checking (the line where it errors), if the series has len

In [1]: s = Series([])

In [2]: s.groupby(s.index).sum()
Out[2]: Series([], dtype: float64)

@thanasis2028
Copy link

Thanks, made a pull request:
#12557
Edit: Output:

>>> reload(pandas)
<module 'pandas' from 'pandas/__init__.pyc'>
>>> pandas.Series().groupby(level = 0).nunique()
Series([], dtype: int64)
>>> pandas.Series([1],[1]).groupby(level = 0).nunique()
1    1
dtype: int64
>>>

@jreback jreback modified the milestones: 0.18.1, 0.18.2 Apr 25, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: Next Major Release, 0.19.0 Sep 1, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 27, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 30, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 30, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Dec 1, 2016
…12553)

Modified tests

simplify tests

Add whatsnew

Moved len check
@jreback jreback modified the milestones: 0.19.2, Next Major Release Dec 4, 2016
@jreback jreback closed this as completed in c0e13d1 Dec 4, 2016
jorisvandenbossche pushed a commit that referenced this issue Dec 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants