Skip to content

ZeroDivisionError when groupby on a empty group #22519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AtomBaf opened this issue Aug 27, 2018 · 7 comments
Closed

ZeroDivisionError when groupby on a empty group #22519

AtomBaf opened this issue Aug 27, 2018 · 7 comments
Labels
Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@AtomBaf
Copy link

AtomBaf commented Aug 27, 2018

Code Sample, a copy-pastable example if possible

# Your code here
import numpy as np
import pandas as pd

df = pd.DataFrame({'A':[0,1,0],'B':[1.,np.nan,2.]})
df.groupby('A').B.rank(pct=True)

Problem description

See code example above. The problem occurs only when pct parameter is True

Regression between 0.22 and 0.23.4.
Before 0.23, the result was a series with NaN values for each group containing only NaN values.
Now it is throwing a ZeroDivisionError

Expected Output

Same output as 0.22

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.3.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 45 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.23.4
pytest: None
pip: 18.0
setuptools: 40.2.0
Cython: None
numpy: 1.15.0
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version labels Aug 27, 2018
@gfyoung
Copy link
Member

gfyoung commented Aug 27, 2018

Indeed, that does look like a regression.

cc @jreback

@kokes
Copy link
Contributor

kokes commented Aug 28, 2018

git bisect tells us this regression was introduced in c1068d9, which resolves #15779.

@AtomBaf
Copy link
Author

AtomBaf commented Sep 4, 2018

Pardon my interruption, but this is blocking for 0.23 adoption on our side : any schedule on the resolution of this regression ?

@gfyoung gfyoung added this to the 0.23.5 milestone Sep 4, 2018
@gfyoung
Copy link
Member

gfyoung commented Sep 4, 2018

@AtomBaf : Thank you for the ping! I have marked it for our next minor release.

@jreback jreback modified the milestones: 0.23.5, Contributions Welcome Sep 4, 2018
@jreback
Copy link
Contributor

jreback commented Sep 4, 2018

if a patch is contributed it can be put on 0.23.5 but nothing is currently out there

gfyoung added a commit to forking-repos/pandas that referenced this issue Sep 5, 2018
@gfyoung gfyoung modified the milestones: Contributions Welcome, 0.23.5 Sep 5, 2018
gfyoung added a commit to forking-repos/pandas that referenced this issue Sep 5, 2018
gfyoung added a commit to forking-repos/pandas that referenced this issue Sep 5, 2018
gfyoung added a commit to forking-repos/pandas that referenced this issue Sep 6, 2018
gfyoung added a commit to forking-repos/pandas that referenced this issue Sep 6, 2018
gfyoung added a commit to forking-repos/pandas that referenced this issue Sep 7, 2018
jreback pushed a commit that referenced this issue Sep 8, 2018
aeltanawy pushed a commit to aeltanawy/pandas that referenced this issue Sep 20, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018
@glennlawyer
Copy link

@gfyoung any timing on that next minor release? This bug is also causing me pain.

@gfyoung
Copy link
Member

gfyoung commented Nov 14, 2018

@glennlawyer : Sorry to hear that! Unfortunately, I do not know when the next version will be released. If you are able to though, you could consider installing directly off master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests

5 participants