Skip to content

BUG: groupby.rank with non-unique index groupers #16577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jun 1, 2017 · 2 comments · Fixed by #44245
Closed

BUG: groupby.rank with non-unique index groupers #16577

jreback opened this issue Jun 1, 2017 · 2 comments · Fixed by #44245
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff good first issue Groupby Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jun 1, 2017

groupby-with-rank and non-unique groupers, which include nan, raise
an odd error (the 2nd one), in reality this cannot reindex properly (which is really the error).

In [103]: df = DataFrame({'A': [1., 2., 3., np.nan], 'value': 1.}, index=[pd.Timestamp('20170101', tz='US/Eastern')] * 4)

In [104]: df.groupby([df.index, 'A']).value.rank(ascending=True, pct=True)     
ValueError: cannot reindex from a duplicate axis
AttributeError: 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'

but works when this is a column (and not an index)

In [105]: df.reset_index().groupby([df.index, 'A']).value.rank(ascending=True, pct=True)
Out[105]: 
0    1.0
1    1.0
2    1.0
3    NaN
Name: value, dtype: float64
```

so 2 interelated bugs here.

xref to #11759 
@jreback jreback added Bug Difficulty Advanced Error Reporting Incorrect or improved errors from pandas Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jun 1, 2017
@jreback jreback added this to the Interesting Issues milestone Jun 1, 2017
@jreback jreback modified the milestones: Interesting Issues, Next Major Release Nov 26, 2017
@WillAyd
Copy link
Member

WillAyd commented Dec 27, 2018

Think this works on master - OK to close?

@mroeschke
Copy link
Member

Could use a unit test.

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Error Reporting Incorrect or improved errors from pandas Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jun 12, 2021
@mroeschke mroeschke mentioned this issue Oct 31, 2021
9 tasks
@jreback jreback added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Groupby labels Oct 31, 2021
@jreback jreback modified the milestones: Contributions Welcome, 1.4 Oct 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff good first issue Groupby Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants