Skip to content

BUG: Fix DataFrame.groupby().apply() for NaN groups with dropna=False #35951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Sep 5, 2020

Conversation

cwkwong
Copy link
Contributor

@cwkwong cwkwong commented Aug 28, 2020

…5889)

* tests should still fail.

* test dropna=True|False with no np.nan in groupings.

* fix expected outputs, declare expected MultiIndex in resulting
  dataframe after df.group().apply()
* nans at same positions in `level` and `key` compares as equal.
* this makes test pass.

* follow existing style where we create MultiIndex,
  then `set_levels` to reinsert nan for case when
  `dropna=False`, and groups has nan grouping.
* black pandas

* git diff upstream/master -u -- "*.py" | flake8 --diff
@rhshadrach rhshadrach added Apply Apply, Aggregate, Transform, Map Bug Groupby labels Aug 28, 2020
Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @cwkwong for a nice PR!

Some largely cosmetic comments, otherwise lgtm

@arw2019
Copy link
Member

arw2019 commented Aug 29, 2020

@cwkwong might want to have a more descriptive title BUG: Fix DataFrame.groupby().apply() for NaN groups with dropna=False

@cwkwong cwkwong changed the title Gh35889 groupby dropna apply BUG: Fix DataFrame.groupby().apply() for NaN groups with dropna=False Aug 29, 2020
@cwkwong
Copy link
Contributor Author

cwkwong commented Sep 1, 2020

Thanks @arw2019 for approving the changes. What's the next step to hopefully make this as part of the 1.1.2 release?

Copy link
Member

@charlesdong1991 charlesdong1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @cwkwong

This should be moved to whatsnew 1.2.0 note. 1.1.2 mainly focuses on regression issue.

@jreback jreback added this to the 1.2 milestone Sep 5, 2020
@jreback jreback merged commit d0de0c6 into pandas-dev:master Sep 5, 2020
@jreback
Copy link
Contributor

jreback commented Sep 5, 2020

thanks @cwkwong very nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: groupby dropna=False with nan value in groupby causes ValueError when apply()
5 participants