Skip to content

BUG: Fix DateFrameGroupBy.mean error for Int64 dtype #32223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Mar 12, 2020
Merged

BUG: Fix DateFrameGroupBy.mean error for Int64 dtype #32223

merged 19 commits into from
Mar 12, 2020

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Feb 24, 2020

It looks like this was due to the TypeError not being caught

{"a": [1, 1, 2, 2, 3, 3], "b": [1, 2, 1, 2, 1, 2]},
],
)
@pytest.mark.parametrize("function", ["mean", "median", "var"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use reduction_func fixture instead of picking these one-off?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having some trouble using this fixture without making the test very complex since the methods each have different output (e.g., sometimes it's empty), some require arguments, etc. Would it be okay if these tests only included the ones affected by the bug?


@pytest.mark.parametrize(
"values",
[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to pandas/tests/groupby/test_function.py

@jreback
Copy link
Contributor

jreback commented Mar 11, 2020

also merge master

@jreback jreback added this to the 1.0.2 milestone Mar 11, 2020
groups = pd.DataFrame(values, dtype="Int64").groupby("a")
result = getattr(groups, function)()

output = 0.5 if function == "var" else 1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the expected to the top.

def test_apply_to_nullable_integer_returns_float(values, function):
# https://github.com/pandas-dev/pandas/issues/32219
groups = pd.DataFrame(values, dtype="Int64").groupby("a")
result = getattr(groups, function)()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move result down to the bottom & add testing for
.agg([function]) and .agg(function)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small change in tests, ping on green.


groups = pd.DataFrame(values, dtype="Int64").groupby("a")

if use_agg:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to parameterize on use_agg; just test all 3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Updated and green

@jreback jreback merged commit 9e7cb7c into pandas-dev:master Mar 12, 2020
@lumberbot-app
Copy link

lumberbot-app bot commented Mar 12, 2020

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
$ git checkout 1.0.x
$ git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
$ git cherry-pick -m1 9e7cb7c102655d0ba92d2561c178da9254d5cef5
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
$ git commit -am 'Backport PR #32223: BUG: Fix DateFrameGroupBy.mean error for Int64 dtype'
  1. Push to a named branch :
git push YOURFORK 1.0.x:auto-backport-of-pr-32223-on-1.0.x
  1. Create a PR against branch 1.0.x, I would have named this PR:

"Backport PR #32223 on branch 1.0.x"

And apply the correct labels and milestones.

Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon!

If these instruction are inaccurate, feel free to suggest an improvement.

@jreback
Copy link
Contributor

jreback commented Mar 12, 2020

thanks @dsaxton

@jreback
Copy link
Contributor

jreback commented Mar 12, 2020

@dsaxton if you can try a manual backport would be great :->

@TomAugspurger
Copy link
Contributor

Thanks @dsaxton. FYI, in the future you can remove the "still needs manual backport" label after the backport is merged.

SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

calling mean on a DataFrameGroupBy with Int64 dtype results in TypeError
5 participants