-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: pivot_table strings as aggfunc #18810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @bobhaffner! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on December 23, 2017 at 20:43 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #18810 +/- ##
=========================================
Coverage ? 91.65%
=========================================
Files ? 154
Lines ? 51368
Branches ? 0
=========================================
Hits ? 47080
Misses ? 4288
Partials ? 0
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -321,7 +321,8 @@ Reshaping | |||
- Bug in :func:`DataFrame.stack` which fails trying to sort mixed type levels under Python 3 (:issue:`18310`) | |||
- Fixed construction of a :class:`Series` from a ``dict`` containing ``NaN`` as key (:issue:`18480`) | |||
- Bug in :func:`Series.rank` where ``Series`` containing ``NaT`` modifies the ``Series`` inplace (:issue:`18521`) | |||
- | |||
- Bug in :func:`Dataframe.pivot_table` which fails when you pass a string for aggfunc arg (:issue:`18713`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally don't use the second-person (or any such person-pronouns in general).
Thanks for the feedback @gfyoung. |
pandas/tests/reshape/test_pivot.py
Outdated
stds = f(np.std) | ||
expected = concat([means, stds], keys=['mean', 'std'], axis=1) | ||
tm.assert_frame_equal(result, expected) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like what you're testing here, but I would modify a couple of things:
- Add a newline after the first
tm.assert_frame_equal
. It makes it easier to read. - Construct the output explicitly using the
DataFrame
constructor (and check against that too). At this point, I know for example thatf('sum')
andf(np.sum)
should be the same, but I don't know what the actual result is. Here is how I might go about writing the first half of this test.
...
result = f('sum')
func_result = f(np.sum)
expected = DataFrame(...)
tm.assert_frame_equal(result, func_result) # consistency
tm.assert_frame_equal(result, expected) # correctness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can parametrize this test by passing these functions in as well
something like
@pytest.mark.parametrize("f, f_numpy", [('sum', np.sum), ('mean', np.mean'), ('std', np.std])
def test_pivot_func_string(self, f, f_numpy):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, both, I'll incorporate these changes
pandas/tests/reshape/test_pivot.py
Outdated
stds = f(np.std) | ||
expected = concat([means, stds], keys=['mean', 'std'], axis=1) | ||
tm.assert_frame_equal(result, expected) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can parametrize this test by passing these functions in as well
something like
@pytest.mark.parametrize("f, f_numpy", [('sum', np.sum), ('mean', np.mean'), ('std', np.std])
def test_pivot_func_string(self, f, f_numpy):
6a1f4f6
to
fe36303
Compare
Separating the cases into two functions seemed like the right thing to do. As opposed to one rather long one with multiple |
pandas/tests/reshape/test_pivot.py
Outdated
(['sum', 'mean'], [np.sum, np.mean]), | ||
(['sum', 'std'], [np.sum, np.std]), | ||
(['std', 'mean'], [np.std, np.mean])] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a weird one, flake8 likes that newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using funcs, simply put this in the parametrize
pandas/tests/reshape/test_pivot.py
Outdated
|
||
@pytest.mark.parametrize("f, f_numpy", funcs) | ||
def test_pivot_string_func_vs_func(self, f, f_numpy): | ||
# GH #18713 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add that this test is for consistency purposes.
@@ -1109,6 +1109,48 @@ def test_pivot_margins_name_unicode(self): | |||
expected = pd.DataFrame(index=index) | |||
tm.assert_frame_equal(table, expected) | |||
|
|||
def test_pivot_string_as_func(self): | |||
# GH #18713 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add that this test is for correctness.
Seems reasonable to me. |
Just following up on this to make sure no other changes are required. No worries if folks are busy |
pandas/tests/reshape/test_pivot.py
Outdated
(['sum', 'mean'], [np.sum, np.mean]), | ||
(['sum', 'std'], [np.sum, np.std]), | ||
(['std', 'mean'], [np.std, np.mean])] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using funcs, simply put this in the parametrize
c4921c1
to
1a9cfeb
Compare
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -341,7 +341,8 @@ Reshaping | |||
- Bug in :func:`DataFrame.stack` which fails trying to sort mixed type levels under Python 3 (:issue:`18310`) | |||
- Fixed construction of a :class:`Series` from a ``dict`` containing ``NaN`` as key (:issue:`18480`) | |||
- Bug in :func:`Series.rank` where ``Series`` containing ``NaT`` modifies the ``Series`` inplace (:issue:`18521`) | |||
- | |||
- Bug in :func:`Dataframe.pivot_table` which fails when the ``aggfunc`` arg is of type string. The behavior is now consistent with other methods like ``agg`` and ``apply`` (:issue:`18713`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebase on master. can you move to 0.23 (docs were renamed), prob easiest to just check this file from master and past in new one
1a9cfeb
to
0decc05
Compare
Very interesting. Looks like the rebase handled the whatsnew change quite nicely
|
@bobhaffner : This is |
@gfyoung Haha, gotcha :-) |
lgtm. ping on green. |
@jreback is this intended for me or some kind of shorthand for a notification process? At any rate, ping :-) Looks like there's a conflict with the whatsnew piece being changed back to the old name. I can follow suit or wait till this is sorted out. |
thanks @bobhaffner yes 'ping on green' is a notification of you to ping me when this has passed and all buttons are green. |
git diff upstream/master -u -- "*.py" | flake8 --diff