-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame.agg and apply with 'size' returns a scalar #39935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rhshadrach
commented
Feb 20, 2021
- closes BUG: DataFrame.agg and apply with 'size' returns a scalar #39934
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebase as well.
It might be better to detect this argument 'size' and do a translation in
pandas/core/base.py: _builtin_table
|
||
obj = self.obj | ||
|
||
if f == "size" and isinstance(obj, ABCDataFrame): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this affect Series as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a Series, size gives the number of rows and so the default path calling obj.size
produces the correct result.
@jreback - I merged master.
This isn't a builtin; |
my point is to map this to a lambda function to avoid special casing like this |
@jreback - makes sense, and I agree your suggestion is the prefered way. The one issue is that when replacing
It seems easier and less error-prone to go with 1, but I can do 2 here if you prefer. |
yeah happy to do this in the future. 2 sounds interesting actually. why don't you add a TODO for now (and issue) then can merge this one. |
@jreback TODO added, issue opened, and green(ish - failure is unrelated). |
can you merge master & ping on greenish |
…g_size � Conflicts: � pandas/tests/apply/test_frame_apply.py
thanks @rhshadrach |
if f == "size" and isinstance(obj, ABCDataFrame): | ||
# Special-cased because DataFrame.size returns a single scalar | ||
value = obj.shape[self.axis] | ||
return obj._constructor_sliced(value, index=self.agg_axis, name="size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional to set the name here? This isn't consistent with other string agg methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, thanks @TheNeuralBit! Opened #42046
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!