-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Incorrect index shape when using a user-defined function for aggregating a grouped series with object-typed index. #40835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
5ded786
30cd16e
e97b413
c5e1cf7
9ccb324
04d61b5
d4d9fb5
307bbe5
ace5f81
bb7f09e
4f31837
db9b29b
fa4291a
b0dfbcd
4ddcc12
a21be6f
7a4a793
4bf3f20
0c16e6c
f457dff
3eb7b79
d608b5b
86db870
ded8433
24a1344
5cca8d2
337edbd
b89eee0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -738,8 +738,12 @@ def agg_series(self, obj: Series, func: F): | |
# TODO: can we get a performant workaround for EAs backed by ndarray? | ||
return self._aggregate_series_pure_python(obj, func) | ||
|
||
elif obj.index._has_complex_internals: | ||
# Preempt TypeError in _aggregate_series_fast | ||
elif obj.index._has_complex_internals or obj.index.dtype == "object": | ||
DriesSchaumont marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# (complex internals): Preempt TypeError in _aggregate_series_fast | ||
# (object index dtype): _aggregate_series_fast raises TypeError | ||
# when applying func because the group indiced become malformatted: | ||
# correct indices are in group.index._index_data, but not in | ||
# group.index._data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On master, replacing the lamba in the linked issue with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. I will keep looking for the root cause and try to provide a fix. I will mark this PR as a draft in the meantime. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rhshadrach I have updated this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think what happens here is that import pandas.core.common as com
def foo(x):
com.require_length_match(x.values, x.index)
return x |
||
return self._aggregate_series_pure_python(obj, func) | ||
|
||
try: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you be more specific here? as a user i am not sure what i am to do with this statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a 'for example df.groupby(....)......` or similar, in particular this is only for a user supplied function (e.g. via .apply right)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted, hope this is more clear?