-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix for GH #14848 for groupby().describe() with tuples as the Index #15110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
9489cb2
f3a7a21
fbd20f5
db13c3b
c18c6cb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1490,6 +1490,19 @@ def test_frame_describe_multikey(self): | |
for name, group in groupedT: | ||
assert_frame_equal(result[name], group.describe()) | ||
|
||
def test_frame_describe_tupleindex(self): | ||
|
||
# GH 14848 - regression from 0.19.0 to 0.19.1 | ||
df1 = DataFrame({'x': [1, 2, 3, 4, 5] * 3, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. comment inside the test There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK |
||
'y': [10, 20, 30, 40, 50] * 3, | ||
'z': [100, 200, 300, 400, 500] * 3}) | ||
df1['k'] = [(0, 0, 1), (0, 1, 0), (1, 0, 0)] * 5 | ||
df2 = df1.rename(columns={'k': 'key'}) | ||
result = df1.groupby('k').describe() | ||
expected = df2.groupby('key').describe() | ||
expected.index.set_names(result.index.names, inplace=True) | ||
assert_frame_equal(result, expected) | ||
|
||
def test_frame_groupby(self): | ||
grouped = self.tsframe.groupby(lambda x: x.weekday()) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1626,7 +1626,11 @@ def __init__(self, objs, axis=0, join='outer', join_axes=None, | |
clean_objs.append(v) | ||
objs = clean_objs | ||
name = getattr(keys, 'name', None) | ||
keys = Index(clean_keys, name=name) | ||
# GH 14848 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we need the check here? the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback Because in the original issue, if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm maybe this should be handled in the _get_grouper logic then where you first figure out what the name actually is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback It's not a grouper issue. Here's code that fails with 0.19.2 when passing an
The value of I know the example is a bit far-fetched, but it is equivalent to what is going on when the grouper calls There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
this is with this PR, something wrong There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback Looks like I didn't push the last set of changes. Sorry about that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so I remove the change in this file and everything still works. Please do that and push up again, the only fix here needed was in MultiIndex. |
||
# If you already have an Index, no need | ||
# to recreate it | ||
if not isinstance(keys, Index): | ||
keys = Index(clean_keys, name=name) | ||
|
||
if len(objs) == 0: | ||
raise ValueError('All objects passed were None') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use
not is_list_like
here (which excludes things like TImestamps and strings), but allows other list-like things