-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Feature/groupby repr ellipses 1135 #24853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 14 commits
f44e671
19bb9bf
d6b310a
43dbc6b
49f1def
85d3012
0746c3b
6a7d7df
3d4b057
33142cb
dbb7d12
5db6c07
8f30d07
2870163
acfa005
b60329c
13b73a6
29c6263
ccb98a3
c74cbba
cdb9ebc
9621669
38ecd1a
9742473
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,6 +38,7 @@ | |
from pandas.core.arrays import ExtensionArray | ||
from pandas.core.base import IndexOpsMixin, PandasObject | ||
import pandas.core.common as com | ||
from pandas.core.config import get_option | ||
from pandas.core.indexes.frozen import FrozenList | ||
import pandas.core.missing as missing | ||
from pandas.core.ops import get_op_result_name, make_invalid_op | ||
|
@@ -4493,7 +4494,7 @@ def groupby(self, values): | |
# map to the label | ||
result = {k: self.take(v) for k, v in compat.iteritems(result)} | ||
|
||
return result | ||
return IndexGroupbyGroups(result) | ||
|
||
def map(self, mapper, na_action=None): | ||
""" | ||
|
@@ -5290,6 +5291,27 @@ def _add_logical_methods_disabled(cls): | |
Index._add_comparison_methods() | ||
|
||
|
||
class IndexGroupbyGroups(dict): | ||
def __repr__(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this just need to call pandas.io.formats.printing.pprint_thing (with the option passed) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. give a class doc-string here |
||
nitems = get_option('display.max_rows') or len(self) | ||
|
||
fmt = u("{{{things}}}") | ||
pfmt = u("{key}: {val}") | ||
|
||
pairs = [] | ||
for k, v in list(self.items()): | ||
pairs.append(pfmt.format(key=k, val=v)) | ||
|
||
if nitems < len(self): | ||
print("Truncating repr") | ||
start_cnt, end_cnt = nitems - int(nitems / 2), int(nitems / 2) | ||
return fmt.format(things=", ".join(pairs[:start_cnt]) + | ||
", ... , " + | ||
", ".join(pairs[-end_cnt:])) | ||
else: | ||
return fmt.format(things=", ".join(pairs)) | ||
|
||
|
||
def ensure_index_from_sequences(sequences, names=None): | ||
""" | ||
Construct an index from sequences of data. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1797,6 +1797,22 @@ def test_period(self): | |
assert str(df) == exp | ||
|
||
|
||
class TestDataFrameGroupByFormatting(object): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this goes in pandas/tests/groupby/test_grouping.py near the other repr tests |
||
def test_groups_repr_truncates(self): | ||
df = pd.DataFrame({ | ||
'a': [1, 1, 1, 2, 2, 3], | ||
'b': [1, 2, 3, 4, 5, 6] | ||
}) | ||
|
||
with option_context('display.max_rows', 2): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you also try a grouper like |
||
x = df.groupby('a').groups | ||
assert ', ... ,' in x.__repr__() | ||
|
||
with option_context('display.max_rows', 5): | ||
x = df.groupby('a').groups | ||
assert ', ... ,' not in x.__repr__() | ||
|
||
|
||
def gen_series_formatting(): | ||
s1 = pd.Series(['a'] * 100) | ||
s2 = pd.Series(['ab'] * 100) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be defined in pandas/core/groupby/groupby.py and used on the dict returning functions, mainly
groups
andindices
this is not user facing function and kind of obscures that this is from groupby
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, maybe best to actually put this in pandas.io.format.printing and name it PrettyDict(dict), see my other notes for where to actually use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll move this. To your first point, are you saying a groups property should do something like this:
return PrettyDict(self.grouper.groups)
? @WillAyd mentioned had noted that he'd prefer to put this in a different place, as multiple calls to the method will create a new object every time.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes; thiat would be a centralized place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The topic initially came up because instantiating in the
groups
method was failing this test:@WillAyd Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @jreback says here makes a lot of sense - go ahead with the move!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll remove this test as part of the next commit.