-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: DataFrame.pivot accepts a list of values #18636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
b74ee0f
a36f9e0
5f94728
3008d8e
b3ea1c2
539ffdc
d176585
6646798
ea77a97
1d6bf58
c000811
c750807
d3a7bec
c50b2dd
df2f0b0
bb85875
8f8b45f
99abef4
41ad9c0
2f5d6f7
516690c
e30fd1c
eb9d85f
786e5f7
8ea45f8
3825c9a
8e54fc9
e293741
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4355,8 +4355,8 @@ def pivot(self, index=None, columns=None, values=None): | |
existing index. | ||
columns : string or object | ||
Column name to use to make new frame's columns | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is probably due to merging master, but you can you undo this change? (there have been changes to this docstring on master, and you have accidentally reverted some of those changes) |
||
values : string or object, optional | ||
Column name to use for populating new frame's values. If not | ||
values : string, object or (0.23.0) a list of the previous, optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. put the version after the word list |
||
Column name(s) to use for populating new frame's values. If not | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add (0.23.0) after 'list of the previous' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
specified, all remaining columns will be used and the result will | ||
have hierarchically indexed columns | ||
|
||
|
@@ -4381,15 +4381,16 @@ def pivot(self, index=None, columns=None, values=None): | |
|
||
>>> df = pd.DataFrame({'foo': ['one','one','one','two','two','two'], | ||
'bar': ['A', 'B', 'C', 'A', 'B', 'C'], | ||
'baz': [1, 2, 3, 4, 5, 6]}) | ||
'baz': [1, 2, 3, 4, 5, 6], | ||
'zoo': ['x', 'y', 'z', 'q', 'w', 't']}) | ||
>>> df | ||
foo bar baz | ||
0 one A 1 | ||
1 one B 2 | ||
2 one C 3 | ||
3 two A 4 | ||
4 two B 5 | ||
5 two C 6 | ||
foo bar baz zoo | ||
0 one A 1 x | ||
1 one B 2 y | ||
2 one C 3 z | ||
3 two A 4 q | ||
4 two B 5 w | ||
5 two C 6 t | ||
|
||
>>> df.pivot(index='foo', columns='bar', values='baz') | ||
A B C | ||
|
@@ -4401,6 +4402,15 @@ def pivot(self, index=None, columns=None, values=None): | |
one 1 2 3 | ||
two 4 5 6 | ||
|
||
>>> df.pivot(index='foo', columns='bar', values=['baz']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh we already have this example, I c, ok you can remove this one (with |
||
A B C | ||
one 1 2 3 | ||
two 4 5 6 | ||
|
||
>>> df.pivot(index='foo', columns='bar', values=['baz', 'zoo']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you show an example using a single value first |
||
A B C A B C | ||
one 1 2 3 x y z | ||
two 4 5 6 q w t | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This output doesn't seem correct. Should there be a mult-indexed columns? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's right. Fixed. |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also undo this removal of blank lines? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All fixed now @jorisvandenbossche. |
||
""" | ||
from pandas.core.reshape.reshape import pivot | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -368,15 +368,18 @@ def pivot(self, index=None, columns=None, values=None): | |
cols = [columns] if index is None else [index, columns] | ||
append = index is None | ||
indexed = self.set_index(cols, append=append) | ||
return indexed.unstack(columns) | ||
else: | ||
if index is None: | ||
index = self.index | ||
index = self.index if index is None else self[index] | ||
index = MultiIndex.from_arrays([index, self[columns]]) | ||
if is_list_like(values): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you need:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. did you need to check if its a tuple? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'm trying to figure out why the MultiIndex with a tuple of values test is giving me an error, would be great if you give me a hand! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
# use DF in case of Iterable values (e.g: list, Series, np.array) | ||
indexed = DataFrame(self[values].values, | ||
index=index, | ||
columns=values) | ||
else: | ||
index = self[index] | ||
indexed = Series(self[values].values, | ||
index=MultiIndex.from_arrays([index, self[columns]])) | ||
return indexed.unstack(columns) | ||
indexed = Series(self[values].values, | ||
index=index) | ||
return indexed.unstack(columns) | ||
|
||
|
||
def pivot_simple(index, columns, values): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -371,6 +371,33 @@ def test_pivot_periods(self): | |
pv = df.pivot(index='p1', columns='p2', values='data1') | ||
tm.assert_frame_equal(pv, expected) | ||
|
||
@pytest.mark.parametrize('values', [ | ||
['bar', 'baz'], np.array(['bar', 'baz']), | ||
pd.Series(['bar', 'baz']), pd.Index(['bar', 'baz']) | ||
]) | ||
def test_pivot_with_list_like_values(self, values): | ||
# issue #17160 | ||
df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two', 'two'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add the issue number as a comment |
||
'bar': ['A', 'B', 'C', 'A', 'B', 'C'], | ||
'baz': [1, 2, 3, 4, 5, 6], | ||
'zoo': ['x', 'y', 'z', 'q', 'w', 't']}) | ||
|
||
result = df.pivot(index='zoo', columns='foo', values=values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a test with values as a tuple (it should fail) |
||
|
||
data = [[np.nan, 'A', np.nan, 4], | ||
[np.nan, 'C', np.nan, 6], | ||
[np.nan, 'B', np.nan, 5], | ||
['A', np.nan, 1, np.nan], | ||
['B', np.nan, 2, np.nan], | ||
['C', np.nan, 3, np.nan]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where are all the NaNs coming from? They don't seem to be there in the docstring example There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I see, because here 'zoo' is used for index and not 'foo'. |
||
index = Index(data=['q', 't', 'w', 'x', 'y', 'z'], name='zoo') | ||
columns = MultiIndex(levels=[['bar', 'baz'], ['one', 'two']], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a test with a MultiIndex and pass values as a tuple There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you looking for something like the following? (please correct me if I am wrong)
then pivot it:
so the output would be:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ping @jreback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep |
||
labels=[[0, 0, 1, 1], [0, 1, 0, 1]], | ||
names=[None, 'foo']) | ||
expected = DataFrame(data=data, index=index, | ||
columns=columns, dtype='object') | ||
tm.assert_frame_equal(result, expected) | ||
|
||
def test_margins(self): | ||
def _check_output(result, values_col, index=['A', 'B'], | ||
columns=['C'], | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you revert this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure