-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: DataFrame.pivot accepts a list of values #18636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
b74ee0f
a36f9e0
5f94728
3008d8e
b3ea1c2
539ffdc
d176585
6646798
ea77a97
1d6bf58
c000811
c750807
d3a7bec
c50b2dd
df2f0b0
bb85875
8f8b45f
99abef4
41ad9c0
2f5d6f7
516690c
e30fd1c
eb9d85f
786e5f7
8ea45f8
3825c9a
8e54fc9
e293741
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -77,16 +77,13 @@ Other Enhancements | |
- :func:`Series.fillna` now accepts a Series or a dict as a ``value`` for a categorical dtype (:issue:`17033`) | ||
- :func:`pandas.read_clipboard` updated to use qtpy, falling back to PyQt5 and then PyQt4, adding compatibility with Python3 and multiple python-qt bindings (:issue:`17722`) | ||
- Improved wording of ``ValueError`` raised in :func:`read_csv` when the ``usecols`` argument cannot match all columns. (:issue:`17301`) | ||
- :func:`DataFrame.pivot` now accepts a list of values (:issue:`17160`). | ||
|
||
.. _whatsnew_0220.api_breaking: | ||
|
||
Backwards incompatible API changes | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
- :func:`Series.fillna` now raises a ``TypeError`` instead of a ``ValueError`` when passed a list, tuple or DataFrame as a ``value`` (:issue:`18293`) | ||
- :func:`pandas.DataFrame.merge` no longer casts a ``float`` column to ``object`` when merging on ``int`` and ``float`` columns (:issue:`16572`) | ||
- The default NA value for :class:`UInt64Index` has changed from 0 to ``NaN``, which impacts methods that mask with NA, such as ``UInt64Index.where()`` (:issue:`18398`) | ||
|
||
.. _whatsnew_0220.api_breaking.deps: | ||
|
||
Dependencies have increased minimum versions | ||
|
@@ -104,8 +101,6 @@ If installed, we now require: | |
+-----------------+-----------------+----------+ | ||
|
||
|
||
|
||
|
||
.. _whatsnew_0220.api: | ||
|
||
Other API Changes | ||
|
@@ -129,6 +124,10 @@ Other API Changes | |
- :func:`DataFrame.from_items` provides a more informative error message when passed scalar values (:issue:`17312`) | ||
- When created with duplicate labels, ``MultiIndex`` now raises a ``ValueError``. (:issue:`17464`) | ||
- Building from source now explicity requires ``setuptools`` in ``setup.py`` (:issue:`18113`) | ||
- :func:`Series.fillna` now raises a ``TypeError`` instead of a ``ValueError`` when passed a list, tuple or DataFrame as a ``value`` (:issue:`18293`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looks like you picked up some other commits here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. those are merged on master now, so I think merging them as is won't harm (or duplicate). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure if it will do harm or not, but can you still fix this? (it makes reviewing harder as there are unrelated changes)
should do it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Merged |
||
- :func:`pandas.DataFrame.merge` no longer casts a ``float`` column to ``object`` when merging on ``int`` and ``float`` columns (:issue:`16572`) | ||
- The default NA value for :class:`UInt64Index` has changed from 0 to ``NaN``, which impacts methods that mask with NA, such as ``UInt64Index.where()`` (:issue:`18398`) | ||
- Building pandas for development now requires ``cython >= 0.24`` (:issue:`18613`) | ||
|
||
.. _whatsnew_0220.deprecations: | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -374,15 +374,17 @@ def pivot(self, index=None, columns=None, values=None): | |
cols = [columns] if index is None else [index, columns] | ||
append = index is None | ||
indexed = self.set_index(cols, append=append) | ||
return indexed.unstack(columns) | ||
else: | ||
if index is None: | ||
index = self.index | ||
index = self.index if index is None else self[index] | ||
index = MultiIndex.from_arrays([index, self[columns]]) | ||
if isinstance(values, list): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a comment here on what is going on |
||
indexed = DataFrame(self[values].values, | ||
index=index, | ||
columns=values) | ||
else: | ||
index = self[index] | ||
indexed = Series(self[values].values, | ||
index=MultiIndex.from_arrays([index, self[columns]])) | ||
return indexed.unstack(columns) | ||
indexed = Series(self[values].values, | ||
index=index) | ||
return indexed.unstack(columns) | ||
|
||
|
||
def pivot_simple(index, columns, values): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -353,6 +353,29 @@ def test_pivot_periods(self): | |
pv = df.pivot(index='p1', columns='p2', values='data1') | ||
tm.assert_frame_equal(pv, expected) | ||
|
||
def test_pivot_with_multi_values(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. say with list_like_values rather than multi_values |
||
df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two', 'two'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add the issue number as a comment |
||
'bar': ['A', 'B', 'C', 'A', 'B', 'C'], | ||
'baz': [1, 2, 3, 4, 5, 6], | ||
'zoo': ['x', 'y', 'z', 'q', 'w', 't']}) | ||
|
||
results = df.pivot(index='zoo', columns='foo', values=['bar', 'baz']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use result= |
||
|
||
data = [[None, 'A', None, 4], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use np.nan rather than None |
||
[None, 'C', None, 6], | ||
[None, 'B', None, 5], | ||
['A', None, 1, None], | ||
['B', None, 2, None], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. parmaterize this with values a list, tuple, np.array, pd.Series, pd.Index (should all act the same) |
||
['C', None, 3, None]] | ||
index = Index(data=['q', 't', 'w', 'x', 'y', 'z'], name='zoo') | ||
columns = MultiIndex(levels=[['bar', 'baz'], ['one', 'two']], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a test with a MultiIndex and pass values as a tuple There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you looking for something like the following? (please correct me if I am wrong)
then pivot it:
so the output would be:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ping @jreback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep |
||
labels=[[0, 0, 1, 1], [0, 1, 0, 1]], | ||
names=[None, 'foo']) | ||
expected = DataFrame(data=data, index=index, | ||
columns=columns, dtype='object') | ||
|
||
tm.assert_frame_equal(results, expected) | ||
|
||
def test_margins(self): | ||
def _check_output(result, values_col, index=['A', 'B'], | ||
columns=['C'], | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add for the
values=
kwargsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now accepts a list for the
values=
kwarg.