ENH: DataFrame.pivot accepts a list of values #18636

ibrahimsharaf · 2017-12-04T21:13:33Z

closes Make DataFrame.pivot accepts a list of column names as values #17160
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jreback · 2017-12-05T00:03:39Z

pandas/tests/reshape/test_pivot.py

+                           'baz': [1, 2, 3, 4, 5, 6],
+                           'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
+
+        results = df.pivot(index='zoo', columns='foo', values=['bar', 'baz'])


use result=

jreback · 2017-12-05T00:04:01Z

pandas/tests/reshape/test_pivot.py

+
+        results = df.pivot(index='zoo', columns='foo', values=['bar', 'baz'])
+
+        data = [[None, 'A', None, 4],


use np.nan rather than None

jreback · 2017-12-05T00:04:23Z

pandas/tests/reshape/test_pivot.py

@@ -353,6 +353,29 @@ def test_pivot_periods(self):
        pv = df.pivot(index='p1', columns='p2', values='data1')
        tm.assert_frame_equal(pv, expected)

+    def test_pivot_with_multi_values(self):
+        df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two', 'two'],


add the issue number as a comment

jreback · 2017-12-05T00:04:46Z

pandas/tests/reshape/test_pivot.py

@@ -353,6 +353,29 @@ def test_pivot_periods(self):
        pv = df.pivot(index='p1', columns='p2', values='data1')
        tm.assert_frame_equal(pv, expected)

+    def test_pivot_with_multi_values(self):


say with list_like_values rather than multi_values

jreback · 2017-12-05T00:05:09Z

doc/source/whatsnew/v0.22.0.txt

@@ -129,6 +124,10 @@ Other API Changes
 - :func:`DataFrame.from_items` provides a more informative error message when passed scalar values (:issue:`17312`)
 - When created with duplicate labels, ``MultiIndex`` now raises a ``ValueError``. (:issue:`17464`)
 - Building from source now explicity requires ``setuptools`` in ``setup.py`` (:issue:`18113`)
+- :func:`Series.fillna` now raises a ``TypeError`` instead of a ``ValueError`` when passed a list, tuple or DataFrame as a ``value`` (:issue:`18293`)


looks like you picked up some other commits here.

those are merged on master now, so I think merging them as is won't harm (or duplicate).

I am not sure if it will do harm or not, but can you still fix this? (it makes reviewing harder as there are unrelated changes)
In principle

git fetch upstream git merge upstream/master

should do it

jreback · 2017-12-05T00:05:54Z

pandas/tests/reshape/test_pivot.py

+                [None, 'C', None, 6],
+                [None, 'B', None, 5],
+                ['A', None, 1, None],
+                ['B', None, 2, None],


parmaterize this with values a list, tuple, np.array, pd.Series, pd.Index (should all act the same)

jreback · 2017-12-05T00:06:18Z

pandas/core/reshape/reshape.py

-            index = self.index
+        index = self.index if index is None else self[index]
+        index = MultiIndex.from_arrays([index, self[columns]])
+        if isinstance(values, list):


use is_list_like here

add a comment here on what is going on

pep8speaks · 2017-12-05T01:56:43Z

Hello @ibrahimsharaf! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 20, 2018 at 15:31 Hours UTC

codecov · 2017-12-05T01:56:59Z

Codecov Report

Merging #18636 into master will decrease coverage by 0.03%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18636      +/-   ##
==========================================
- Coverage   91.59%   91.55%   -0.04%     
==========================================
  Files         155      155              
  Lines       51255    51255              
==========================================
- Hits        46948    46929      -19     
- Misses       4307     4326      +19

Flag	Coverage Δ
#multiple	`89.42% <100%> (-0.02%)`	⬇️
#single	`40.67% <0%> (-0.11%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/reshape/reshape.py	`100% <100%> (ø)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/plotting/_converter.py	`63.44% <0%> (-1.82%)`	⬇️
pandas/core/frame.py	`97.81% <0%> (-0.1%)`	⬇️
pandas/util/testing.py	`82.01% <0%> (+0.19%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a764663...5f94728. Read the comment docs.

codecov · 2017-12-05T01:57:06Z

Codecov Report

Merging #18636 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18636      +/-   ##
==========================================
- Coverage    91.8%   91.77%   -0.03%     
==========================================
  Files         152      152              
  Lines       49215    49217       +2     
==========================================
- Hits        45181    45171      -10     
- Misses       4034     4046      +12

Flag	Coverage Δ
#multiple	`90.16% <100%> (-0.03%)`	⬇️
#single	`41.84% <0%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/frame.py	`97.18% <ø> (ø)`	⬆️
pandas/core/reshape/reshape.py	`100% <100%> (ø)`	⬆️
pandas/plotting/_converter.py	`65.07% <0%> (-1.74%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01882ba...e293741. Read the comment docs.

ibrahimsharaf · 2017-12-05T02:04:40Z

@jreback should the documentation page be modified at this PR?

jorisvandenbossche · 2017-12-05T08:56:25Z

@jreback should the documentation page be modified at this PR?

Yes, you need to update the docstring / tutorial docs in this PR.

Can you show an actual example of what this PR can do? (eg show the result of the df in the test) As the test example result looks a bit strange from a quick look (all the NaNs)

ibrahimsharaf · 2017-12-08T03:11:59Z

@jreback can you help me find the reason of test failing (Travis log)? it happens when I pass the values columns as a tuple in those lines (test, def)

I tried adding tuple type in this condition (core/frame.py), it passed this test, but failed many others.

jorisvandenbossche · 2017-12-08T08:50:18Z

The reason of the error is that a tuple ('bar', 'baz') is seen as a single column name (for a MultiIndex), which is the correct behaviour IMO. So you shouldn't try to interpret a list the same as a tuple.

ibrahimsharaf · 2017-12-08T19:31:07Z

So what you are saying is that dealing with a tuple of values should be split from the is_list_like in pivot function?

jreback · 2017-12-18T14:32:36Z

doc/source/whatsnew/v0.22.0.txt

@@ -139,6 +139,8 @@ Other Enhancements
 - :func:`read_excel()` has gained the ``nrows`` parameter (:issue:`16645`)
 - :func:``DataFrame.to_json`` and ``Series.to_json`` now accept an ``index`` argument which allows the user to exclude the index from the JSON output (:issue:`17394`)
 - ``IntervalIndex.to_tuples()`` has gained the ``na_tuple`` parameter to control whether NA is returned as a tuple of NA, or NA itself (:issue:`18756`)
+- :func:`DataFrame.pivot` now accepts a list of values (:issue:`17160`).


add for the values= kwargs

I'm not sure I get you

now accepts a list for the values= kwarg.

jreback · 2017-12-18T14:33:30Z

pandas/core/frame.py

@@ -4390,6 +4391,19 @@ def pivot(self, index=None, columns=None, values=None):
        one  1   2   3
        two  4   5   6

+       >>> df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])


can you show an example using a single value first

jreback · 2017-12-18T14:36:10Z

pandas/core/reshape/reshape.py

-            index = self.index
+        index = self.index if index is None else self[index]
+        index = MultiIndex.from_arrays([index, self[columns]])
+        if is_list_like(values):


I think you need:

is_list_like(values) and not isinstance(values, tuple)

did you need to check if its a tuple?

Yes, I'm trying to figure out why the MultiIndex with a tuple of values test is giving me an error, would be great if you give me a hand!

here @jreback: https://travis-ci.org/pandas-dev/pandas/jobs/325099021#L2255

jreback · 2017-12-18T14:37:33Z

pandas/tests/reshape/test_pivot.py

+                           'baz': [1, 2, 3, 4, 5, 6],
+                           'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
+
+        result_list = df.pivot(index='zoo', columns='foo',


pls parametrize on these types, somethign like

@pytest.mark.parametrize("constructor", [pd.Series, pd.idnex, np.array, list]))

jreback · 2017-12-18T14:37:47Z

pandas/tests/reshape/test_pivot.py

+                ['B', np.nan, 2, np.nan],
+                ['C', np.nan, 3, np.nan]]
+        index = Index(data=['q', 't', 'w', 'x', 'y', 'z'], name='zoo')
+        columns = MultiIndex(levels=[['bar', 'baz'], ['one', 'two']],


add a test with a MultiIndex and pass values as a tuple

Are you looking for something like the following? (please correct me if I am wrong)

bar baz first second first second 0 one A 1 x 1 one B 2 y 2 one C 3 z 3 two A 4 q 4 two B 5 w 5 two C 6 t

then pivot it:

df.pivot(index=('bar', 'first'), columns=('bar', 'second'), values=('baz', 'first'))

so the output would be:

A B C one 1 2 3 two 4 5 6

ping @jreback

…vot_multi

jreback

you need to merge master and push your changes.

jreback · 2018-01-02T11:12:53Z

doc/source/whatsnew/v0.22.0.txt

@@ -139,6 +139,8 @@ Other Enhancements
 - :func:`read_excel()` has gained the ``nrows`` parameter (:issue:`16645`)
 - :func:``DataFrame.to_json`` and ``Series.to_json`` now accept an ``index`` argument which allows the user to exclude the index from the JSON output (:issue:`17394`)
 - ``IntervalIndex.to_tuples()`` has gained the ``na_tuple`` parameter to control whether NA is returned as a tuple of NA, or NA itself (:issue:`18756`)
+- :func:`DataFrame.pivot` now accepts a list of values (:issue:`17160`).


now accepts a list for the values= kwarg.

jreback · 2018-01-02T11:13:17Z

pandas/core/frame.py

-        values : string or object, optional
-            Column name to use for populating new frame's values. If not
+        values : string, object or a list of the previous, optional
+            Column name(s) to use for populating new frame's values. If not


add (0.23.0) after 'list of the previous'

jreback · 2018-01-02T11:13:56Z

pandas/core/frame.py

+        one   1  2  3   x  y  z
+        two   4  5  6   q  w  t
+
+        >>> df.pivot(index='zoo', columns='foo', values=['bar', 'baz'])


don't show the same example, rather show one with values = a single value

…into pivot_multi

jreback · 2018-01-04T00:30:18Z

doc/source/whatsnew/v0.22.0.txt

@@ -240,4 +240,4 @@ With conda, use

 Note that the inconsistency in the return value for all-*NA* series is still
 there for pandas 0.20.3 and earlier. Avoiding pandas 0.21 will only help with
-the empty case.


can you revert this file

jreback · 2018-01-04T00:30:45Z

pandas/core/frame.py

@@ -4355,8 +4355,8 @@ def pivot(self, index=None, columns=None, values=None):
            existing index.
        columns : string or object
            Column name to use to make new frame's columns
-        values : string or object, optional
-            Column name to use for populating new frame's values. If not
+        values : string, object or (0.23.0) a list of the previous, optional


put the version after the word list

jreback · 2018-01-04T00:31:18Z

pandas/core/frame.py

@@ -4401,6 +4402,15 @@ def pivot(self, index=None, columns=None, values=None):
        one  1   2   3
        two  4   5   6

+        >>> df.pivot(index='foo', columns='bar', values=['baz'])


oh we already have this example, I c, ok you can remove this one (with ['bar'])

jreback · 2018-01-04T00:31:51Z

pandas/core/reshape/reshape.py

-            index = self.index
+        index = self.index if index is None else self[index]
+        index = MultiIndex.from_arrays([index, self[columns]])
+        if is_list_like(values):


did you need to check if its a tuple?

jreback · 2018-01-04T00:32:18Z

pandas/tests/reshape/test_pivot.py

+                           'baz': [1, 2, 3, 4, 5, 6],
+                           'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
+
+        result = df.pivot(index='zoo', columns='foo', values=values)


add a test with values as a tuple (it should fail)

jreback

minor comments. looks good otherwise. ping on green.

jreback · 2018-03-01T23:18:17Z

pandas/core/reshape/reshape.py

-        return indexed.unstack(columns)
+        index = MultiIndex.from_arrays([index, self[columns]])
+
+        if is_list_like(values) and not isinstance(values, tuple):


can you add a comment here on why excluding tuples

jreback · 2018-03-01T23:19:07Z

pandas/tests/reshape/test_pivot.py

+                           'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
+        with pytest.raises(KeyError):
+            # tuple is seen as a single column name
+            result = df.pivot(index='zoo', columns='foo', values=('bar', 'baz'))


don;t need the result = here (its a lint error)

ibrahimsharaf · 2018-03-02T00:31:59Z

Hi @jreback, thanks for reviewing. We still have a KeyError exception with the MultiIndex failing test, how do you think it should be fixed?

jreback · 2018-03-02T11:44:18Z

can u paste what case is failing

ibrahimsharaf · 2018-03-02T11:55:29Z

Here: https://travis-ci.org/pandas-dev/pandas/jobs/348034399#L2303

TomAugspurger · 2018-03-02T14:34:59Z

#19966 for the isssue causing that test to fail.

We can see how hard that is to fix. I'd say just xfail the test for now, rather than trying to work around it.

jorisvandenbossche · 2018-03-02T13:30:22Z

pandas/core/frame.py

+       >>> df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
+              A  B  C   A  B  C
+        one   1  2  3   x  y  z
+        two   4  5  6   q  w  t


This output doesn't seem correct. Should there be a mult-indexed columns?

Yes, that's right. Fixed.

jorisvandenbossche · 2018-03-02T13:56:12Z

pandas/tests/reshape/test_pivot.py

+                [np.nan, 'B', np.nan, 5],
+                ['A', np.nan, 1, np.nan],
+                ['B', np.nan, 2, np.nan],
+                ['C', np.nan, 3, np.nan]]


Where are all the NaNs coming from? They don't seem to be there in the docstring example

Ah, I see, because here 'zoo' is used for index and not 'foo'.
Can you add a test for that as well?

ibrahimsharaf · 2018-03-19T15:22:44Z

Shall you guys take a look? @jreback @jorisvandenbossche @TomAugspurger

jreback · 2018-03-20T00:29:17Z

lgtm @ibrahimsharaf

@TomAugspurger @jorisvandenbossche might have some comments

jorisvandenbossche · 2018-03-20T09:18:24Z

pandas/core/frame.py

-            Column to use to make new frame's columns.
-        values : string or object, optional
-            Column to use for populating new frame's values. If not
+            Column name to use to make new frame's columns


This is probably due to merging master, but you can you undo this change? (there have been changes to this docstring on master, and you have accidentally reverted some of those changes)

jorisvandenbossche · 2018-03-20T09:19:05Z

pandas/core/frame.py

-            Column to use for populating new frame's values. If not
+            Column name to use to make new frame's columns
+        values : string, object or a list (0.23.0) of the previous, optional
+            Column name(s) to use for populating new frame's values. If not


Column name(s) -> Column(s) to be consistent with above

jorisvandenbossche · 2018-03-20T09:20:02Z

pandas/core/frame.py

-        values : string or object, optional
-            Column to use for populating new frame's values. If not
+            Column name to use to make new frame's columns
+        values : string, object or a list (0.23.0) of the previous, optional


Instead of mentioning the (0.23.0) here, can you instead add a

.. versionchanged:: 0.23.0 Also accept list of column names.

to the parameter description?

jorisvandenbossche · 2018-03-20T09:22:13Z

pandas/core/frame.py

@@ -5011,8 +5012,14 @@ def pivot(self, index=None, columns=None, values=None):
        one  1   2   3
        two  4   5   6

-        A ValueError is raised if there are any duplicates.
+        >>> df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
+                baz       zoo


The alignment seems a bit off here. I think the 'baz' should align with the 'A' below. Best to check how it is in the console output.

jorisvandenbossche · 2018-03-20T09:23:05Z

pandas/core/frame.py

        Notice that the first two rows are the same for our `index`
        and `columns` arguments.
-


Can you also undo this removal of blank lines?

All fixed now @jorisvandenbossche.

ibrahimsharaf · 2018-03-20T17:01:10Z

Can you @jreback @jorisvandenbossche restart Travis build?

ibrahimsharaf · 2018-03-25T00:26:18Z

ping @jorisvandenbossche

jreback · 2018-03-25T14:45:02Z

lgtm. @jorisvandenbossche merge when ok.

jorisvandenbossche · 2018-03-26T07:24:12Z

@ibrahimsharaf Thanks a lot!

ibrahimsharaf added 2 commits December 4, 2017 23:11

add pivot with multi-values

b74ee0f

update whatsnew

a36f9e0

ibrahimsharaf changed the title ~~add pivot with multi-values~~ DataFrame.pivot accepts a list of values Dec 4, 2017

jreback requested changes Dec 5, 2017

View reviewed changes

jreback added Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 5, 2017

fix review comments

5f94728

PEP8 fixes

3008d8e

ibrahimsharaf added 2 commits December 8, 2017 04:06

merge master

b3ea1c2

merge master

539ffdc

ibrahimsharaf added 4 commits December 16, 2017 18:58

Merge branch 'master' into pivot_multi

d176585

remove tuple from test

6646798

update pivot docstring

ea77a97

remove unused import

1d6bf58

jreback requested changes Dec 18, 2017

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into pi…

c000811

…vot_multi

jreback requested changes Jan 2, 2018

View reviewed changes

ibrahimsharaf added 4 commits January 2, 2018 17:51

Merge branch 'master' into pivot_multi

c750807

Merge branch 'master' into pivot_multi

d3a7bec

Merge branch 'pivot_multi' of https://github.com/ibrahimsharaf/pandas …

c50b2dd

…into pivot_multi

Push requested changes

df2f0b0

jreback requested changes Jan 4, 2018

View reviewed changes

ibrahimsharaf added 2 commits March 1, 2018 21:28

Use pytest raises instead of xfail

516690c

Remove unnecessary code

e30fd1c

jreback requested changes Mar 1, 2018

View reviewed changes

jreback added this to the 0.23.0 milestone Mar 1, 2018

Fix review comments

eb9d85f

jorisvandenbossche reviewed Mar 2, 2018

View reviewed changes

ibrahimsharaf added 3 commits March 18, 2018 14:51

Merge and resolve

786e5f7

xfail multiindex test

8ea45f8

Add additional test

3825c9a

jreback approved these changes Mar 20, 2018

View reviewed changes

jorisvandenbossche reviewed Mar 20, 2018

View reviewed changes

ibrahimsharaf added 2 commits March 20, 2018 17:15

Merge remote-tracking branch 'upstream/master' into pivot_multi

8e54fc9

Review changes

e293741

jorisvandenbossche approved these changes Mar 26, 2018

View reviewed changes

jorisvandenbossche changed the title ~~DataFrame.pivot accepts a list of values~~ ENH: DataFrame.pivot accepts a list of values Mar 26, 2018

jorisvandenbossche merged commit 6c0c277 into pandas-dev:master Mar 26, 2018

ibrahimsharaf deleted the pivot_multi branch March 26, 2018 08:14

javadnoorb pushed a commit to javadnoorb/pandas that referenced this pull request Mar 29, 2018

ENH: DataFrame.pivot accepts a list of values (pandas-dev#18636)

eecb129

dworvos pushed a commit to dworvos/pandas that referenced this pull request Apr 2, 2018

ENH: DataFrame.pivot accepts a list of values (pandas-dev#18636)

4d0ae9c

kornilova203 pushed a commit to kornilova203/pandas that referenced this pull request Apr 23, 2018

ENH: DataFrame.pivot accepts a list of values (pandas-dev#18636)

2106418


		results = df.pivot(index='zoo', columns='foo', values=['bar', 'baz'])

		data = [[None, 'A', None, 4],

		Notice that the first two rows are the same for our `index`
		and `columns` arguments.

ENH: DataFrame.pivot accepts a list of values #18636

ENH: DataFrame.pivot accepts a list of values #18636

Conversation

ibrahimsharaf commented Dec 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pep8speaks commented Dec 5, 2017 • edited Loading

Comment last updated on March 20, 2018 at 15:31 Hours UTC

codecov bot commented Dec 5, 2017

Codecov Report

codecov bot commented Dec 5, 2017 • edited Loading

Codecov Report

ibrahimsharaf commented Dec 5, 2017

jorisvandenbossche commented Dec 5, 2017

ibrahimsharaf commented Dec 8, 2017 • edited Loading

jorisvandenbossche commented Dec 8, 2017

ibrahimsharaf commented Dec 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ibrahimsharaf commented Mar 2, 2018 • edited Loading

jreback commented Mar 2, 2018 • edited Loading

ibrahimsharaf commented Mar 2, 2018

TomAugspurger commented Mar 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ibrahimsharaf commented Mar 19, 2018

jreback commented Mar 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ibrahimsharaf commented Mar 20, 2018

ibrahimsharaf commented Mar 25, 2018

jreback commented Mar 25, 2018

jorisvandenbossche commented Mar 26, 2018

ibrahimsharaf commented Dec 4, 2017 •

edited

Loading

pep8speaks commented Dec 5, 2017 •

edited

Loading

codecov bot commented Dec 5, 2017 •

edited

Loading

ibrahimsharaf commented Dec 8, 2017 •

edited

Loading

ibrahimsharaf commented Dec 8, 2017 •

edited

Loading

ibrahimsharaf commented Mar 2, 2018 •

edited

Loading

jreback commented Mar 2, 2018 •

edited

Loading