Skip to content

Adding Multiindex support to dataframe pivot function(Fixes #21425) #21467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed

Conversation

NikhilKumarM
Copy link
Contributor

@NikhilKumarM NikhilKumarM commented Jun 13, 2018

@codecov
Copy link

codecov bot commented Jun 13, 2018

Codecov Report

Merging #21467 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21467      +/-   ##
==========================================
+ Coverage    91.9%    91.9%   +<.01%     
==========================================
  Files         154      154              
  Lines       49657    49667      +10     
==========================================
+ Hits        45638    45648      +10     
  Misses       4019     4019
Flag Coverage Δ
#multiple 90.28% <100%> (ø) ⬆️
#single 41.89% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/reshape/reshape.py 99.79% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc45fba...faab956. Read the comment docs.

@gfyoung gfyoung added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jun 13, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 13, 2018

@NikhilKumarM : Off to a good start here! Going to need a whatsnew entry + test (can use the example that @jorisvandenbossche gave).

@NikhilKumarM
Copy link
Contributor Author

NikhilKumarM commented Jun 14, 2018

Thanks @gfyoung . I have added test cases and whatsnew entry

@gfyoung
Copy link
Member

gfyoung commented Jun 14, 2018

@NikhilKumarM : You actually need to add the whatsnew entry (search for the v0.23.2.txt file and test cases to your PR changes 😄 .

@jorisvandenbossche
Copy link
Member

@NikhilKumarM Did you push your new changes? Because they do not show up on github here. Normally you only need to add new commits to the same branch, push that to github, and the PR here should get updated.

@gfyoung gfyoung modified the milestones: 0.24.0, 0.23.2 Jun 14, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 14, 2018

@jorisvandenbossche : If you look at the PR description, you can see that @NikhilKumarM accidentally added them there. If this PR goes unresponsive, I can bring it to the finish line.

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.23.2, 0.24.0 Jun 14, 2018
@pep8speaks
Copy link

pep8speaks commented Jun 15, 2018

Hello @NikhilKumarM! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on July 01, 2018 at 13:52 Hours UTC

Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM!

cc @jorisvandenbossche

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can you also add the ability to pass multiple values for columns ?
  • Can you also update the docstring of DataFrame.pivot ?

@@ -66,7 +66,7 @@ Bug Fixes

**Reshaping**

-
- Bug in: DataFrame.pivot() function where error was raised when multiple columns are set as index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to the 0.24.0.txt file? And you can mention it in the Enhancements section

(as this is an new feature, not really a bug fix, I think, as it was just not yet implemented)

@@ -392,12 +392,22 @@ def pivot(self, index=None, columns=None, values=None):
cols = [columns] if index is None else [index, columns]
append = index is None
indexed = self.set_index(cols, append=append)
# adding the support for multi-index in pivot function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is not really needed I think

@@ -283,6 +283,23 @@ def test_pivot_multi_functions(self):
expected = concat([means, stds], keys=['mean', 'std'], axis=1)
tm.assert_frame_equal(result, expected)

# adding the test case for multiple columns as index (#21425)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move this comment to the first line of the test function

expected = DataFrame([[0, 1], [2, 3], [4, 5], [6, 7]],
exp_index,
exp_columns)
tm.assert_frame_equal(result, expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example where the values argument is not mentioned? eg df.pivot(index=['lev1', 'lev2'], columns='lev3')
(I think this will not yet work, but should work)

@NikhilKumarM
Copy link
Contributor Author

NikhilKumarM commented Jun 15, 2018

  • Sure. I will try to add ability to pass multiple values for columns.
  • I have not worked on docstrings before. Is there any good resource to start doing that? @jorisvandenbossche

@WillAyd
Copy link
Member

WillAyd commented Jun 15, 2018

@NikhilKumarM there is a section in the contributing guide focused on docstrings:

https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html

index = MultiIndex.from_arrays([index, self[columns]])
# added this case to handle multi-index
elif isinstance(index, list):
indexes = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this a list-comprehension

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

else:
if index is None:
index = self.index
index = MultiIndex.from_arrays([index, self[columns]])
# added this case to handle multi-index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make a comment that reflects what is here (not added)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -16,6 +16,8 @@ Other Enhancements
- :func:`Series.mode` and :func:`DataFrame.mode` now support the ``dropna`` parameter which can be used to specify whether NaN/NaT values should be considered (:issue:`17534`)
- :func:`to_csv` now supports ``compression`` keyword when a file handle is passed. (:issue:`21227`)
- :meth:`Index.droplevel` is now implemented also for flat indexes, for compatibility with MultiIndex (:issue:`21115`)
- :func:'Dataframe.pivot' now supports multiple columns as index. (:issue:'21425')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'columns as an index'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will make those changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected

@NikhilKumarM
Copy link
Contributor Author

NikhilKumarM commented Jun 21, 2018

Why is continuous-integration/travis-ci failing?

@gfyoung
Copy link
Member

gfyoung commented Jun 21, 2018

Why is continuous-integration/travis-ci failing?

@NikhilKumarM : Restarted Travis for you. Failure unrelated to PR.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small change otherwise lgtm. does the doc string need updating?

else:
if index is None:
index = self.index
index = MultiIndex.from_arrays([index, self[columns]])
elif isinstance(index, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use is_list_like instead here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will do it.

@NikhilKumarM
Copy link
Contributor Author

@jreback @gfyoung @jorisvandenbossche
Could you please check my last commit and suggest changes. I have modified the code to handle when there is no 'values' argument for 'pivot' function and also added a test case for that. But I do not know if have done it in an optimal way.

@gfyoung
Copy link
Member

gfyoung commented Jul 1, 2018

@NikhilKumarM : Got a linting error on Travis:

https://travis-ci.org/pandas-dev/pandas/jobs/398632775#L2904

@NikhilKumarM
Copy link
Contributor Author

NikhilKumarM commented Jul 1, 2018

@gfyoung Fixed the linting error.

@NikhilKumarM
Copy link
Contributor Author

@gfyoung @jreback Could you please give me your feedback on my last commit?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a whatsnew note for 0.24. I guess this is a bug fix, though was this documented before? if so its an enhancement. Can you also add an exampl ein reshape.rst with this. And in the pivot doc-string as well.

@@ -395,15 +395,29 @@ def pivot(self, index=None, columns=None, values=None):
See DataFrame.pivot
"""
if values is None:
cols = [columns] if index is None else [index, columns]
if index is None:
cols = [columns]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have tests for each one of these paths here? e.g. branching a lot more now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an enhancement. I have added the test cases and need to update the docs. Thanks for your time.

@jreback
Copy link
Contributor

jreback commented Aug 9, 2018

also needs to rebase

@jreback
Copy link
Contributor

jreback commented Nov 4, 2018

can you merge master

@NikhilKumarM
Copy link
Contributor Author

Sure!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you merge master & add a whatsnew note.

cols = [columns] if index is None else [index, columns]
if index is None:
cols = [columns]
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment explaining what is going on here, ideally we would have a comment for each branch of the if/else

@jreback
Copy link
Contributor

jreback commented Nov 18, 2018

can you update

@jreback jreback removed this from the 0.24.0 milestone Nov 18, 2018
@jreback
Copy link
Contributor

jreback commented Dec 9, 2018

can you merge master

@NikhilKumarM
Copy link
Contributor Author

I think pivot function has been removed from reshape.py and put in pivot.py. I will rebase with the master and make necessary changes by this weekend. Thank you !!

@jreback
Copy link
Contributor

jreback commented Jan 5, 2019

can you merge master and update

@WillAyd
Copy link
Member

WillAyd commented Feb 6, 2019

Closing as stale - ping if you'd like to reopen and continue working this

@WillAyd WillAyd closed this Feb 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.pivot fails on multiple columns to set as index
6 participants