Skip to content

Fix pivot index bug #37771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 14, 2020
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,7 @@ Reshaping
- Bug in :meth:`DataFrame.agg` with ``func={'name':<FUNC>}`` incorrectly raising ``TypeError`` when ``DataFrame.columns==['Name']`` (:issue:`36212`)
- Bug in :meth:`Series.transform` would give incorrect results or raise when the argument ``func`` was dictionary (:issue:`35811`)
- Bug in :meth:`DataFrame.pivot` did not preserve :class:`MultiIndex` level names for columns when rows and columns both multiindexed (:issue:`36360`)
- Bug in :meth:`DataFrame.pivot` modified ``index`` argument when ``columns`` was passed but ``values`` was not (:issue:`37635`)
- Bug in :func:`join` returned a non deterministic level-order for the resulting :class:`MultiIndex` (:issue:`36910`)
- Bug in :meth:`DataFrame.combine_first()` caused wrong alignment with dtype ``string`` and one level of ``MultiIndex`` containing only ``NA`` (:issue:`37591`)
- Fixed regression in :func:`merge` on merging DatetimeIndex with empty DataFrame (:issue:`36895`)
Expand Down
3 changes: 1 addition & 2 deletions pandas/core/reshape/pivot.py
Original file line number Diff line number Diff line change
Expand Up @@ -450,10 +450,9 @@ def pivot(
cols = com.convert_to_list_like(index)
else:
cols = []
cols.extend(columns)

append = index is None
indexed = data.set_index(cols, append=append)
indexed = data.set_index(cols + columns, append=append)
else:
if index is None:
index = [Series(data.index, name=data.index.name)]
Expand Down
37 changes: 37 additions & 0 deletions pandas/tests/reshape/test_pivot.py
Original file line number Diff line number Diff line change
Expand Up @@ -2153,3 +2153,40 @@ def test_pivot_index_none(self):

expected.columns.name = "columns"
tm.assert_frame_equal(result, expected)

def test_pivot_index_list_values_none_modifies_args(self):
# GH37635
Copy link
Member

@ivanovmg ivanovmg Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is about making sure that the index list does not change.
Would it be more reasonable to test here only that, without comparing pivoted table with the expected one?
This is related to one of your prior questions.

Thanks for the quick reply @arw2019!

Quick question: I'm essentially copying TestPivot.test_pivot(). The point of the bugfix is to prevent the pivot() method from modifying the passed argument index. Should I still include the assert statements from TestPivot.test_pivot() to verify that the pivot itself worked, or just include an assert statement focused on the bugfix?

Copy link
Contributor Author

@Jacob-Stevens-Haas Jacob-Stevens-Haas Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that originally, but a previous review asked that I

check here that you got the correct result from the pivot (might need to hard-code)

Doesn't much matter to me - now that the code is written, it's easy to delete or leave in as desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly on this
But I'd say it's useful to check that the pivot is doing what it's supposed to before checking that the args aren't changed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the test module and noticed that there are no tests on pivot when index is a list and values are None.
Thus, there is no repetition of any kind and it is perfectly fine to have the test as it is.
Thank you!

df = DataFrame(
{
"lev1": [1, 1, 1, 2, 2, 2],
"lev2": [1, 1, 2, 1, 1, 2],
"lev3": [1, 2, 1, 2, 1, 2],
"lev4": [1, 2, 3, 4, 5, 6],
"values": [0, 1, 2, 3, 4, 5],
}
)
index = ["lev1", "lev2"]
columns = ["lev3"]
result = df.pivot(index=index, columns=columns, values=None)

expected = DataFrame(
np.array(
[
[1.0, 2.0, 0.0, 1.0],
[3.0, np.nan, 2.0, np.nan],
[5.0, 4.0, 4.0, 3.0],
[np.nan, 6.0, np.nan, 5.0],
]
),
index=MultiIndex.from_arrays(
[(1, 1, 2, 2), (1, 2, 1, 2)], names=["lev1", "lev2"]
),
columns=MultiIndex.from_arrays(
[("lev4", "lev4", "values", "values"), (1, 2, 1, 2)],
names=[None, "lev3"],
),
)

tm.assert_frame_equal(result, expected)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check here that you got the correct result from the pivot (might need to hard-code)

assert index == ["lev1", "lev2"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also assert that columns is the same as original, ping on green.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done but not green; new comment below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it's green, LGTM

Copy link
Contributor Author

@Jacob-Stevens-Haas Jacob-Stevens-Haas Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks! @jreback?