Skip to content

BUG: DataFrame.pivot drops column level names when both rows and columns are multiindexed #36655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Oct 31, 2020
Merged
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -512,6 +512,7 @@ Reshaping
- Bug in func :meth:`crosstab` when using multiple columns with ``margins=True`` and ``normalize=True`` (:issue:`35144`)
- Bug in :meth:`DataFrame.agg` with ``func={'name':<FUNC>}`` incorrectly raising ``TypeError`` when ``DataFrame.columns==['Name']`` (:issue:`36212`)
- Bug in :meth:`Series.transform` would give incorrect results or raise when the argument ``func`` was dictionary (:issue:`35811`)
- Bug in :meth:`DataFrame.pivot` did not preserve :class:`MultiIndex` level names for columns when rows and columns both multiindexed (:issue:`36360`)
- Bug in :func:`join` returned a non deterministic level-order for the resulting :class:`MultiIndex` (:issue:`36910`)
-

Expand Down
4 changes: 2 additions & 2 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1645,8 +1645,8 @@ def _wrap_aggregated_output(
DataFrame
"""
indexed_output = {key.position: val for key, val in output.items()}
name = self._obj_with_exclusions._get_axis(1 - self.axis).name
columns = Index([key.label for key in output], name=name)
columns = Index([key.label for key in output])
columns._set_names(self._obj_with_exclusions._get_axis(1 - self.axis).names)

result = self.obj._constructor(indexed_output)
result.columns = columns
Expand Down
34 changes: 33 additions & 1 deletion pandas/tests/reshape/test_pivot_multilevel.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import pytest

import pandas as pd
from pandas import Index, MultiIndex
from pandas import Index, Int64Index, MultiIndex
import pandas._testing as tm


Expand Down Expand Up @@ -190,3 +190,35 @@ def test_pivot_list_like_columns(
expected_values, columns=expected_columns, index=expected_index
)
tm.assert_frame_equal(result, expected)


def test_pivot_multiindexed_rows_and_cols():
# GH 36360
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have appropraite tests if index OR columns are a MI and the other is not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do (in this file and in test_pivot.py)

What existing tests don't cover is the case when pivot_table produces a result with multiple MutliIndexed columns - added here


df = pd.DataFrame(
data=np.arange(12).reshape(4, 3),
columns=MultiIndex.from_tuples(
[(0, 0), (0, 1), (0, 2)], names=["col_L0", "col_L1"]
),
index=MultiIndex.from_tuples(
[(0, 0, 0), (0, 0, 1), (1, 1, 1), (1, 0, 0)],
names=["idx_L0", "idx_L1", "idx_L2"],
),
)

res = df.pivot_table(
index=["idx_L0"],
columns=["idx_L1"],
values=[(0, 1)],
aggfunc=lambda col: col.values.sum(),
)

expected = pd.DataFrame(
data=[[5.0, np.nan], [10.0, 7.0]],
columns=MultiIndex.from_tuples(
[(0, 1, 0), (0, 1, 1)], names=["col_L0", "col_L1", "idx_L1"]
),
index=Int64Index([0, 1], dtype="int64", name="idx_L0"),
)

tm.assert_frame_equal(res, expected)