Skip to content

BUG: MultiIndex dtypes incorrect if level names not unique #45175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -838,6 +838,7 @@ MultiIndex
- Bug in :meth:`MultiIndex.get_loc` raising ``TypeError`` instead of ``KeyError`` on nested tuple (:issue:`42440`)
- Bug in :meth:`MultiIndex.union` setting wrong ``sortorder`` causing errors in subsequent indexing operations with slices (:issue:`44752`)
- Bug in :meth:`MultiIndex.putmask` where the other value was also a :class:`MultiIndex` (:issue:`43212`)
- Bug in :meth:`MultiIndex.dtypes` duplicate level names returned only one dtype per name (:issue:`45174`)
-

I/O
Expand Down
4 changes: 1 addition & 3 deletions pandas/core/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -739,9 +739,7 @@ def dtypes(self) -> Series:
from pandas import Series

names = com.fill_missing_names([level.name for level in self.levels])
return Series(
{names[idx]: level.dtype for idx, level in enumerate(self.levels)}
)
return Series([level.dtype for level in self.levels], index=names)

def __len__(self) -> int:
return len(self.codes[0])
Expand Down
17 changes: 17 additions & 0 deletions pandas/tests/indexes/multi/test_get_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,23 @@ def test_get_dtypes_no_level_name():
tm.assert_series_equal(expected, idx_multitype.dtypes)


def test_get_dtypes_duplicate_level_names():
# Test MultiIndex.dtypes with non-unique level names (# GH45174)
result = MultiIndex.from_product(
[
[1, 2, 3],
["a", "b", "c"],
pd.date_range("20200101", periods=2, tz="UTC"),
],
names=["A", "A", "A"],
).dtypes
expected = pd.Series(
[np.dtype("int64"), np.dtype("O"), DatetimeTZDtype(tz="utc")],
index=["A", "A", "A"],
)
tm.assert_series_equal(result, expected)


def test_get_level_number_out_of_bounds(multiindex_dataframe_random_data):
frame = multiindex_dataframe_random_data

Expand Down