Skip to content

Commit 62ced7b

Browse files
committed
BUG: fix dtype of all-NaN MultiIndex level
closes #17929
1 parent 77b4bb3 commit 62ced7b

File tree

3 files changed

+11
-6
lines changed

3 files changed

+11
-6
lines changed

doc/source/whatsnew/v0.21.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -964,6 +964,7 @@ Indexing
964964
- When called on an unsorted ``MultiIndex``, the ``loc`` indexer now will raise ``UnsortedIndexError`` only if proper slicing is used on non-sorted levels (:issue:`16734`).
965965
- Fixes regression in 0.20.3 when indexing with a string on a ``TimedeltaIndex`` (:issue:`16896`).
966966
- Fixed :func:`TimedeltaIndex.get_loc` handling of ``np.timedelta64`` inputs (:issue:`16909`).
967+
- Bug in ``MultiIndex`` which would assign object dtype to all-NaN levels (:issue:`17929`).
967968
- Fix :func:`MultiIndex.sort_index` ordering when ``ascending`` argument is a list, but not all levels are specified, or are in a different order (:issue:`16934`).
968969
- Fixes bug where indexing with ``np.inf`` caused an ``OverflowError`` to be raised (:issue:`16957`)
969970
- Bug in reindexing on an empty ``CategoricalIndex`` (:issue:`16770`)

pandas/core/categorical.py

+3
Original file line numberDiff line numberDiff line change
@@ -2276,6 +2276,7 @@ def _factorize_from_iterable(values):
22762276
a CategoricalIndex keeping the categories and order of `values`.
22772277
"""
22782278
from pandas.core.indexes.category import CategoricalIndex
2279+
from pandas.core.indexes.numeric import Float64Index
22792280

22802281
if not is_list_like(values):
22812282
raise TypeError("Input must be list-like")
@@ -2291,6 +2292,8 @@ def _factorize_from_iterable(values):
22912292
cat = Categorical(values, ordered=True)
22922293
categories = cat.categories
22932294
codes = cat.codes
2295+
if len(codes) and not len(categories):
2296+
categories = Float64Index([])
22942297
return codes, categories
22952298

22962299

pandas/tests/indexes/test_multi.py

+7-6
Original file line numberDiff line numberDiff line change
@@ -970,12 +970,13 @@ def test_get_level_values_na(self):
970970

971971
arrays = [[np.nan, np.nan, np.nan], ['a', np.nan, 1]]
972972
index = pd.MultiIndex.from_arrays(arrays)
973-
values = index.get_level_values(0)
974-
expected = np.array([np.nan, np.nan, np.nan])
975-
tm.assert_numpy_array_equal(values.values.astype(float), expected)
976-
values = index.get_level_values(1)
977-
expected = np.array(['a', np.nan, 1], dtype=object)
978-
tm.assert_numpy_array_equal(values.values, expected)
973+
result = index.get_level_values(0)
974+
expected = pd.Index([np.nan, np.nan, np.nan])
975+
tm.assert_index_equal(result, expected)
976+
977+
result = index.get_level_values(1)
978+
expected = pd.Index(['a', np.nan, 1], dtype=object)
979+
tm.assert_index_equal(result, expected)
979980

980981
arrays = [['a', 'b', 'b'], pd.DatetimeIndex([0, 1, pd.NaT])]
981982
index = pd.MultiIndex.from_arrays(arrays)

0 commit comments

Comments
 (0)