Skip to content

Commit 117d0b1

Browse files
pganssleTomAugspurger
authored andcommitted
BUG: Empty CategoricalIndex fails with boolean categories (#22710)
* TST: Add failing test for empty bool Categoricals * BUG: Failure in empty boolean CategoricalIndex Fixes GH #22702.
1 parent 1c113db commit 117d0b1

File tree

4 files changed

+19
-2
lines changed

4 files changed

+19
-2
lines changed

doc/source/whatsnew/v0.24.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -616,6 +616,7 @@ Categorical
616616
^^^^^^^^^^^
617617

618618
- Bug in :meth:`Categorical.from_codes` where ``NaN`` values in ``codes`` were silently converted to ``0`` (:issue:`21767`). In the future this will raise a ``ValueError``. Also changes the behavior of ``.from_codes([1.1, 2.0])``.
619+
- Constructing a :class:`pd.CategoricalIndex` with empty values and boolean categories was raising a ``ValueError`` after a change to dtype coercion (:issue:`22702`).
619620

620621
Datetimelike
621622
^^^^^^^^^^^^

pandas/core/arrays/categorical.py

+6-2
Original file line numberDiff line numberDiff line change
@@ -2439,9 +2439,13 @@ def _get_codes_for_values(values, categories):
24392439
"""
24402440
utility routine to turn values into codes given the specified categories
24412441
"""
2442-
24432442
from pandas.core.algorithms import _get_data_algo, _hashtables
2444-
if not is_dtype_equal(values.dtype, categories.dtype):
2443+
if is_dtype_equal(values.dtype, categories.dtype):
2444+
# To prevent erroneous dtype coercion in _get_data_algo, retrieve
2445+
# the underlying numpy array. gh-22702
2446+
values = getattr(values, 'values', values)
2447+
categories = getattr(categories, 'values', categories)
2448+
else:
24452449
values = ensure_object(values)
24462450
categories = ensure_object(categories)
24472451

pandas/tests/arrays/categorical/test_constructors.py

+6
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,12 @@ def test_constructor_empty(self):
4242
expected = pd.Int64Index([1, 2, 3])
4343
tm.assert_index_equal(c.categories, expected)
4444

45+
def test_constructor_empty_boolean(self):
46+
# see gh-22702
47+
cat = pd.Categorical([], categories=[True, False])
48+
categories = sorted(cat.categories.tolist())
49+
assert categories == [False, True]
50+
4551
def test_constructor_tuples(self):
4652
values = np.array([(1,), (1, 2), (1,), (1, 2)], dtype=object)
4753
result = Categorical(values)

pandas/tests/indexes/test_category.py

+6
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,12 @@ def test_construction_with_dtype(self):
136136
result = CategoricalIndex(idx, categories=idx, ordered=True)
137137
tm.assert_index_equal(result, expected, exact=True)
138138

139+
def test_construction_empty_with_bool_categories(self):
140+
# see gh-22702
141+
cat = pd.CategoricalIndex([], categories=[True, False])
142+
categories = sorted(cat.categories.tolist())
143+
assert categories == [False, True]
144+
139145
def test_construction_with_categorical_dtype(self):
140146
# construction with CategoricalDtype
141147
# GH18109

0 commit comments

Comments
 (0)