Skip to content

Commit 01582c4

Browse files
authored
REGR: Categorical with np.str_ categories (#31528)
* REGR: Categorical with np.str_ categories
1 parent 73ea6ca commit 01582c4

File tree

3 files changed

+12
-2
lines changed

3 files changed

+12
-2
lines changed

doc/source/whatsnew/v1.0.1.rst

+1
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Fixed regressions
2222
- Fixed regression in :meth:`GroupBy.apply` if called with a function which returned a non-pandas non-scalar object (e.g. a list or numpy array) (:issue:`31441`)
2323
- Fixed regression in :meth:`to_datetime` when parsing non-nanosecond resolution datetimes (:issue:`31491`)
2424
- Fixed regression in :meth:`~DataFrame.to_csv` where specifying an ``na_rep`` might truncate the values written (:issue:`31447`)
25+
- Fixed regression in :class:`Categorical` construction with ``numpy.str_`` categories (:issue:`31499`)
2526
- Fixed regression where setting :attr:`pd.options.display.max_colwidth` was not accepting negative integer. In addition, this behavior has been deprecated in favor of using ``None`` (:issue:`31532`)
2627
- Fixed regression in objTOJSON.c fix return-type warning (:issue:`31463`)
2728
- Fixed regression in :meth:`qcut` when passed a nullable integer. (:issue:`31389`)

pandas/_libs/hashtable_class_helper.pxi.in

+6-2
Original file line numberDiff line numberDiff line change
@@ -670,7 +670,9 @@ cdef class StringHashTable(HashTable):
670670
val = values[i]
671671

672672
if isinstance(val, str):
673-
v = get_c_string(val)
673+
# GH#31499 if we have a np.str_ get_c_string wont recognize
674+
# it as a str, even though isinstance does.
675+
v = get_c_string(<str>val)
674676
else:
675677
v = get_c_string(self.na_string_sentinel)
676678
vecs[i] = v
@@ -703,7 +705,9 @@ cdef class StringHashTable(HashTable):
703705
val = values[i]
704706

705707
if isinstance(val, str):
706-
v = get_c_string(val)
708+
# GH#31499 if we have a np.str_ get_c_string wont recognize
709+
# it as a str, even though isinstance does.
710+
v = get_c_string(<str>val)
707711
else:
708712
v = get_c_string(self.na_string_sentinel)
709713
vecs[i] = v

pandas/tests/arrays/categorical/test_constructors.py

+5
Original file line numberDiff line numberDiff line change
@@ -408,6 +408,11 @@ def test_constructor_str_unknown(self):
408408
with pytest.raises(ValueError, match="Unknown dtype"):
409409
Categorical([1, 2], dtype="foo")
410410

411+
def test_constructor_np_strs(self):
412+
# GH#31499 Hastable.map_locations needs to work on np.str_ objects
413+
cat = pd.Categorical(["1", "0", "1"], [np.str_("0"), np.str_("1")])
414+
assert all(isinstance(x, np.str_) for x in cat.categories)
415+
411416
def test_constructor_from_categorical_with_dtype(self):
412417
dtype = CategoricalDtype(["a", "b", "c"], ordered=True)
413418
values = Categorical(["a", "b", "d"])

0 commit comments

Comments
 (0)