Skip to content

Commit 0e61847

Browse files
conquistador1492jreback
authored andcommitted
BUG: Dataframe.fillna with np.nan for dtype=category(GH 14021)
closes #14021 Author: John Liekezer <[email protected]> Closes #14051 from conquistador1492/issue_14021 and squashes the following commits: a405777 [John Liekezer] BUG: Dataframe.fillna with np.nan for dtype=category(GH 14021)
1 parent 0db4304 commit 0e61847

File tree

3 files changed

+37
-2
lines changed

3 files changed

+37
-2
lines changed

doc/source/whatsnew/v0.19.0.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -1188,7 +1188,8 @@ Bug Fixes
11881188
- Bug in ``Series`` arithmetic raises ``TypeError`` if it contains datetime-like as ``object`` dtype (:issue:`13043`)
11891189

11901190
- Bug ``Series.isnull()`` and ``Series.notnull()`` ignore ``Period('NaT')`` (:issue:`13737`)
1191-
- Bug ``Series.fillna()`` and ``Series.dropna()`` don't affect to ``Period('NaT')`` (:issue:`13737`)
1191+
- Bug ``Series.fillna()`` and ``Series.dropna()`` don't affect to ``Period('NaT')`` (:issue:`13737`
1192+
- Bug in ``.fillna(value=np.nan)`` incorrectly raises ``KeyError`` on a ``category`` dtyped ``Series`` (:issue:`14021`)
11921193

11931194
- Bug in extension dtype creation where the created types were not is/identical (:issue:`13285`)
11941195
- Bug in ``.resample(..)`` where incorrect warnings were triggered by IPython introspection (:issue:`13618`)

pandas/core/categorical.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -1464,7 +1464,10 @@ def fillna(self, value=None, method=None, limit=None):
14641464
mask = values == -1
14651465
if mask.any():
14661466
values = values.copy()
1467-
values[mask] = self.categories.get_loc(value)
1467+
if isnull(value):
1468+
values[mask] = -1
1469+
else:
1470+
values[mask] = self.categories.get_loc(value)
14681471

14691472
return self._constructor(values, categories=self.categories,
14701473
ordered=self.ordered, fastpath=True)

pandas/tests/test_categorical.py

+31
Original file line numberDiff line numberDiff line change
@@ -4118,6 +4118,37 @@ def f():
41184118
res = df.fillna("a")
41194119
tm.assert_frame_equal(res, df_exp)
41204120

4121+
# GH 14021
4122+
# np.nan should always be a is a valid filler
4123+
cat = Categorical([np.nan, 2, np.nan])
4124+
val = Categorical([np.nan, np.nan, np.nan])
4125+
df = DataFrame({"cats": cat, "vals": val})
4126+
res = df.fillna(df.median())
4127+
v_exp = [np.nan, np.nan, np.nan]
4128+
df_exp = pd.DataFrame({"cats": [2, 2, 2], "vals": v_exp},
4129+
dtype='category')
4130+
tm.assert_frame_equal(res, df_exp)
4131+
4132+
result = df.cats.fillna(np.nan)
4133+
tm.assert_series_equal(result, df.cats)
4134+
result = df.vals.fillna(np.nan)
4135+
tm.assert_series_equal(result, df.vals)
4136+
4137+
idx = pd.DatetimeIndex(['2011-01-01 09:00', '2016-01-01 23:45',
4138+
'2011-01-01 09:00', pd.NaT, pd.NaT])
4139+
df = DataFrame({'a': pd.Categorical(idx)})
4140+
tm.assert_frame_equal(df.fillna(value=pd.NaT), df)
4141+
4142+
idx = pd.PeriodIndex(['2011-01', '2011-01', '2011-01',
4143+
pd.NaT, pd.NaT], freq='M')
4144+
df = DataFrame({'a': pd.Categorical(idx)})
4145+
tm.assert_frame_equal(df.fillna(value=pd.NaT), df)
4146+
4147+
idx = pd.TimedeltaIndex(['1 days', '2 days',
4148+
'1 days', pd.NaT, pd.NaT])
4149+
df = pd.DataFrame({'a': pd.Categorical(idx)})
4150+
tm.assert_frame_equal(df.fillna(value=pd.NaT), df)
4151+
41214152
def test_astype_to_other(self):
41224153

41234154
s = self.cat['value_group']

0 commit comments

Comments
 (0)