Skip to content

Commit a405777

Browse files
BUG: Dataframe.fillna with np.nan for dtype=category(GH 14021)
1 parent 5c955cb commit a405777

File tree

3 files changed

+30
-1
lines changed

3 files changed

+30
-1
lines changed

doc/source/whatsnew/v0.19.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1061,6 +1061,7 @@ Bug Fixes
10611061
- Bug where ``pd.read_gbq()`` could throw ``ImportError: No module named discovery`` as a result of a naming conflict with another python package called apiclient (:issue:`13454`)
10621062
- Bug in ``Index.union`` returns an incorrect result with a named empty index (:issue:`13432`)
10631063
- Bugs in ``Index.difference`` and ``DataFrame.join`` raise in Python3 when using mixed-integer indexes (:issue:`13432`, :issue:`12814`)
1064+
- Bug in ``DataFrame(..., dtype='category').fillna(value=np.nan)`` raise ``KeyError`` (:issue:`14021`)
10641065
- Bug in ``.to_excel()`` when DataFrame contains a MultiIndex which contains a label with a NaN value (:issue:`13511`)
10651066
- Bug in invalid frequency offset string like "D1", "-2-3H" may not raise ``ValueError (:issue:`13930`)
10661067
- Bug in ``concat`` and ``groupby`` for hierarchical frames with ``RangeIndex`` levels (:issue:`13542`).

pandas/core/categorical.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -1444,7 +1444,10 @@ def fillna(self, value=None, method=None, limit=None):
14441444
mask = values == -1
14451445
if mask.any():
14461446
values = values.copy()
1447-
values[mask] = self.categories.get_loc(value)
1447+
if isnull(value):
1448+
values[mask] = -1
1449+
else:
1450+
values[mask] = self.categories.get_loc(value)
14481451

14491452
return self._constructor(values, categories=self.categories,
14501453
ordered=self.ordered, fastpath=True)

pandas/tests/test_categorical.py

+25
Original file line numberDiff line numberDiff line change
@@ -4100,6 +4100,31 @@ def f():
41004100
res = df.fillna("a")
41014101
tm.assert_frame_equal(res, df_exp)
41024102

4103+
# GH 14021
4104+
cat = Categorical([np.nan, 2, np.nan])
4105+
val = Categorical([np.nan, np.nan, np.nan])
4106+
df = DataFrame({"cats": cat, "vals": val})
4107+
res = df.fillna(df.median())
4108+
v_exp = [np.nan, np.nan, np.nan]
4109+
df_exp = pd.DataFrame({"cats": [2, 2, 2], "vals": v_exp},
4110+
dtype='category')
4111+
tm.assert_frame_equal(res, df_exp)
4112+
4113+
idx = pd.DatetimeIndex(['2011-01-01 09:00', '2016-01-01 23:45',
4114+
'2011-01-01 09:00', pd.NaT, pd.NaT])
4115+
df = DataFrame({'a': pd.Categorical(idx)})
4116+
tm.assert_frame_equal(df.fillna(value=pd.NaT), df)
4117+
4118+
idx = pd.PeriodIndex(['2011-01', '2011-01', '2011-01',
4119+
pd.NaT, pd.NaT], freq='M')
4120+
df = DataFrame({'a': pd.Categorical(idx)})
4121+
tm.assert_frame_equal(df.fillna(value=pd.NaT), df)
4122+
4123+
idx = pd.TimedeltaIndex(['1 days', '2 days',
4124+
'1 days', pd.NaT, pd.NaT])
4125+
df = pd.DataFrame({'a': pd.Categorical(idx)})
4126+
tm.assert_frame_equal(df.fillna(value=pd.NaT), df)
4127+
41034128
def test_astype_to_other(self):
41044129

41054130
s = self.cat['value_group']

0 commit comments

Comments
 (0)