Skip to content

BUG: Series.fillna() crashes on Categorical series if value is a series #17033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
capelastegui opened this issue Jul 20, 2017 · 3 comments · Fixed by #18293
Closed

BUG: Series.fillna() crashes on Categorical series if value is a series #17033

capelastegui opened this issue Jul 20, 2017 · 3 comments · Fixed by #18293
Labels
Categorical Categorical Data Type Enhancement Error Reporting Incorrect or improved errors from pandas Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@capelastegui
Copy link

Code Sample

import pandas as pd, numpy as np
s_str = pd.Series(['hello',np.NaN])
print s_str.fillna(s_str)   # This works
s_cat = s_str.astype('category')
print s_cat.fillna(s_str)   # This crashes

Problem description

Pandas.Series.fillna can take a scalar, dict, Series or DataFrame as value. The fillna() method for categorical only takes scalars as value, but it doesn't provide a clear error message when an unsupported input type, such as Series, is provided.

Calling Categorical.fillna() with value=series crashes with a cryptic error message:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Expected Output

If Series cannot be supported as input, the function should check for input type and provide a proper ValueError message (e.g. "value must be a scalar").

The ideal solution, however, would be to have Categorical.fillna() support the same value types as other fillna() methods.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.11.final.0 python-bits: 64 OS: Darwin OS-release: 14.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8

pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 30.2.0
Cython: 0.23.4
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.1.0
sphinx: 1.5.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.0.0
tables: None
numexpr: 2.5
feather: None
matplotlib: 1.5.1
openpyxl: 2.4.1
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: 3.4.4
bs4: None
html5lib: None
sqlalchemy: 1.1.2
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Categorical Categorical Data Type Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Error Reporting Incorrect or improved errors from pandas Enhancement labels Jul 20, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 20, 2017

@capelastegui : I agree that at the very least the error message can be improved. I also don't see why we shouldn't support Series or DataFrame as inputs.

Thus, a PR to improve the error message is welcome! However, feel free to also dive into how we could support those two classes as inputs and submit a PR to add that functionality.

@jreback
Copy link
Contributor

jreback commented Jul 21, 2017

I suppose this could work.

@jreback jreback added this to the Next Major Release milestone Jul 21, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 22, 2017

@jreback : I think at the every least we could improve the error message. We can address the actual behavior in a subsequent PR if necessary. How does that sound?

reidy-p added a commit to reidy-p/pandas that referenced this issue Nov 14, 2017
reidy-p added a commit to reidy-p/pandas that referenced this issue Nov 14, 2017
@jreback jreback modified the milestones: Next Major Release, 0.22.0 Nov 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Enhancement Error Reporting Incorrect or improved errors from pandas Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants