Series.astype(str, skipna=True) vanished in the 1.0 release #31708

languitar · 2020-02-05T16:39:32Z

Code Sample, a copy-pastable example if possible

pd.Series([None, 42]).astype(str, skipna=True)

results in TypeError: astype() got an unexpected keyword argument 'skipna'

Problem description

The provided snippet used to work before the 1.0 release. With the skipna argument, the None was preserved as a real np.nan type, instead of being converted to the string nan. The skipna argument seems to have gone in the 1.0 release and I can't find any way to restore this behavior without a custom function being passed to apply.

Expected Output

A series with [np.nan, '42']

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit : None python : 3.8.1.final.0 python-bits : 64 OS : Linux OS-release : 5.5.1-arch1-1 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_US.utf8 LOCALE : en_US.UTF-8

pandas : 1.0.0
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3
setuptools : 44.0.0
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : 2.2.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: 0.8.1
bs4 : 4.8.2
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None

The text was updated successfully, but these errors were encountered:

MarcoGorelli · 2020-02-05T21:03:18Z

Can confirm this used to work in 0.25.3, but no longer works now.

DataFrame.astype used to accept kwargs but they were removed in #28646

jorisvandenbossche · 2020-02-05T21:18:40Z

Thanks for opening the issue!
As I said on gitter, I was not aware of this keyword, which is not surprising as it has never been documented AFAIK (and also no tests clearly ..)

The reason it worked before (in certain cases) is because the kwargs were passed through to astype_nansafe, which has indeed a skipna keyword (

pandas/pandas/core/dtypes/cast.py

Line 806 in 935b6f4

def astype_nansafe(arr, dtype, copy: bool = True, skipna: bool = False):

)

jreback · 2020-02-06T01:09:23Z

this may have accidentally worked (passing skipna they)
but is not supported in any way

languitar · 2020-02-06T09:36:38Z

Actually, I found this only after reading some issues in this project. I can't exactly recall the source issue for my information on this argument, but the behavior was, for instance, also documented here: #25353

languitar · 2020-02-06T09:37:13Z

In any case, is there an alternative proposal how to handle this case that is also working in 1.0?

jreback · 2020-02-06T11:44:25Z

this is not documented in #25353
this is a straight up bug with a PR but hasn’t been merged yet.

canonically something like

s[s.notna()] = s.astype(str)

would work

jreback · 2020-02-06T11:44:46Z

this is a duplicate issue

jorisvandenbossche · 2020-02-06T13:05:49Z

@languitar And you are certainly not alone in finding it from that issue .. (according to the 👍's)

In practice, it would be easy to restore this for 1.0.2. But in principle, I would personally prefer to not do this, since it was never actually documented (apart from the comment on that issue), only worked kind f accidentally, and is relatively easy to workaround until the actual issue is fixed.

However, depending on the discussion in #28176, we might still want to add it back anyhow (if we would introduce such a keyword to handle the deprecation cycle). But, that will first need some discussion in #28176 then.

pandas-dev/pandas#31708

TomAugspurger · 2020-03-11T14:23:35Z

I don't think there's anything being done for 1.0.2 here. Pushing.

sl224 · 2020-03-23T23:10:49Z

Any slick ways to handle this in versions > 1.0?
jrebacks solution seems the most straight forward s[s.notna()] = s.astype(str)

simonjayhawkins · 2020-03-29T11:21:12Z

However, depending on the discussion in #28176, we might still want to add it back anyhow (if we would introduce such a keyword to handle the deprecation cycle). But, that will first need some discussion in #28176 then.

makes sense, see #28176 (comment), can we do this seperate to #28176 to resolve this regression.

t0ma-sz · 2020-06-29T09:49:22Z

Is there any update on this one? In pandas 1.0.5 it’s still complaining about unexpected keyword argument.

I think this may block the migration of some projects to 1.0.x.

simonjayhawkins · 2020-06-29T11:46:49Z

@t0ma-sz #28176 was closed as stale.

if you would like to submit a PR and address #28176 (comment), we can review again.

dbalduini · 2021-04-06T15:56:28Z

This solution s[s.notna()] = s.astype(str) will not work with mixed types.

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

This is another workaround that does work with mixed types: s = s.where(s.isna(), s.astype(str))

How to reproduce:

df = pd.DataFrame({'number': pd.Series(range(3))})
df["name"] = "abc"
df.number.loc[2] = None
df.name.loc[1] = None

   number  name
0     0.0   abc
1     1.0  None
2     NaN   abc

df['number'].to_list()
> [0.0, 1.0, nan]
df['name'].to_list()
> ['abc', None, 'abc']

df[df.notna()] = df.astype(str)
> TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

df = df.where(df.isnull(), df.astype(str))

df['number'].to_list()
> ['0.0', '1.0', nan]
df['name'].to_list()
> ['abc', None, 'abc']

hans2520 · 2022-04-07T18:27:54Z

This solution s[s.notna()] = s.astype(str) will not work with mixed types.

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

This is another workaround that does work with mixed types: s = s.where(s.isna(), s.astype(str))

This workaround does not work with Int64 columns:

*** TypeError: object cannot be converted to an IntegerDtype

Leaving both workarounds not working in such a use case.

jbrockmendel · 2023-04-06T21:52:50Z

This hasn't been around for a while, not likely to come back. Closeable?

MarcoGorelli added the Regression Functionality that used to work in a prior pandas version label Feb 5, 2020

jorisvandenbossche added this to the 1.0.2 milestone Feb 5, 2020

jreback closed this as completed Feb 6, 2020

jorisvandenbossche reopened this Feb 6, 2020

smorken pushed a commit to cat-cfs/libcbm_py that referenced this issue Feb 7, 2020

bugfix to support pandas 1.0 release:

b8f630e

pandas-dev/pandas#31708

MarcoGorelli added the Needs Discussion Requires discussion from core team before further action label Feb 16, 2020

TomAugspurger removed this from the 1.0.2 milestone Mar 11, 2020

simonjayhawkins mentioned this issue Mar 29, 2020

Astype keeps nan when converting into string #28176

Closed

mroeschke added the Bug label May 11, 2020

t0ma-sz mentioned this issue Jun 30, 2020

Fix issue #31708 Series.astype(str, skipna=True) vanished in the 1.0 release #35060

Closed

5 tasks

jbrockmendel added the Astype label Jan 8, 2022

jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Apr 6, 2023

mroeschke closed this as completed Nov 21, 2023

woodly0 mentioned this issue Mar 21, 2025

Cast from object to str replaces None constants with its string representation jpmml/sklearn2pmml#445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Series.astype(str, skipna=True) vanished in the 1.0 release #31708

Series.astype(str, skipna=True) vanished in the 1.0 release #31708

languitar commented Feb 5, 2020

MarcoGorelli commented Feb 5, 2020 •

edited

Loading

jorisvandenbossche commented Feb 5, 2020

jreback commented Feb 6, 2020

languitar commented Feb 6, 2020

languitar commented Feb 6, 2020

jreback commented Feb 6, 2020

jreback commented Feb 6, 2020

jorisvandenbossche commented Feb 6, 2020

TomAugspurger commented Mar 11, 2020

sl224 commented Mar 23, 2020 •

edited

Loading

simonjayhawkins commented Mar 29, 2020

t0ma-sz commented Jun 29, 2020

simonjayhawkins commented Jun 29, 2020

dbalduini commented Apr 6, 2021 •

edited

Loading

hans2520 commented Apr 7, 2022 •

edited

Loading

jbrockmendel commented Apr 6, 2023

Series.astype(str, skipna=True) vanished in the 1.0 release #31708

Series.astype(str, skipna=True) vanished in the 1.0 release #31708

Comments

languitar commented Feb 5, 2020

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

MarcoGorelli commented Feb 5, 2020 • edited Loading

jorisvandenbossche commented Feb 5, 2020

jreback commented Feb 6, 2020

languitar commented Feb 6, 2020

languitar commented Feb 6, 2020

jreback commented Feb 6, 2020

jreback commented Feb 6, 2020

jorisvandenbossche commented Feb 6, 2020

TomAugspurger commented Mar 11, 2020

sl224 commented Mar 23, 2020 • edited Loading

simonjayhawkins commented Mar 29, 2020

t0ma-sz commented Jun 29, 2020

simonjayhawkins commented Jun 29, 2020

dbalduini commented Apr 6, 2021 • edited Loading

hans2520 commented Apr 7, 2022 • edited Loading

jbrockmendel commented Apr 6, 2023

Output of `pd.show_versions()`

MarcoGorelli commented Feb 5, 2020 •

edited

Loading

sl224 commented Mar 23, 2020 •

edited

Loading

dbalduini commented Apr 6, 2021 •

edited

Loading

hans2520 commented Apr 7, 2022 •

edited

Loading