-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
str accessor functions returns float(NaN) instead of pd.NA #30966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report. Seems to be with the StringArray constructor when passed an ndarray, In [2]: import pandas as pd
In [3]: import numpy as np
In [4]: values = np.array(['a', np.nan], dtype=object)
In [6]: pd.arrays.StringArray(values)
Out[6]:
<StringArray>
['a', nan]
Length: 2, dtype: string |
I don't recall if we expect / document that the values passed to StringArray should be strings / NA (not NaN). When we go through
|
Thanks for the reply! Any way to help to solve the issue? On a similar issue, |
To have |
The |
Code Sample
Problem description
the
str
accessor, when working on string-typed series, should return a string-typed series, which should be an array of [string
,pd.NA
] only, but it seems that some functions (see list below) can return series that containsfloat('nan')
.Affected functions
As of now, I found these str accessor functions to be affected:
upper
lower
replace
also,
extract(expand=False)
on a string type series returns an object type series, which seems unintended as well.Expected Output
<class 'pandas._libs.missing.NAType'>
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: