Skip to content

API/BUG: pd.array(float_ndarray_with_nans) casts np.nan to pd.NA #39039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Jan 8, 2021 · 7 comments
Closed

API/BUG: pd.array(float_ndarray_with_nans) casts np.nan to pd.NA #39039

jbrockmendel opened this issue Jan 8, 2021 · 7 comments
Labels
API Design Closing Candidate May be closeable, needs more eyeballs NA - MaskedArrays Related to pd.NA and nullable extension arrays Needs Discussion Requires discussion from core team before further action

Comments

@jbrockmendel
Copy link
Member

I was under the impression we meant to keep these as distinct concepts

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 8, 2021
@jbrockmendel
Copy link
Member Author

setting data[-3] = np.nan also gets cast to pd.NA

@phofl
Copy link
Member

phofl commented Jan 19, 2021

Docs sound like this is intended? https://pandas.pydata.org/docs/reference/api/pandas.array.html

Changed in version 1.2.0: Pandas now also infers nullable-floating dtype for float-like input data

@jbrockmendel
Copy link
Member Author

@jorisvandenbossche is the one to ask on this

@jorisvandenbossche
Copy link
Member

Yes, this is not a bug but intentional behaviour.

See the 4th bullet points in the top post of the PR implementing it (#34307), but this just mentions it. Some actual discussion about is scattered through the issue on this: #32265

The main reason is that any code that currently deals with numpy arrays and pandas (also internally in pandas) assumes that NaNs are treated as missing data. So for compatibility with this (preserving the "missing" semantics), we convert any NaN to NA by default when converting numpy array to a masked/nullable array.

Now, this certainly is a topic that still needs more attention (eg adding an option to be able to preserve NaNs on input instead of converting to NA, we still need to discuss how to treat NaNs that get introduced by computations, default conversion to numpy, ..)

@jbrockmendel
Copy link
Member Author

Thanks @jorisvandenbossche. Is there (supposed to be?) a way to actually set data[3] = np.nan and actually get np.nan instead of pd.NA?

@simonjayhawkins simonjayhawkins added NA - MaskedArrays Related to pd.NA and nullable extension arrays API Design Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member Bug labels Jan 22, 2021
@simonjayhawkins
Copy link
Member

duplicate of/covered by #32265

@mroeschke
Copy link
Member

Does look similar to #32265, continued discussion can happen on that issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Closing Candidate May be closeable, needs more eyeballs NA - MaskedArrays Related to pd.NA and nullable extension arrays Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

5 participants