Implement pd.NA checking function #33490

dsaxton · 2020-04-12T03:30:26Z

tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

This may already exist in some form, but if not I think it could be useful to have a fast function for checking specifically for the presence of pd.NA (e.g. if we're looking at an arbitrary array and don't know if we'll have access to a _mask attribute, or if we're handling some other data type that uses pd.NA for missing values but doesn't have this attribute).

jreback

isna already does this

dsaxton · 2020-04-12T04:07:39Z

isna already does this

What I was thinking here was a way to get a mask for only pd.NA (so not including np.NaN, None or other missing value indicators like isna does). The main motivation was trying to think of a way to fix #31990 without writing the equivalent of a Python for loop to do this check

jbrockmendel · 2020-04-12T04:10:31Z

pandas/_libs/missing.pyx

+        Py_ssize_t i, n
+        ndarray[uint8_t] result
+
+    assert arr.ndim == 1, "'arr' must be 1-D."


not a big deal, but is this restriction necessary?

Maybe not, the main application I had in mind was when operating on single columns though

jbrockmendel · 2020-04-12T04:11:38Z

related helper function ive wanted is is_matching_na

WillAyd · 2020-04-14T15:13:23Z

What I was thinking here was a way to get a mask for only pd.NA (so not including np.NaN, None or other missing value indicators like isna does). The main motivation was trying to think of a way to fix #31990 without writing the equivalent of a Python for loop to do this check

Can't you use the mask that the StringDtype has to check this?

dsaxton · 2020-04-14T18:04:26Z

Closing since it looks like others aren't in favor of the idea. I do think some kind of "dtype agnostic" way of checking for NA could be helpful later

Implement pd.NA checking function

982ab6d

jreback requested changes Apr 12, 2020

View reviewed changes

jbrockmendel reviewed Apr 12, 2020

View reviewed changes

gfyoung added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Apr 12, 2020

dsaxton closed this Apr 14, 2020

dsaxton deleted the is-pandas-na branch April 14, 2020 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement pd.NA checking function #33490

Implement pd.NA checking function #33490

dsaxton commented Apr 12, 2020

jreback left a comment

dsaxton commented Apr 12, 2020

jbrockmendel Apr 12, 2020

dsaxton Apr 12, 2020

jbrockmendel commented Apr 12, 2020

WillAyd commented Apr 14, 2020

dsaxton commented Apr 14, 2020

Implement pd.NA checking function #33490

Implement pd.NA checking function #33490

Conversation

dsaxton commented Apr 12, 2020

jreback left a comment

Choose a reason for hiding this comment

dsaxton commented Apr 12, 2020

jbrockmendel Apr 12, 2020

Choose a reason for hiding this comment

dsaxton Apr 12, 2020

Choose a reason for hiding this comment

jbrockmendel commented Apr 12, 2020

WillAyd commented Apr 14, 2020

dsaxton commented Apr 14, 2020