Skip to content

CLN: deduplicate __setitem__ and _reduce on masked arrays #34187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jorisvandenbossche
Copy link
Member

For __setitem__, the only difference was that they both call to their own "coercing" function, so I made this available as a helper method, so it the rest of the implementation of __setitem__ can be shared.

For _reduce, everything was the same, BooleanArray just has an additional check for any and all.

@jorisvandenbossche jorisvandenbossche added Clean NA - MaskedArrays Related to pd.NA and nullable extension arrays labels May 15, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.1 milestone May 15, 2020
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -77,6 +84,23 @@ def __getitem__(self, item):

return type(self)(self._data[item], self._mask[item])

def _coerce_to_array(self, values) -> Tuple[np.ndarray, np.ndarray]:
raise AbstractMethodError(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orthogranal to this PR but wondering if we even need this any more or could just use abc.abstractmethod decorator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use this instead of the decorator in the EA base class, because the decorator requires it to actually be an abstract base class.

See

This class does not inherit from 'abc.ABCMeta' for performance reasons.
Methods and properties required by the interface raise
``pandas.errors.AbstractMethodError`` and no ``register`` method is
provided for registering virtual subclasses.

@jorisvandenbossche jorisvandenbossche merged commit 6ad157e into pandas-dev:master May 20, 2020
@jorisvandenbossche jorisvandenbossche deleted the EA-masked-setitem branch May 20, 2020 20:54
result = op(data, axis=0, skipna=skipna, mask=mask, **kwargs)

if np.isnan(result):
return libmissing.NA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a follow up, should this be self.dtype.na_value instead of hardcoded in the base class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like having it explicit as well (all subclasses use NA as missing value indicator, this method is not shared with ones where dtype.na_value is something different), but can go either way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm,

we have

    @property
    def _na_value(self):
        return self.dtype.na_value

so this is not consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also remove that property, from a quick search, that is only used once in the arrays code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, if not generally on an array class, I think self.dtype.na_value is generally clearer.

This is not the only case where libmissing.NA is used, so we should use one ore the other. I accept your argument about being explicit but also prefer the the case of allowing the class to be extensible. YAGNI favours your proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants