Skip to content

REF: implement putmask for CI/DTI/TDI/PI #36400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 17, 2020

Conversation

jbrockmendel
Copy link
Member

Avoids casting to ndarray which in some cases means an object-dtype cast.

@jreback jreback added Refactor Internal refactoring of code Datetime Datetime data dtype labels Sep 16, 2020
@jreback jreback added this to the 1.2 milestone Sep 16, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, just confirm that we are hitting those lines (not that i trust the coverage thing........)

try:
code_value = self._data._validate_where_value(value)
except (TypeError, ValueError):
return self.astype(object).putmask(mask, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hit in tests?

codes = self._data._ndarray.copy()
np.putmask(codes, mask, code_value)
cat = self._data._from_backing_data(codes)
return type(self)._simple_new(cat, name=self.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hit in tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we have coverage for all of the datetimelike but none of the categorical; will update

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question. The PR title implies a refactor. but IIUC the overridden putmask behaves differently from the existing base class which casts to object. So being a change in behaviour is this PR aimed at fixing any issues in particular?

@@ -422,6 +422,17 @@ def where(self, cond, other=None):
cat = Categorical(values, dtype=self.dtype)
return type(self)._simple_new(cat, name=self.name)

def putmask(self, mask, value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessarily for today. but is there any value to pushing this down to the array and having a putmask_compat until NEP18 can be supported?

putmask on the Index returns a copy whereas putmask compat on the array would be expected to be inplace. This may not be so easy for Categorical, but for other numpy backed arrays could be more trivial.

also is the goal of extension array backed indexes to allow 3rd party EAs in the Index. If so, putmask on the array would need to be added to the EA interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] putmask on the array would need to be added to the EA interface?

I would be in favor of this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I am also +1 on this as this is a 'standard' array method, can you create an issue (we might have one?)

@jreback jreback merged commit 52c81a9 into pandas-dev:master Sep 17, 2020
@jreback
Copy link
Contributor

jreback commented Sep 17, 2020

merging but if possible to followon for a) additional testing and b) any perf issues and c) EA api

rhshadrach pushed a commit to rhshadrach/pandas that referenced this pull request Sep 17, 2020
@jbrockmendel jbrockmendel deleted the bug-index-putmask branch September 17, 2020 23:20
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants