-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: improve efficiency of BaseMaskedArray.__setitem__
#44186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e0a5472
to
015917c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need an asv for this
pandas/core/arrays/boolean.py
Outdated
@@ -364,6 +365,17 @@ def map_string(s): | |||
|
|||
_HANDLED_TYPES = (np.ndarray, numbers.Number, bool, np.bool_) | |||
|
|||
def __setitem__(self, key, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type the args & return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
pandas/core/arrays/boolean.py
Outdated
@@ -364,6 +365,17 @@ def map_string(s): | |||
|
|||
_HANDLED_TYPES = (np.ndarray, numbers.Number, bool, np.bool_) | |||
|
|||
def __setitem__(self, key, value): | |||
if lib.is_bool(value): | |||
key = check_array_indexer(self, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can move this to the base class instead of duplicating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? I need to do is_bool
and is_int
and is_float
depending on the subclass. So unless we want to introduce a new method...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would basically constitute _validate_setitem_value
. @alexreg you said you tried this and it added complexity and hurt performance. can you describe what you tried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I tried copying bits from existing _validate_setitem_value
methods, but that may have been the wrong way to go about it (or at the least overkill). Let me try again and see...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I pushed again. How about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better. still needs tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, no doubt... in honesty though, I think I'd just be flailing about helplessly trying to write tests for this. I wouldn't know what input data to use and how to set that up properly, for one. Not to mention what precisely to test. If anyone else wants to come in and add tests for this, that would be a big help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at the test_setitem_foo tests in tests.arrays.test_datetimelike. Something similar in either tests.arrays.masked_shared or tests.arrays.(integer|boolean|floating).test_indexing
d468b36
to
fd645db
Compare
@alexreg status here? pls merge master as well |
This somewhat deals with pandas-dev#44172, though that won't be fully resolved until 2D `ExtensionArray`s are supported (per the comments there).
fd645db
to
fb2b5fe
Compare
@jreback Sorry, haven't had much time lately. It would probably take me a while to properly wrap my head around the testing framework and write good tests. Ideally someone else could step in with that, but if not, maybe I'll get to it in a month or two. I rebased on current master, however, and pushed that. (Nice and easy.) |
None of us are going to write tests for you, but we will help you through getting the hang of writing them.
|
This somewhat deals with #44172, though that won't be fully resolved until 2D
ExtensionArray
s are supported (per the comments there).CC @jbrockmendel