-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: avoid copies if possible in fill_binop #31300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not being super familiar here, is there a way to avoid copy without having to special case for the mask presence?
pandas/core/ops/__init__.py
Outdated
if fill_value is not None: | ||
left_mask = isna(left) | ||
right_mask = isna(right) | ||
left = left.copy() | ||
|
||
right = right.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the right copy not be removed as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, will update
Not sure I understand the question. Can you give an example? |
@jbrockmendel : I believe @WillAyd is trying to determine if there is a faster way to figure out if we need to
|
Well, we could pre-check for dtypes that can't hold NAs and short-circuit in those cases. Other than that, we might be able to do 1 O(n) pass instead of 2-3 if we did it all in cython, but i dont think thats worthwhile at this point |
this is fine. pls add asvs when you can. |
…ndexing-1row-df * upstream/master: (194 commits) DOC Remove Python 2 specific comments from documentation (pandas-dev#31198) Follow up PR: pandas-dev#28097 Simplify branch statement (pandas-dev#29243) BUG: DatetimeIndex.snap incorrectly setting freq (pandas-dev#31188) Move DataFrame.info() to live with similar functions (pandas-dev#31317) ENH: accept a dictionary in plot colors (pandas-dev#31071) PERF: add shortcut to Timestamp constructor (pandas-dev#30676) CLN/MAINT: Clean and annotate stata reader and writers (pandas-dev#31072) REF: define _get_slice_axis in correct classes (pandas-dev#31304) BUG: DataFrame.floordiv(ser, axis=0) not matching column-wise bheavior (pandas-dev#31271) PERF: optimize is_scalar, is_iterator (pandas-dev#31294) BUG: Series rolling count ignores min_periods (pandas-dev#30923) xfail sparse warning; closes pandas-dev#31310 (pandas-dev#31311) REF: DatetimeIndex.get_value wrap DTI.get_loc (pandas-dev#31314) CLN: internals.managers (pandas-dev#31316) PERF: avoid copies if possible in fill_binop (pandas-dev#31300) Add test for multiindex json (pandas-dev#31307) BUG: passing TDA and wrong freq to TimedeltaIndex (pandas-dev#31268) BUG: inconsistency between PeriodIndex.get_value vs get_loc (pandas-dev#31172) CLN: remove _set_subtyp (pandas-dev#31301) CI: Updated version of macos image (pandas-dev#31292) ...
This has a small perf bump, but more importantly is necessary for the blockwise frame-with-frame PR that is coming up.