Skip to content

PERF: Implement round on the block level #51498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 1, 2023

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Feb 20, 2023

Benchmarks are showing same speedup as OP.

       before           after         ratio
     [c133327c]       [1a2c35f5]
     <main>           <faster-round>
-            115M              72M     0.63  frame_methods.Round.peakmem_round_transposed
-      1.24±0.2ms       60.9±0.9μs     0.05  frame_methods.Round.time_round
-        863±70ms       61.2±0.4μs     0.00  frame_methods.Round.time_round_transposed

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

Whether Copy on Write is enabled right now
"""
if not self.is_numeric or self.is_bool:
return self.copy(deep=None if using_cow else True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly raise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that DataFrame.round() never raises even when there are no numeric columns, but Series.round() does.

Do you want me to add a comment that this doesn't raise?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is just a perf PR than this should maintain behavior, but might merit an issue or TODO comment. looks like some of our EAs have round methods that are not documented as part of the interface.

@simonjayhawkins simonjayhawkins added the Performance Memory or execution speed performance label Feb 22, 2023
@lithomas1 lithomas1 marked this pull request as ready for review February 22, 2023 16:57
Whether Copy on Write is enabled right now
"""
if not self.is_numeric or self.is_bool:
return self.copy(deep=using_cow)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not using_cow I think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch.

@lithomas1 lithomas1 requested a review from phofl February 27, 2023 16:19
# error: Item "ExtensionArray" of "Union[ndarray[Any, Any], ExtensionArray]"
# has no attribute "round"
return new_block_2d(
self.values.round(decimals), # type: ignore[union-attr]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to use self.make_block_same_class here? If class could change then self.make_block. Don't have to set placement there

Copy link
Member

@phofl phofl Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dangerous, I don't think that round guarantees a copy, e.g.

arr = np.array([1, 2])
arr2 = arr.round(2)
arr2[0] = 100

You can check this with

arr is arr2

(also please add a test for this case)

Edit: You can just pass the refs of self in this case like we do in other places

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip.
I found the source here https://github.com/numpy/numpy/blob/486878b37fc7439a3b2b87747f50db9b62fea8eb/numpy/core/src/multiarray/calculation.c#L625-L636
and it looks like this is only triggered for integer case with decimals >= 0.

BTW, it also looks like round is an operation that can be done inplace by using the out parameter in numpy.
Maybe I should open a PR for that as a follow-up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have to check if it's actually faster, but if yes then should utilise it in CoW cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up sounds good to me

# returns the same array, so need to set refs
# https://github.com/numpy/numpy/blob/486878b37fc7439a3b2b87747f50db9b62fea8eb/numpy/core/src/multiarray/calculation.c#L625-L636
refs = None
if not self.is_extension and self.dtype.kind == "i" and decimals >= 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer checking:

result = self.values.round(..)
if self.values is result:
   refs = self.refs

This would not depend on an implementation detail of numpy


if using_copy_on_write:
assert np.shares_memory(get_array(df2, "b"), get_array(df, "b"))
assert np.shares_memory(get_array(df2, "a"), get_array(df, "a"))
if decimals >= 0:
assert np.shares_memory(get_array(df2, "a"), get_array(df, "a"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do a else: assert not and add a todo that we can optimise with out?

if using_copy_on_write:
assert not np.shares_memory(get_array(df2, "b"), get_array(df, "b"))
if decimals >= 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if think this if is not necessary? Both cases shouldn't share

# https://github.com/numpy/numpy/blob/486878b37fc7439a3b2b87747f50db9b62fea8eb/numpy/core/src/multiarray/calculation.c#L625-L636
values = values.copy()
return self.make_block_same_class(
values,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could fit into one line?

Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comment, otherwise lgtm

@lithomas1 lithomas1 added this to the 2.0 milestone Mar 1, 2023
@lithomas1 lithomas1 merged commit f298507 into pandas-dev:main Mar 1, 2023
@lithomas1 lithomas1 deleted the faster-round branch March 1, 2023 22:54
@lumberbot-app
Copy link

lumberbot-app bot commented Mar 1, 2023

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.0.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 f298507b153a166bdd2a919274e039c45e740c5d
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #51498: PERF: Implement round on the block level'
  1. Push to a named branch:
git push YOURFORK 2.0.x:auto-backport-of-pr-51498-on-2.0.x
  1. Create a PR against branch 2.0.x, I would have named this PR:

"Backport PR #51498 on branch 2.0.x (PERF: Implement round on the block level)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

@jbrockmendel
Copy link
Member

Belated: LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: DataFrame.round() unnecessarily slow copared to np.round()
4 participants