Performance regression in stat_ops.FrameMultiIndexOps.time_op #35186

simonjayhawkins · 2020-07-08T19:15:05Z

closes Performance regression in stat_ops.FrameMultiIndexOps.time_op #35050
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

import pandas as pd
import numpy as np

levels = [np.arange(10), np.arange(100), np.arange(100)]
codes = [
    np.arange(10).repeat(10000),
    np.tile(np.arange(100).repeat(100), 10),
    np.tile(np.tile(np.arange(100), 100), 10),
]
index = pd.MultiIndex(levels=levels, codes=codes)
df = pd.DataFrame(np.random.randn(len(index), 4), index=index)
%timeit df.std(level=1)
# 9.09 ms ± 59.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) -> master
# 7.39 ms ± 71.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) -> 0.25.3
# 7.21 ms ± 113 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) -> PR

jreback · 2020-07-08T21:49:03Z

pandas/core/groupby/groupby.py

-            post_processing=lambda vals, inference: np.sqrt(vals),
-            ddof=ddof,
-        )
+        result = self.var(ddof=ddof)


hmm you are reverting.

instead I would like to see if the generic function can be patched (e.g. something is fundamentally different).

cc @rhshadrach

I don't see a way. _get_cythonized_result operates column-by-column, creating overhead. On data this size, we spend 25% in grouper.group_info and 40% in _wrap_aggregated_output. Within each of these operations, I don't see any easy wins. Only 27% of the time is spent doing the actual computation.

The only way I see to improve performance is to change _get_cythonized_result to a 2d operation, operating on all columns at once. In the PR that caused this regression, that's the first thing I tried and found that it got too hairy. In the process, I realized that perhaps a better solution would be to incorporate the features of _get_cythonized_result (namely, pre- and post-processing) into _cython_agg_general.

hmm you are reverting.

yep. should have mentioned this in the OP.

instead I would like to see if the generic function can be patched (e.g. something is fundamentally different).

OK. will close this for now. can reopen if a better solution is not forthcoming before the release.

OK. will close this for now. can reopen if a better solution is not forthcoming before the release.

or maybe better would be to backport this after 1.1rc if not fixed before.

simonjayhawkins · 2020-07-22T10:43:18Z

@jreback if we're not branching, should I reopen this and merge to master? see #34730 (comment)

jreback · 2020-07-22T10:47:30Z

no reverting is not useful here

rhshadrach · 2020-07-25T15:02:03Z

If we're not reverting, the only three options I can see are:

Let the regression stand.
Make _get_cythonized_result a 2d computation rather than operating on each column (which I first tried, and backed out once I realized how difficult this would be). I also expect that doing so would also involve a performance regression for single-column frames that go through this path.
Add pre/post-processing to _cython_agg_general.

I don't know how viable/difficult 3 would be, and I'd be happy to work on it, but I don't think we want it in 1.1.0 at this point even if it could get done.

Performance regression in stat_ops.FrameMultiIndexOps.time_op

79b5263

simonjayhawkins added the Performance Memory or execution speed performance label Jul 8, 2020

fixup for as_index=False

8138635

jreback added this to the 1.1 milestone Jul 8, 2020

jreback requested changes Jul 8, 2020

View reviewed changes

simonjayhawkins closed this Jul 9, 2020

simonjayhawkins mentioned this pull request Jul 9, 2020

RLS: 1.1 #34730

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression in stat_ops.FrameMultiIndexOps.time_op #35186

Performance regression in stat_ops.FrameMultiIndexOps.time_op #35186

simonjayhawkins commented Jul 8, 2020 •

edited

Loading

jreback Jul 8, 2020

rhshadrach Jul 8, 2020

simonjayhawkins Jul 9, 2020

simonjayhawkins Jul 9, 2020

simonjayhawkins commented Jul 22, 2020

jreback commented Jul 22, 2020

rhshadrach commented Jul 25, 2020

Performance regression in stat_ops.FrameMultiIndexOps.time_op #35186

Performance regression in stat_ops.FrameMultiIndexOps.time_op #35186

Conversation

simonjayhawkins commented Jul 8, 2020 • edited Loading

jreback Jul 8, 2020

Choose a reason for hiding this comment

rhshadrach Jul 8, 2020

Choose a reason for hiding this comment

simonjayhawkins Jul 9, 2020

Choose a reason for hiding this comment

simonjayhawkins Jul 9, 2020

Choose a reason for hiding this comment

simonjayhawkins commented Jul 22, 2020

jreback commented Jul 22, 2020

rhshadrach commented Jul 25, 2020

simonjayhawkins commented Jul 8, 2020 •

edited

Loading