Skip to content

Performance regression in stat_ops.FrameMultiIndexOps.time_op #35050

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Jun 29, 2020 · 10 comments
Closed

Performance regression in stat_ops.FrameMultiIndexOps.time_op #35050

TomAugspurger opened this issue Jun 29, 2020 · 10 comments
Labels
Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version

Comments

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jun 29, 2020

https://pandas.pydata.org/speed/pandas/index.html#stat_ops.FrameMultiIndexOps.time_op?p-level=1&p-op=%27std%27&commits=c9144ca54dcc924995acae3d9dcb890a5802d7c0

import pandas as pd
import numpy as np

levels = [np.arange(10), np.arange(100), np.arange(100)]
codes = [
    np.arange(10).repeat(10000),
    np.tile(np.arange(100).repeat(100), 10),
    np.tile(np.tile(np.arange(100), 100), 10),
]
index = pd.MultiIndex(levels=levels, codes=codes)
df = pd.DataFrame(np.random.randn(len(index), 4), index=index)
%timeit df.groupby(level=1).std()

Points to #34372 (cc @rhshadrach), but there was an earlier slowdown.

@TomAugspurger TomAugspurger added Bug Needs Triage Issue that has not been reviewed by a pandas team member Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 29, 2020
@TomAugspurger TomAugspurger added this to the 1.1 milestone Jun 29, 2020
@TomAugspurger
Copy link
Contributor Author

Added code to reproduce @rhshadrach are you able to look into why this is slower now?

# 1.0.4
%timeit df.std(level=1)
## -- End pasted text --
3.68 ms ± 51.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# master
%timeit df.std(level=1)
## -- End pasted text --
5.47 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@rhshadrach
Copy link
Member

rhshadrach commented Jul 3, 2020

Thanks - I'll be able to look into this on Sunday.

@rhshadrach
Copy link
Member

rhshadrach commented Jul 5, 2020

Confirmed the timings on 1.0.4, master, as well as PR #34372 directly. This agrees with the slowdown that was reported in #34372 (comment). I also confirmed that single column performance on master beats 1.0.4 by a similar ratio as reported in the comment linked above:

df = pd.DataFrame(np.random.randn(len(index), 1), index=index)

# 1.0.4
%timeit df.std(level=1)
1.93 ms ± 2.55 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

# master
%timeit df.std(level=1)
1.6 ms ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This makes this a duplicate of #34771, although previously that issue only mentioned var -- I've updated it to include std.

@CloseChoice
Copy link
Member

I profiled the call with both versions. Here is my script in case s.b. wants to reproduce:

import cProfile, pstats, io
from pstats import SortKey
import timeit
import pandas as pd
import numpy as np

levels = [np.arange(10), np.arange(100), np.arange(100)]
codes = [
    np.arange(10).repeat(10000),
    np.tile(np.arange(100).repeat(100), 10),
    np.tile(np.tile(np.arange(100), 100), 10),
]
index = pd.MultiIndex(levels=levels, codes=codes)
df = pd.DataFrame(np.random.randn(len(index), 4), index=index)
pr = cProfile.Profile(timeunit=0.000000001)
pr.enable()
df.std(level=1)
pr.disable()
s = io.StringIO()
sortby = SortKey.CUMULATIVE
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print(s.getvalue())

Here are the results (I changed the paths at some lines).

v1.0.4 (only first 50 lines)

         4214 function calls (3938 primitive calls) in 0.005 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.005    0.005 pandas/core/generic.py:11222(stat_func)
        1    0.000    0.000    0.005    0.005 pandas/core/generic.py:10094(_agg_by_level)
        1    0.000    0.000    0.003    0.003 pandas/core/groupby/groupby.py:1248(std)
        1    0.000    0.000    0.003    0.003 pandas/core/groupby/groupby.py:1271(var)
        1    0.000    0.000    0.003    0.003 pandas/core/groupby/generic.py:990(_cython_agg_general)
        1    0.000    0.000    0.003    0.003 pandas/core/groupby/generic.py:998(_cython_agg_blocks)
        1    0.000    0.000    0.002    0.002 pandas/core/groupby/ops.py:583(aggregate)
        1    0.000    0.000    0.002    0.002 pandas/core/groupby/ops.py:427(_cython_operation)
        1    0.000    0.000    0.002    0.002 pandas/core/frame.py:5736(groupby)
        1    0.000    0.000    0.002    0.002 pandas/core/groupby/groupby.py:362(__init__)
        1    0.000    0.000    0.002    0.002 pandas/core/groupby/grouper.py:426(get_grouper)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/ops.py:288(ngroups)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/ops.py:298(result_index)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/grouper.py:396(result_index)
        2    0.000    0.000    0.001    0.001 pandas/core/groupby/grouper.py:402(group_index)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/grouper.py:409(_make_codes)
        1    0.000    0.000    0.001    0.001 pandas/core/algorithms.py:585(factorize)
        1    0.001    0.001    0.001    0.001 pandas/core/groupby/ops.py:593(_aggregate)
        2    0.000    0.000    0.001    0.000 pandas/core/indexes/range.py:345(get_loc)
        2    0.000    0.000    0.001    0.000 pandas/core/indexes/base.py:2637(get_loc)
        1    0.000    0.000    0.001    0.001 pandas/core/indexes/multi.py:1561(get_level_values)
        1    0.000    0.000    0.001    0.001 pandas/core/indexes/multi.py:1535(_get_level_values)
        3    0.000    0.000    0.001    0.000 {method 'get_loc' of 'pandas._libs.index.IndexEngine' objects}
        1    0.000    0.000    0.001    0.001 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/algorithms.py:456(_factorize_array)
        1    0.000    0.000    0.001    0.001 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/groupby/grouper.py:563(is_in_axis)
        1    0.000    0.000    0.001    0.001 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/indexes/base.py:888(__repr__)
        1    0.001    0.001    0.001    0.001 {method 'factorize' of 'pandas._libs.hashtable.Int64HashTable' objects}
        1    0.000    0.000    0.001    0.001 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/indexes/base.py:924(_format_data)
        1    0.000    0.000    0.001    0.001 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/io/formats/printing.py:280(format_object_summary)
        2    0.000    0.000    0.001    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/algorithms.py:1565(take_nd)
        5    0.000    0.000    0.000    0.000 {pandas._libs.algos.ensure_int64}
        7    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/indexes/base.py:276(__new__)
        7    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/indexes/numeric.py:107(_shallow_copy)
        6    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/indexes/base.py:544(_shallow_copy_with_infer)
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/internals/managers.py:708(get_numeric_data)
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/internals/managers.py:718(combine)
       17    0.000    0.000    0.000    0.000 /home/tobias/anaconda3/envs/pandas_1_0_3/lib/python3.8/abc.py:96(__instancecheck__)
       17    0.000    0.000    0.000    0.000 {built-in method _abc._abc_instancecheck}
        2    0.000    0.000    0.000    0.000 {pandas._libs.algos.take_1d_int64_int64}
        9    0.000    0.000    0.000    0.000 {pandas._libs.lib.is_list_like}
     95/9    0.000    0.000    0.000    0.000 /home/tobias/anaconda3/envs/pandas_1_0_3/lib/python3.8/abc.py:100(__subclasscheck__)
     95/9    0.000    0.000    0.000    0.000 {built-in method _abc._abc_subclasscheck}
       20    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/io/formats/printing.py:275(<lambda>)
       20    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/io/formats/printing.py:162(pprint_thing)
        4    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/internals/managers.py:122(__init__)
        2    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas_1_0_3/pandas/core/algorithms.py:64(_ensure_data)

and master:

         3048 function calls (2862 primitive calls) in 0.006 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.006    0.006 pandas/core/generic.py:11413(stat_func)
        1    0.000    0.000    0.006    0.006 pandas/core/generic.py:10196(_agg_by_level)
        1    0.000    0.000    0.005    0.005 pandas/core/groupby/groupby.py:1419(std)
        1    0.003    0.003    0.005    0.005 pandas/core/groupby/groupby.py:2366(_get_cythonized_result)
        1    0.000    0.000    0.001    0.001 pandas/core/frame.py:6342(groupby)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/groupby.py:480(__init__)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/grouper.py:601(get_grouper)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/generic.py:1717(_wrap_aggregated_output)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/ops.py:268(group_info)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/ops.py:285(_get_compressed_codes)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/ops.py:229(codes)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/ops.py:231(<listcomp>)
        2    0.000    0.000    0.001    0.000 pandas/core/groupby/grouper.py:563(codes)
        1    0.000    0.000    0.001    0.001 pandas/core/groupby/grouper.py:582(_make_codes)
        1    0.000    0.000    0.001    0.001 pandas/core/algorithms.py:503(factorize)
        1    0.000    0.000    0.001    0.001 pandas/core/indexes/multi.py:1536(get_level_values)
        1    0.000    0.000    0.001    0.001 pandas/core/indexes/multi.py:1511(_get_level_values)
        1    0.000    0.000    0.001    0.001 pandas/core/algorithms.py:463(_factorize_array)
        1    0.000    0.000    0.001    0.001 pandas/core/algorithms.py:1586(take_nd)
        1    0.000    0.000    0.001    0.001 pandas/core/frame.py:436(__init__)
        1    0.001    0.001    0.001    0.001 {method 'factorize' of 'pandas._libs.hashtable.Int64HashTable' objects}
        1    0.000    0.000    0.001    0.001 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/construction.py:237(init_dict)
      7/3    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/indexes/base.py:280(__new__)
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/construction.py:60(arrays_to_mgr)
       13    0.000    0.000    0.000    0.000 {pandas._libs.lib.is_list_like}
       21    0.000    0.000    0.000    0.000 /home/tobias/anaconda3/envs/pandas-dev/lib/python3.8/abc.py:96(__instancecheck__)
        5    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/groupby/generic.py:995(_iterate_slices)
        1    0.000    0.000    0.000    0.000 {pandas._libs.algos.take_1d_int64_int64}
       21    0.000    0.000    0.000    0.000 {built-in method _abc._abc_instancecheck}
        5    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/frame.py:930(items)
     82/6    0.000    0.000    0.000    0.000 /home/tobias/anaconda3/envs/pandas-dev/lib/python3.8/abc.py:100(__subclasscheck__)
     82/6    0.000    0.000    0.000    0.000 {built-in method _abc._abc_subclasscheck}
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/managers.py:1638(create_block_manager_from_arrays)
        4    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/generic.py:3523(_get_item_cache)
      608    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/managers.py:1678(form_blocks)
        4    0.000    0.000    0.000    0.000 {pandas._libs.algos.ensure_int64}
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/groupby/grouper.py:752(is_in_obj)
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/frame.py:2827(__getitem__)
       49    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/dtypes/base.py:256(is_dtype)
        4    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/managers.py:951(iget)
        5    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/blocks.py:2652(get_block_type)
        1    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/internals/construction.py:329(_homogenize)
     14/6    0.000    0.000    0.000    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
        4    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/construction.py:389(sanitize_array)
        2    0.000    0.000    0.000    0.000 /home/tobias/programming/python/pandas_tests/pandas/pandas/core/indexes/base.py:5705(_maybe_cast_data_without_dtype)

Seems like we spend a lot more time in get_cythonized_result

@simonjayhawkins
Copy link
Member

moved off 1.1.1 milestone (scheduled for this week) as no PRs to fix in the pipeline

@simonjayhawkins simonjayhawkins modified the milestones: 1.1.2, 1.1.3 Sep 7, 2020
@simonjayhawkins
Copy link
Member

moved off 1.1.2 milestone (scheduled for this week) as no PRs to fix in the pipeline

@simonjayhawkins simonjayhawkins modified the milestones: 1.1.3, 1.1.4 Oct 5, 2020
@simonjayhawkins
Copy link
Member

moved off 1.1.3 milestone (overdue) as no PRs to fix in the pipeline

@simonjayhawkins simonjayhawkins modified the milestones: 1.1.4, 1.1.5 Oct 29, 2020
@simonjayhawkins
Copy link
Member

moved off 1.1.4 milestone (scheduled for release tomorrow) as no PRs to fix in the pipeline

@jreback jreback modified the milestones: 1.1.5, Contributions Welcome Nov 25, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

In [3]: %timeit df.groupby(level=1).std()
2.06 ms ± 65.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- main
2.25 ms ± 56.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- 1.5.3
2.3 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- 1.4.0

That's as far back as I can install easily

@rhshadrach
Copy link
Member

Thanks @jbrockmendel - closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants