Skip to content

Performance issues with groupby for large values of ngroups #8426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dlovell opened this issue Sep 30, 2014 · 1 comment
Closed

Performance issues with groupby for large values of ngroups #8426

dlovell opened this issue Sep 30, 2014 · 1 comment
Labels
Groupby Performance Memory or execution speed performance

Comments

@dlovell
Copy link
Contributor

dlovell commented Sep 30, 2014

Per @jreback: #8410 (comment)

Groupby for large values of ngroups, 10000 in this case, is very slow.

Invoked with :
--ncalls: 3
--repeats: 3


---------------------------------------------------------
Test name                                    |    #0    |
---------------------------------------------------------

...

groupby_large_ngroups_value_counts           | 23527.0356 |
groupby_large_ngroups_nunique                | 19496.5300 |
groupby_large_ngroups_describe               | 82329.8963 |
groupby_large_ngroups_mad                    | 21441.5947 |
groupby_large_ngroups_pct_change             | 18865.5750 |

See also: https://gist.github.com/dlovell/ea3400273314e7612f6e

@jreback jreback added Groupby Performance Memory or execution speed performance labels Sep 30, 2014
@jreback jreback added this to the 0.15.1 milestone Sep 30, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@TomAugspurger
Copy link
Contributor

Not immediately clear if this is still an issue.

Can reopen with a runnable example (preferable an asv benchmark).

@TomAugspurger TomAugspurger modified the milestones: Contributions Welcome, No action Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

3 participants