Skip to content

PERF: groupby-cummax,cummin #15048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jan 3, 2017 · 0 comments
Closed

PERF: groupby-cummax,cummin #15048

jreback opened this issue Jan 3, 2017 · 0 comments
Labels
Groupby Performance Memory or execution speed performance
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jan 3, 2017

these could be implemented in cython, should be very straightforward as we already have much templated code

In [1]: np.random.seed(1234)

In [4]: G = 1000

In [5]: N = 10000

In [6]: df = pd.DataFrame({'A':np.random.randint(0,G,size=N),'B':np.random.randn(N)})

In [7]: %timeit df.groupby('A').cumsum()
1000 loops, best of 3: 1.27 ms per loop

In [8]: %timeit df.groupby('A').cummax()
1 loop, best of 3: 799 ms per loop

In [9]: %timeit df.groupby('A').cummin()
1 loop, best of 3: 796 ms per loop

In [10]: %timeit df.groupby('A').cumprod()
1000 loops, best of 3: 1.26 ms per loop
@jreback jreback added Difficulty Intermediate Groupby Performance Memory or execution speed performance labels Jan 3, 2017
@jreback jreback added this to the Next Major Release milestone Jan 3, 2017
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 4, 2017
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 5, 2017
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 6, 2017
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 7, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 8, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed

Add dtypes test

Add additional tests
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 8, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed

Add dtypes test

Add additional tests

handle nan case
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 9, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed

Add dtypes test

Add additional tests

handle nan case

Adapt max/min for different dtypes + tests
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 9, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed

Add dtypes test

Add additional tests

handle nan case

Adapt max/min for different dtypes + tests

remove uncessary comments
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 11, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed

Add dtypes test

Add additional tests

handle nan case

Adapt max/min for different dtypes + tests

remove uncessary comments

Added test & adjust algorithm
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 11, 2017
pep8, removed args, kwargs

add whatsnew + test + changed logic

small error in algo

Use a more obvious test

Fixed algo & test passed

Add dtypes test

Add additional tests

handle nan case

Adapt max/min for different dtypes + tests

remove uncessary comments

Added test & adjust algorithm

Fix linting errors
@jreback jreback modified the milestones: 0.20.0, Next Major Release Jan 11, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
closes pandas-dev#15048

Author: Matt Roeschke <[email protected]>

Closes pandas-dev#15053 from mroeschke/improve_cummin_cummax and squashes the following commits:

5e8ba63 [Matt Roeschke] PERF: Cythonize Groupby.cummin/cummax (pandas-dev#15048)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant