Skip to content

PERF: improved performance of multiindex slicing #10290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 24, 2015

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Jun 5, 2015

closes #10287

master

In [22]: %timeit mdt2.loc[idx[test_A-eps_A:test_A+eps_A],:].loc[idx[:,test_B-eps_B:test_B+eps_B],:].loc[idx[:,:,test_C-eps_C:test_C+eps_C],:].loc[idx[:,:,:,test_D-eps_D:test_D+eps_D],:]
10 loops, best of 3: 141 ms per loop

In [23]: %timeit mdt2.loc[idx[test_A-eps_A:test_A+eps_A,test_B-eps_B:test_B+eps_B,test_C-eps_C:test_C+eps_C,test_D-eps_D:test_D+eps_D],:]
1 loops, best of 3: 4.23 s per loop

PR

# this actually is a bit faster in master. but this repeated chain indexing is frowned upon anyhow
In [22]: %timeit mdt2.loc[idx[test_A-eps_A:test_A+eps_A],:].loc[idx[:,test_B-eps_B:test_B+eps_B],:].loc[idx[:,:,test_C-eps_C:test_C+eps_C],:].loc[idx[:,:,:,test_D-eps_D:test_D+eps_D],:]
1 loops, best of 3: 210 ms per loop

# this is the prefered method (as you can set with this and such), and such a huge diff.
In [23]: %timeit mdt2.loc[idx[test_A-eps_A:test_A+eps_A,test_B-eps_B:test_B+eps_B,test_C-eps_C:test_C+eps_C,test_D-eps_D:test_D+eps_D],:]
1 loops, best of 3: 425 ms per loop

master

In [10]: %timeit s3.isin([1,2])
100 loops, best of 3: 8.83 ms per loop

PR

In [2]: s3 = Series(np.random.randint(1, 10, 100000)).astype('int64')

In [5]: %timeit s3.isin([1,2])
100 loops, best of 3: 2.47 ms per loop

@jreback jreback added Performance Memory or execution speed performance MultiIndex labels Jun 5, 2015
@jreback jreback added this to the 0.16.2 milestone Jun 5, 2015
@jreback
Copy link
Contributor Author

jreback commented Jun 5, 2015

@sinhrks IIRC we had some discussion of the specialized ismember_int64. added here.

@jreback jreback modified the milestones: 0.17.0, 0.16.2 Jun 7, 2015
@jreback
Copy link
Contributor Author

jreback commented Jun 7, 2015

going to defer to 0.17.0 as 3.2 fails for some reason (and not going thru the trouble of creating a build environment); we are going to drop in #9118 . Its pretty hard to install 3.2.

jreback added a commit that referenced this pull request Jun 24, 2015
PERF: improved performance of multiindex slicing
@jreback jreback merged commit 4a4fe0b into pandas-dev:master Jun 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MultiIndex Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: mutli-index selection vs repeated selections
1 participant