Skip to content

PERF: fix slow s.loc[[0]] #9127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 23, 2014
Merged

PERF: fix slow s.loc[[0]] #9127

merged 1 commit into from
Dec 23, 2014

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Dec 22, 2014

Fixes #9126

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
series_loc_list_like                         |   0.2840 | 125.0146 |   0.0023 |
series_loc_array                             |   0.9947 | 125.5813 |   0.0079 |
series_loc_scalar                            |   0.0430 |   0.0424 |   1.0150 |
series_loc_slice                             |   0.0647 |   0.0606 |   1.0668 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Whee!

I also wrote this fix for IntervalIndex (#8707); this change pulls it out separately. I believe it changes the complexity of the lookup check from O(n*m) for length n index and length m key to O(n+m).

@shoyer shoyer added the Performance Memory or execution speed performance label Dec 22, 2014
@shoyer shoyer added this to the 0.16.0 milestone Dec 22, 2014
@jorisvandenbossche
Copy link
Member

@shoyer Maybe at once add the exact same tests for [], .ix[] and iloc[] ?

@jreback
Copy link
Contributor

jreback commented Dec 22, 2014

@shoyer looks good
needs a release note and good 2 go

@shoyer
Copy link
Member Author

shoyer commented Dec 22, 2014

@jorisvandenbossche Good idea! Just amended in those tests.

I'll merge this once Travis gives the OK.

shoyer added a commit that referenced this pull request Dec 23, 2014
@shoyer shoyer merged commit ab20769 into pandas-dev:master Dec 23, 2014
@shoyer shoyer deleted the fix-slow-loc branch December 23, 2014 01:24
@jreback
Copy link
Contributor

jreback commented Dec 23, 2014

@shoyer first merge! congrats

@jreback
Copy link
Contributor

jreback commented Dec 23, 2014

@shoyer when the original post is from SO/ml then I usually post an update (in that location), something like: they this is now fixed and will be in the xxx release

@shoyer
Copy link
Member Author

shoyer commented Dec 23, 2014

Done! Thanks for the reminder

On Mon, Dec 22, 2014 at 7:33 PM, jreback [email protected] wrote:

@shoyer when the original post is from SO/ml then I usually post an update - hey this is now fixed and will be in the xxx release

Reply to this email directly or view it on GitHub:
#9127 (comment)

@sergeny
Copy link

sergeny commented Dec 24, 2014

Thank you! I've also looked into an earlier issue #6683, and it looks like it still stands.

Is it basically true that .ixalways outperforms .loc, .iloc ,.at, and .iat, contrary to all of the documentation, and the only reason for not using .ix is better type or bounds checking?

a=pd.DataFrame(data=np.zeros([10000,6]))
In [80]: %timeit a.ix[2343,4]
100000 loops, best of 3: 4.32 µs per loop

In [81]: %timeit a.at[2343,4]
100000 loops, best of 3: 6.53 µs per loop
# different semantics, but it is slower than both .ix and .at. Why ?!
In [82]: %timeit a.iat[2343,4]
a.ix[2343,4]100000 loops, best of 3: 7.96 µs per loop

In [87]: %timeit a.ix[xrange(1000,2000),4]
1000 loops, best of 3: 306 µs per loop
In [88]: %timeit a.loc[xrange(1000,2000),4]
1000 loops, best of 3: 998 µs per loop
In [89]: %timeit a.iloc[xrange(1000,2000),4]
1000 loops, best of 3: 538 µs per loop

@sergeny
Copy link

sergeny commented Dec 24, 2014

I see... .iat gets faster than .ix eventually, on a large dataset; and .iloc does get faster, but only on random sets of indices (which is what matters), not an ordered, contiguous sequences. Presumably, there is a special case in .ix to optimize that.

@jreback
Copy link
Contributor

jreback commented Dec 24, 2014

@Commentor

I am completely puzzled why this analysis matters in the slightest. If you are doing a small number of indexings, then the microseconds difference makes no difference. If you are doing a large number of lookups then this is completely the wrong approach. Simply looks them up at the same time.

If you for some reason you really want to look up individual values fast, just drop down to numpy.

In [4]: %timeit df.ix[2343,4]
100000 loops, best of 3: 4.27 us per loop

In [5]: %timeit df[4].values[2343]
100000 loops, best of 3: 3.19 us per loop

In [6]: x = df[4]

In [7]: x = df[4].values

In [8]: %timeit x[2343]
10000000 loops, best of 3: 87 ns per loop

Using an indexer to lookup

In [9]: indexer = np.random.randint(0,len(df),size=1000)

 [10]: %timeit df.loc[indexer,4]
1000 loops, best of 3: 1.34 ms per loop

In [11]: %timeit df.ix[indexer,4]
1000 loops, best of 3: 1.27 ms per loop

In [12]: %timeit df.iloc[indexer,4]
1000 loops, best of 3: 484 us per loop

In [13]: %timeit x[indexer]
1000000 loops, best of 3: 1.74 us per loop

The point is that pandas wants to have a correct lookup first. .ix has some odd and surpising edge cases is not strict enough for most folks with the fallback integer indexing. The decision was made to supplant this with a suite of indexers that try hard to very strictly corrrect on there behavior. This includes a variety of bounds and type checking.

Feel free to use whatever indexer you want.

@sergeny
Copy link

sergeny commented Dec 24, 2014

Thank you, jreback. Now I have a very clear idea.

The advice about going down to numpy level is also very helpful; I did not think about that. My system is replying to real-time calls, and when it starts to process the call, it selects some data from a large dataframe. So yes, I will definitely consider switching to the numpy level after enough testing.

For other purposes, I'll switch back to .loc after a new release that fixes #9127.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Why is DataFrame.loc[[1]] 1,800x slower than df.ix [[1]] and 3,500x than df.loc[1]?
4 participants