PERF: optimize index.getitem for slice & boolean mask indexers #6440

immerrr · 2014-02-22T12:13:35Z

This patch fixes performance issues discussed in and closes #6370.

There's an API change though: no type inference will now happen upon resulting index, e.g. here's what it would look like on current master:

In [1]: pd.Index([1,2,3, 'a', 'b', 'c'])
Out[1]: Index([1, 2, 3, u'a', u'b', u'c'], dtype='object')

In [2]: _1[[0,1,2]]
Out[2]: Int64Index([1, 2, 3], dtype='int64')

And this is how it looks with the patch:

In [1]: pd.Index([1,2,3, 'a', 'b', 'c'])
Out[1]: Index([1, 2, 3, u'a', u'b', u'c'], dtype='object')

In [2]: _1[[0,1,2]]
Out[2]: Index([1, 2, 3], dtype='object')

On the bright side, I'd say using mixed-type indices seems a quite rare scenario and the performance improvement is worth it.

Performance Before:

In [1]: import pandas.util.testing as tm

In [2]: idx = tm.makeStringIndex(1000000)

In [3]: mask = np.arange(1000000) % 3 == 0

In [4]: series_mask = pd.Series(mask); mask
Out[4]: array([ True, False, False, ..., False, False,  True], dtype=bool)

In [5]: timeit idx[:-1]
100000 loops, best of 3: 1.57 µs per loop

In [6]: timeit idx[::2]
100 loops, best of 3: 14.6 ms per loop

In [7]: timeit idx[mask]
100 loops, best of 3: 15.4 ms per loop

In [8]: timeit idx[series_mask]
100 loops, best of 3: 15.4 ms per loop

In [9]: pd.__version__
Out[9]: '0.13.1-278-gaf63b99'

Performance After:

In [1]: import pandas.util.testing as tm

In [2]: idx = tm.makeStringIndex(1000000)

In [3]: mask = np.arange(1000000) % 3 == 0

In [4]: series_mask = pd.Series(mask)

In [5]: timeit idx[:-1]
1000000 loops, best of 3: 1.56 µs per loop

In [6]: timeit idx[::2]
100000 loops, best of 3: 4.89 µs per loop

In [7]: timeit idx[mask]
100 loops, best of 3: 11.4 ms per loop

In [8]: timeit idx[series_mask]
100 loops, best of 3: 11.4 ms per loop

In [9]: pd.__version__
Out[9]: '0.13.1-279-gbc810f0'

jreback · 2014-02-22T12:25:41Z

can u post any significant changes in a full vbench run

jreback · 2014-02-22T12:27:17Z

pandas/core/index.py

+                return result
+        except (IndexError):
+            if not len(key):
+                return self.__class__(np.empty(0, dtype=self.dtype),


use Index here

With or without fastpath?

I think u can just do Index([])

This will cause losing specific index type, i.e. something like that:

In [4]: pd.Index([1,2,3], name='int_index') Out[4]: Int64Index([1, 2, 3], dtype='int64') In [5]: _4[[]] Out[5]: Int64Index([], dtype='int64')

will return generic Index, not Int64Index as above. Is that OK?

IMO you're fixing the behaviour above where a sub-Series can have different class its super Series, for consistency this should also share class when empty. :)

Maybe self._constructor is better ?

Maybe class should be determined from dtype passed to the Index constructor:

In [11]: pd.Index([], dtype=int) # expect Int64Index ? Out[11]: Index([], dtype='object')

But I guess that is a sep issue.

Maybe self._constructor is better ?

I see it's implemented as self.__class__ in Index, so I assume it provides some extra functionality in other classes, but I can't grip the rationale: for Series it always will return Series, even if you subclass it, same for DataFrame. What exactly is it supposed to do?

I take it back, they are equivalent.

I think it's supposed to do just that, be used instead of Index/Series/DataFrame directly so these can be subclassed and methods will respect subclassing.

Anyway, this PR is really good (and a bug fix as well as perf).

I think it's supposed to do just that, be used instead of Index/Series/DataFrame directly so these can be subclassed and methods will respect subclassing.

This is exactly what seems strange to me: I usually use self.__class__ or type(self) if I need a base class method to recognize it can be subclassed, but OTOH looking at how _constructor is implemented in Series it appears to seal class hierarchy (unless you override it in derived class) and the whole point of this is unclear to me.

immerrr · 2014-02-22T12:28:40Z

Here's full vbench log: https://gist.github.com/immerrr/9153429

Interesting parts:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
groupby_frame_apply_overhead                 |   8.4306 |  11.8597 |   0.7109 |
groupby_frame_apply                          |  45.0277 |  60.4376 |   0.7450 |
index_str_boolean_indexer                    |  10.9810 |  14.5193 |   0.7563 |
index_str_boolean_series_indexer             |  11.0783 |  14.5703 |   0.7603 |
....
frame_mask_bools                             |  14.3187 |  12.2263 |   1.1711 |
datamatrix_getitem_scalar                    |   0.0110 |   0.0083 |   1.3269 |

jreback · 2014-02-22T12:38:29Z

can u see if the get item scalar vbench repeats when u rerun?
eg is it an actual change

secondly why does the type inference fail on a selected series? that doesn't break any tests?

immerrr · 2014-02-22T13:02:17Z

can u see if the get item scalar vbench repeats when u rerun?

can you elaborate what "get item scalar vbench" means?

secondly why does the type inference fail on a selected series? that doesn't break any tests?

No, it doesn't break tests (as run by nosetests -w pandas on my Linux system).

As for why type inference fails, it's the type inference that actually causes the 1000x slowdown. I've got rid of it and the slice operation has become O(1) instead of O(n) again.

As for if this inference is necessary or not, my reasoning is as follows: one inference already happens when the initial index object is created and given that most of the time index keys come from one source and thus have same dtypes, it's very unlikely that slicing will change index type. If this assumption is true then in order to support such rare cases each index slicing operation must suffer O(n) penalty (which can easily be 1000x for 1M elements as shown).

jreback · 2014-02-22T13:12:44Z

datamatrix_getitem_scalar | 0.0110 | 0.0083 | 1.3269 |

is what I meant (this is a hard to measure test its in mu....that's why I don't think it will repeat)

ok on the type inference then.

Can you add a test that uses the example from above (and shows that the resultant is an Index type). Just (and doc to this issue), just that it is referenced and know and will break if its changed later.

immerrr · 2014-02-22T13:15:40Z

Will do tomorrow.

jreback · 2014-02-22T21:44:31Z

ahh ok

then leave it like it is

immerrr · 2014-02-23T10:37:03Z

Apparently, datamatrix_getitem_scalar result is not relevant:

$ ./test_perf.sh -H -N 5 -r datamatrix_getitem_scalar
CPU affinity set to 1


*** LOG_FILE = /home/immerrr/sources/pandas/vb_suite.log


Performing 1 benchmarks (5 runs each)
.

*** 
Invoked with :
--ncalls: 3
--repeats: 3


-----------------------------------------------------------------------------------------------------
Test name                                    |    #0    |    #1    |    #2    |    #3    |    #4    |
-----------------------------------------------------------------------------------------------------
datamatrix_getitem_scalar                    |   0.0083 |   0.0087 |   0.0087 |   0.0083 |   0.0083 |
-----------------------------------------------------------------------------------------------------
Test name                                    |    #0    |    #1    |    #2    |    #3    |    #4    |
-----------------------------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [8801089] : TST: make sure index type doesn't change

jreback · 2014-02-23T21:55:01Z

can you add a release notes mention..otherwise looks good to go

immerrr · 2014-02-25T09:00:57Z

@jreback would you please look here:

9f0dc3b#diff-dc96e62847cdfe77b570cf9c6b0e8086R629

It appears that the if not len(key): result = [] piece is part of allowing iloc array indexers to include out-of-bounds indices without raising an error. According to #6296, that's supposed to be rolled back in #6443 and I'd guess it should be removed here as well, right?

jreback · 2014-02-25T10:09:39Z

prove it with a test case

their are so many paths fife inspection often is faulty

immerrr · 2014-02-27T19:21:48Z

I've removed the special handling for empty key case and added a test for it. If that's OK, I'll squash and bump the changelog.

Btw, speaking of changelog, as @hayd pointed out, aside from perf improvements this also changes API somewhat. How do I categorize this PR?

jreback · 2014-02-27T19:32:13Z

put it in the release.rst API section (the 1-liner reffing this issue). Plus do a small example in v0.14.0.txt in the API section. Show a 'common' operation, e.g. create a frame with a mixed index, then slice it and show that the index is not inferred as before (the before should be a code-block)

hayd · 2014-02-27T19:47:29Z

IMO the old behaviour was undocumented so current categorization is all good.

Perhaps worth saying in release notes you can/must be explicit if you want to convert the sub-index (e.g in your example above using _1[[0,1,2]].astype(int)).

immerrr · 2014-02-28T08:37:31Z

Rebased & bumped the changelog.

Some commits look rather independent from the others to me, do you want them folded into one?

jreback · 2014-02-28T11:53:31Z

you can squash them all into 1
or 2 if u want

immerrr · 2014-02-28T14:06:31Z

Done

PERF: optimize index.__getitem__ for slice & boolean mask indexers

jreback · 2014-02-28T15:21:13Z

thanks!

jreback reviewed Feb 22, 2014
View reviewed changes

jreback added Performance labels Feb 22, 2014

jreback added this to the 0.14.0 milestone Feb 22, 2014

jreback mentioned this pull request Feb 26, 2014

PERF: non-info axes slicing on Panels is slow #6484

Closed

PERF: optimize index.__getitem__ for slice & boolean mask indexers

9788ad1

jreback added a commit that referenced this pull request Feb 28, 2014

Merge pull request #6440 from immerrr/index-getitem-performance

f149927

PERF: optimize index.__getitem__ for slice & boolean mask indexers

jreback merged commit f149927 into pandas-dev:master Feb 28, 2014

immerrr deleted the index-getitem-performance branch February 28, 2014 16:15

This was referenced Feb 28, 2014

BLD: add benchmarks for all single-axis indexers #6450

Closed

BUG/TST: fix several issues with slice bound checking code #6531

Merged

immerrr restored the index-getitem-performance branch March 3, 2014 17:14

immerrr deleted the index-getitem-performance branch March 3, 2014 17:15

immerrr mentioned this pull request May 5, 2014

PERF: optimize Index.delete for dtype=object #7040

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: optimize index.getitem for slice & boolean mask indexers #6440

PERF: optimize index.getitem for slice & boolean mask indexers #6440

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

jreback Feb 22, 2014

immerrr Feb 22, 2014

jreback Feb 22, 2014

immerrr Feb 22, 2014

hayd Feb 23, 2014

immerrr Feb 23, 2014

hayd Feb 23, 2014

immerrr Feb 24, 2014

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

immerrr commented Feb 23, 2014

jreback commented Feb 23, 2014

immerrr commented Feb 25, 2014

jreback commented Feb 25, 2014

immerrr commented Feb 27, 2014

jreback commented Feb 27, 2014

hayd commented Feb 27, 2014

immerrr commented Feb 28, 2014

jreback commented Feb 28, 2014

immerrr commented Feb 28, 2014

jreback commented Feb 28, 2014

PERF: optimize index.__getitem__ for slice & boolean mask indexers #6440

PERF: optimize index.__getitem__ for slice & boolean mask indexers #6440

Conversation

immerrr commented Feb 22, 2014

Performance Before:

Performance After:

jreback commented Feb 22, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

immerrr commented Feb 22, 2014

jreback commented Feb 22, 2014

immerrr commented Feb 23, 2014

jreback commented Feb 23, 2014

immerrr commented Feb 25, 2014

jreback commented Feb 25, 2014

immerrr commented Feb 27, 2014

jreback commented Feb 27, 2014

hayd commented Feb 27, 2014

immerrr commented Feb 28, 2014

jreback commented Feb 28, 2014

immerrr commented Feb 28, 2014

jreback commented Feb 28, 2014

PERF: optimize index.getitem for slice & boolean mask indexers #6440

PERF: optimize index.getitem for slice & boolean mask indexers #6440