Skip to content

PERF: 6x perf hit from using numexpr #5481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jtratner opened this issue Nov 10, 2013 · 7 comments · Fixed by #5482
Closed

PERF: 6x perf hit from using numexpr #5481

jtratner opened this issue Nov 10, 2013 · 7 comments · Fixed by #5482
Labels
Performance Memory or execution speed performance
Milestone

Comments

@jtratner
Copy link
Contributor

1 million x 3 dataframe, using numexpr takes 142ms to do multiplication, with numexpr disabled, takes 21.6ms. I'm going to investigate, but would be helpful to know if you can reproduce this anywhere else.

In [10]: df = pd.DataFrame({"A": np.arange(1000000), "B": np.arange(1000000, 0, -1), "C": np.random.randn(1000000)})

In [11]: df * df
Out[11]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000000 entries, 0 to 999999
Data columns (total 3 columns):
A    1000000  non-null values
B    1000000  non-null values
C    1000000  non-null values
dtypes: float64(1), int64(2)

In [12]: %timeit df * df
10 loops, best of 3: 142 ms per loop

In [13]: pd.computation.expressions.set_use_numexpr(False)

In [14]: %timeit df * df
10 loops, best of 3: 21.6 ms per loop

In [15]: pd.computation.expressions.set_use_numexpr(True)

In [16]: %timeit df * df
10 loops, best of 3: 141 ms per loop

In [17]: len(df)
Out[17]: 1000000
@jtratner
Copy link
Contributor Author

INSTALLED VERSIONS
------------------
Python: 2.7.5.final.0
OS: Darwin 12.4.0 Darwin Kernel Version 12.4.0: Wed May  1 17:57:12 PDT 2013; root:xnu-2050.24.15~1/RELEASE_X86_64 x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.12.0-1081-ge684bdc
Cython: 0.19.1
Numpy: 1.7.1
Scipy: 0.12.0
statsmodels: Not installed
    patsy: Not installed
scikits.timeseries: Not installed
dateutil: 2.1
pytz: 2013d
bottleneck: Not installed
PyTables: 3.0.0
    numexpr: 2.2.1
matplotlib: 1.3.0
openpyxl: 1.6.2
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: Not installed
sqlalchemy: 0.8.2
lxml: 3.2.3
bs4: 4.3.2
html5lib: 1.0b3
bigquery: Not installed
apiclient: Not installed

@jtratner
Copy link
Contributor Author

installing bottleneck doesn't help (not that it would really make sense if it did)

@dsm054
Copy link
Contributor

dsm054 commented Nov 10, 2013

I can reproduce: python 2.7.4, pandas 0.12.0-1083-g9e2c3d6, numpy 1.9.0.dev-8a2728c, numexpr 2.2.2:

>>> pd.computation.expressions.set_use_numexpr(False)
>>> %timeit df * df
10 loops, best of 3: 28.4 ms per loop
>>> pd.computation.expressions.set_use_numexpr(True)
>>> %timeit df * df
10 loops, best of 3: 91.2 ms per loop

@jtratner
Copy link
Contributor Author

So prun suggests that infer_dtype is the cost, I'm looking into it but I'm guessing that it has to do with the different return type from numexpr

@jtratner
Copy link
Contributor Author

Still happens in 0.12 as well

@jtratner
Copy link
Contributor Author

Issue is with mixed dtypes, float only and int only are both faster as expected. Maybe the cols are getting pushed together.

@jtratner
Copy link
Contributor Author

This is the result of multiplying dataframes 10 times by themselves (didn't divide time by 10):

Frames:

(pd.DataFrame({"A": np.arange(1000000), "B": np.arange(1000000, 0, -1), # mixed
                    "C": np.random.randn(1000000)}))
(pd.DataFrame({"A": np.arange(1000000), "B": np.arange(1000000, 0, -1)})) # int
(pd.DataFrame({"C": np.random.randn(1000000)})) # float
Mixed dtype
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000000 entries, 0 to 999999
Data columns (total 3 columns):
A    1000000  non-null values
B    1000000  non-null values
C    1000000  non-null values
dtypes: float64(1), int64(2)
No numexpr:0.396526098251
With numexpr1.45339894295

Int only
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000000 entries, 0 to 999999
Data columns (total 2 columns):
A    1000000  non-null values
B    1000000  non-null values
dtypes: int64(2)
No numexpr:0.120020866394
With numexpr0.0645599365234

Float only
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000000 entries, 0 to 999999
Data columns (total 1 columns):
C    1000000  non-null values
dtypes: float64(1)
No numexpr:0.0512380599976
With numexpr0.0369510650635

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants