Skip to content

Commit 38a95d1

Browse files
committed
Merge pull request pandas-dev#10257 from jreback/numba
add numba example to enhancingperf.rst
2 parents e182793 + 2e20eb7 commit 38a95d1

File tree

2 files changed

+72
-16
lines changed

2 files changed

+72
-16
lines changed

doc/source/enhancingperf.rst

+70-16
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
88
import os
99
import csv
10-
from pandas import DataFrame
10+
from pandas import DataFrame, Series
1111
import pandas as pd
1212
pd.options.display.max_rows=15
1313
@@ -68,9 +68,10 @@ Here's the function in pure python:
6868
6969
We achieve our result by using ``apply`` (row-wise):
7070

71-
.. ipython:: python
71+
.. code-block:: python
7272
73-
%timeit df.apply(lambda x: integrate_f(x['a'], x['b'], x['N']), axis=1)
73+
In [7]: %timeit df.apply(lambda x: integrate_f(x['a'], x['b'], x['N']), axis=1)
74+
10 loops, best of 3: 174 ms per loop
7475
7576
But clearly this isn't fast enough for us. Let's take a look and see where the
7677
time is spent during this operation (limited to the most time consuming
@@ -97,7 +98,7 @@ First we're going to need to import the cython magic function to ipython:
9798

9899
.. ipython:: python
99100
100-
%load_ext cythonmagic
101+
%load_ext Cython
101102
102103
103104
Now, let's simply copy our functions over to cython as is (the suffix
@@ -122,9 +123,10 @@ is here to distinguish between function versions):
122123
to be using bleeding edge ipython for paste to play well with cell magics.
123124

124125

125-
.. ipython:: python
126+
.. code-block:: python
126127
127-
%timeit df.apply(lambda x: integrate_f_plain(x['a'], x['b'], x['N']), axis=1)
128+
In [4]: %timeit df.apply(lambda x: integrate_f_plain(x['a'], x['b'], x['N']), axis=1)
129+
10 loops, best of 3: 85.5 ms per loop
128130
129131
Already this has shaved a third off, not too bad for a simple copy and paste.
130132

@@ -150,9 +152,10 @@ We get another huge improvement simply by providing type information:
150152
...: return s * dx
151153
...:
152154

153-
.. ipython:: python
155+
.. code-block:: python
154156
155-
%timeit df.apply(lambda x: integrate_f_typed(x['a'], x['b'], x['N']), axis=1)
157+
In [4]: %timeit df.apply(lambda x: integrate_f_typed(x['a'], x['b'], x['N']), axis=1)
158+
10 loops, best of 3: 20.3 ms per loop
156159
157160
Now, we're talking! It's now over ten times faster than the original python
158161
implementation, and we haven't *really* modified the code. Let's have another
@@ -229,9 +232,10 @@ the rows, applying our ``integrate_f_typed``, and putting this in the zeros arra
229232
Loops like this would be *extremely* slow in python, but in Cython looping
230233
over numpy arrays is *fast*.
231234

232-
.. ipython:: python
235+
.. code-block:: python
233236
234-
%timeit apply_integrate_f(df['a'].values, df['b'].values, df['N'].values)
237+
In [4]: %timeit apply_integrate_f(df['a'].values, df['b'].values, df['N'].values)
238+
1000 loops, best of 3: 1.25 ms per loop
235239
236240
We've gotten another big improvement. Let's check again where the time is spent:
237241

@@ -278,20 +282,70 @@ advanced cython techniques:
278282
...: return res
279283
...:
280284

281-
.. ipython:: python
285+
.. code-block:: python
282286
283-
%timeit apply_integrate_f_wrap(df['a'].values, df['b'].values, df['N'].values)
287+
In [4]: %timeit apply_integrate_f_wrap(df['a'].values, df['b'].values, df['N'].values)
288+
1000 loops, best of 3: 987 us per loop
284289
285290
Even faster, with the caveat that a bug in our cython code (an off-by-one error,
286291
for example) might cause a segfault because memory access isn't checked.
287292

288293

289-
Further topics
290-
~~~~~~~~~~~~~~
294+
.. _enhancingperf.numba:
295+
296+
Using numba
297+
-----------
298+
299+
A recent alternative to statically compiling cython code, is to use a *dynamic jit-compiler*, ``numba``.
300+
301+
Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.
302+
303+
Numba works by generating optimized machine code using the LLVM compiler infrastructure at import time, runtime, or statically (using the included pycc tool). Numba supports compilation of Python to run on either CPU or GPU hardware, and is designed to integrate with the Python scientific software stack.
304+
305+
.. note::
306+
307+
You will need to install ``numba``. This is easy with ``conda``, by using: ``conda install numba``, see :ref:`installing using miniconda<install.miniconda>`.
308+
309+
We simply take the plain python code from above and annotate with the ``@jit`` decorator.
310+
311+
.. code-block:: python
312+
313+
import numba
314+
315+
@numba.jit
316+
def f_plain(x):
317+
return x * (x - 1)
318+
319+
@numba.jit
320+
def integrate_f_numba(a, b, N):
321+
s = 0
322+
dx = (b - a) / N
323+
for i in range(N):
324+
s += f_plain(a + i * dx)
325+
return s * dx
326+
327+
@numba.jit
328+
def apply_integrate_f_numba(col_a, col_b, col_N):
329+
n = len(col_N)
330+
result = np.empty(n, dtype='float64')
331+
assert len(col_a) == len(col_b) == n
332+
for i in range(n):
333+
result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
334+
return result
335+
336+
def compute_numba(df):
337+
result = apply_integrate_f_numba(df['a'].values, df['b'].values, df['N'].values)
338+
return Series(result, index=df.index, name='result')
339+
340+
Similar to above, we directly pass ``numpy`` arrays directly to the numba function. Further
341+
we are wrapping the results to provide a nice interface by passing/returning pandas objects.
342+
343+
.. code-block:: python
291344
292-
- Loading C modules into cython.
345+
In [4]: %timeit compute_numba(df)
346+
1000 loops, best of 3: 798 us per loop
293347
294-
Read more in the `cython docs <http://docs.cython.org/>`__.
348+
Read more in the `numba docs <http://numba.pydata.org/>`__.
295349

296350
.. _enhancingperf.eval:
297351

doc/source/whatsnew/v0.16.2.txt

+2
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ We recommend that all users upgrade to this version.
99

1010
Highlights include:
1111

12+
- Documentation on how to use ``numba`` with *pandas*, see :ref:`here <enhancingperf.numba>`
13+
1214
Check the :ref:`API Changes <whatsnew_0162.api>` before updating.
1315

1416
.. contents:: What's new in v0.16.2

0 commit comments

Comments
 (0)