Skip to content

Commit 4494266

Browse files
thooPingviinituutti
authored andcommitted
Fix flake8 issues in doc/source/enhancingperf.rst (pandas-dev#24772)
1 parent 6a1cbd3 commit 4494266

File tree

2 files changed

+48
-45
lines changed

2 files changed

+48
-45
lines changed

doc/source/enhancingperf.rst

+48-44
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ four calls) using the `prun ipython magic function <http://ipython.org/ipython-d
7373

7474
.. ipython:: python
7575
76-
%prun -l 4 df.apply(lambda x: integrate_f(x['a'], x['b'], x['N']), axis=1)
76+
%prun -l 4 df.apply(lambda x: integrate_f(x['a'], x['b'], x['N']), axis=1) # noqa E999
7777
7878
By far the majority of time is spend inside either ``integrate_f`` or ``f``,
7979
hence we'll concentrate our efforts cythonizing these two functions.
@@ -189,8 +189,10 @@ in Python, so maybe we could minimize these by cythonizing the apply part.
189189
...: for i in range(N):
190190
...: s += f_typed(a + i * dx)
191191
...: return s * dx
192-
...: cpdef np.ndarray[double] apply_integrate_f(np.ndarray col_a, np.ndarray col_b, np.ndarray col_N):
193-
...: assert (col_a.dtype == np.float and col_b.dtype == np.float and col_N.dtype == np.int)
192+
...: cpdef np.ndarray[double] apply_integrate_f(np.ndarray col_a, np.ndarray col_b,
193+
...: np.ndarray col_N):
194+
...: assert (col_a.dtype == np.float
195+
...: and col_b.dtype == np.float and col_N.dtype == np.int)
194196
...: cdef Py_ssize_t i, n = len(col_N)
195197
...: assert (len(col_a) == len(col_b) == n)
196198
...: cdef np.ndarray[double] res = np.empty(n)
@@ -271,7 +273,9 @@ advanced Cython techniques:
271273
...: return s * dx
272274
...: @cython.boundscheck(False)
273275
...: @cython.wraparound(False)
274-
...: cpdef np.ndarray[double] apply_integrate_f_wrap(np.ndarray[double] col_a, np.ndarray[double] col_b, np.ndarray[int] col_N):
276+
...: cpdef np.ndarray[double] apply_integrate_f_wrap(np.ndarray[double] col_a,
277+
...: np.ndarray[double] col_b,
278+
...: np.ndarray[int] col_N):
275279
...: cdef int i, n = len(col_N)
276280
...: assert len(col_a) == len(col_b) == n
277281
...: cdef np.ndarray[double] res = np.empty(n)
@@ -317,45 +321,45 @@ take the plain Python code from above and annotate with the ``@jit`` decorator.
317321

318322
.. code-block:: python
319323
320-
import numba
324+
import numba
321325
322326
323-
@numba.jit
324-
def f_plain(x):
325-
return x * (x - 1)
327+
@numba.jit
328+
def f_plain(x):
329+
return x * (x - 1)
326330
327331
328-
@numba.jit
329-
def integrate_f_numba(a, b, N):
330-
s = 0
331-
dx = (b - a) / N
332-
for i in range(N):
333-
s += f_plain(a + i * dx)
334-
return s * dx
332+
@numba.jit
333+
def integrate_f_numba(a, b, N):
334+
s = 0
335+
dx = (b - a) / N
336+
for i in range(N):
337+
s += f_plain(a + i * dx)
338+
return s * dx
335339
336340
337-
@numba.jit
338-
def apply_integrate_f_numba(col_a, col_b, col_N):
339-
n = len(col_N)
340-
result = np.empty(n, dtype='float64')
341-
assert len(col_a) == len(col_b) == n
342-
for i in range(n):
343-
result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
344-
return result
341+
@numba.jit
342+
def apply_integrate_f_numba(col_a, col_b, col_N):
343+
n = len(col_N)
344+
result = np.empty(n, dtype='float64')
345+
assert len(col_a) == len(col_b) == n
346+
for i in range(n):
347+
result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
348+
return result
345349
346350
347-
def compute_numba(df):
348-
result = apply_integrate_f_numba(df['a'].values, df['b'].values,
349-
df['N'].values)
350-
return pd.Series(result, index=df.index, name='result')
351+
def compute_numba(df):
352+
result = apply_integrate_f_numba(df['a'].values, df['b'].values,
353+
df['N'].values)
354+
return pd.Series(result, index=df.index, name='result')
351355
352356
Note that we directly pass NumPy arrays to the Numba function. ``compute_numba`` is just a wrapper that provides a
353357
nicer interface by passing/returning pandas objects.
354358

355359
.. code-block:: ipython
356360
357-
In [4]: %timeit compute_numba(df)
358-
1000 loops, best of 3: 798 us per loop
361+
In [4]: %timeit compute_numba(df)
362+
1000 loops, best of 3: 798 us per loop
359363
360364
In this example, using Numba was faster than Cython.
361365

@@ -368,30 +372,30 @@ Consider the following toy example of doubling each observation:
368372

369373
.. code-block:: python
370374
371-
import numba
375+
import numba
372376
373377
374-
def double_every_value_nonumba(x):
375-
return x * 2
378+
def double_every_value_nonumba(x):
379+
return x * 2
376380
377381
378-
@numba.vectorize
379-
def double_every_value_withnumba(x):
380-
return x * 2
382+
@numba.vectorize
383+
def double_every_value_withnumba(x): # noqa E501
384+
return x * 2
381385
382386
.. code-block:: ipython
383387
384-
# Custom function without numba
385-
In [5]: %timeit df['col1_doubled'] = df.a.apply(double_every_value_nonumba)
386-
1000 loops, best of 3: 797 us per loop
388+
# Custom function without numba
389+
In [5]: %timeit df['col1_doubled'] = df.a.apply(double_every_value_nonumba) # noqa E501
390+
1000 loops, best of 3: 797 us per loop
387391
388-
# Standard implementation (faster than a custom function)
389-
In [6]: %timeit df['col1_doubled'] = df.a*2
390-
1000 loops, best of 3: 233 us per loop
392+
# Standard implementation (faster than a custom function)
393+
In [6]: %timeit df['col1_doubled'] = df.a * 2
394+
1000 loops, best of 3: 233 us per loop
391395
392-
# Custom function with numba
393-
In [7]: %timeit df['col1_doubled'] = double_every_value_withnumba(df.a.values)
394-
1000 loops, best of 3: 145 us per loop
396+
# Custom function with numba
397+
In [7]: %timeit (df['col1_doubled'] = double_every_value_withnumba(df.a.values)
398+
1000 loops, best of 3: 145 us per loop
395399
396400
Caveats
397401
~~~~~~~

setup.cfg

-1
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,6 @@ ignore = E402, # module level import not at top of file
4848
exclude =
4949
doc/source/basics.rst
5050
doc/source/contributing_docstring.rst
51-
doc/source/enhancingperf.rst
5251

5352

5453
[yapf]

0 commit comments

Comments
 (0)