@@ -234,14 +234,14 @@ the rows, applying our ``integrate_f_typed``, and putting this in the zeros arra
234
234
235
235
.. code-block :: ipython
236
236
237
- In [4]: %timeit apply_integrate_f(df['a'].values , df['b'].values , df['N'].values )
237
+ In [4]: %timeit apply_integrate_f(df['a'].to_numpy() , df['b'].to_numpy() , df['N'].to_numpy() )
238
238
1000 loops, best of 3: 1.25 ms per loop
239
239
240
240
We've gotten another big improvement. Let's check again where the time is spent:
241
241
242
242
.. ipython :: python
243
243
244
- % prun - l 4 apply_integrate_f(df[' a' ].values , df[' b' ].values , df[' N' ].values )
244
+ % prun - l 4 apply_integrate_f(df[' a' ].to_numpy() , df[' b' ].to_numpy() , df[' N' ].to_numpy() )
245
245
246
246
As one might expect, the majority of the time is now spent in ``apply_integrate_f ``,
247
247
so if we wanted to make anymore efficiencies we must continue to concentrate our
@@ -286,7 +286,7 @@ advanced Cython techniques:
286
286
287
287
.. code-block :: ipython
288
288
289
- In [4]: %timeit apply_integrate_f_wrap(df['a'].values , df['b'].values , df['N'].values )
289
+ In [4]: %timeit apply_integrate_f_wrap(df['a'].to_numpy() , df['b'].to_numpy() , df['N'].to_numpy() )
290
290
1000 loops, best of 3: 987 us per loop
291
291
292
292
Even faster, with the caveat that a bug in our Cython code (an off-by-one error,
@@ -349,8 +349,8 @@ take the plain Python code from above and annotate with the ``@jit`` decorator.
349
349
350
350
351
351
def compute_numba (df ):
352
- result = apply_integrate_f_numba(df[' a' ].values , df[' b' ].values ,
353
- df[' N' ].values )
352
+ result = apply_integrate_f_numba(df[' a' ].to_numpy() , df[' b' ].to_numpy() ,
353
+ df[' N' ].to_numpy() )
354
354
return pd.Series(result, index = df.index, name = ' result' )
355
355
356
356
Note that we directly pass NumPy arrays to the Numba function. ``compute_numba `` is just a wrapper that provides a
@@ -394,7 +394,7 @@ Consider the following toy example of doubling each observation:
394
394
1000 loops, best of 3: 233 us per loop
395
395
396
396
# Custom function with numba
397
- In [7]: %timeit (df['col1_doubled'] = double_every_value_withnumba(df.a.values )
397
+ In [7]: %timeit (df['col1_doubled'] = double_every_value_withnumba(df.a.to_numpy() )
398
398
1000 loops, best of 3: 145 us per loop
399
399
400
400
Caveats
0 commit comments