Skip to content

REF: de-duplicate code for pad/backfill #40306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

simonjayhawkins
Copy link
Member

       before           after         ratio
     [5ef41593]       [6a7ac9e4]
     <master>         <interpolate_2d>
+     3.44±0.04ms       5.14±0.4ms     1.50  frame_methods.Fillna.time_frame_fillna(True, 'pad', 'Float64')
+      3.48±0.1ms       5.06±0.3ms     1.45  frame_methods.Fillna.time_frame_fillna(True, 'pad', 'Int64')
+     3.32±0.05ms      4.59±0.05ms     1.38  frame_methods.Fillna.time_frame_fillna(False, 'bfill', 'Int64')
+     3.30±0.05ms       4.42±0.1ms     1.34  frame_methods.Fillna.time_frame_fillna(False, 'bfill', 'Float64')
+     3.37±0.03ms       4.43±0.1ms     1.32  frame_methods.Fillna.time_frame_fillna(False, 'pad', 'Int64')
+     3.34±0.05ms      4.30±0.08ms     1.29  frame_methods.Fillna.time_frame_fillna(False, 'pad', 'Float64')
+     2.15±0.04ms       2.48±0.2ms     1.15  frame_methods.Fillna.time_frame_fillna(True, 'bfill', 'Float64')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

if the general idea of this PR is acceptable, could try to improve these but these were already significantly improved in #39953.

with the cython functions reduced from 4 to 1, it is feasible to add 1 dedicated function for no limit (my numba branch is quicker than cython with that change) instead of 4 extra functions.

also may pave the way to add limit area and limit direction support to EAs

and maybe also for 2D masked arrays? cc @jbrockmendel

also having 1 cython function, I think also paves the way for a cython version for the limit area to remove the python set logic could be investigated.

@simonjayhawkins simonjayhawkins added Refactor Internal refactoring of code Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Mar 8, 2021
@jreback
Copy link
Contributor

jreback commented Mar 8, 2021

looks good, +1 on this general idea to unify things.


def _datetimelike_compat(func: F) -> F:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep datetimelike_compat and apply it to interpolate_2d (i learned in #40294 that ATM interpolate_2d doesnt need it ATM bc it the wrapper is used in the functions that this PR removes)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can do. although as stated recently from my experience with experimenting with numba, these wrappers would probably be best in _libs adjacent to the cython code to give a cleaner api and remove conversions needed for the cython code from the python code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im looking forward to seeing this, increasingly positive on the idea of doing more numba and less cython

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slow down. not discussed this yet. My primary motivations are no_arch releases, fully typed libraries without stubs and utilization of more cpu cores and maybe gpu with standard pandas (AFAIK cudf/rapids are planning to rely on wsl2 in order to support Windows and not support Windows natively).

I've only just started, so don't know if this is a realistic goal or whether a more incremental approach of using numba when installed for parallelization is more feasible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i reserve the right to be enthused about this idea

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. i'm enthusiastic too at this early stage...

  • Inline typing compatible with mypy
  • Faster local development cycle
  • Simpler development environment for new contributors
  • No wheel building, maybe faster release cycles
  • Maybe can add some basic gpu support in future
  • Benefits of numba
  • Less complex code
  • No need for some prep functions, casting, asi8, ensure_int/float etc
  • Style checks apply to library code.
  • Let numba handle types and type conversions
  • Let numba handle memory allocation and deallocation

for the basic functions i've looked at so far, the performance of numba seems to be comparable to cython when comparing to a pure python implementation (but i've not touched anything yet where we have C classes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh numba perforance can often be much better than cython, and usually not less than. The issue we have is that we cannot move much of the core functions as we do not have a strict dependency at this point. we can certainly revisit things going forward.

@jorisvandenbossche
Copy link
Member

@simonjayhawkins does the slowdown come from the cython changes (removing the 1D version) or from other changes?

@simonjayhawkins
Copy link
Member Author

@simonjayhawkins does the slowdown come from the cython changes (removing the 1D version) or from other changes?

not profiled yet. (the slowdown is neglible compared to the object conversion that was done up to a few days ago!)

I will profile shortly.

but if due to the cython code, could consolidate around the 1d versions and use np.along_axis for 2d instead.

I consolidated around the 2d here since I think that can be parallelized with numba, but that shouldn't be a consideration here.

there was no performance impact upto the last commit so may split those changes into a separate PR to keep things manageable.

I could be that some extra checking in interploated_2d is responsible for the slowdown. I found that it impacts performance calling pd.isna on the mask after transformations compared to before transformations.

@jorisvandenbossche
Copy link
Member

(the slowdown is neglible compared to the object conversion that was done up to a few days ago!)

Yes, but the baseline for comparison should probably not be the slow object conversion way from before, but comparing with numpy numeric dtypes (since it's the goal to replace eg float64 with Float64)

@simonjayhawkins
Copy link
Member Author

Yes, but the baseline for comparison should probably not be the slow object conversion way from before, but comparing with numpy numeric dtypes (since it's the goal to replace eg float64 with Float64)

agreed. I don't expect this PR to be merged/acceptable with the performance degradation.

when changing to not do the object conversion, we always update the mask in the 1d version and this will also be needed in the 2d version if masked arrays are to be backed by 2d arrays unless we make the mask update conditional or change that strategy.

@simonjayhawkins simonjayhawkins marked this pull request as draft March 10, 2021 15:19
@simonjayhawkins
Copy link
Member Author

Yes, but the baseline for comparison should probably not be the slow object conversion way from before, but comparing with numpy numeric dtypes (since it's the goal to replace eg float64 with Float64)

closing for now to clear queue. will revisit after #38992.

@simonjayhawkins
Copy link
Member Author

closing for now to clear queue. will revisit after #38992.

while experimenting with numba, #40530, i'm periodically running the benchmark suite. I don't know how best to do this as some benchmarks show a significant degradation that are likely due to compilation as testing the functions in jupyter notebook tend to show similar performance to Cython. However, the benchmarks that show a performance improvement likely indicate areas where improvements to the cython code could be investigated.

This PR is part of that process. Below is the benchmarks that are improved with #40530 (as of yesterday).

As I make progress with #40530. This output will change and the priorities will undoubtedly change.

-      10.7±0.2ms       9.65±0.3ms     0.90  join_merge.Concat.time_concat_series(0)
-      21.7±0.4ms       19.4±0.1ms     0.90  algos.isin.IsInLongSeriesLookUpDominates.time_isin('float32', 5, 'monotone_hits')
-      7.50±0.7μs       6.70±0.3μs     0.89  index_cached_properties.IndexCache.time_engine('UInt64Index')
-      23.9±0.1ms         21.3±1ms     0.89  groupby.Nth.time_series_nth_all('float32')
-      24.0±0.2ms       21.3±0.9ms     0.89  groupby.Nth.time_series_nth_any('float32')
-     3.74±0.02ms      3.09±0.04ms     0.83  frame_methods.Fillna.time_frame_fillna(False, 'pad', 'Int64')
-     3.76±0.08ms      3.01±0.03ms     0.80  frame_methods.Fillna.time_frame_fillna(True, 'pad', 'Int64')
-     22.2±0.06ms       17.5±0.2ms     0.79  reshape.Crosstab.time_crosstab
-     4.10±0.03ms      3.23±0.04ms     0.79  frame_methods.Fillna.time_frame_fillna(False, 'bfill', 'Int64')
-     4.11±0.02ms      3.22±0.09ms     0.78  frame_methods.Fillna.time_frame_fillna(False, 'bfill', 'Float64')
-     3.89±0.04ms      2.99±0.05ms     0.77  frame_methods.Fillna.time_frame_fillna(True, 'pad', 'Float64')
-      3.30±0.8μs       2.30±0.2μs     0.70  index_cached_properties.IndexCache.time_values('UInt64Index')
-      5.81±0.1ms      3.70±0.03ms     0.64  frame_methods.Fillna.time_frame_fillna(True, 'pad', 'datetime64[ns, tz]')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants