-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: support fused types in roll_min/max #12373 #12481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Changed the rolling min/max functions in algos.pyx so that they use a cython fused type as input instead of a float64 so that the function can accept arrays of any numeric type * Merged the functionality of rolling min/max into a common function with branches based on whether or not it's running min/max * When running rolling min/max for intergral types and there are not enough minimum periods, the output values returned are zero * Added a unit test to test_moments to make sure that rolling min/max works for all integral types and float32/64
* Changed the rolling min/max functions in algos.pyx so that they use a cython fused type as input instead of a float64 so that the function can accept arrays of any numeric type * Merged the functionality of rolling min/max into a common function with branches based on whether or not it's running min/max * When running rolling min/max for intergral types and there are not enough minimum periods, the output values returned are zero * Added a unit test to test_moments to make sure that rolling min/max works for all integral types and float32/64
… obsolete file test_moments.py to test_window.py
Conflicts: pandas/stats/tests/test_moments.py
|
||
@cython.boundscheck(False) | ||
@cython.wraparound(False) | ||
def _roll_min_max(ndarray[numeric] a, int window, int minp, bint is_max): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be cdef
ideally if you can run the perf tests that cover rolling min/max and see if anything is changing (prob not). We prob need to expand the dtypes that are tested as well. |
will be in a slight conflict with #12376, IOW we don't need to upcast for min/max (but do for other algos). So need a flag for that somewhere. (until everything is converted like this) |
closes pandas-dev#8848 Author: Andrew Rosenfeld <[email protected]> Closes pandas-dev#12463 from rosnfeld/issue_8848 and squashes the following commits: 43d1fcd [Andrew Rosenfeld] ENH: add missing methods to .dt for Period, resolves pandas-dev#8848
I feel a bit embarrassed for not completely reading the issue and noticing that someone already did a PR. I think my fix is a bit more comprehensive. I’ll take a look this week and try to merge. From: Jeff Reback [mailto:[email protected]] will be in a slight conflict with #12376#12376, IOW we don't need to upcast for min/max (but do for other algos). So need a flag for that somewhere. — |
@joshuastorck no worries. I think the other fix has some comprehensive tests which is good and will move the needle. however your fix on top is ultimately better as the fused types are more general. |
@joshuastorck Your fix makes more sense than just force every roll_ to only work on float64. I was thinking about this but I am not very familiar with Cython so I just haven't touch the Cython code. |
…sql.rst Author: Mortada Mehyar <[email protected]> Closes pandas-dev#12491 from mortada/sql_docs and squashes the following commits: 5d4f520 [Mortada Mehyar] DOC: fixed references to DataFrameGroupBy methods in comparison_with_sql.rst
used * Since rolling min/max used to implicitly convert the return type to float, but now supports returning the same type as the original value, added an as_float argument to the min/max functions so that users of the API can obtain the original return type by passing as_float=False * Enhanced the unit test for the new as_float parameter * Changed the private _roll_min_max function to be declared as cdef instead of def
used * Since rolling min/max used to implicitly convert the return type to float, but now supports returning the same type as the original value, added an as_float argument to the min/max functions so that users of the API can obtain the original return type by passing as_float=False * Enhanced the unit test for the new as_float parameter * Changed the private _roll_min_max function to be declared as cdef instead of def
* Changed the rolling min/max functions in algos.pyx so that they use a cython fused type as input instead of a float64 so that the function can accept arrays of any numeric type * Merged the functionality of rolling min/max into a common function with branches based on whether or not it's running min/max * When running rolling min/max for intergral types and there are not enough minimum periods, the output values returned are zero * Added a unit test to test_moments to make sure that rolling min/max works for all integral types and float32/64
… obsolete file test_moments.py to test_window.py
used * Since rolling min/max used to implicitly convert the return type to float, but now supports returning the same type as the original value, added an as_float argument to the min/max functions so that users of the API can obtain the original return type by passing as_float=False * Enhanced the unit test for the new as_float parameter * Changed the private _roll_min_max function to be declared as cdef instead of def
…n and update the unit test accordingly
The PDF doc build was chocking on the `Parameters` section being the first item. Author: Tom Augspurger <[email protected]> Closes pandas-dev#12586 from TomAugspurger/doc-build-error and squashes the following commits: 6843139 [Tom Augspurger] DOC: Fixes LaTeX build error
* Changed the rolling min/max functions in algos.pyx so that they use a cython fused type as input instead of a float64 so that the function can accept arrays of any numeric type * Merged the functionality of rolling min/max into a common function with branches based on whether or not it's running min/max * When running rolling min/max for intergral types and there are not enough minimum periods, the output values returned are zero * Added a unit test to test_moments to make sure that rolling min/max works for all integral types and float32/64 * Updated computations and whatsnew doc * Added an as_float argument that when set to False, runs rolling min/max with the original types
@joshuastorck can you rebase/update |
I'm closing this in favor of #12595 |
@joshuastorck ok! |
git diff upstream/master | flake8 --diff
Ran the performance tests with no changes detected:
$ asv continuous -b 'stat_op.*rolling' upstream/master GH12373
· Creating environments
· Discovering benchmarks
·· Uninstalling from py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..................................................
·· Installing into py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..
· Running 18 total benchmarks (2 commits * 1 environments * 9 benchmarks)
[ 0.00%] · For pandas commit hash 170fb27:
[ 0.00%] ·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.......................................................
[ 0.00%] ·· Benchmarking py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 5.56%] ··· Running stat_ops.stats_rolling_mean.time_rolling_kurt 11.39ms
[ 11.11%] ··· Running stat_ops.stats_rolling_mean.time_rolling_max 7.48ms
[ 16.67%] ··· Running stat_ops.stats_rolling_mean.time_rolling_mean 7.22ms
[ 22.22%] ··· Running stat_ops.stats_rolling_mean.time_rolling_median 114.21ms
[ 27.78%] ··· Running stat_ops.stats_rolling_mean.time_rolling_min 7.55ms
[ 33.33%] ··· Running stat_ops.stats_rolling_mean.time_rolling_skew 11.35ms
[ 38.89%] ··· Running stat_ops.stats_rolling_mean.time_rolling_std 9.13ms
[ 44.44%] ··· Running stat_ops.stats_rolling_mean.time_rolling_sum 6.93ms
[ 50.00%] ··· Running stat_ops.stats_rolling_mean.time_rolling_var 7.98ms
[ 50.00%] · For pandas commit hash 56e285a:
[ 50.00%] ·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.......................................................
[ 50.00%] ·· Benchmarking py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 55.56%] ··· Running stat_ops.stats_rolling_mean.time_rolling_kurt 11.66ms
[ 61.11%] ··· Running stat_ops.stats_rolling_mean.time_rolling_max 8.68ms
[ 66.67%] ··· Running stat_ops.stats_rolling_mean.time_rolling_mean 7.48ms
[ 72.22%] ··· Running stat_ops.stats_rolling_mean.time_rolling_median 107.71ms
[ 77.78%] ··· Running stat_ops.stats_rolling_mean.time_rolling_min 8.74ms
[ 83.33%] ··· Running stat_ops.stats_rolling_mean.time_rolling_skew 11.53ms
[ 88.89%] ··· Running stat_ops.stats_rolling_mean.time_rolling_std 9.25ms
[ 94.44%] ··· Running stat_ops.stats_rolling_mean.time_rolling_sum 6.97ms
[100.00%] ··· Running stat_ops.stats_rolling_mean.time_rolling_var 8.24msBENCHMARKS NOT SIGNIFICANTLY CHANGED.