Skip to content

BUG: support fused types in roll_min/max #12373 #12481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 82 commits into from

Conversation

joshuastorck
Copy link
Contributor

Ran the performance tests with no changes detected:

$ asv continuous -b 'stat_op.*rolling' upstream/master GH12373
· Creating environments
· Discovering benchmarks
·· Uninstalling from py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..................................................
·· Installing into py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..
· Running 18 total benchmarks (2 commits * 1 environments * 9 benchmarks)
[ 0.00%] · For pandas commit hash 170fb27:
[ 0.00%] ·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.......................................................
[ 0.00%] ·· Benchmarking py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 5.56%] ··· Running stat_ops.stats_rolling_mean.time_rolling_kurt 11.39ms
[ 11.11%] ··· Running stat_ops.stats_rolling_mean.time_rolling_max 7.48ms
[ 16.67%] ··· Running stat_ops.stats_rolling_mean.time_rolling_mean 7.22ms
[ 22.22%] ··· Running stat_ops.stats_rolling_mean.time_rolling_median 114.21ms
[ 27.78%] ··· Running stat_ops.stats_rolling_mean.time_rolling_min 7.55ms
[ 33.33%] ··· Running stat_ops.stats_rolling_mean.time_rolling_skew 11.35ms
[ 38.89%] ··· Running stat_ops.stats_rolling_mean.time_rolling_std 9.13ms
[ 44.44%] ··· Running stat_ops.stats_rolling_mean.time_rolling_sum 6.93ms
[ 50.00%] ··· Running stat_ops.stats_rolling_mean.time_rolling_var 7.98ms
[ 50.00%] · For pandas commit hash 56e285a:
[ 50.00%] ·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.......................................................
[ 50.00%] ·· Benchmarking py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 55.56%] ··· Running stat_ops.stats_rolling_mean.time_rolling_kurt 11.66ms
[ 61.11%] ··· Running stat_ops.stats_rolling_mean.time_rolling_max 8.68ms
[ 66.67%] ··· Running stat_ops.stats_rolling_mean.time_rolling_mean 7.48ms
[ 72.22%] ··· Running stat_ops.stats_rolling_mean.time_rolling_median 107.71ms
[ 77.78%] ··· Running stat_ops.stats_rolling_mean.time_rolling_min 8.74ms
[ 83.33%] ··· Running stat_ops.stats_rolling_mean.time_rolling_skew 11.53ms
[ 88.89%] ··· Running stat_ops.stats_rolling_mean.time_rolling_std 9.25ms
[ 94.44%] ··· Running stat_ops.stats_rolling_mean.time_rolling_sum 6.97ms
[100.00%] ··· Running stat_ops.stats_rolling_mean.time_rolling_var 8.24msBENCHMARKS NOT SIGNIFICANTLY CHANGED.

joshuastorck and others added 6 commits February 27, 2016 09:50
* Changed the rolling min/max functions in algos.pyx so
  that they use a cython fused type as input instead of a float64
  so that the function can accept arrays of any numeric type
* Merged the functionality of rolling min/max into a common
  function with branches based on whether or not it's running
  min/max
* When running rolling min/max for intergral types and there
  are not enough minimum periods, the output values returned
  are zero
* Added a unit test to test_moments to make sure that rolling
  min/max works for all integral types and float32/64
* Changed the rolling min/max functions in algos.pyx so
  that they use a cython fused type as input instead of a float64
  so that the function can accept arrays of any numeric type
* Merged the functionality of rolling min/max into a common
  function with branches based on whether or not it's running
  min/max
* When running rolling min/max for intergral types and there
  are not enough minimum periods, the output values returned
  are zero
* Added a unit test to test_moments to make sure that rolling
  min/max works for all integral types and float32/64
… obsolete file test_moments.py to test_window.py

@cython.boundscheck(False)
@cython.wraparound(False)
def _roll_min_max(ndarray[numeric] a, int window, int minp, bint is_max):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be cdef

@jreback
Copy link
Contributor

jreback commented Feb 27, 2016

ideally if you can run the perf tests that cover rolling min/max and see if anything is changing (prob not).

We prob need to expand the dtypes that are tested as well.

@jreback
Copy link
Contributor

jreback commented Feb 27, 2016

will be in a slight conflict with #12376, IOW we don't need to upcast for min/max (but do for other algos). So need a flag for that somewhere. (until everything is converted like this)

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode Numeric Operations Arithmetic, Comparison, and Logical operations labels Feb 27, 2016
@jreback jreback added this to the 0.18.0 milestone Feb 27, 2016
@jreback jreback changed the title Bugfix for #12373 BUG: support fused types in roll_min/max #12373 Feb 27, 2016
@joshuastorck
Copy link
Contributor Author

I feel a bit embarrassed for not completely reading the issue and noticing that someone already did a PR. I think my fix is a bit more comprehensive. I’ll take a look this week and try to merge.

From: Jeff Reback [mailto:[email protected]]
Sent: Saturday, February 27, 2016 10:43 AM
To: pydata/pandas
Cc: Joshua Storck
Subject: Re: [pandas] Bugfix for #12373 (#12481)

will be in a slight conflict with #12376#12376, IOW we don't need to upcast for min/max (but do for other algos). So need a flag for that somewhere.


Reply to this email directly or view it on GitHubhttps://github.com//pull/12481#issuecomment-189669738.

@jreback
Copy link
Contributor

jreback commented Feb 28, 2016

@joshuastorck no worries. I think the other fix has some comprehensive tests which is good and will move the needle. however your fix on top is ultimately better as the fused types are more general.

@jreback jreback modified the milestones: 0.18.1, 0.18.0 Feb 28, 2016
@BranYang
Copy link
Contributor

BranYang commented Mar 1, 2016

@joshuastorck Your fix makes more sense than just force every roll_ to only work on float64. I was thinking about this but I am not very familiar with Cython so I just haven't touch the Cython code.
Glad to see that someone will do this for making pandas better

mortada and others added 4 commits February 29, 2016 20:56
…sql.rst

Author: Mortada Mehyar <[email protected]>

Closes pandas-dev#12491 from mortada/sql_docs and squashes the following commits:

5d4f520 [Mortada Mehyar] DOC: fixed references to DataFrameGroupBy methods in comparison_with_sql.rst
  used
* Since rolling min/max used to implicitly convert the
  return type to float, but now supports returning the
  same type as the original value, added an as_float argument
  to the min/max functions so that users of the API can
  obtain the original return type by passing as_float=False
* Enhanced the unit test for the new as_float parameter
* Changed the private _roll_min_max function to be declared
  as cdef instead of def
joshuastorck and others added 25 commits March 10, 2016 23:33
  used
* Since rolling min/max used to implicitly convert the
  return type to float, but now supports returning the
  same type as the original value, added an as_float argument
  to the min/max functions so that users of the API can
  obtain the original return type by passing as_float=False
* Enhanced the unit test for the new as_float parameter
* Changed the private _roll_min_max function to be declared
  as cdef instead of def
* Changed the rolling min/max functions in algos.pyx so
  that they use a cython fused type as input instead of a float64
  so that the function can accept arrays of any numeric type
* Merged the functionality of rolling min/max into a common
  function with branches based on whether or not it's running
  min/max
* When running rolling min/max for intergral types and there
  are not enough minimum periods, the output values returned
  are zero
* Added a unit test to test_moments to make sure that rolling
  min/max works for all integral types and float32/64
… obsolete file test_moments.py to test_window.py
  used
* Since rolling min/max used to implicitly convert the
  return type to float, but now supports returning the
  same type as the original value, added an as_float argument
  to the min/max functions so that users of the API can
  obtain the original return type by passing as_float=False
* Enhanced the unit test for the new as_float parameter
* Changed the private _roll_min_max function to be declared
  as cdef instead of def
The PDF doc build was chocking on the `Parameters` section being the
first item.

Author: Tom Augspurger <[email protected]>

Closes pandas-dev#12586 from TomAugspurger/doc-build-error and squashes the following commits:

6843139 [Tom Augspurger] DOC: Fixes LaTeX build error
* Changed the rolling min/max functions in algos.pyx so
  that they use a cython fused type as input instead of a float64
  so that the function can accept arrays of any numeric type
* Merged the functionality of rolling min/max into a common
  function with branches based on whether or not it's running
  min/max
* When running rolling min/max for intergral types and there
  are not enough minimum periods, the output values returned
  are zero
* Added a unit test to test_moments to make sure that rolling
  min/max works for all integral types and float32/64
* Updated computations and whatsnew doc
* Added an as_float argument that when set to False, runs rolling
  min/max with the original types
@jreback
Copy link
Contributor

jreback commented Apr 18, 2016

@joshuastorck can you rebase/update

@jreback jreback removed this from the 0.18.1 milestone Apr 18, 2016
@joshuastorck
Copy link
Contributor Author

I'm closing this in favor of #12595

@jreback
Copy link
Contributor

jreback commented Apr 18, 2016

@joshuastorck ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.