Skip to content

ENH: Min_weight for Rolling #12750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

max-sixty
Copy link
Contributor

Initial implementation for #11167.

Needs some additional tests - for dataframes, other axis values etc, but want to get this out there and get feedback.

Note that the implemenation is different than that for min_periods. This uses a single function to mask all rolling calculations. min_periods is implemented separately for each function. I imagine the latter is a bit faster, although also more complicated.

@jreback
Copy link
Contributor

jreback commented Mar 30, 2016

is this in concert with min_periods? (e.g. looks like this), or better as either/or?

@jreback jreback added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode Numeric Operations Arithmetic, Comparison, and Logical operations labels Mar 30, 2016
@max-sixty
Copy link
Contributor Author

While you could use them together, most cases are going to be an either / or.

If it's just either / or, we could implement within min_periods - a value < 1 going to min_weight and otherwise going to min_periods. But I think less explicit.

@jreback
Copy link
Contributor

jreback commented Mar 30, 2016

min_periods=None is the default, just wondering if it makes any sense to do it and/or raise if BOTH min_weight is not None and min_periods. You don't need to have special values.

Furthermore does the order of application of min_periods and min_weight matter?

give this some tests, if its too complicated / not worth it, ok w/o supporting both simultaneously

@max-sixty
Copy link
Contributor Author

wondering if it makes any sense to do it and/or raise if BOTH min_weight is not None and min_periods.

I think it's fine to have both - it will put a NaN in if either test fails. They are independent, so order doesn't matter.

give this some tests, if its too complicated / not worth it, ok w/o supporting both simultaneously

My thought exactly

@max-sixty max-sixty force-pushed the rolling-min-weight branch 2 times, most recently from 579af97 to 3146b2f Compare April 1, 2016 00:25
@@ -1373,6 +1386,19 @@ def _check_func(minp, window):
return _check_func


def _min_weight_mask(rolling, min_weight):
data = rolling.obj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment explaining what this is doing/why

@jreback
Copy link
Contributor

jreback commented Apr 1, 2016

  • update the doc-strings (Window/Rolling/Expanding/EWM), with versionadded tags.
  • whatsnew sub-section to show mini-example and point to updated docs
  • add to docs
  • more tests. Test for some edge cases (empty, 1 point, all-nan)

@max-sixty
Copy link
Contributor Author

Yes will do - this was WIP. Thanks for the comments

result = rolling.mean()

# check that all points between 25 and 90 are NaN
self.assertTrue(result.iloc[24:89].isnull().all())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

construct the actual result

@max-sixty max-sixty force-pushed the rolling-min-weight branch from 3146b2f to 5f9801c Compare April 6, 2016 23:11
@jreback
Copy link
Contributor

jreback commented May 7, 2016

can you rebase / update

@max-sixty max-sixty force-pushed the rolling-min-weight branch from 5f9801c to 0ca3efc Compare May 17, 2016 04:32
@codecov-io
Copy link

codecov-io commented May 17, 2016

Current coverage is 84.65% (diff: 100%)

Merging #12750 into master will increase coverage by <.01%

@@             master     #12750   diff @@
==========================================
  Files           144        144          
  Lines         51025      51038    +13   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43191      43205    +14   
+ Misses         7834       7833     -1   
  Partials          0          0          

Powered by Codecov. Last update 4c3d4d4...3c42153

@@ -5175,30 +5175,34 @@ def _add_series_or_dataframe_operations(cls):
from pandas.core import window as rwindow

@Appender(rwindow.rolling.__doc__)
def rolling(self, window, min_periods=None, freq=None, center=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to the doc-string min_weight with a versionadded tag (I don't remember exactly where these are defined)

@max-sixty
Copy link
Contributor Author

I've been remiss about the final fixes here. Will need a couple of weeks on other commitments but I will get to this

@jreback
Copy link
Contributor

jreback commented Sep 9, 2016

can you rebase / update?

@jreback
Copy link
Contributor

jreback commented Nov 16, 2016

if you can update

@max-sixty
Copy link
Contributor Author

Yes, apologies, this is another I'll attempt over / after Thanksgiving. Appreciate the patience!

@max-sixty max-sixty force-pushed the rolling-min-weight branch 4 times, most recently from 1d2a591 to 4c40a10 Compare November 28, 2016 16:05
@jreback
Copy link
Contributor

jreback commented Dec 21, 2016

looks like you rebased, great.

needs some docs (computation.rst), whatsnew (you can do a sub-section or just put a ref to the docs) & doc-string examples.

@jreback
Copy link
Contributor

jreback commented Jan 21, 2017

status?

@jreback
Copy link
Contributor

jreback commented Mar 20, 2017

can you update

@max-sixty
Copy link
Contributor Author

Yes, thanks for your patience. I'll wrap most of these up over the next month

@jreback
Copy link
Contributor

jreback commented May 6, 2017

@MaximilianR can you rebase / update

@max-sixty
Copy link
Contributor Author

Forgive the extreme delay for this + a couple others. I need a few more weeks on other stuff, and then I'll give a push to finishing these up.

@jreback
Copy link
Contributor

jreback commented Jul 26, 2017

pls rebase / update.

@jreback
Copy link
Contributor

jreback commented Aug 18, 2017

love the idea, but needs rebase and updating.

@jreback jreback closed this Aug 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Numeric Operations Arithmetic, Comparison, and Logical operations Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: min_weight in addition to min_periods for ewma
3 participants