Skip to content

ENH: add quantile method to resample #22304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 22, 2018

Conversation

discort
Copy link
Contributor

@discort discort commented Aug 13, 2018

@codecov
Copy link

codecov bot commented Aug 13, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@9346e79). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #22304   +/-   ##
=========================================
  Coverage          ?   92.05%           
=========================================
  Files             ?      169           
  Lines             ?    50716           
  Branches          ?        0           
=========================================
  Hits              ?    46685           
  Misses            ?     4031           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.46% <100%> (?)
#single 42.25% <11.11%> (?)
Impacted Files Coverage Δ
pandas/core/resample.py 96.09% <100%> (ø)
pandas/core/groupby/groupby.py 96.18% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9346e79...92b5b2d. Read the comment docs.

@@ -650,6 +650,7 @@ Groupby/Resample/Rolling
``SeriesGroupBy`` when the grouping variable only contains NaNs and numpy version < 1.13 (:issue:`21956`).
- Multiple bugs in :func:`pandas.core.Rolling.min` with ``closed='left'` and a
datetime-like index leading to incorrect results and also segfault. (:issue:`21704`)
- Added :meth:`Resampler.quantile` (:issue:`15023`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to other enhancements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be added in api.rst as well

def quantile(self, q=0.5, **kwargs):
"""
Return value at the given quantile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a versionadded tag

@@ -41,7 +41,7 @@

# The various methods we support
downsample_methods = ['min', 'max', 'first', 'last', 'sum', 'mean', 'sem',
'median', 'prod', 'var', 'ohlc']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the test from the OP which compares 2 methods of getting the same result

@jreback jreback added Enhancement Resample resample method labels Aug 13, 2018
@discort discort force-pushed the resampler_quantile branch from c0e8601 to b64bd34 Compare August 13, 2018 11:53
@pep8speaks
Copy link

pep8speaks commented Aug 13, 2018

Hello @discort! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on August 22, 2018 at 10:44 Hours UTC

@discort discort force-pushed the resampler_quantile branch from b64bd34 to 07ed27f Compare August 13, 2018 11:55
@discort
Copy link
Contributor Author

discort commented Aug 13, 2018

@jreback

Parameters
----------
q : float or array-like, default 0.5 (50% quantile)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some See Also and point to the Series / groupby versions

@@ -2165,6 +2165,13 @@ def test_resample_datetime_values(self):
res = df['timestamp'].resample('2D').first()
tm.assert_series_equal(res, exp)

def test_resample_quantile(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you test on timedelta / period as well. you might want to move to the Base class to make this easier.

@@ -182,6 +182,7 @@ Other Enhancements
- :func:`to_timedelta` now supports iso-formated timedelta strings (:issue:`21877`)
- :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`)
- :class:`DatetimeIndex` gained :attr:`DatetimeIndex.timetz` attribute. Returns local time with timezone information. (:issue:`21358`)
- Added :meth:`Resampler.quantile` (:issue:`15023`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you say that Series.resample and DataFrame.resample have gained the Resample.quantile() method (use references for .resample)

@@ -1053,9 +1059,15 @@ def _downsample(self, how, **kwargs):
how = self._is_cython_func(how) or how
ax = self.ax

# Empty groupby objs are accepted only callable funcs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what triggers this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback

under periodindex scenario, test_resample_empty_dataframe is failed on this line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this is odd. can you remove this fix, and instead: create a new issue which demonstrates this (is this only with this method?) and xfail the test. I think something bigger is going on here. We can merge this then come back with a battery of tests for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback

to more precisely, this block doesn't pass the check and raises exception.
It became to fail after adding kwargs to _groupby_and_aggregate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this is old logic. can you try calling

def _try_aggregate_string_function(self, arg, *args, **kwargs) (its defined in the base class) and see if this works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't

.../pandas/pandas/core/groupby/generic.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
     74                             min_count=-1):
     75         new_items, new_blocks = self._cython_agg_blocks(
---> 76             how, alt=alt, numeric_only=numeric_only, min_count=min_count)
     77         return self._wrap_agged_blocks(new_items, new_blocks)
     78

.../pandas/pandas/core/groupby/generic.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
    147
    148         if len(new_blocks) == 0:
--> 149             raise DataError('No numeric types to aggregate')
    150
    151         # reset the locs in the blocks to correspond to our

DataError: No numeric types to aggregate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, ok, then let's just xfail this test and pls open another issue. I think this is a bigger issues and needs to be addressed separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback

Here is the issue - #22339

@@ -1053,9 +1059,15 @@ def _downsample(self, how, **kwargs):
how = self._is_cython_func(how) or how
ax = self.ax

# Empty groupby objs are accepted only callable funcs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this is odd. can you remove this fix, and instead: create a new issue which demonstrates this (is this only with this method?) and xfail the test. I think something bigger is going on here. We can merge this then come back with a battery of tests for this.

@@ -1041,9 +1059,15 @@ def _downsample(self, how, **kwargs):
how = self._is_cython_func(how) or how
ax = self.ax

# Empty groupby objs are accepted only callable funcs
# with arg or kwargs currently
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so remove this fix here

@@ -771,6 +771,15 @@ def test_apply_to_empty_series(self):

assert_series_equal(result, expected, check_dtype=False)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then xfail this test for Period, in your followup you can unxfail it.

OR you can do everything here is ok too. (close the new PR / issue). I guess this was simpler that it looked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second way looks simpler. So I closed new PR/issue

@discort discort force-pushed the resampler_quantile branch 2 times, most recently from 3f831f2 to 6f4f96f Compare August 17, 2018 08:05
@discort
Copy link
Contributor Author

discort commented Aug 22, 2018

@jreback

@jreback jreback added this to the 0.24.0 milestone Aug 22, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment on the doc-string, ping on green.

DataFrame.quantile
DataFrameGroupBy.quantile

.. versionadded:: 0.24.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tag goes right below the summary line (the first line of the doc-string, with a blank line before and after)

@discort discort force-pushed the resampler_quantile branch from 6f4f96f to 92b5b2d Compare August 22, 2018 10:44
@jreback jreback merged commit d83f533 into pandas-dev:master Aug 22, 2018
@jreback
Copy link
Contributor

jreback commented Aug 22, 2018

thanks @discort keep em coming!

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: add quantile method to resample
3 participants