Skip to content

BUG: RollingGroupBy.quantile ignores interpolation keyword argument #29567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 18, 2019
Merged

BUG: RollingGroupBy.quantile ignores interpolation keyword argument #29567

merged 6 commits into from
Nov 18, 2019

Conversation

pwsiegel
Copy link
Contributor

@pwsiegel pwsiegel commented Nov 12, 2019

@pwsiegel pwsiegel changed the title BUG: RollingGroupBy.quantile ignores interpolation keyword argume… BUG: RollingGroupBy.quantile ignores interpolation keyword argument Nov 12, 2019
@pwsiegel
Copy link
Contributor Author

pwsiegel commented Nov 12, 2019

Hello all! My build failed and threw the following error during CI:

all_parsers = <pandas.tests.io.parser.conftest.PythonParser object at 0x7fc4298d4860>

    def test_chunks_have_consistent_numerical_type(all_parsers):
        parser = all_parsers
        integers = [str(i) for i in range(499999)]
        data = "a\n" + "\n".join(integers + ["1.0", "2.0"] + integers)
    
        # Coercions should work without warnings.
        with tm.assert_produces_warning(None):
>           result = parser.read_csv(StringIO(data))

pandas/tests/io/parser/test_common.py:1204: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <contextlib._GeneratorContextManager object at 0x7fc3ebc8a438>
type = None, value = None, traceback = None

    def __exit__(self, type, value, traceback):
        if type is None:
            try:
>               next(self.gen)
E               AssertionError: Caused unexpected warning(s): [('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='test1.xlsm'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='test1.xlsx'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='test1.xlsm'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='test1.xlsx'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='testmultiindex.xlsm'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='testmultiindex.xlsm'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='test1.xlsx'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927), ('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='test1.xlsm'>",), '/home/vsts/work/1/s/pandas/io/parsers.py', 2927)].

../../../miniconda3/envs/pandas-dev/lib/python3.6/contextlib.py:89: AssertionError

My changes didn't touch the reader / parser part of the codebase, so I'm wondering if this problem crept into the build from elsewhere? Any suggestions for what I can do?

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Can you try merging master and pushing again? I think should fix your CI issues

result = r.quantile(0.5)
expected = g.apply(lambda x: x.rolling(4).quantile(0.5))
tm.assert_frame_equal(result, expected)
for f in ["linear", "lower", "higher", "midpoint", "nearest"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either here or in a follow up could you parametrize this test with these values rather than the loop within?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u make this a separate test and parmetrize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think I can handle that here - working on it now.

Copy link
Contributor Author

@pwsiegel pwsiegel Nov 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a new test called test_rolling_quantile and parametrized the interpolation argument as requested, but unfortunately it looks as though most of the tests in this file have broken since I merged in master. To test, I checked out the upstream master branch, rebuilt my conda environment from scratch, and ran the tests there. 10 of the tests failed, including test_rolling (stack trace below). Is this a known issue?

self = <pandas.tests.window.test_grouper.TestGrouperGrouping object at 0x116432208>

    def test_rolling(self):
        g = self.frame.groupby("A")
        r = g.rolling(window=4)

        for f in ["sum", "mean", "min", "max", "count", "kurt", "skew"]:

>           result = getattr(r, f)()

test_grouper.py:64:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../core/window/rolling.py:1870: in sum
    return super().sum(*args, **kwargs)
../../core/window/rolling.py:1245: in sum
    return self._apply("roll_sum", "sum", **kwargs)
../../core/window/common.py:81: in _apply
    return self._groupby.apply(f)
../../core/groupby/groupby.py:718: in apply
    result = self._python_apply_general(f)
../../core/groupby/groupby.py:734: in _python_apply_general
    keys, values, mutated = self.grouper.apply(f, self._selected_obj, self.axis)
../../core/groupby/ops.py:171: in apply
    result_values, mutated = splitter.fast_apply(f, group_keys)
../../core/groupby/ops.py:904: in fast_apply
    return libreduction.apply_frame_axis0(sdata, f, names, starts, ends)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   AttributeError: 'Int64Index' object has no attribute '_has_complex_internals'

pandas/_libs/reduction.pyx:519: AttributeError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, @WillAyd's suggestion to rebuild the Cython extensions was correct. The tests are now passing; I'm going to push my changes shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is pushed - let me know if that's different from what you had in mind!

@WillAyd WillAyd added the Window rolling, ewma, expanding label Nov 17, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is including lots of other changes, can you merge master

@pwsiegel
Copy link
Contributor Author

Oops, I think I rebased by accident instead of merging, give me a moment to fix!

@pwsiegel
Copy link
Contributor Author

@jreback OK, that looks better I think. Should I try to squash the merge commit in with 352445c?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to squash ; we do on merge

@@ -436,6 +436,7 @@ Groupby/resample/rolling
- Bug in :meth:`DataFrame.groupby` not offering selection by column name when ``axis=1`` (:issue:`27614`)
- Bug in :meth:`DataFrameGroupby.agg` not able to use lambda function with named aggregation (:issue:`27519`)
- Bug in :meth:`DataFrame.groupby` losing column name information when grouping by a categorical column (:issue:`28787`)
- Bug in :meth:`RollingGroupby.quantile` ignoring ``interpolation`` keyword argument (:issue:`28779`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don’t think this will render, can u see how we reference rolling in other notes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll find another example in the notes and imitate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the reference to DataFrameGroupBy.rolling().quantile(), similarly to the reference to DataFrameGroupBy.rolling().apply() here: https://pandas.pydata.org/pandas-docs/version/0.23.4/whatsnew.html#groupby-resample-rolling. Hopefully that will work better?

result = r.quantile(0.5)
expected = g.apply(lambda x: x.rolling(4).quantile(0.5))
tm.assert_frame_equal(result, expected)
for f in ["linear", "lower", "higher", "midpoint", "nearest"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u make this a separate test and parmetrize

@WillAyd
Copy link
Member

WillAyd commented Nov 17, 2019 via email

@jreback jreback added this to the 1.0 milestone Nov 18, 2019
@jreback jreback merged commit cd56708 into pandas-dev:master Nov 18, 2019
@jreback
Copy link
Contributor

jreback commented Nov 18, 2019

thanks @pwsiegel

@pwsiegel pwsiegel deleted the rolling-groupby-quantile-fix branch November 18, 2019 00:54
Reksbril pushed a commit to Reksbril/pandas that referenced this pull request Nov 18, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RollingGroupBy.quantile ignores interpolation argument
3 participants