Skip to content

DataFrame.groupby(level=...).rolling(n).mean() #14420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
691175002 opened this issue Oct 13, 2016 · 3 comments
Closed

DataFrame.groupby(level=...).rolling(n).mean() #14420

691175002 opened this issue Oct 13, 2016 · 3 comments
Labels
Bug Duplicate Report Duplicate issue or pull request Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@691175002
Copy link

Using the groupby().rolling() object seems to duplicate a level of the index.

In [9]: d.groupby(level='ticker').rolling(30).mean()
Out[9]: 
ticker  ticker  date      
BMO     BMO     2006-01-02          NaN
                2006-01-03          NaN

TD      TD      2016-09-22    57.139340
                2016-09-23    57.171864


In [10]: d.groupby(level='ticker').apply(pd.rolling_mean, 30)
Out[10]: 
ticker  date      
BMO     2006-01-02          NaN
        2006-01-03          NaN

TD      2016-09-22    57.139340
        2016-09-23    57.171864


In [11]: d.groupby(level='ticker').apply(lambda x: x.rolling(30).mean())
Out[11]: 
ticker  date      
BMO     2006-01-02          NaN
        2006-01-03          NaN

TD      2016-09-22    57.139340
        2016-09-23    57.171864

I would expect the output to be the same in all three cases.

@jreback
Copy link
Contributor

jreback commented Oct 13, 2016

pls show a starting frame and well as pd.show_versions()

@691175002
Copy link
Author

Versions:

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 27.2.0
Cython: None
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.8.0rc1
xarray: None
IPython: 5.1.0
sphinx: 1.4.8
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.3
lxml: None
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

Easy reproduction:

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
          np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
df = pd.DataFrame(np.random.randn(8, 2), index=arrays)

df.groupby(level=0).rolling(2).sum()
Out[17]: 
                    0         1
bar bar one       NaN       NaN
        two -0.779420 -1.860285
baz baz one       NaN       NaN
        two -0.680321 -1.221606
foo foo one       NaN       NaN
        two -0.992899 -0.612821
qux qux one       NaN       NaN
        two -1.694832 -1.078676

df.groupby(level=0).apply(lambda x: x.rolling(2).sum())
Out[18]: 
                0         1
bar one       NaN       NaN
    two -0.779420 -1.860285
baz one       NaN       NaN
    two -0.680321 -1.221606
foo one       NaN       NaN
    two -0.992899 -0.612821
qux one       NaN       NaN
    two -1.694832 -1.078676

@jreback
Copy link
Contributor

jreback commented Oct 13, 2016

duplicate of #14013

pull-requests are welcome to fix!

@jreback jreback closed this as completed Oct 13, 2016
@jreback jreback added Bug Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode Duplicate Report Duplicate issue or pull request labels Oct 13, 2016
@jreback jreback added this to the No action milestone Oct 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

2 participants