DataFrame.groupby.expanding duplicates grouped dimensions #14134

twheys · 2016-09-01T11:40:02Z

While using a groupby, applying an expanding function duplicates the grouped dimensions. I would expect the result of this to work like groupby.cumsum where the expanding function is applied over the groups and then the result has the same index as the original data frame.

Code Sample, a copy-pastable example if possible

My Data Frame sample

>>> dataframe
           one
cont cat1     
0    a       0
     b       1
1    a       2
     b       3
2    a       4
     b       5
3    a       6
     b       7
4    a       8
     b       9
5    a      10
     b      11
6    a      12
     b      13
7    a      14
     b      15

Example of the function call and incorrect output.

>>> dataframe.groupby(level=1).expanding(min_periods=1).mean()

                one
cat1 cont cat1     
a    0    a     0.0
     1    a     1.0
     2    a     2.0
     3    a     3.0
     4    a     4.0
     5    a     5.0
     6    a     6.0
     7    a     7.0
b    0    b     1.0
     1    b     2.0
     2    b     3.0
     3    b     4.0
     4    b     5.0
     5    b     6.0
     6    b     7.0
     7    b     8.0

Expected Output

>>> dataframe.groupby(level=1).cumsum()

           one
cont cat1     
0    a       0.0
     b       1.0
1    a       1.0
     b       2.0
2    a       2.0
     b       3.0
3    a      3.0
     b      4.0
4    a      4.0
     b      5.0
5    a      5.0
     b      6.0
6    a      6.0
     b      7.0
7    a      7.0
     b      8.0

output of `pd.show_versions()`

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.0.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 8.1.2
setuptools: 18.1
Cython: None
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2016-09-01T11:46:34Z

this is s dupe of #14013

jreback closed this as completed Sep 1, 2016

jreback added Groupby API Design Duplicate Report Duplicate issue or pull request labels Sep 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.groupby.expanding duplicates grouped dimensions #14134

DataFrame.groupby.expanding duplicates grouped dimensions #14134

twheys commented Sep 1, 2016

jreback commented Sep 1, 2016 •

edited

Loading

DataFrame.groupby.expanding duplicates grouped dimensions #14134

DataFrame.groupby.expanding duplicates grouped dimensions #14134

Comments

twheys commented Sep 1, 2016

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

jreback commented Sep 1, 2016 • edited Loading

output of `pd.show_versions()`

jreback commented Sep 1, 2016 •

edited

Loading