Skip to content

pd.expanding is incorrectly calculating window size when axis=1 #13753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
seanlaw opened this issue Jul 22, 2016 · 3 comments
Closed

pd.expanding is incorrectly calculating window size when axis=1 #13753

seanlaw opened this issue Jul 22, 2016 · 3 comments
Labels
Bug Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@seanlaw
Copy link
Contributor

seanlaw commented Jul 22, 2016

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [0, 1, 2, np.nan, 4], 
                   'B': [0, 1, 2, np.nan, 4], 
                   'C': [0, 1, 2, np.nan, 4], 
                   'D': [0, 1, 2, np.nan, 4], 
                   'E': [0, 1, 2, np.nan, 4], 
                   'F': [0, 1, 2, np.nan, 4]})

print df.expanding(axis=1).sum()

Expected Output

     A    B     C     D     E     F
0  0.0  0.0   0.0   0.0   0.0   0.0
1  1.0  2.0   3.0   4.0   5.0   5.0
2  2.0  4.0   6.0   8.0  10.0  10.0
3  NaN  NaN   NaN   NaN   NaN   NaN
4  4.0  8.0  12.0  16.0  20.0  20.0

However, the correct result should be:

     A    B     C     D     E     F
0  0.0  0.0   0.0   0.0   0.0   0.0
1  1.0  2.0   3.0   4.0   5.0   6.0
2  2.0  4.0   6.0   8.0  10.0  12.0
3  NaN  NaN   NaN   NaN   NaN   NaN
4  4.0  8.0  12.0  16.0  20.0  24.0

Notice that the last column E is different. I've tracked this down and found that the _get_window function (for expanding) fails to return the correct number of windows when the following conditions are met:

  1. axis=1 is used instead of axis=0 (default)
  2. The number of rows in the dataframe is less than the number of columns

This is caused by the fact that the object is using len(obj) in determining the window size. Instead, it should be using obj.shape[self.axis]

output of pd.show_versions()


commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.18.1+237.ge357ea1
nose: 1.3.7
pip: 8.1.2
setuptools: 20.1.1
Cython: 0.23.4
numpy: 1.11.1
scipy: 0.17.1
statsmodels: 0.6.1
xarray: 0.7.0
IPython: 4.0.3
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.4.1
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.6.0
matplotlib: None
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
httplib2: 0.9
apiclient: 1.4.0
sqlalchemy: 1.0.11
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.0

@jreback
Copy link
Contributor

jreback commented Jul 22, 2016

this is a sympton of #13503 , essentially axis=1 is broken (expanding is just a sub-class of rolling). Certainly appreciate a PR to address that.

@jreback jreback closed this as completed Jul 22, 2016
@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Duplicate Report Duplicate issue or pull request labels Jul 22, 2016
@jreback jreback added this to the Next Major Release milestone Jul 22, 2016
@seanlaw
Copy link
Contributor Author

seanlaw commented Jul 22, 2016

yes, I will submit a pull request to fix expanding

On Fri, Jul 22, 2016 at 1:07 PM, Jeff Reback [email protected]
wrote:

this is a sympton of #13503
#13503 , essentially axis=1 is
broken (expanding is just a sub-class of rolling). Certainly appreciate a
PR to address that.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#13753 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHIJcezIpdbeKbdk6Gj4Bi4IlriiXv3kks5qYPjAgaJpZM4JSpt3
.

@seanlaw seanlaw mentioned this issue Jul 22, 2016
4 tasks
@seanlaw
Copy link
Contributor Author

seanlaw commented Jul 22, 2016

I tried to look at how to fix rolling but couldn't figure out where or why self.axis was being overwritten/ignored. self.axis is present when the object is instantiated via _Window but the functions within _Rolling don't seem to inherit this attribute

@jorisvandenbossche jorisvandenbossche modified the milestones: No action, Next Major Release Jul 23, 2016
@jreback jreback reopened this Jul 24, 2016
@jreback jreback closed this as completed Jul 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants