Skip to content

DataFrame.where() fails with some binary comparisons #11492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JohnNapier opened this issue Oct 31, 2015 · 2 comments
Closed

DataFrame.where() fails with some binary comparisons #11492

JohnNapier opened this issue Oct 31, 2015 · 2 comments
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@JohnNapier
Copy link

Suppose we want to cap values in a dataframe of shape (6,5).

import pandas as pd,numpy as np
# WORKS WITH DF OF SHAPE (6,5)
ix=pd.date_range('1/1/2015',periods=6,freq='D')
df0=pd.DataFrame(np.random.normal(size=(len(ix),5)),index=ix)
print df0
df1=df0.quantile(q=.5,axis=1)
print df1
df2=df0.where(df0<df1,df1,axis=0)
print df2

The above code has capped values above the median. For example:

                   0         1         2         3         4
2015-01-01  1.209706  1.362493 -0.141184 -0.262279  0.108538
2015-01-02 -0.752241  0.554342 -0.053927 -1.150906 -0.993721
2015-01-03  0.049342  1.100449  0.616066 -1.453542 -0.023043
2015-01-04 -1.148023 -0.438199 -1.378162  0.678946 -0.224544
2015-01-05  0.395681 -1.237518 -0.606024  0.834100  0.545389
2015-01-06 -0.426188 -0.283056 -0.012219  0.510134  1.363700

2015-01-01    0.108538
2015-01-02   -0.752241
2015-01-03    0.049342
2015-01-04   -0.438199
2015-01-05    0.395681
2015-01-06   -0.012219
Freq: D, dtype: float64

                   0         1         2         3         4
2015-01-01  0.108538  0.108538 -0.141184 -0.262279  0.108538
2015-01-02 -0.752241 -0.752241 -0.752241 -1.150906 -0.993721
2015-01-03  0.049342  0.049342  0.049342 -1.453542 -0.023043
2015-01-04 -1.148023 -0.438199 -1.378162 -0.438199 -0.438199
2015-01-05  0.395681 -1.237518 -0.606024  0.395681  0.395681
2015-01-06 -0.426188 -0.283056 -0.012219 -0.012219 -0.012219

Now, if we do the same with a squared dataframe, the same code will fail.

# FAILS WITH DF OF SHAPE (5,5)
df0=df0.iloc[:5,:5]
df1=df1.iloc[:5]
df2=df0.where(df0<df1,df1,axis=0)
print df2

For some reason, broadcasting now is done through axis=1, although the argument explicitly asked for axis=0.

                   0         1         2         3         4
2015-01-01  0.108538 -0.752241 -0.141184 -0.262279  0.395681
2015-01-02  0.108538 -0.752241  0.049342 -1.150906 -0.993721
2015-01-03  0.108538 -0.752241  0.049342 -1.453542 -0.023043
2015-01-04 -1.148023 -0.752241 -1.378162 -0.438199  0.395681
2015-01-05  0.108538 -1.237518 -0.606024 -0.438199  0.395681

The same failure will take place is we use binary comparison functions.

# IT ALSO FAILS WITH BINARY COMPARISONS
# http://pandas.pydata.org/pandas-docs/stable/basics.html#basics-compare
df2=df0.where(df0.lt(df1,axis=0),df1,axis=0)
print df2
                   0         1         2         3         4
2015-01-01  0.108538 -0.752241 -0.141184 -0.262279  0.395681
2015-01-02  0.108538 -0.752241  0.049342 -1.150906 -0.993721
2015-01-03  0.108538 -0.752241  0.049342 -1.453542 -0.023043
2015-01-04 -1.148023 -0.752241 -1.378162 -0.438199  0.395681
2015-01-05  0.108538 -1.237518 -0.606024 -0.438199  0.395681

pandas version 0.16.2 and numpy 1.9.2

Thanks!

@jreback
Copy link
Contributor

jreback commented Oct 31, 2015

should be closed by this: #9838

when posting an example like this, pls use np.random.seed(...) to make it reproducible
by copy-pasting

@jreback
Copy link
Contributor

jreback commented Oct 31, 2015

issue is this: #9736

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Oct 31, 2015
@jreback jreback closed this as completed Nov 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

2 participants