-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: DataFrame.loc
is not consistent with DataFrame.__setitem__
when used with 2D numpy array
#46544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
DataFrame.loc
is not consistent with DataFrame__setitem__
when used with 2D numpy arrayDataFrame.loc
is not consistent with DataFrame.__setitem__
when used with 2D numpy array
I think the loc case is correct |
for the case in the OP (without the edit) or for the swapped case ("Note: if you swap 2 lines from above, then the code will work!") or both? Note this is a change in behavior from 1.3.5 when it worked for both cases and ordering did not matter. I'll label as a regression, for now pending further investigation. |
All of them should raise |
to be clear, on main... df = pd.DataFrame(np.zeros((256, 10)))
array_2d = np.zeros((256, 2))
df.loc[:, 0] = array_2d
df[0] = array_2d works df = pd.DataFrame(np.zeros((256, 10)))
array_2d = np.zeros((256, 2))
df[0] = array_2d
df.loc[:, 0] = array_2d raises
i'm ignoring the so both the above code samples (in this comment) should raise and there is a bug on master?
i'll do a bisect shortly to get more insight. |
Yes I think so, if you use a list as indexer, e.g. [0], they are already raising. also if you initial dataframe has multiple dtypes and we are running through the split path, they are also raising. you can test this through adding df[100] = „a“ before doing the 2d assignment |
Haven't checked the behavior change, but I think that this works at all should be considered a bug. |
removing from 1.4.x milestone. |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
Inconsistent behavior of functions that should behave the same.
Expected Behavior
Either an exception should be thrown for both cases, or it should not, but also in both cases.
Installed Versions
INSTALLED VERSIONS
commit : 06d2301
python : 3.8.12.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19044
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.4.1
numpy : 1.22.2
pytz : 2021.3
dateutil : 2.8.2
pip : 22.0.3
setuptools : 59.8.0
Cython : None
pytest : 7.0.0
hypothesis : None
sphinx : 4.4.0
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.7.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 8.0.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : 2022.01.0
gcsfs : None
matplotlib : 3.2.2
numba : None
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : 0.17.0
pyarrow : 6.0.1
pyreadstat : None
pyxlsb : None
s3fs : 2022.01.0
scipy : 1.8.0
sqlalchemy : 1.4.31
tables : 3.7.0
tabulate : None
xarray : 0.21.1
xlrd : 2.0.1
xlwt : None
zstandard : None
The text was updated successfully, but these errors were encountered: