Skip to content

BUG: loc.setitem with expansion expanding rows #37932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,7 @@ Indexing
- Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` raises when the index was of ``object`` dtype and the given numeric label was in the index (:issue:`26491`)
- Bug in :meth:`DataFrame.loc` returned requested key plus missing values when ``loc`` was applied to single level from a :class:`MultiIndex` (:issue:`27104`)
- Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`CategoricalIndex` using a listlike indexer containing NA values (:issue:`37722`)
- Bug in :meth:`DataFrame.loc.__setitem__` expanding an empty :class:`DataFrame` with mixed dtypes (:issue:`37932`)
- Bug in :meth:`DataFrame.xs` ignored ``droplevel=False`` for columns (:issue:`19056`)
- Bug in :meth:`DataFrame.reindex` raising ``IndexingError`` wrongly for empty DataFrame with ``tolerance`` not None or ``method="nearest"`` (:issue:`27315`)
- Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`CategoricalIndex` using listlike indexer that contains elements that are in the index's ``categories`` but not in the index itself failing to raise ``KeyError`` (:issue:`37901`)
Expand Down
8 changes: 8 additions & 0 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1687,6 +1687,14 @@ def _setitem_with_indexer_split_path(self, indexer, value, name: str):
for loc, v in zip(ilocs, value):
self._setitem_single_column(loc, v, pi)

elif len(ilocs) == 1 and com.is_null_slice(pi) and len(self.obj) == 0:
# This is a setitem-with-expansion, see
# test_loc_setitem_empty_append_expands_rows_mixed_dtype
# e.g. df = DataFrame(columns=["x", "y"])
# df["x"] = df["x"].astype(np.int64)
# df.loc[:, "x"] = [1, 2, 3]
self._setitem_single_column(ilocs[0], value, pi)

else:
raise ValueError(
"Must have equal len keys and value "
Expand Down
14 changes: 13 additions & 1 deletion pandas/tests/indexing/test_loc.py
Original file line number Diff line number Diff line change
Expand Up @@ -952,7 +952,7 @@ def test_loc_uint64(self):
result = s.loc[[np.iinfo("uint64").max - 1, np.iinfo("uint64").max]]
tm.assert_series_equal(result, s)

def test_loc_setitem_empty_append(self):
def test_loc_setitem_empty_append_expands_rows(self):
# GH6173, various appends to an empty dataframe

data = [1, 2, 3]
Expand All @@ -963,6 +963,18 @@ def test_loc_setitem_empty_append(self):
df.loc[:, "x"] = data
tm.assert_frame_equal(df, expected)

def test_loc_setitem_empty_append_expands_rows_mixed_dtype(self):
# GH#37932 same as test_loc_setitem_empty_append_expands_rows
# but with mixed dtype so we go through take_split_path
data = [1, 2, 3]
expected = DataFrame({"x": data, "y": [None] * len(data)})

df = DataFrame(columns=["x", "y"])
df["x"] = df["x"].astype(np.int64)
df.loc[:, "x"] = data
tm.assert_frame_equal(df, expected)

def test_loc_setitem_empty_append_single_value(self):
# only appends one value
expected = DataFrame({"x": [1.0], "y": [np.nan]})
df = DataFrame(columns=["x", "y"], dtype=float)
Expand Down