Skip to content

BUG: DataFrame.__setitem__ raising ValueError with string indexer and empty df and df to set #38931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 6, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,7 @@ Indexing
- Bug in :meth:`CategoricalIndex.get_indexer` failing to raise ``InvalidIndexError`` when non-unique (:issue:`38372`)
- Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
- Bug in :meth:`DataFrame.loc`, :meth:`Series.loc`, :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` returning incorrect elements for non-monotonic :class:`DatetimeIndex` for string slices (:issue:`33146`)
- Bug in :meth:`DataFrame.__setitem__` raising ``ValueError`` with empty :class:`DataFrame` and specified columns for string indexer and non empty :class:`DataFrame` to set (:issue:`38831`)
- Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
- Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
-
Expand Down
15 changes: 8 additions & 7 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3334,13 +3334,14 @@ def _ensure_valid_index(self, value):
"""
# GH5632, make sure that we are a Series convertible
if not len(self.index) and is_list_like(value) and len(value):
try:
value = Series(value)
except (ValueError, NotImplementedError, TypeError) as err:
raise ValueError(
"Cannot set a frame with no defined index "
"and a value that cannot be converted to a Series"
) from err
if not isinstance(value, DataFrame):
try:
value = Series(value)
except (ValueError, NotImplementedError, TypeError) as err:
raise ValueError(
"Cannot set a frame with no defined index "
"and a value that cannot be converted to a Series"
) from err

# GH31368 preserve name of index
index_copy = value.index.copy()
Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/frame/indexing/test_setitem.py
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,16 @@ def test_setitem_bool_with_numeric_index(self, dtype):

tm.assert_index_equal(df.columns, expected_cols)

@pytest.mark.parametrize("indexer", ["B", ["B"]])
def test_setitem_frame_length_0_str_key(self, indexer):
# GH#38831
df = DataFrame(columns=["A", "B"])
other = DataFrame({"B": [1, 2]})
df[indexer] = other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need/want to check any of the other indexers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know if this would be helpful, the string-DataFrame case is a special case while indexer in [list, Index, Series, array] is all the same (which was working previously)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, can you check df.loc[:, indexer] ? should be the same result i think.

Copy link
Member Author

@phofl phofl Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unfortunately not, in case of indexer=["B"] we align lhs and rhs, which leads to an empty result. This is losely related to our dicussion in #30439 @jbrockmendel

When indexer="B" this works as the case above

expected = DataFrame({"A": [np.nan] * 2, "B": [1, 2]})
expected["A"] = expected["A"].astype("object")
tm.assert_frame_equal(df, expected)


class TestDataFrameSetItemWithExpansion:
def test_setitem_listlike_views(self):
Expand Down