-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: loc.setitem raising ValueError when df has duplicate columns #39278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
phofl
commented
Jan 19, 2021
- closes BUG: Setting values to slice fails with duplicated column name #38521
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
Is there a more elegant way to check if an element occurs only once in an index? |
@@ -373,6 +373,15 @@ def test_setitem_string_column_numpy_dtype_raising(self): | |||
expected = DataFrame([[1, 2, 5], [3, 4, 6]], columns=[0, 1, "0 - Name"]) | |||
tm.assert_frame_equal(df, expected) | |||
|
|||
def test_setitem_empty_df_duplicate_columns(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need tests for Series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think so, indexer has to be a tuple to land there, which is only valid for dataframes I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its weird, but you could do series.loc[("foo",)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, definitely weird :)
But this is already converted before reaching that point.
df = Series(index=["a", "b", "b"], dtype="float64")
df.loc[("a", )] = 1
This results in indexer=0
. Tested the same with a MultiIndex Series, also 0 there
pandas/core/indexing.py
Outdated
@@ -1850,7 +1850,8 @@ def _setitem_single_block(self, indexer, value, name: str): | |||
for i, idx in enumerate(indexer) | |||
if i != info_axis | |||
) | |||
and item_labels.is_unique | |||
and len(item_labels.get_indexer_for([item_labels[indexer[info_axis]]])) | |||
== 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets separate this out into a nested condition and define item_labels[indexer[info_axis]]
once, then reuse it on L1856
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Failure unrelated |
thanks @phofl |