Skip to content

BUG: loc.setitem raising ValueError when df has duplicate columns #39278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 21, 2021

Conversation

phofl
Copy link
Member

@phofl phofl commented Jan 19, 2021

@phofl phofl added the Indexing Related to indexing on series/frames, not to indexes themselves label Jan 19, 2021
@phofl
Copy link
Member Author

phofl commented Jan 19, 2021

Is there a more elegant way to check if an element occurs only once in an index?

@@ -373,6 +373,15 @@ def test_setitem_string_column_numpy_dtype_raising(self):
expected = DataFrame([[1, 2, 5], [3, 4, 6]], columns=[0, 1, "0 - Name"])
tm.assert_frame_equal(df, expected)

def test_setitem_empty_df_duplicate_columns(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need tests for Series?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so, indexer has to be a tuple to land there, which is only valid for dataframes I think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its weird, but you could do series.loc[("foo",)]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, definitely weird :)

But this is already converted before reaching that point.

df = Series(index=["a", "b", "b"], dtype="float64")
df.loc[("a", )] = 1

This results in indexer=0. Tested the same with a MultiIndex Series, also 0 there

@@ -1850,7 +1850,8 @@ def _setitem_single_block(self, indexer, value, name: str):
for i, idx in enumerate(indexer)
if i != info_axis
)
and item_labels.is_unique
and len(item_labels.get_indexer_for([item_labels[indexer[info_axis]]]))
== 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets separate this out into a nested condition and define item_labels[indexer[info_axis]] once, then reuse it on L1856

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@phofl
Copy link
Member Author

phofl commented Jan 20, 2021

Failure unrelated

@jreback jreback added this to the 1.3 milestone Jan 21, 2021
@jreback jreback added the Bug label Jan 21, 2021
@jreback jreback merged commit 0270b23 into pandas-dev:master Jan 21, 2021
@jreback
Copy link
Contributor

jreback commented Jan 21, 2021

thanks @phofl

@phofl phofl deleted the 38521 branch January 21, 2021 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Setting values to slice fails with duplicated column name
3 participants