-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Clean up DataFrame.setitem behavior for duplicate columns #39403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/indexers.py
Outdated
@@ -376,6 +376,25 @@ def unpack_1tuple(tup): | |||
return tup | |||
|
|||
|
|||
def check_key_length(columns, key, value): | |||
"""Checks if a key used as indexer has the same length as the columns it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline after """
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx
pandas/core/indexers.py
Outdated
|
||
Parameters | ||
---------- | ||
columns: The columns of the DataFrame to index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
columns : Index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
If you're interested, there might be something worth salvaging from _setitem_array here. Avoids going through iloc so that we dont have to worry about inplace vs not-inplace there (xref #38896). If/when I get #39302 in shape to be a real PR, something like that branch will have to be a part of it, but I'm optimistic you'll get around to it before I do. |
That looks good. I will try to incorporate this after this is merged. |
pandas/core/indexers.py
Outdated
@@ -376,6 +376,27 @@ def unpack_1tuple(tup): | |||
return tup | |||
|
|||
|
|||
def check_key_length(columns, key, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type as much as you can here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would have to type wiht Index and DataFrame causing circular imports. Is there a way around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the import inside a if TYPE_CHECKING:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, should have said that I have already tried this. If using TYPE_CHECKING and typing the signature (not the return value) this raises
NameError: name 'DataFrame' is not defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not know about the future import. Thx for the help
pandas/core/indexers.py
Outdated
|
||
Parameters | ||
---------- | ||
columns: Index The columns of the DataFrame to index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space between "columns" and colon, same on the next two lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx
thanks @phofl was this technically a bug fix? or just an incorrect test? |
Bug fix yes, but is covered by the whatsnew from #39341, follow up from that one |
cc @jbrockmendel This cleans the edgy behavior up for duplicate columns.
Previous test was wrong.