Skip to content

BUG: DataFrame.loc not aligning dict when setting to a column #47361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 30, 2022

Conversation

phofl
Copy link
Member

@phofl phofl commented Jun 15, 2022

@phofl phofl added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 15, 2022
@phofl phofl requested a review from rhshadrach June 15, 2022 08:01
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find that this already works on the multiblock case.

@rhshadrach rhshadrach added this to the 1.5 milestone Jun 16, 2022
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; cc @jreback

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, minor comment

@@ -4709,6 +4709,8 @@ def _sanitize_column(self, value) -> ArrayLike:
# We should never get here with DataFrame value
if isinstance(value, Series):
return _reindex_for_setitem(value, self.index)
elif isinstance(value, dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_dict_like ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aligned this with a codepath in indexing.py

        if (isinstance(value, ABCSeries) and name != "iloc") or isinstance(value, dict):
            from pandas import Series

            value = self._align_series(indexer, Series(value))

Not sure why we are calling for dict here, but should update in both places when we use is_dict_like here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_dict_like returns True for Series, so it depends if the name != "iloc" is supposed to exclude a specific Series case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point. Yes iloc should not align, hence the condition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we end up there with a DataFrame, which raises. Will investigate in follow up how this happens

@mroeschke mroeschke merged commit cd2b819 into pandas-dev:main Jun 30, 2022
@mroeschke
Copy link
Member

Thanks @phofl

@phofl phofl deleted the 47216 branch July 1, 2022 07:22
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
…-dev#47361)

* BUG: DataFrame.loc not aligning dict when setting to a column

* Add partial case

* Use is_dict_like

* Revert "Use is_dict_like"

This reverts commit d270851.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Assigning dictionary to series using .loc produces random results.
4 participants