Skip to content

BUG: loc.setitem raising when expanding empty frame with array value #50065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

phofl
Copy link
Member

@phofl phofl commented Dec 5, 2022

@phofl phofl added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Dec 5, 2022
# GH#49972
result = DataFrame()
result.loc[0, 0] = np.asarray([0])
expected = DataFrame({0: [0.0]})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So expected.iloc[0] will return a list and not necessarily the np.array? i.e. the ndarray is coerced to a list and then stored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think expected.iloc[0, 0] here is a scalar, which is probably 1) what the suer in 49972 wanted and 2) technically incorrect. i'd expect 1-D ndarray containing a zero

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm in principal I agree, but this is consistent now with

df = DataFrame({0: [0]})
df.loc[1, 0] = np.asarray([0])

and

df = DataFrame({0: [0]})
df.loc[0, 1] = np.asarray([0])

which are similar expanding cases. This points to the bug being somewhere else regarding scalar vs array

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats reasonable. maybe let's see what it would take to fix the two cases you mention here and whether those are isolate-ish sketchy cases before adding another one?

if moving forward with this, should add a comment in test/code about living with technically-wrong behavior

Copy link
Member Author

@phofl phofl Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a general inconsistency that might be expected? Not Sure.

If you have a column with a numeric dtype

df = DataFrame({0: [0]})

then

df.loc[1, 0] = np.asarray([0])

unpacks and sets a scalar. If the column has object dtype,

df = DataFrame({0: ["a"]})

we set the array. Is this expected?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expected?

uhh maybe? with object-dtype it is more plausible that the user is intentionally trying to set an array as a value.

@jbrockmendel
Copy link
Member

@phofl worth talking about in today's meeting?

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jan 14, 2023
@simonjayhawkins
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Cannot use .loc to set a ndarray as the value of an empty dataframe
4 participants