Skip to content

BUG: fix Frame getitem when column keys are nested tuples #43939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

Svanazar
Copy link
Contributor

@Svanazar Svanazar commented Oct 9, 2021

Multi-column selection when the column labels are nested tuples, for example atomic labels of form ((1,),) or MultiIndex having tuples at each level as reported in the issue raised error due to a condition in asarray_tuplesafe.

A key list of form [((1,))] converts to [[[1]]] through the call to np.array, resulting in ndim>2. However, only the case ndim==2 was further handled, causing a multi-dimensional array to be returned and subsequently an error.

This extends the ndim==2 condition for nested-tuple keys, but excludes multi-dimensional arrays so that errors are raised in accordance with test_getitem_ndarray_3d.

…43780

Multi-column selection when the column keys are nested tuples (for
example when a MultiIndex has at each level a tuple) raised error due
to a condition in asarray_tuplesafe.

A key list of form [((1,))] converts to [[[1]]] through the call to
np.array, resulting in ndim>2. However, only the case ndim=2 was
further handled, causing a multi-dimensional array to be returned and
subsequently raising error.

This commit extends the ndim=2 condition for these type of keys.
@@ -240,7 +240,8 @@ def asarray_tuplesafe(values, dtype: NpDtype | None = None) -> np.ndarray:
if issubclass(result.dtype.type, str):
result = np.asarray(values, dtype=object)

if result.ndim == 2:
# For ndim > 2 distinguish between nested tuples and multidimensional arrays
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these coerced to in the first place? we need to stop that from happening

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). labels Oct 10, 2021
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Nov 10, 2021
@jreback
Copy link
Contributor

jreback commented Nov 14, 2021

@Svanazar can you merge master and update to comments

@Svanazar
Copy link
Contributor Author

Hey, extremely sorry for the delay, this totally slipped my mind. I'll be staying a bit busy this week, so is it okay if I turn this into a draft and finish it during the next week? (in case the issue doesn't get resolved by then).
I'm really sorry for the long gap.

@Svanazar Svanazar marked this pull request as draft November 29, 2021 10:19
@jreback
Copy link
Contributor

jreback commented Jan 16, 2022

status here?

@Svanazar
Copy link
Contributor Author

I won't be able to work on this; really sorry for not informing earlier. Should I close the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). Stale
Projects
None yet
3 participants