Skip to content

BUG: df.iloc not working with NamedTuple #57004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

yuanx749
Copy link
Contributor

@yuanx749 yuanx749 commented Jan 22, 2024

@rhshadrach I kept the type(key) is tuple since index can contain namedtuple, used as key in df.loc[key], as shown in the test case below:

def test_loc_getitem_index_namedtuple(self):
IndexType = namedtuple("IndexType", ["a", "b"])
idx1 = IndexType("foo", "bar")
idx2 = IndexType("baz", "bof")
index = Index([idx1, idx2], name="composite_index", tupleize_cols=False)
df = DataFrame([(1, 2), (3, 4)], index=index, columns=["A", "B"])
result = df.loc[IndexType("foo", "bar")]["A"]
assert result == 1

@rhshadrach
Copy link
Member

I kept the type(key) is tuple since index can contain namedtuple, used as key in df.loc[key], as shown in the test case below:

Sorry, I don't understand. In particular, the test you highlighted passes with this patch applied and from your comment above I expected it not to.

@yuanx749
Copy link
Contributor Author

I mean if we only change type(key) is tuple to isinstance(key, tuple), this particular test will fail.

Because in this test, namedtuples are elements of the index, and are used as labels in df.loc. It was handled by the else branch in the original code below.

pandas/pandas/core/indexing.py

Lines 1180 to 1192 in 7281475

if type(key) is tuple:
key = tuple(list(x) if is_iterator(x) else x for x in key)
key = tuple(com.apply_if_callable(x, self.obj) for x in key)
if self._is_scalar_access(key):
return self.obj._get_value(*key, takeable=self._takeable)
return self._getitem_tuple(key)
else:
# we by definition only have the 0th axis
axis = self.axis or 0
maybe_callable = com.apply_if_callable(key, self.obj)
maybe_callable = self._check_deprecated_callable_usage(key, maybe_callable)
return self._getitem_axis(maybe_callable, axis=axis)

@rhshadrach
Copy link
Member

Thanks - makes sense. I think this might need further discussion in that case. I do not see a way to support tuple (and subclasses of tuple) in a flat Index (i.e. not a MultiIndex) and using tuples for accessing (row, column) pairs. Indeed, the following variant of that test using tuples fails on main:

idx1 = ("foo", "bar")
idx2 = ("baz", "bof")
index = Index([idx1, idx2], name="composite_index", tupleize_cols=False)
df = DataFrame([(1, 2), (3, 4)], index=index, columns=["A", "B"])
result = df.loc[("foo", "bar")]["A"]
assert result == 1

The only way I can see making it all work is to try one and if it fails try the other - which I think we want to avoid.

While we don't run into this problem with iloc, I wonder if we want to avoid supporting subclasses with iloc but not loc.

cc @jbrockmendel @phofl

@simonjayhawkins simonjayhawkins added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Feb 7, 2024
Copy link
Contributor

github-actions bot commented Mar 9, 2024

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Mar 9, 2024
@yuanx749 yuanx749 closed this Mar 9, 2024
@yuanx749 yuanx749 deleted the iloc-tuple branch March 9, 2024 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Using NamedTuples with .iloc fails
3 participants