Skip to content

BUG: DataFrame.at with non-unique axes #33047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,7 @@ Indexing
- Bug in :meth:`DatetimeIndex.get_loc` raising ``KeyError`` with converted-integer key instead of the user-passed key (:issue:`31425`)
- Bug in :meth:`Series.xs` incorrectly returning ``Timestamp`` instead of ``datetime64`` in some object-dtype cases (:issue:`31630`)
- Bug in :meth:`DataFrame.iat` incorrectly returning ``Timestamp`` instead of ``datetime`` in some object-dtype cases (:issue:`32809`)
- Bug in :meth:`DataFrame.at` when either columns or index is non-unique (:issue:`33041`)
- Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` when indexing with an integer key on a object-dtype :class:`Index` that is not all-integers (:issue:`31905`)
- Bug in :meth:`DataFrame.iloc.__setitem__` on a :class:`DataFrame` with duplicate columns incorrectly setting values for all matching columns (:issue:`15686`, :issue:`22036`)
- Bug in :meth:`DataFrame.loc:` and :meth:`Series.loc` with a :class:`DatetimeIndex`, :class:`TimedeltaIndex`, or :class:`PeriodIndex` incorrectly allowing lookups of non-matching datetime-like dtypes (:issue:`32650`)
Expand Down
32 changes: 25 additions & 7 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -2045,6 +2045,7 @@ def __setitem__(self, key, value):
key = _tuplify(self.ndim, key)
if len(key) != self.ndim:
raise ValueError("Not enough indexers for scalar access (setting)!")

key = list(self._convert_key(key, is_setter=True))
self.obj._set_value(*key, value=value, takeable=self._takeable)

Expand All @@ -2064,15 +2065,32 @@ def _convert_key(self, key, is_setter: bool = False):

return key

@property
def _axes_are_unique(self) -> bool:
# Only relevant for self.ndim == 2
assert self.ndim == 2
return self.obj.index.is_unique and self.obj.columns.is_unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add this assertion explicity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated+green


def __getitem__(self, key):
if self.ndim != 1 or not is_scalar(key):
# FIXME: is_scalar check is a kludge
return super().__getitem__(key)

# Like Index.get_value, but we do not allow positional fallback
obj = self.obj
loc = obj.index.get_loc(key)
return obj.index._get_values_for_loc(obj, loc, key)
if self.ndim == 2 and not self._axes_are_unique:
# GH#33041 fall back to .loc
if not isinstance(key, tuple) or not all(is_scalar(x) for x in key):
raise ValueError("Invalid call for scalar access (getting)!")
return self.obj.loc[key]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with this, though, is that it allows any kind of indexer for this case (not only scalar ones), and thus relaxing the requirements for .at

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to address this issue


return super().__getitem__(key)

def __setitem__(self, key, value):
if self.ndim == 2 and not self._axes_are_unique:
# GH#33041 fall back to .loc
if not isinstance(key, tuple) or not all(is_scalar(x) for x in key):
raise ValueError("Invalid call for scalar access (setting)!")

self.obj.loc[key] = value
return

return super().__setitem__(key, value)


@doc(IndexingMixin.iat)
Expand Down
40 changes: 40 additions & 0 deletions pandas/tests/indexing/test_scalar.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,46 @@ def test_imethods_with_dups(self):
result = df.iat[2, 0]
assert result == 2

def test_frame_at_with_duplicate_axes(self):
# GH#33041
arr = np.random.randn(6).reshape(3, 2)
df = DataFrame(arr, columns=["A", "A"])

result = df.at[0, "A"]
expected = df.iloc[0]

tm.assert_series_equal(result, expected)

result = df.T.at["A", 0]
tm.assert_series_equal(result, expected)

# setter
df.at[1, "A"] = 2
expected = Series([2.0, 2.0], index=["A", "A"], name=1)
tm.assert_series_equal(df.iloc[1], expected)

def test_frame_at_with_duplicate_axes_requires_scalar_lookup(self):
# GH#33041 check that falling back to loc doesn't allow non-scalar
# args to slip in

arr = np.random.randn(6).reshape(3, 2)
df = DataFrame(arr, columns=["A", "A"])

msg = "Invalid call for scalar access"
with pytest.raises(ValueError, match=msg):
df.at[[1, 2]]
with pytest.raises(ValueError, match=msg):
df.at[1, ["A"]]
with pytest.raises(ValueError, match=msg):
df.at[:, "A"]

with pytest.raises(ValueError, match=msg):
df.at[[1, 2]] = 1
with pytest.raises(ValueError, match=msg):
df.at[1, ["A"]] = 1
with pytest.raises(ValueError, match=msg):
df.at[:, "A"] = 1

def test_series_at_raises_type_error(self):
# at should not fallback
# GH 7814
Expand Down