Skip to content

BUG-20629 allow .at accessor with CategoricalIndex #26298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 20, 2019
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ Bug Fixes
Categorical
^^^^^^^^^^^

-
- Bug in :func:`DataFrame.at` and :func:`Series.at` that would raise exception if the index was a :class:`CategoricalIndex` (:issue:`20629`)
-
-

Expand Down
16 changes: 11 additions & 5 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -2710,13 +2710,19 @@ def _get_value(self, index, col, takeable=False):

try:
return engine.get_value(series._values, index)
except KeyError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually don't need the e here (just do raise) is enough

if self.index.nlevels > 1:
raise e # partial indexing forbidden
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put the comment on a separate line before the raise

else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this else surperflous?

pass # GH 20629
except (TypeError, ValueError):
pass

# we cannot handle direct indexing
# use positional
col = self.columns.get_loc(col)
index = self.index.get_loc(index)
return self._get_value(index, col, takeable=True)
# we cannot handle direct indexing
# use positional
col = self.columns.get_loc(col)
index = self.index.get_loc(index)
return self._get_value(index, col, takeable=True)
_get_value.__doc__ = get_value.__doc__

def set_value(self, index, col, value, takeable=False):
Expand Down
2 changes: 2 additions & 0 deletions pandas/core/indexes/category.py
Original file line number Diff line number Diff line change
Expand Up @@ -498,6 +498,8 @@ def get_value(self, series, key):
k = self._convert_scalar_indexer(k, kind='getitem')
indexer = self.get_loc(k)
return series.iloc[indexer]
except AttributeError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, what is the attribute error you are catching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above line, return series.iloc[indexer], assumes the series argument has implemented iloc. The regular index version does not make this assumption. In addition, the doc for this function reads "Fast lookup of value from 1-dimensional ndarray" so this assumption should be relaxed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,, I would then put the AttributeError catching with the Key and Type error below then.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key and type errors catch exceptions from self.get_loc, which usually happens when using indexing like s[s>5] leading to a key which is like a list of booleans.

The exceptions cannot use super.get_value(series, key) because its implementation does not work for CategoricalIndex when the key is a scalar value in the index (although it does work when the key is a positional index).

They also cannot all use super.get_value(series, indexer) because the creation of indexer is contingent on self.get_loc not failing.

Therefore, two cases are needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above line, return series.iloc[indexer], assumes the series argument has implemented iloc. The regular index version does not make this assumption.

ok so IIUC, series can actually be an Index or a Series? if so, then just change to using .take() I think should work. Can you try and type annotate things (and update the doc-strings).

I don't think we would need the AttributeError catching then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I've applied your recommended changes and added type information (the case where series is an ndarray was actually the case in the original issue)

return super().get_value(series, indexer)
except (KeyError, TypeError):
pass

Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/indexing/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -719,3 +719,11 @@ def test_map_with_dict_or_series(self):
output = cur_index.map(mapper)
# Order of categories in output can be different
tm.assert_index_equal(expected, output)

def test_at_with_categorical_index(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test both loc and at with here

# GH 20629
s = Series([1, 2, 3], index=pd.CategoricalIndex(["A", "B", "C"]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this after the loc tests

assert s.at['A'] == 1
df = DataFrame([[1, 2], [3, 4], [5, 6]],
index=pd.CategoricalIndex(["A", "B", "C"]))
assert df.at['B', 1] == 4