Skip to content

API/BUG: make .at raise same exceptions as .loc #31724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 6, 2020

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Feb 5, 2020

This also (very) indirectly addresses #31683 which in turn will let us get rid of CategoricalIndex.get_value altogether.

@jbrockmendel jbrockmendel added the Indexing Related to indexing on series/frames, not to indexes themselves label Feb 5, 2020
@@ -63,7 +63,8 @@ Backwards incompatible API changes
- :meth:`DataFrameGroupby.mean` and :meth:`SeriesGroupby.mean` (and similarly for :meth:`~DataFrameGroupby.median`, :meth:`~DataFrameGroupby.std`` and :meth:`~DataFrameGroupby.var``)
now raise a ``TypeError`` if a not-accepted keyword argument is passed into it.
Previously a ``UnsupportedFunctionCall`` was raised (``AssertionError`` if ``min_count`` passed into :meth:`~DataFrameGroupby.median``) (:issue:`31485`)

- :meth:`DataFrame.at` and :meth:`Series.at` will raise a ``TypeError`` instead of a ``ValueError`` if an incompatible key is passed, matching the behavior of ``.loc[]`` (:issue:`31722`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should mention the KeyError here as well?

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check some timings for at? (as this is supposed to be the fast one)

How does this indirectly also address #31683 ? And if it does, shouldn't there be tests involving categorical / a whatsnew?

msg = (
"At based indexing on an non-integer index can only have "
"non-integer indexers"
"cannot do label indexing on <class 'pandas.core.indexes.base.Index'> "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say that I found the previous error message more readable ..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree, that suggests we should improve the existing messages for "loc"

@jbrockmendel
Copy link
Member Author

Can you check some timings for at? (as this is supposed to be the fast one)

Yah we take a hit

In [2]: ser = pd.Series(range(10000))                                           
In [3]: %timeit ser.at[501]                                                     
4.54 µs ± 139 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # <-- master
7.8 µs ± 75.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # <-- PR

In [4]: df = ser.to_frame("A")                                                  
In [5]: %timeit df.at[501, "A"]                                                 
3.9 µs ± 25.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # <-- master
7.71 µs ± 99.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # <-- PR

We might get some of that back from #31709.

@jbrockmendel
Copy link
Member Author

How does this indirectly also address #31683 ? And if it does, shouldn't there be tests involving categorical / a whatsnew?

With this in place, all the places from which index.get_value are called are preceeded by maybe_cast_scalar_indexer (except one in which the key is a tuple). That makes the call inside CategoricalIndex.get_value unnecessary, which addresses the motivating issue for #31683. The "indexing on CategoricalIndex is hard" problem is still a problem.

@jreback jreback added this to the 1.1 milestone Feb 6, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency is a huge + here, makes the code much better.

I am not sure if @WillAyd or @jorisvandenbossche have comments, but lgtm.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. parametrization will be a nice follow on

@jreback jreback merged commit e1ca66b into pandas-dev:master Feb 6, 2020
@jreback
Copy link
Contributor

jreback commented Feb 6, 2020

thanks @jbrockmendel

@jorisvandenbossche
Copy link
Member

@jreback maybe you didn't notice, but this time I had a "request changes" open, as you asked previous time to indicate that you shouldn't merge without me having the chance to answer back.
If there is still discussion going on, please give it some time.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 7, 2020

@jbrockmendel can you do a follow up with:

  • fixing the performance regression
  • fixing the usability regression in the error message

@jorisvandenbossche
Copy link
Member

fixing the usability regression in the error message

Looking back at it, it's actually mainly the long fully qualified class name that is annoying, so that is an easy fix -> #31769

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API/BUG: Inconsistent errors/msgs between loc vs at
4 participants