Skip to content

BUG: Improved error msg #32855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from
Closed

BUG: Improved error msg #32855

wants to merge 23 commits into from

Conversation

a-y-khan
Copy link
Contributor

Hopefully this is a more informative message. The original bug report mentions that the index key also fails for Series but didn't report the error type, so I took my best guess at the reporter's intent in the test.

@jreback jreback added Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Mar 21, 2020
@a-y-khan
Copy link
Contributor Author

I made some changes. Please let me know if this is what you had mind. Thanks!

@a-y-khan a-y-khan requested a review from jreback March 23, 2020 04:30
@a-y-khan a-y-khan changed the title Improved error msg BUG: Improved error msg Mar 29, 2020
s = pd.Series(np.linspace(1, 2, 2), index=pd.MultiIndex.from_tuples(idx))

with pytest.raises(KeyError, match=r".+? indexing with .+? failing for MultiIndex"):
df.at["a_row", "A"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a similar problem for at.__setitem__?

Copy link
Contributor Author

@a-y-khan a-y-khan Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiindex with at.__setitem__ doesn't raise an exception and sets items at level 0:

In [1]: import pandas as pd 
   ...: import numpy as np 
   ...:  
   ...: cols = [("a_col", chr(i + 65)) for i in range(2)] 
   ...: idx = [("a_row", chr(i + 65)) for i in range(2)] 
   ...: df = pd.DataFrame( 
   ...:     np.linspace(1, 4, 4).reshape(2, 2), 
   ...:     index=pd.MultiIndex.from_tuples(idx), 
   ...:     columns=pd.MultiIndex.from_tuples(cols), 
   ...: ) 
   ...: display(df)
                                                                                                           
        a_col     
            A    B
a_row A   1.0  2.0
      B   3.0  4.0

In [2]: df.at["a_row", "A"] = 888.0 
   ...: display(df)
                                                                                                           
         a_col       
             A      B
a_row A  888.0  888.0
      B    3.0    4.0

In [3]: df.at["a_row", "A"] = [888.0, 999.0] 
   ...: display(df)
                                                                                                           
         a_col       
             A      B
a_row A  888.0  999.0
      B    3.0    4.0

if isinstance(self.obj.index, ABCMultiIndex):
raise KeyError(
f"Detected KeyError {err}, indexing with {key} "
"failing for MultiIndex"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing this on master, the DataFrame case is giving a reasonable message, while the Series case is unhelpful. Maybe condition this on self.ndim == 1 and add a suggestion "pass a tuple"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It the Series case goes down a different code path where an exception is raised in pandas/utils/_validators.py. Would adding an except for TypeError with a more informative error message work?

In [3]: ser.at["a_row", "A"]                                                                                  
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-2aec4c8078ba> in <module>
----> 1 ser.at["a_row", "A"]

/workspaces/pandas-aykhan/pandas/core/indexing.py in __getitem__(self, key)
   2090             return self.obj.loc[key]
   2091 
-> 2092         return super().__getitem__(key)
   2093 
   2094     def __setitem__(self, key, value):

/workspaces/pandas-aykhan/pandas/core/indexing.py in __getitem__(self, key)
   2035         try:
   2036             # import pdb; pdb.set_trace()
-> 2037             return self.obj._get_value(*key, takeable=self._takeable)
   2038         except KeyError as err:
   2039             if isinstance(self.obj.index, ABCMultiIndex):

TypeError: _get_value() got multiple values for argument 'takeable'

df.at["a_row", "A"]

with pytest.raises(TypeError, match=r".+? got multiple values for argument .+?"):
ser.at["a_row", "A"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a bug. ser.at["a_row", "A"] should return 1.0, same as ser.loc["a_row", "A"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this on my PR #32520, it returns 1.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the series from the test. @rhshadrach is correct that the original context was DataFrames, and improving the error message should clarify why the KeyError is being raised.

@jreback
Copy link
Contributor

jreback commented May 25, 2020

I am not sure this is needed any longer as we merged #32520, but pls @a-y-khan if you can test and see if the original issue now works, or we still need something.

@jreback
Copy link
Contributor

jreback commented Jun 9, 2020

closing as stale, if you'd like to update as above, pls ping / open a new PR.

@jreback jreback closed this Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Pandas indexing with at/iat fails with multi-index
4 participants