Skip to content

Trying to access non-existent df column throws KeyError for wrong key #28799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jwhendy opened this issue Oct 5, 2019 · 3 comments · Fixed by #28809
Closed

Trying to access non-existent df column throws KeyError for wrong key #28799

jwhendy opened this issue Oct 5, 2019 · 3 comments · Fixed by #28809
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jwhendy
Copy link

jwhendy commented Oct 5, 2019

Code Sample, a copy-pastable example if possible

I ran into this on a more complicated groupby().apply(lambda: ...) call, but reproduced a simple version below of what I think seems like a bug (or at least undesirable behavior).

import pandas as pd

# simple dataframe
test = pd.DataFrame({'var': ['a', 'a', 'b', 'b'], 'val': range(4)})

# simulated mistake of asking for a 'vau' column, not 'val'
test.groupby('var').apply(lambda rows: pd.DataFrame({'var': [rows.iloc[-1]['var']],
                                                     'val': [rows.iloc[-1]['vau']]}))

Doing this yields:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/.local/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   4735             try:
-> 4736                 return libindex.get_value_box(s, key)
   4737             except IndexError:

pandas/_libs/index.pyx in pandas._libs.index.get_value_box()

pandas/_libs/index.pyx in pandas._libs.index.get_value_at()

pandas/_libs/util.pxd in pandas._libs.util.get_value_at()

pandas/_libs/util.pxd in pandas._libs.util.validate_indexer()

TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

----------  8<  [snip]  >8 ----------

KeyError: 'var'

Problem description

When attempting to troubleshoot this, KeyError: 'foo' is misleading, as it doesn't point to the key that failed, but one that succeeded. In my actual code, I was doing this for more variables, so while this case is trivial to troubleshoot, real world cases might be more confusing and challenging to figure out for the user.

Expected Output

A KeyError: 'foo' where foo is the incorrect key, not a successful one.

Output of pd.show_versions()

pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 5.3.1-arch1-1-ARCH
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.1
numpy : 1.17.1
pytz : 2019.2
dateutil : 2.8.0
pip : 19.0.3
setuptools : 41.2.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.8.0
pandas_datareader: None
bs4 : 4.8.0
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : 2.6.3
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : None
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

@mroeschke
Copy link
Member

This looks fixed on master:

~/pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
   1612             return self.table.vals[k]
   1613         else:
-> 1614             raise KeyError(val)
   1615
   1616     cpdef set_item(self, object key, Py_ssize_t val):

KeyError: 'vau'

In [2]: pd.__version__
Out[2]: '0.26.0.dev0+490.g9cfb8b55b'

I guess this could use a regression test.

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions labels Oct 6, 2019
@bravech
Copy link
Contributor

bravech commented Oct 6, 2019

Hi! I'd like to work on this as my first issue. You just need a regression test added to check that the proper KeyError is thrown?

@mroeschke
Copy link
Member

Correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants