df.query() does not support column name 'class' #18221

michaelaye · 2017-11-10T21:46:59Z

Code Sample, a copy-pastable example if possible

indices_to_plot = df.query('class>0')

Problem description

Above code results in this error traceback:

Traceback (most recent call last):

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-33-6e077c50ac68>", line 2, in <module>
    indices_to_plot = df.query('class>0')

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/frame.py", line 2297, in query
    res = self.eval(expr, **kwargs)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/frame.py", line 2366, in eval
    return _eval(expr, inplace=inplace, **kwargs)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/eval.py", line 290, in eval
    truediv=truediv)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 732, in __init__
    self.terms = self.parse()

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 749, in parse
    return self._visitor.visit(self.expr)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 310, in visit
    node = ast.fix_missing_locations(ast.parse(clean))

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
    class >0
          ^
SyntaxError: invalid syntax

My column names are "occ_id, class, et, radius, lon, width, type" and if I execute this query on another column, it works fine:

indices_to_plot = df.query('et>0')

Only the column named 'class' seems to fail.

Expected Output

Sub selection of the dataframe according to the query.

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.6.0
Cython: 0.27.3
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: 0.9.6
IPython: 6.2.1
sphinx: 1.6.5
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

chris-b1 · 2017-11-10T21:53:47Z

While the error message could be better, I'm not sure this is something we can support (easily) - pandas and numexpr use the python parser to evaluate these expressions, and class is of course a reserved word in python.

michaelaye · 2017-11-10T22:03:06Z

Understood, though this paragraph from the docstring made me believe it should work:

The DataFrame.index and DataFrame.columns attributes of the DataFrame instance are placed in the query namespace by default, which allows you to treat both the index and columns of the frame as a column in the frame. The identifier index is used for the frame index; you can also use the name of the index to identify it in a query.

If it's impossible to use any reserved keywords as column names for query it should be explicitly called out in the docstring, I think.

chris-b1 · 2017-11-10T22:08:19Z

If it's impossible to use any reserved keywords as column names for query it should be explicitly called out in the docstring, I think.

Yes agreed, we may also want to wrap the parsing in a try/catch to bubble up a more directed error. PR welcome!

…pandas-dev#18248)

Sbrjt · 2024-08-30T09:55:53Z

I could get it to work with backticks: `class`

chris-b1 added Docs Error Reporting Incorrect or improved errors from pandas labels Nov 10, 2017

chris-b1 added this to the Next Major Release milestone Nov 10, 2017

WillAyd added a commit to WillAyd/pandas that referenced this issue Nov 12, 2017

DOC: Error msg using Python keyword in numexpr query pandas-dev#18221

0f477de

WillAyd mentioned this issue Nov 12, 2017

DOC: Error msg using Python keyword in numexpr query #18221 #18248

Merged

4 tasks

jreback modified the milestones: Next Major Release, 0.21.1, 0.22.0 Nov 12, 2017

jreback closed this as completed in #18248 Nov 13, 2017

jreback pushed a commit that referenced this issue Nov 13, 2017

DOC: Error msg using Python keyword in numexpr query #18221 (#18248)

feaa0d0

No-Stream pushed a commit to No-Stream/pandas that referenced this issue Nov 28, 2017

DOC: Error msg using Python keyword in numexpr query pandas-dev#18221 (…

c010204

…pandas-dev#18248)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

df.query() does not support column name 'class' #18221

df.query() does not support column name 'class' #18221

michaelaye commented Nov 10, 2017

chris-b1 commented Nov 10, 2017 •

edited

Loading

michaelaye commented Nov 10, 2017

chris-b1 commented Nov 10, 2017

Sbrjt commented Aug 30, 2024 •

edited

Loading

df.query() does not support column name 'class' #18221

df.query() does not support column name 'class' #18221

Comments

michaelaye commented Nov 10, 2017

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

chris-b1 commented Nov 10, 2017 • edited Loading

michaelaye commented Nov 10, 2017

chris-b1 commented Nov 10, 2017

Sbrjt commented Aug 30, 2024 • edited Loading

Output of `pd.show_versions()`

chris-b1 commented Nov 10, 2017 •

edited

Loading

Sbrjt commented Aug 30, 2024 •

edited

Loading