Skip to content

df.query() does not support column name 'class' #18221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelaye opened this issue Nov 10, 2017 · 4 comments · Fixed by #18248
Closed

df.query() does not support column name 'class' #18221

michaelaye opened this issue Nov 10, 2017 · 4 comments · Fixed by #18248
Labels
Docs Error Reporting Incorrect or improved errors from pandas
Milestone

Comments

@michaelaye
Copy link
Contributor

Code Sample, a copy-pastable example if possible

indices_to_plot = df.query('class>0')

Problem description

Above code results in this error traceback:

Traceback (most recent call last):

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-33-6e077c50ac68>", line 2, in <module>
    indices_to_plot = df.query('class>0')

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/frame.py", line 2297, in query
    res = self.eval(expr, **kwargs)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/frame.py", line 2366, in eval
    return _eval(expr, inplace=inplace, **kwargs)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/eval.py", line 290, in eval
    truediv=truediv)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 732, in __init__
    self.terms = self.parse()

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 749, in parse
    return self._visitor.visit(self.expr)

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 310, in visit
    node = ast.fix_missing_locations(ast.parse(clean))

  File "/Users/klay6683/miniconda3/envs/stable/lib/python3.6/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
    class >0
          ^
SyntaxError: invalid syntax

My column names are "occ_id, class, et, radius, lon, width, type" and if I execute this query on another column, it works fine:

indices_to_plot = df.query('et>0')

Only the column named 'class' seems to fail.

Expected Output

Sub selection of the dataframe according to the query.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.6.0
Cython: 0.27.3
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: 0.9.6
IPython: 6.2.1
sphinx: 1.6.5
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@chris-b1
Copy link
Contributor

chris-b1 commented Nov 10, 2017

While the error message could be better, I'm not sure this is something we can support (easily) - pandas and numexpr use the python parser to evaluate these expressions, and class is of course a reserved word in python.

@michaelaye
Copy link
Contributor Author

Understood, though this paragraph from the docstring made me believe it should work:

The DataFrame.index and DataFrame.columns attributes of the DataFrame instance are placed in the query namespace by default, which allows you to treat both the index and columns of the frame as a column in the frame. The identifier index is used for the frame index; you can also use the name of the index to identify it in a query.

If it's impossible to use any reserved keywords as column names for query it should be explicitly called out in the docstring, I think.

@chris-b1
Copy link
Contributor

If it's impossible to use any reserved keywords as column names for query it should be explicitly called out in the docstring, I think.

Yes agreed, we may also want to wrap the parsing in a try/catch to bubble up a more directed error. PR welcome!

@chris-b1 chris-b1 added Docs Error Reporting Incorrect or improved errors from pandas labels Nov 10, 2017
@chris-b1 chris-b1 added this to the Next Major Release milestone Nov 10, 2017
@jreback jreback modified the milestones: Next Major Release, 0.21.1, 0.22.0 Nov 12, 2017
@Sbrjt
Copy link

Sbrjt commented Aug 30, 2024

I could get it to work with backticks: `class`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants