Skip to content

ENH: DataFrame eval will not work with '"True", "False", "inf", "Inf" #47859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 3 tasks
weikhor opened this issue Jul 26, 2022 · 1 comment
Open
1 of 3 tasks

ENH: DataFrame eval will not work with '"True", "False", "inf", "Inf" #47859

weikhor opened this issue Jul 26, 2022 · 1 comment
Labels

Comments

@weikhor
Copy link
Contributor

weikhor commented Jul 26, 2022

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

With version 1.5.0.dev0+1213.g4fe8b84c77, it not able to support dataframe eval when column name used is from ["True", "False", "inf", "Inf"]

import pandas as pd
import numpy as np

column = "True"
df = pd.DataFrame(np.random.randint(0, 100, size=(10, 2)),
                  columns=[column, "col1"])
expected = df[df[column] > 6]
result = df.query(f"{column}>6")

It shows this error

Traceback (most recent call last):
  File "/home/open_source/pandas_1/development/development.py", line 19, in <module>
    result = df.query(f"{column}>6")
  File "/home/open_source/pandas_1/pandas/pandas/util/_decorators.py", line 317, in wrapper
    return func(*args, **kwargs)
  File "/home/open_source/pandas_1/pandas/pandas/core/frame.py", line 4388, in query
    result = self.loc[res]
  File "/home/open_source/pandas_1/pandas/pandas/core/indexing.py", line 1071, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/home/open_source/pandas_1/pandas/pandas/core/indexing.py", line 1306, in _getitem_axis
    self._validate_key(key, axis)
  File "/home/open_source/pandas_1/pandas/pandas/core/indexing.py", line 1116, in _validate_key
    raise KeyError(
KeyError: 'False: boolean label can not be used without a boolean index'

Feature Description

Column name from ["True", "False", "inf", "Inf"] can support dataframe eval. For example,

import pandas as pd
import numpy as np

column = "True"
df = pd.DataFrame(np.random.randint(0, 100, size=(10, 2)),
                  columns=[column, "col1"])
expected = df[df[column] > 6]
result = df.query(f"{column}>6")

Alternative Solutions

No alternative solution.

Additional Context

It is related to #44603

@weikhor weikhor added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 26, 2022
@dannyi96
Copy link
Contributor

dannyi96 commented Jul 31, 2022

  1. For Python Boolean Literals - True and False

As per docs - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html

exprstr - The query string to evaluate.
....
... You can refer to column names that are not valid Python variable names by surrounding them in backticks. 
...

Hence the above will work by using
result = df.query(f"`{column}`>6")

  1. For inf and Inf
    Specifying the df name before the column name should do the trick
    result = df.query(f"@df.{column}>6")

Below is example working snippet -

import pandas as pd
import numpy as np

literal_columns = ["True", "False"]
for column in literal_columns:
    df = pd.DataFrame(np.random.randint(0, 100, size=(10, 2)),
                  columns=[column, "col1"])
    expected = df[df[column] > 6]
    result = df.query(f"`{column}`>6")
    print(result)

inf_columns = ["inf", "Inf"]
for column in inf_columns:
    df = pd.DataFrame(np.random.randint(0, 100, size=(10, 2)),
                  columns=[column, "col1"])
    expected = df[df[column] > 6]
    result = df.query(f"@df.{column}>6")
    print(result)

@jbrockmendel jbrockmendel added the expressions pd.eval, query label Nov 25, 2022
@mroeschke mroeschke removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants