DOC: Confusing 'resolvers' kwarg documentation for DataFrame.query() #34966

eric-brunner · 2020-06-24T08:41:59Z

Location of the documentation

According to DataFrame.query() documentation
(https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.query.html#pandas.DataFrame.query)
the 'resolvers' keyword is allowed and its function documented by DataFrame.eval() (https://pandas.pydata.org/docs/dev/reference/api/pandas.eval.html#pandas.eval).

Documentation problem

The keyword's description states:
"A list of objects implementing the getitem special method that you can use to inject an additional collection of namespaces to use for variable lookup. [...]"

But default namespaces used by DataFrame.eval() are actually overwritten by namespace collections. See basic example:

from pandas import DataFrame, Series
import numpy as np
df = DataFrame(np.random.random((5, 3)),)
print(df.query('index==1'))
print(df.query('map==1', resolvers=({'map': Series([4,3,2,1,0], [0,1,2,3,4])},)))
print(df.query('index==1', resolvers=({'map': Series([4,3,2,1,0], [0,1,2,3,4])},)))

The last line raises an UndefinedVariableError, because 'index' is no longer part of the namespace collection.

This is caused by the condition if resolvers is None: in DataFrame.eval() source code (v1.0.4, frame.py, line 3335):

...
inplace = validate_bool_kwarg(inplace, "inplace")
        resolvers = kwargs.pop("resolvers", None)
        kwargs["level"] = kwargs.pop("level", 0) + 1
        if resolvers is None:
            index_resolvers = self._get_index_resolvers()
            column_resolvers = self._get_cleaned_column_resolvers()
            resolvers = column_resolvers, index_resolvers
        if "target" not in kwargs:
            kwargs["target"] = self
        kwargs["resolvers"] = kwargs.get("resolvers", ()) + tuple(resolvers)
...

Suggested fix for documentation

Therefore, according to the actual behaviour, the documentation for the 'resolvers' kwarg of DataFrame.eval() should read:
"A list of objects implementing the getitem special method that you can use to replace the default collection of namespaces to use for variable lookup. [...]"

OR

Is this a bug? Should it be addressed?

The text was updated successfully, but these errors were encountered:

bubblingoak · 2021-12-30T22:07:23Z

Just ran into this same issue. It seems like a bug to me as I think overriding the default resolvers just makes pandas.dataframe.eval equivalent to pandas.eval. The code itself is also contradictory. It first removes the resolvers key with dict.pop then attempts to get it with dict.get as though to combine the two tuples of resolvers.

I've made a MR to address this here #45134

eric-brunner added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 24, 2020

jbrockmendel added expressions pd.eval, query and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 3, 2020

bubblingoak mentioned this issue Dec 30, 2021

BUG: allow use of both default+input resolvers in df.eval, GH34966 #45134

Merged

4 tasks

jreback added this to the 1.4 milestone Dec 31, 2021

jreback closed this as completed in #45134 Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Confusing 'resolvers' kwarg documentation for DataFrame.query() #34966

DOC: Confusing 'resolvers' kwarg documentation for DataFrame.query() #34966

eric-brunner commented Jun 24, 2020

bubblingoak commented Dec 30, 2021

DOC: Confusing 'resolvers' kwarg documentation for DataFrame.query() #34966

DOC: Confusing 'resolvers' kwarg documentation for DataFrame.query() #34966

Comments

eric-brunner commented Jun 24, 2020

Location of the documentation

Documentation problem

Suggested fix for documentation

bubblingoak commented Dec 30, 2021