Skip to content

ENH: define the order of resolution for index vs columns in query/eval #6677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 21, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions doc/source/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -784,6 +784,29 @@ If instead you don't want to or cannot name your index, you can use the name
del old_index


.. note::

If the name of your index overlaps with a column name, the column name is
given precedence. For example,

.. ipython:: python

df = DataFrame({'a': randint(5, size=5)})
df.index.name = 'a'
df.query('a > 2') # uses the column 'a', not the index

You can still use the index in a query expression by using the special
identifier 'index':

.. ipython:: python

df.query('index > 2')

If for some reason you have a column named ``index``, then you can refer to
the index as ``ilevel_0`` as well, but at this point you should consider
renaming your columns to something less ambiguous.


:class:`~pandas.MultiIndex` :meth:`~pandas.DataFrame.query` Syntax
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
2 changes: 2 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,8 @@ API Changes
- Following keywords are now acceptable for :meth:`DataFrame.plot(kind='bar')` and :meth:`DataFrame.plot(kind='barh')`.
- `width`: Specify the bar width. In previous versions, static value 0.5 was passed to matplotlib and it cannot be overwritten.
- `position`: Specify relative alignments for bar plot layout. From 0 (left/bottom-end) to 1(right/top-end). Default is 0.5 (center). (:issue:`6604`)
- Define and document the order of column vs index names in query/eval
(:issue:`6676`)

Deprecations
~~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1855,7 +1855,7 @@ def eval(self, expr, **kwargs):
kwargs['level'] = kwargs.pop('level', 0) + 1
if resolvers is None:
index_resolvers = self._get_index_resolvers()
resolvers = index_resolvers, dict(self.iteritems())
resolvers = dict(self.iteritems()), index_resolvers
kwargs['target'] = self
kwargs['resolvers'] = kwargs.get('resolvers', ()) + resolvers
return _eval(expr, **kwargs)
Expand Down
27 changes: 26 additions & 1 deletion pandas/tests/test_frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -12583,7 +12583,7 @@ def setUpClass(cls):
super(TestDataFrameQueryNumExprPandas, cls).setUpClass()
cls.engine = 'numexpr'
cls.parser = 'pandas'
tm.skip_if_no_ne()
tm.skip_if_no_ne(cls.engine)

@classmethod
def tearDownClass(cls):
Expand Down Expand Up @@ -12867,6 +12867,31 @@ def test_query_undefined_local(self):
"local variable 'c' is not defined"):
df.query('a == @c', engine=engine, parser=parser)

def test_index_resolvers_come_after_columns_with_the_same_name(self):
n = 1
a = np.r_[20:101:20]

df = DataFrame({'index': a, 'b': np.random.randn(a.size)})
df.index.name = 'index'
result = df.query('index > 5', engine=self.engine, parser=self.parser)
expected = df[df['index'] > 5]
tm.assert_frame_equal(result, expected)

df = DataFrame({'index': a, 'b': np.random.randn(a.size)})
result = df.query('ilevel_0 > 5', engine=self.engine, parser=self.parser)
expected = df.loc[df.index[df.index > 5]]
tm.assert_frame_equal(result, expected)

df = DataFrame({'a': a, 'b': np.random.randn(a.size)})
df.index.name = 'a'
result = df.query('a > 5', engine=self.engine, parser=self.parser)
expected = df[df.a > 5]
tm.assert_frame_equal(result, expected)

result = df.query('index > 5', engine=self.engine, parser=self.parser)
expected = df.loc[df.index[df.index > 5]]
tm.assert_frame_equal(result, expected)


class TestDataFrameQueryNumExprPython(TestDataFrameQueryNumExprPandas):

Expand Down