Skip to content

COMPAT: .query/.eval should work w/o numexpr being installed if possible #12864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -152,25 +152,29 @@ before_install:
- export DISPLAY=:99.0

install:
- echo "install"
- echo "install start"
- ci/prep_ccache.sh
- ci/install_travis.sh
- ci/submit_ccache.sh
- echo "install done"

before_script:
- source activate pandas && pip install codecov
- ci/install_db.sh

script:
- echo "script"
- echo "script start"
- ci/run_build_docs.sh
- ci/script.sh
- ci/lint.sh
- echo "script done"

after_success:
- source activate pandas && codecov

after_script:
- echo "after_script start"
- ci/install_test.sh
- source activate pandas && ci/print_versions.py
- ci/print_skipped.py /tmp/nosetests.xml
- echo "after_script done"
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.18.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ API changes



- the default for ``.query()/.eval()`` is now ``engine=None`` which will use ``numexpr`` if its installed, else will fallback to the ``python`` engine. This mimics the pre-0.18.1 behavior if ``numexpr`` is installed (and which previously would raise if ``numexpr`` was NOT installed and ``.query()/.eval()`` was used). (:issue:`12749`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche @TomAugspurger wording make sense? This is really a bug fix, but wanted to highlite it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"its" -> "it's". Maybe change "and which would previously..." to "Previously, if numexpr was not installed, .query()/.eval() would raise."

Does it raise a PerformanceWarning when it falls back to python? Might be good to mention that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no reason to show a PerformanceWarning. I guess we could, but the issue addressed here is thatt was raising an ImportError if you didn't have numexpr installed and DIDN't specify ``engine='python'` which was not the default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue of showing a PerformanceWarning (good idea) is that we have to be careful when doing it, because for example string ops are by definition in python, so will create a separate issue for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. +1 to this change.


- ``CParserError`` is now a ``ValueError`` instead of just an ``Exception`` (:issue:`12551`)
- ``read_csv`` no longer allows a combination of strings and integers for the ``usecols`` parameter (:issue:`12678`)
Expand Down
21 changes: 18 additions & 3 deletions pandas/computation/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,19 @@ def _check_engine(engine):
* If an invalid engine is passed
ImportError
* If numexpr was requested but doesn't exist

Returns
-------
string engine

"""

if engine is None:
if _NUMEXPR_INSTALLED:
engine = 'numexpr'
else:
engine = 'python'

if engine not in _engines:
raise KeyError('Invalid engine {0!r} passed, valid engines are'
' {1}'.format(engine, list(_engines.keys())))
Expand All @@ -41,6 +53,8 @@ def _check_engine(engine):
"engine='numexpr' for query/eval "
"if 'numexpr' is not installed")

return engine


def _check_parser(parser):
"""Make sure a valid parser is passed.
Expand Down Expand Up @@ -131,7 +145,7 @@ def _check_for_locals(expr, stack_level, parser):
raise SyntaxError(msg)


def eval(expr, parser='pandas', engine='numexpr', truediv=True,
def eval(expr, parser='pandas', engine=None, truediv=True,
local_dict=None, global_dict=None, resolvers=(), level=0,
target=None, inplace=None):
"""Evaluate a Python expression as a string using various backends.
Expand Down Expand Up @@ -160,10 +174,11 @@ def eval(expr, parser='pandas', engine='numexpr', truediv=True,
``'python'`` parser to retain strict Python semantics. See the
:ref:`enhancing performance <enhancingperf.eval>` documentation for
more details.
engine : string, default 'numexpr', {'python', 'numexpr'}
engine : string or None, default 'numexpr', {'python', 'numexpr'}

The engine used to evaluate the expression. Supported engines are

- None : tries to use ``numexpr``, falls back to ``python``
- ``'numexpr'``: This default engine evaluates pandas objects using
numexpr for large speed ups in complex expressions
with large frames.
Expand Down Expand Up @@ -230,7 +245,7 @@ def eval(expr, parser='pandas', engine='numexpr', truediv=True,
first_expr = True
for expr in exprs:
expr = _convert_expression(expr)
_check_engine(engine)
engine = _check_engine(engine)
_check_parser(parser)
_check_resolvers(resolvers)
_check_for_locals(expr, level, parser)
Expand Down
53 changes: 50 additions & 3 deletions pandas/tests/frame/test_query_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
makeCustomDataframe as mkdf)

import pandas.util.testing as tm
from pandas.computation import _NUMEXPR_INSTALLED

from pandas.tests.frame.common import TestData

Expand All @@ -34,13 +35,59 @@ def skip_if_no_pandas_parser(parser):

def skip_if_no_ne(engine='numexpr'):
if engine == 'numexpr':
try:
import numexpr as ne # noqa
except ImportError:
if not _NUMEXPR_INSTALLED:
raise nose.SkipTest("cannot query engine numexpr when numexpr not "
"installed")


class TestCompat(tm.TestCase):

def setUp(self):
self.df = DataFrame({'A': [1, 2, 3]})
self.expected1 = self.df[self.df.A > 0]
self.expected2 = self.df.A + 1

def test_query_default(self):

# GH 12749
# this should always work, whether _NUMEXPR_INSTALLED or not
df = self.df
result = df.query('A>0')
assert_frame_equal(result, self.expected1)
result = df.eval('A+1')
assert_series_equal(result, self.expected2, check_names=False)

def test_query_None(self):

df = self.df
result = df.query('A>0', engine=None)
assert_frame_equal(result, self.expected1)
result = df.eval('A+1', engine=None)
assert_series_equal(result, self.expected2, check_names=False)

def test_query_python(self):

df = self.df
result = df.query('A>0', engine='python')
assert_frame_equal(result, self.expected1)
result = df.eval('A+1', engine='python')
assert_series_equal(result, self.expected2, check_names=False)

def test_query_numexpr(self):

df = self.df
if _NUMEXPR_INSTALLED:
result = df.query('A>0', engine='numexpr')
assert_frame_equal(result, self.expected1)
result = df.eval('A+1', engine='numexpr')
assert_series_equal(result, self.expected2, check_names=False)
else:
self.assertRaises(ImportError,
lambda: df.query('A>0', engine='numexpr'))
self.assertRaises(ImportError,
lambda: df.eval('A+1', engine='numexpr'))


class TestDataFrameEval(tm.TestCase, TestData):

_multiprocess_can_split_ = True
Expand Down
19 changes: 7 additions & 12 deletions pandas/util/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,21 +329,16 @@ def _incompat_bottleneck_version(method):


def skip_if_no_ne(engine='numexpr'):
import nose
_USE_NUMEXPR = pd.computation.expressions._USE_NUMEXPR
from pandas.computation.expressions import (_USE_NUMEXPR,
_NUMEXPR_INSTALLED)

if engine == 'numexpr':
try:
import numexpr as ne
except ImportError:
raise nose.SkipTest("numexpr not installed")

if not _USE_NUMEXPR:
raise nose.SkipTest("numexpr disabled")

if ne.__version__ < LooseVersion('2.0'):
raise nose.SkipTest("numexpr version too low: "
"%s" % ne.__version__)
import nose
raise nose.SkipTest("numexpr enabled->{enabled}, "
"installed->{installed}".format(
enabled=_USE_NUMEXPR,
installed=_NUMEXPR_INSTALLED))


def _skip_if_has_locale():
Expand Down