Skip to content

Commit 99c2d93

Browse files
chris-b1jreback
authored andcommitted
API: multi-line, not inplace eval
PEP8 compliance for test_eval and eval.py
1 parent b771c37 commit 99c2d93

File tree

5 files changed

+317
-50
lines changed

5 files changed

+317
-50
lines changed

doc/source/enhancingperf.rst

+56-6
Original file line numberDiff line numberDiff line change
@@ -570,18 +570,51 @@ prefix the name of the :class:`~pandas.DataFrame` to the column(s) you're
570570
interested in evaluating.
571571

572572
In addition, you can perform assignment of columns within an expression.
573-
This allows for *formulaic evaluation*. Only a single assignment is permitted.
574-
The assignment target can be a new column name or an existing column name, and
575-
it must be a valid Python identifier.
573+
This allows for *formulaic evaluation*. The assignment target can be a
574+
new column name or an existing column name, and it must be a valid Python
575+
identifier.
576+
577+
.. versionadded:: 0.18.0
578+
579+
The ``inplace`` keyword determines whether this assignment will performed
580+
on the original ``DataFrame`` or return a copy with the new column.
581+
582+
.. warning::
583+
584+
For backwards compatability, ``inplace`` defaults to ``True`` if not
585+
specified. This will change in a future version of pandas - if your
586+
code depends on an inplace assignment you should update to explicitly
587+
set ``inplace=True``
576588

577589
.. ipython:: python
578590
579591
df = pd.DataFrame(dict(a=range(5), b=range(5, 10)))
580-
df.eval('c = a + b')
581-
df.eval('d = a + b + c')
582-
df.eval('a = 1')
592+
df.eval('c = a + b', inplace=True)
593+
df.eval('d = a + b + c', inplace=True)
594+
df.eval('a = 1', inplace=True)
583595
df
584596
597+
When ``inplace`` is set to ``False``, a copy of the ``DataFrame`` with the
598+
new or modified columns is returned and the original frame is unchanged.
599+
600+
.. ipython:: python
601+
602+
df
603+
df.eval('e = a - c', inplace=False)
604+
df
605+
606+
.. versionadded:: 0.18.0
607+
608+
As a convenience, multiple assignments can be performed by using a
609+
multi-line string.
610+
611+
.. ipython:: python
612+
613+
df.eval("""
614+
c = a + b
615+
d = a + b + c
616+
a = 1""", inplace=False)
617+
585618
The equivalent in standard Python would be
586619

587620
.. ipython:: python
@@ -592,6 +625,23 @@ The equivalent in standard Python would be
592625
df['a'] = 1
593626
df
594627
628+
.. versionadded:: 0.18.0
629+
630+
The ``query`` method gained the ``inplace`` keyword which determines
631+
whether the query modifies the original frame.
632+
633+
.. ipython:: python
634+
635+
df = pd.DataFrame(dict(a=range(5), b=range(5, 10)))
636+
df.query('a > 2')
637+
df.query('a > 2', inplace=True)
638+
df
639+
640+
.. warning::
641+
642+
Unlike with ``eval``, the default value for ``inplace`` for ``query``
643+
is ``False``. This is consistent with prior versions of pandas.
644+
595645
Local Variables
596646
~~~~~~~~~~~~~~~
597647

doc/source/whatsnew/v0.18.0.txt

+46-1
Original file line numberDiff line numberDiff line change
@@ -295,15 +295,60 @@ date strings is no longer supported and raises a ``ValueError``. (:issue:`11818`
295295

296296
- ``.memory_usage`` now includes values in the index, as does memory_usage in ``.info`` (:issue:`11597`)
297297

298+
Changes to eval
299+
^^^^^^^^^^^^^^^
298300

301+
In prior versions, new columns assignments in an ``eval`` expression resulted
302+
in an inplace change to the ``DataFrame``. (:issue:`9297`)
299303

304+
.. ipython:: python
300305

306+
df = pd.DataFrame({'a': np.linspace(0, 10, 5), 'b': range(5)})
307+
df.eval('c = a + b')
308+
df
301309

310+
In version 0.18.0, a new ``inplace`` keyword was added to choose whether the
311+
assignment should be done inplace or return a copy.
302312

313+
.. ipython:: python
303314

315+
df
316+
df.eval('d = c - b', inplace=False)
317+
df
318+
df.eval('d = c - b', inplace=True)
319+
df
304320

321+
.. warning::
322+
323+
For backwards compatability, ``inplace`` defaults to ``True`` if not specified.
324+
This will change in a future version of pandas - if your code depends on an
325+
inplace assignment you should update to explicitly set ``inplace=True``
305326

327+
The ``inplace`` keyword parameter was also added the ``query`` method.
306328

329+
.. ipython:: python
330+
331+
df.query('a > 5')
332+
df.query('a > 5', inplace=True)
333+
df
334+
335+
.. warning::
336+
337+
Note that the default value for ``inplace`` in a ``query``
338+
is ``False``, which is consistent with prior verions.
339+
340+
``eval`` has also been updated to allow multi-line expressions for multiple
341+
assignments. These expressions will be evaluated one at a time in order. Only
342+
assginments are valid for multi-line expressions.
343+
344+
.. ipython:: python
345+
346+
df
347+
df.eval("""
348+
e = d + a
349+
f = e - 22
350+
g = f / 2.0""", inplace=True)
351+
df
307352

308353
.. _whatsnew_0180.deprecations:
309354

@@ -412,7 +457,7 @@ Bug Fixes
412457
- Bug in ``pd.read_clipboard`` and ``pd.to_clipboard`` functions not supporting Unicode; upgrade included ``pyperclip`` to v1.5.15 (:issue:`9263`)
413458

414459

415-
460+
- Bug in ``DataFrame.query`` containing an assignment (:issue:`8664`)
416461

417462

418463

pandas/computation/eval.py

+83-26
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,12 @@
33
"""Top level ``eval`` module.
44
"""
55

6+
import warnings
67
import tokenize
78
from pandas.core import common as com
89
from pandas.computation.expr import Expr, _parsers, tokenize_string
910
from pandas.computation.scope import _ensure_scope
10-
from pandas.compat import DeepChainMap, builtins
11+
from pandas.compat import string_types
1112
from pandas.computation.engines import _engines
1213
from distutils.version import LooseVersion
1314

@@ -138,7 +139,7 @@ def _check_for_locals(expr, stack_level, parser):
138139

139140
def eval(expr, parser='pandas', engine='numexpr', truediv=True,
140141
local_dict=None, global_dict=None, resolvers=(), level=0,
141-
target=None):
142+
target=None, inplace=None):
142143
"""Evaluate a Python expression as a string using various backends.
143144
144145
The following arithmetic operations are supported: ``+``, ``-``, ``*``,
@@ -196,6 +197,13 @@ def eval(expr, parser='pandas', engine='numexpr', truediv=True,
196197
scope. Most users will **not** need to change this parameter.
197198
target : a target object for assignment, optional, default is None
198199
essentially this is a passed in resolver
200+
inplace : bool, default True
201+
If expression mutates, whether to modify object inplace or return
202+
copy with mutation.
203+
204+
WARNING: inplace=None currently falls back to to True, but
205+
in a future version, will default to False. Use inplace=True
206+
explicitly rather than relying on the default.
199207
200208
Returns
201209
-------
@@ -214,29 +222,78 @@ def eval(expr, parser='pandas', engine='numexpr', truediv=True,
214222
pandas.DataFrame.query
215223
pandas.DataFrame.eval
216224
"""
217-
expr = _convert_expression(expr)
218-
_check_engine(engine)
219-
_check_parser(parser)
220-
_check_resolvers(resolvers)
221-
_check_for_locals(expr, level, parser)
222-
223-
# get our (possibly passed-in) scope
224-
level += 1
225-
env = _ensure_scope(level, global_dict=global_dict,
226-
local_dict=local_dict, resolvers=resolvers,
227-
target=target)
228-
229-
parsed_expr = Expr(expr, engine=engine, parser=parser, env=env,
230-
truediv=truediv)
231-
232-
# construct the engine and evaluate the parsed expression
233-
eng = _engines[engine]
234-
eng_inst = eng(parsed_expr)
235-
ret = eng_inst.evaluate()
236-
237-
# assign if needed
238-
if env.target is not None and parsed_expr.assigner is not None:
239-
env.target[parsed_expr.assigner] = ret
240-
return None
225+
first_expr = True
226+
if isinstance(expr, string_types):
227+
exprs = [e for e in expr.splitlines() if e != '']
228+
else:
229+
exprs = [expr]
230+
multi_line = len(exprs) > 1
231+
232+
if multi_line and target is None:
233+
raise ValueError("multi-line expressions are only valid in the "
234+
"context of data, use DataFrame.eval")
235+
236+
first_expr = True
237+
for expr in exprs:
238+
expr = _convert_expression(expr)
239+
_check_engine(engine)
240+
_check_parser(parser)
241+
_check_resolvers(resolvers)
242+
_check_for_locals(expr, level, parser)
243+
244+
# get our (possibly passed-in) scope
245+
level += 1
246+
env = _ensure_scope(level, global_dict=global_dict,
247+
local_dict=local_dict, resolvers=resolvers,
248+
target=target)
249+
250+
parsed_expr = Expr(expr, engine=engine, parser=parser, env=env,
251+
truediv=truediv)
252+
253+
# construct the engine and evaluate the parsed expression
254+
eng = _engines[engine]
255+
eng_inst = eng(parsed_expr)
256+
ret = eng_inst.evaluate()
257+
258+
if parsed_expr.assigner is None and multi_line:
259+
raise ValueError("Multi-line expressions are only valid"
260+
" if all expressions contain an assignment")
261+
262+
# assign if needed
263+
if env.target is not None and parsed_expr.assigner is not None:
264+
if inplace is None:
265+
warnings.warn(
266+
"eval expressions containing an assignment currently"
267+
"default to operating inplace.\nThis will change in "
268+
"a future version of pandas, use inplace=True to "
269+
"avoid this warning.",
270+
FutureWarning, stacklevel=3)
271+
inplace = True
272+
273+
# if returning a copy, copy only on the first assignment
274+
if not inplace and first_expr:
275+
target = env.target.copy()
276+
else:
277+
target = env.target
278+
279+
target[parsed_expr.assigner] = ret
280+
281+
if not resolvers:
282+
resolvers = ({parsed_expr.assigner: ret},)
283+
else:
284+
# existing resolver needs updated to handle
285+
# case of mutating existing column in copy
286+
for resolver in resolvers:
287+
if parsed_expr.assigner in resolver:
288+
resolver[parsed_expr.assigner] = ret
289+
break
290+
else:
291+
resolvers += ({parsed_expr.assigner: ret},)
292+
293+
ret = None
294+
first_expr = False
295+
296+
if not inplace and inplace is not None:
297+
return target
241298

242299
return ret

0 commit comments

Comments
 (0)