@@ -464,19 +464,20 @@ evaluate an expression in the "context" of a ``DataFrame``.
464
464
465
465
Any expression that is a valid :func: `~pandas.eval ` expression is also a valid
466
466
``DataFrame.eval `` expression, with the added benefit that *you don't have to
467
- prefix the name of the * ``DataFrame `` *to the column you're interested in
467
+ prefix the name of the * ``DataFrame `` *to the column(s) you're interested in
468
468
evaluating *.
469
469
470
- In addition, you can perform in-line assignment of columns within an expression.
471
- This can allow for *formulaic evaluation *. Only a signle assignement is permitted.
472
- It can be a new column name or an existing column name. It must be a string-like.
470
+ In addition, you can perform assignment of columns within an expression.
471
+ This allows for *formulaic evaluation *. Only a single assignment is permitted.
472
+ The assignment target can be a new column name or an existing column name, and
473
+ it must be a valid Python identifier.
473
474
474
475
.. ipython :: python
475
476
476
- df = DataFrame(dict (a = range (5 ), b = range (5 ,10 )))
477
- df.eval(' c=a+ b' )
478
- df.eval(' d=a+b+ c' )
479
- df.eval(' a= 1' )
477
+ df = DataFrame(dict (a = range (5 ), b = range (5 , 10 )))
478
+ df.eval(' c = a + b' )
479
+ df.eval(' d = a + b + c' )
480
+ df.eval(' a = 1' )
480
481
df
481
482
482
483
Local Variables
@@ -616,3 +617,20 @@ different engines.
616
617
617
618
This plot was created using a ``DataFrame `` with 3 columns each containing
618
619
floating point values generated using ``numpy.random.randn() ``.
620
+
621
+ Technical Minutia
622
+ ~~~~~~~~~~~~~~~~~
623
+ - Expressions that would result in an object dtype (including simple
624
+ variable evaluation) have to be evaluated in Python space. The main reason
625
+ for this behavior is to maintain backwards compatbility with versions of
626
+ numpy < 1.7. In those versions of ``numpy `` a call to ``ndarray.astype(str) ``
627
+ will truncate any strings that are more than 60 characters in length. Second,
628
+ we can't pass ``object `` arrays to ``numexpr `` thus string comparisons must
629
+ be evaluated in Python space.
630
+ - The upshot is that this *only * applies to object-dtype'd expressions. So,
631
+ if you have an expression--for example--that's a string comparison
632
+ ``and ``-ed together with another boolean expression that's from a numeric
633
+ comparison, the numeric comparison will be evaluated by ``numexpr ``. In fact,
634
+ in general, :func: `~pandas.query `/:func: `~pandas.eval ` will "pick out" the
635
+ subexpressions that are ``eval ``-able by ``numexpr `` and those that must be
636
+ evaluated in Python space transparently to the user.
0 commit comments