Skip to content

BUG: sqrt not implemented in df.eval #7677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
acorbe opened this issue Jul 7, 2014 · 16 comments
Closed

BUG: sqrt not implemented in df.eval #7677

acorbe opened this issue Jul 7, 2014 · 16 comments

Comments

@acorbe
Copy link
Contributor

acorbe commented Jul 7, 2014

In the following scenario:

import pandas as pd
import numpy as np

a = np.random.rand(1000)
df = pd.DataFrame({'a' : a })

This call doesn't work (while it is supposed to, I guess)

df.eval('sqrt(a)')

     NotImplementedError: 'Call' nodes are not implemented

while this one does

df.eval('(a)**(.5)')

I guess this is a bug.

@jreback
Copy link
Contributor

jreback commented Jul 7, 2014

Well, its NotImplemented, the AST parsing does not interpret the Call node ATM. I just don't think @cpcloud had time to implement it (their may have been another reason but I don't recall). You are welcome to take a stab.

This is applies to function calls in general in eval/query. Note that this IS implemented for the pytables parsing.

@jreback jreback added this to the Someday milestone Jul 7, 2014
@acorbe
Copy link
Contributor Author

acorbe commented Jul 7, 2014

Hi,
I can see whether I can implement it. I had no idea where I should start looking at though at though.

Do you have suggestions?

@jreback
Copy link
Contributor

jreback commented Jul 7, 2014

https://github.com/pydata/pandas/blob/master/pandas/computation/expr.py

hmm, this IS implemented. So might be buggy somewhere. in the debugger step thru this and see if you can see where/why this is rejected.

@jreback jreback added the Bug label Jul 7, 2014
@jreback jreback modified the milestones: 0.15.0, Someday Jul 7, 2014
@cpcloud
Copy link
Member

cpcloud commented Jul 7, 2014

the parser disallows function calls by original design, mostly for time reasons but this could be opened up. there's a frozenset object somewhere (I think in computation/expr.py) that contains the valid numexpr functions. I don't think arbitrary function calls are in scope of eval, but we could certainly open it up for the numexpr functions.

@cpcloud
Copy link
Member

cpcloud commented Jul 7, 2014

@jreback that implementation is only used by the pytables engine

@jreback
Copy link
Contributor

jreback commented Jul 7, 2014

@cpcloud right....this was a while ago. yes I remember now, the arbitrary functions have to be done in python space, so was a bit complicated (e.g. have to validate which are numexpr capable and which are not).

@jreback
Copy link
Contributor

jreback commented Jul 7, 2014

@acorbe this is a bit non-trivial. but would love some eyes on it! this is actually a pretty interesting thing to understand, you get to learn AST parsing, and numexpr internals ( maybe +1 or -1) by your definition !

@jreback jreback modified the milestones: 0.15.1, 0.15.0 Jul 7, 2014
@cpcloud
Copy link
Member

cpcloud commented Jul 7, 2014

@acorbe just a bit of what's going on to (maybe?) whet your appetite:

An expression like:

a ** .5

which as you show parses to

(a) ** (.5)

eventually goes into numexpr (effectively) as

numexpr.evaluate('(df.a.values) ** (0.5)')

Since ** is a binary operator, you get it for "free" (as long as you implement a generic enough representation of binary operators).

With function calls, the parsing is more complicated (though most of the code necessary to do it is already there thanks to @jreback), but Python has the ast module which is fairly easy to use to parse Python code.

numexpr has a small set of functions that it implements in its special (extensible) evaluator.

There's a lot going in eval/query, which I'm happy to discuss with you, would love to have some more eyes on this.

@acorbe
Copy link
Contributor Author

acorbe commented Jul 10, 2014

Hi people,

thanks for your explanations!

I must say that when I mentioned it as a bug I had a look at numexpr reference. Since sqrt was part of numexpr I somehow thought that the problem could have been solved easily/trivially. Clearly it's not the case.

I cannot promise this much time to look at this, sorry :(.

@Twizzledrizzle
Copy link

Any progress on this? If not I will have a look to see if I can help

@cpcloud
Copy link
Member

cpcloud commented Nov 9, 2014

@Twizzledrizzle please dive in if you get a chance! I'm not sure when I'll be able to get to this

@Twizzledrizzle
Copy link

Hmmm....

_unsupported_nodes contain (expr.py row ~316)

frozenset(['Global', 'Raise', 'For', 'Param', 'DictComp', 'IsNot', 'Suite', 'Print', 'Import', 'AugLoad', 
'TryExcept', 'Store', 'expr_context', 'excepthandler', 'Return', 'Exec', 'Repr', 'ExceptHandler', 
'ImportFrom', 'TryFinally', 'Delete', 'With', 'arguments', 'SetComp', 'AugAssign', 'ClassDef', 'stmt', 
'While', 'Continue', 'Del', 'Yield', 'Expression', 'FunctionDef', 'mod', 'Load', 'Interactive', 'Set', 'keyword', 
'Is', 'AST', 'AugStore', 'alias', 'Lambda', 'Assert', 'Break', 'Pass', 'If', 'GeneratorExp', 'IfExp'])

cannot find where it get stuck on that 'Call' nodes are not implemented.

Can you point me closer to the problem?? Sorry much of this code is currently new to me :)

@Twizzledrizzle
Copy link

In this function, the 'Call' node is included when I print the node. Can it be included here?

# python code
def disallow(nodes):
    """Decorator to disallow certain nodes from parsing. Raises a
    NotImplementedError instead.

    Returns
    -------
    disallowed : callable
    """
    def disallowed(cls):
        cls.unsupported_nodes = ()
        for node in nodes:
            print node
            new_method = _node_not_implemented(node, cls)
            name = 'visit_{0}'.format(node)
            cls.unsupported_nodes += (name,)
            setattr(cls, name, new_method)
        return cls
    return disallowed

@Twizzledrizzle
Copy link

I am also not sure how the engine works in eval.py, line 211.

When I push a hard coded 'sqrt(2)' for instance instead of parsed_expr. It still does not work.

Do you know what string is required to make the engine work?

I guess one simple way of making it work, is turning all 'sqrt(....)' into '(....)**0.5'
Would this be to ugly??

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@xmnlab
Copy link

xmnlab commented Jul 1, 2015

The same problem with log function.

@jreback
Copy link
Contributor

jreback commented Oct 19, 2015

this was closed by #10953 and is available in 0.17.0

@jreback jreback closed this as completed Oct 19, 2015
@jreback jreback modified the milestones: 0.17.0, Next Major Release Oct 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants