-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: remove restrictions to numexpr to allow where
etc.
#34834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you add a minimal example? http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports |
Well, yeah. I've updated it. But it does not make so much sense I think to provide an example for Again, in short: Let me know if things are confusing and thanks a lot for taking a look at it! |
Thanks. And can you add the expected output? Do you know if we have other expressions that would be supported by only one engine (numexpr in this case)? |
I was about to report this issue but realized [edit: I would have created] a dupe. numexpr supports the following functions missing from the definitions in pandas.core.computation.ops:
Expectation is that import pandas as pd
data = {'a': [1, 2, 3]}
df = pd.DataFrame({'a': [1, 2, 3]})
df.eval('where(a>2, 42, 0)') should return the same as pd.DataFrame({'a':[0, 0, 42]}) |
My apologies @mayou36 can you reopen? I meant that I would have created a dupe. This is not a dupe, I think it's a legit issue that pandas doesn't support where or tan when numexpr does. |
Oc, sorry that was by mistake and it is not yet resolved. @TomAugspurger, do you have an idea why this is not here? |
I have "solved" my problem by using import pandas
pandas.core.computation.ops.MATHOPS = (*pandas.core.computation.ops.MATHOPS, "where") Works out of the box with ternary operators, 🎉 df = pandas.DataFrame({"a": [2.0, 4.0, 5.0]})
pandas.eval("where(df.a > 3.0, df.a, 1)", target=df) results in
Not exactly proud of this solution, but this shows that this feature request is probably done by adding the strings and writing some unit tests. NB: arguments 2 and 3 have to be numbers - I'd love to have strings (same type should only be required only for arg 2 and 3) |
@TomAugspurger any news on this? This seems to be a limitation for no apparent reason? |
Okay, so what about if I go ahead and remove the limitation in a PR? Maybe the tests will tell us why this is there, or some maintainer, but without comments in the code or on the issue here, we can only guess. |
Another issue was opened for this same topic: #55091 @jonas-eschle I suggest you add whichever functions you think are going to be useful and write unit tests that ensures every function you're adding is working properly. |
FYI also loosely related I've filed #58329 which means that |
Is your feature request related to a problem?
the evaluation of a query is currently limited to the list
_mathops
while numexpr would support more, most notably a where (that would also solve other issues simple).I do not see any reason for this restriction. In fact, simply adding the
where
runs (at least for my use case). Why is this restriction in place? Why can't we enlarge it/directly pass it through to numexpr?Describe the solution you'd like
Allow the full operator set that numexpr supports in the
pd.eval
API breaking implications
Nothing
Alternatives
Using
.where
is an option if you can access the dataframe directly (although suboptimal). However, if your selection of data is based on passing a selection string around instead of the df (several reasons for this), the latter is not feasible.The following doesn't work:
whereas in
numexpr
it doeswe expect this to return
[0, 0, 42]
The text was updated successfully, but these errors were encountered: