-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Adding lambda support inside of __getitem__ for DataFrame, Series, .. etc. #2560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm surprised this doesn't work:
because and'ing two arrays together is not currently a vector operation, |
does the following do what you want?
|
That does it, but my goal with the lambdas is specifically to avoid needing to verbosely type out the dataframe's name -dot- attr_name for all the columns involved in the selection. The lambda at least lets me reduce it just to "x". As for "and" not working where & works, this is just a known limitation of Python's for using the built in logical operators on arrays. It gives the classic "truth value of an array is undefined" error. On Dec 18, 2012, at 6:27 PM, jreback [email protected] wrote:
|
this is addressed by #4164 |
To avoid the verbose syntax currently needed to select across many columns of a data frame, here's a suggestion. Inside of DataFrame's getitem function, make some special case logic to handle the case where a lambda function is passed in. If a lambda is passed in, then apply it to the dataframe itself and attempt to get the items based on the lambda's result.
Here's an example of what I mean. Suppose that I create a data frame named "dfrm" and it has columns A, B, C, D, and E. Then currently, the following syntax will work to sub-select across conditions on the A and B columns:
dfrm[(lambda x: (x.A < 0) & (x.B > 0))(dfrm)]
By adding the extra handling to getitem, you can remove the need for the the last set of parentheses where dfrm itself is passed as the argument to the lambda. getitem can check for a callable and just always pass itself to the callable, so that the syntax would look like this:
dfrm[(lambda x: (x.A < 0) & (x.B > 0))]
The text was updated successfully, but these errors were encountered: