Skip to content

ENH: Allow function in Series/DataFrame get syntax #9279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
soaxelbrooke opened this issue Jan 17, 2015 · 1 comment
Closed

ENH: Allow function in Series/DataFrame get syntax #9279

soaxelbrooke opened this issue Jan 17, 2015 · 1 comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@soaxelbrooke
Copy link

Currently, pandas makes clever use of __getitem__ to filter/mask Series and DataFrame. At times, the current abstraction is leaky - for instance, df[18 < df['age'] < 25] would raise an exception, even though cascaded compare symbols in that manner are legal Python syntax.

It would be awesome to also be able to pass in a function to the __getitem__ function that would act as a filter. This would allow users to use all of Python's language features for filtering while still maintaining the brevity intended for these statements.

For example:

df[lambda row: 18 < row['age'] < 25]

This would also allow lengthier filters to be defined outside of the getter square brackets:

def age_filter(row):
    return 18 < row['age'] < 25 if row['country'] == 'USA' else row['age'] > 13
df[age_filter]

Functions as viable filters would also allow for chaining of filter functions without creating multiple intermediary DataFrames or Series, making optimization of cascaded filters much easier.

@jreback
Copy link
Contributor

jreback commented Jan 17, 2015

this is a dupe of #2560. you can do this using query/eval, see docs here

you can do chained relationships as well.

__getitem__ is already hopelessly overloaded. I don't think this add any clarify. And it will be prone to non-vectorized evaluation, losing performance. All of the operations you show above are vectorizable

@jreback jreback added API Design Indexing Related to indexing on series/frames, not to indexes themselves labels Jan 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

2 participants