You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, pandas makes clever use of __getitem__ to filter/mask Series and DataFrame. At times, the current abstraction is leaky - for instance, df[18 < df['age'] < 25] would raise an exception, even though cascaded compare symbols in that manner are legal Python syntax.
It would be awesome to also be able to pass in a function to the __getitem__ function that would act as a filter. This would allow users to use all of Python's language features for filtering while still maintaining the brevity intended for these statements.
For example:
df[lambdarow: 18<row['age'] <25]
This would also allow lengthier filters to be defined outside of the getter square brackets:
Functions as viable filters would also allow for chaining of filter functions without creating multiple intermediary DataFrames or Series, making optimization of cascaded filters much easier.
The text was updated successfully, but these errors were encountered:
this is a dupe of #2560. you can do this using query/eval, see docs here
you can do chained relationships as well.
__getitem__ is already hopelessly overloaded. I don't think this add any clarify. And it will be prone to non-vectorized evaluation, losing performance. All of the operations you show above are vectorizable
Currently, pandas makes clever use of
__getitem__
to filter/mask Series and DataFrame. At times, the current abstraction is leaky - for instance,df[18 < df['age'] < 25]
would raise an exception, even though cascaded compare symbols in that manner are legal Python syntax.It would be awesome to also be able to pass in a function to the
__getitem__
function that would act as a filter. This would allow users to use all of Python's language features for filtering while still maintaining the brevity intended for these statements.For example:
This would also allow lengthier filters to be defined outside of the getter square brackets:
Functions as viable filters would also allow for chaining of filter functions without creating multiple intermediary DataFrames or Series, making optimization of cascaded filters much easier.
The text was updated successfully, but these errors were encountered: