Add filtering capability to GroupBy #919

wesm · 2012-03-15T15:13:56Z

Can be accomplished in a hackish way using apply, but a more structured approach would be nice

7/12/2012: Not sure what I was intending with this one

sanand0 · 2012-10-13T08:04:59Z

This would be quite useful.

For example, if for an address book, data.groupby('city') lists 1000 cities, and we want those with over 100 entries, would be useful to be able to say something like:

grouped = data.groupby('city')
grouped.filter(grouped.size() > 100)

... and then compute on just that subset.

apratap · 2012-10-31T23:30:03Z

FYI : without knowing about this open issue, I stumbled upon the same cleaning requirement. Would be nice to have this in pandas but for now I was able to move fwd.

Ref: http://stackoverflow.com/questions/13167391/filtering-grouped-df-in-pandas

apratap · 2012-11-01T16:53:59Z

Wesley: Can you please help me with the apply hack ? I still cant seem to filter grouped data. More details on the stackoverflow post.link above. Thanks! -Abhi

blounsbury-usbr · 2012-11-08T21:22:23Z

Probably bad issue etiquette but just wanted to add my +1 for this enhancement. I grouped my data by year (hydrologic water year actually) and then wanted to remove years with less than 365 days of data. I used the stackoverflow answer of pandas.concat() to work around it. But that is pretty ugly.

I agree with sanand0 that grouped.filter() would be easiest. Another possibility would be to add a 'drop()' function to a DataFrameGroupBy object. This would allow a loop over len(group.groups[name]).

wesm · 2012-11-21T04:12:33Z

another somewhat related reference: http://stackoverflow.com/questions/13446480/python-pandas-remove-entries-based-on-the-number-of-occurrences

jalperin · 2012-11-29T22:03:53Z

another +1. I can't quite figure out what the best way to work around it is, in a generic way. Which of the SO answers do you recommend Wesley?

jreback · 2013-06-06T21:07:08Z

closed via #3680

danielballan mentioned this issue May 22, 2013

EHN: Add filter methods to SeriesGroupBy, DataFrameGroupBy GH919 #3680

Merged

jreback closed this as completed in #3680 Jun 6, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add filtering capability to GroupBy #919

Add filtering capability to GroupBy #919

wesm commented Mar 15, 2012

sanand0 commented Oct 13, 2012

apratap commented Oct 31, 2012

apratap commented Nov 1, 2012

blounsbury-usbr commented Nov 8, 2012

wesm commented Nov 21, 2012

jalperin commented Nov 29, 2012

jreback commented Jun 6, 2013

Add filtering capability to GroupBy #919

Add filtering capability to GroupBy #919

Comments

wesm commented Mar 15, 2012

sanand0 commented Oct 13, 2012

apratap commented Oct 31, 2012

apratap commented Nov 1, 2012

blounsbury-usbr commented Nov 8, 2012

wesm commented Nov 21, 2012

jalperin commented Nov 29, 2012

jreback commented Jun 6, 2013