Skip to content

BUG: Groupby filter maintains ordering, closes #4621 #5222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 15, 2013

Conversation

danielballan
Copy link
Contributor

Simply adds np.sort.

Includes tests for SeriesGroupBy and DataFrameGroupBy, using indexes that are sequentially increasing, sequentially decreasing, and shuffled. Also, this PR removes all ad hoc sorting involved in the original filter tests -- now no longer needed. Much better!

@jreback
Copy link
Contributor

jreback commented Oct 14, 2013

looks good
in theory u might want to have a kw
sort=True that you would accept in order to turn this off
(only would matter with. large number of groups though)

@jreback
Copy link
Contributor

jreback commented Oct 14, 2013

can u add a release notes entry as well

@danielballan
Copy link
Contributor Author

I'll add the sort keyword. Does that qualify as an API change, or is it too minor?

@jreback
Copy link
Contributor

jreback commented Oct 14, 2013

pretty minor - just make sure doc string is good

@danielballan
Copy link
Contributor Author

Ready (assuming Travis agrees!)

@hayd
Copy link
Contributor

hayd commented Oct 14, 2013

This looks good, though I dislike the use of sort kwarg to mean "maintains original order"... not sure I have alternative.

@jtratner
Copy link
Contributor

Yes, I agree - that's very confusing. Is the alternative not sorting it?

@jreback
Copy link
Contributor

jreback commented Oct 14, 2013

sort is the kw in groupby already so it's consistent (or maybe it's sorted?)
I don't remember u may be able to just use that kw

@hayd
Copy link
Contributor

hayd commented Oct 14, 2013

@jreback that's a good point, I'd not seen or used that.

sort : boolean, default True
Sort group keys. Get better performance by turning this off

Although this is somewhat different, in that case it is actually sorting the keys. Here we are putting them back into their previous order of the index (and this may - or may not - be sorted)...

@danielballan
Copy link
Contributor Author

Exactly. IMO, there is a strong case for not even allowing the option to skip the sorting. For one thing, it's difficult to test, which seems like a red flag. (My original tests sorted the result before comparing to the expected one.)

@jreback
Copy link
Contributor

jreback commented Oct 14, 2013

this is the indexer it should always be sorted
just skip a keyword on this

it should always be in the same order as the groups which is why it needs sorting, because the groups are sorted (groupsort_indexer)

@danielballan
Copy link
Contributor Author

Removed sort, including docstring and release notes. Ready.

jreback added a commit that referenced this pull request Oct 15, 2013
BUG: Groupby filter maintains ordering, closes #4621
@jreback jreback merged commit 61d7e14 into pandas-dev:master Oct 15, 2013
@danielballan danielballan deleted the filter-maintains-ordering branch October 15, 2013 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants