DOC: add nlargest/nsmallest to API.rst #10145

RomanPekar · 2015-05-15T10:34:53Z

At the moment there's no easy way to partially sort the DataFrame to get top N rows, could be useful feature.

There's partial sorting available for numpy within the bottleneck library.

Here's link to SO question about numpy partial sort - http://stackoverflow.com/questions/10337533/a-fast-way-to-find-the-largest-n-elements-in-an-numpy-array

jreback · 2015-05-15T11:57:46Z

http://pandas.pydata.org/pandas-docs/stable/basics.html?highlight=nlargest#smallest-largest-values

(this is even faster than partsort as its grabbing the top values)

though these are not mentioned in API.rst. want to do a PR to add them?

RomanPekar · 2015-05-15T12:08:44Z

Ok, but this one is for Series, my issue was about getting top N rows from DataFrame ordered by columns I want, similar to SQL select top (@N) * from <Table> order by <col1> asc, <col2> desc without actually sorting the whole dataset

jreback · 2015-05-15T12:25:47Z

df.apply(Series.nlargest)

RomanPekar · 2015-05-15T12:35:26Z

Thanks again, but it still will not work if I want to sort by several columns. As an example - suppose I have this dataset:

And I want to take top 4 records ordered by A desc, B desc (rows 5, 4, 1, 2)

jreback · 2015-05-15T12:58:06Z

In [42]: df.sort(['A','B'],ascending=[0,0])
Out[42]: 
   A  B
4  3  2
5  3  1
1  2  5
2  2  4
3  2  3
0  1  5

I don't think there is an easy way to do this w/o a full-sort. partsort (and most other algos operate on a 1-d array).

Fixes pandas-dev#10145

jreback added the Docs label May 15, 2015

jreback added this to the 0.17.0 milestone May 15, 2015

jreback changed the title ~~ENH: Partial sorting on the DataFrame, could be useful to get top N rows~~ DOC: add nlargest/nsmallest to API.rst May 15, 2015

jreback added the Difficulty Novice label May 15, 2015

robintw added a commit to robintw/pandas that referenced this issue May 25, 2015

DOC: Added nlargest/nsmallest to API docs

df2cffc

Fixes pandas-dev#10145

robintw mentioned this issue May 25, 2015

DOC: Added nlargest/nsmallest to API docs #10206

Merged

jreback closed this as completed in #10206 May 26, 2015

jorisvandenbossche modified the milestones: 0.17.0, 0.16.2 Jun 2, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: add nlargest/nsmallest to API.rst #10145

DOC: add nlargest/nsmallest to API.rst #10145

RomanPekar commented May 15, 2015

jreback commented May 15, 2015

RomanPekar commented May 15, 2015

jreback commented May 15, 2015

RomanPekar commented May 15, 2015

jreback commented May 15, 2015

DOC: add nlargest/nsmallest to API.rst #10145

DOC: add nlargest/nsmallest to API.rst #10145

Comments

RomanPekar commented May 15, 2015

jreback commented May 15, 2015

RomanPekar commented May 15, 2015

jreback commented May 15, 2015

RomanPekar commented May 15, 2015

jreback commented May 15, 2015