Skip to content

ENH: support kind and na_position kwargs in Series.sort_index #14445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.20.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ New features

Other enhancements
^^^^^^^^^^^^^^^^^^
- ``Series.sort_index`` accepts parameters ``kind`` and ``na_position`` (:issue:`13589`, :issue:`14444`)



Expand Down
3 changes: 2 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2029,7 +2029,8 @@ def sort_values(self, by, axis=0, ascending=True, inplace=False,
DataFrames, this option is only applied when sorting on a single
column or label.
na_position : {'first', 'last'}, default 'last'
`first` puts NaNs at the beginning, `last` puts NaNs at the end
`first` puts NaNs at the beginning, `last` puts NaNs at the end.
Not implemented for MultiIndex.
sort_remaining : bool, default True
if true and sorting by level and index is multilevel, sort by other
levels too (in order) after sorting by specified level
Expand Down
16 changes: 10 additions & 6 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -1773,7 +1773,7 @@ def _try_kind_sort(arr):

@Appender(generic._shared_docs['sort_index'] % _shared_doc_kwargs)
def sort_index(self, axis=0, level=None, ascending=True, inplace=False,
sort_remaining=True):
kind='quicksort', na_position='last', sort_remaining=True):

axis = self._get_axis_number(axis)
index = self.index
Expand All @@ -1782,12 +1782,16 @@ def sort_index(self, axis=0, level=None, ascending=True, inplace=False,
sort_remaining=sort_remaining)
elif isinstance(index, MultiIndex):
from pandas.core.groupby import _lexsort_indexer
indexer = _lexsort_indexer(index.labels, orders=ascending)
indexer = _ensure_platform_int(indexer)
new_index = index.take(indexer)
indexer = _lexsort_indexer(index.labels, orders=ascending,
na_position=na_position)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave this line out? (if it's not supported, not really needed to pass it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, leave out the na_position argument from _lexsort_indexer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes (as it has no effect I understood?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true. Unfortunately, it has no effect right now. I pushed that change now.


else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put:

indexer = _ensure_platform_int(indexer)
new_index = index.take(indexer)

outside the if-else block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; moved.

new_index, indexer = index.sort_values(return_indexer=True,
ascending=ascending)
from pandas.core.groupby import _nargsort
indexer = _nargsort(index, kind=kind, ascending=ascending,
na_position=na_position)

indexer = _ensure_platform_int(indexer)
new_index = index.take(indexer)

new_values = self._values.take(indexer)
result = self._constructor(new_values, index=new_index)
Expand Down
25 changes: 25 additions & 0 deletions pandas/tests/series/test_sorting.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,3 +144,28 @@ def test_sort_index_multiindex(self):
# rows share same level='A': sort has no effect without remaining lvls
res = s.sort_index(level='A', sort_remaining=False)
assert_series_equal(s, res)

def test_sort_index_kind(self):
# GH #14444 & #13589: Add support for sort algo choosing
series = Series(index=[3, 2, 1, 4, 3])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment with the github reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

expected_series = Series(index=[1, 2, 3, 3, 4])

index_sorted_series = series.sort_index(kind='mergesort')
assert_series_equal(expected_series, index_sorted_series)

index_sorted_series = series.sort_index(kind='quicksort')
assert_series_equal(expected_series, index_sorted_series)

index_sorted_series = series.sort_index(kind='heapsort')
assert_series_equal(expected_series, index_sorted_series)

def test_sort_index_na_position(self):
series = Series(index=[3, 2, 1, 4, 3, np.nan])

expected_series_first = Series(index=[np.nan, 1, 2, 3, 3, 4])
index_sorted_series = series.sort_index(na_position='first')
assert_series_equal(expected_series_first, index_sorted_series)

expected_series_last = Series(index=[1, 2, 3, 3, 4, np.nan])
index_sorted_series = series.sort_index(na_position='last')
assert_series_equal(expected_series_last, index_sorted_series)