-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: change Index set ops sort=True -> sort=None #25063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 21 commits
aac172c
d4bcc55
45c827c
68b72a6
8716f97
f7056d5
e82cbb1
2a2de25
5c3da74
ce6d1db
52a2f2f
bb848f1
b15dc7e
27b5b16
1564d4f
cb54640
05a0ed0
5e1b831
32a5966
d234a1d
260aba2
41c24f0
2e181ac
1c5a037
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,10 +15,34 @@ Whats New in 0.24.1 (February XX, 2019) | |
These are the changes in pandas 0.24.1. See :ref:`release` for a full changelog | ||
including other versions of pandas. | ||
|
||
.. _whatsnew_0241.api: | ||
|
||
API Changes | ||
~~~~~~~~~~~ | ||
|
||
Changing the ``sort`` parameter for :class:`Index` set operations | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The default ``sort`` value for :meth:`Index.union` has changed from ``True`` to ``None`` (:issue:`24959`). | ||
The default *behavior*, however, remains the same: the result is sorted, unless | ||
|
||
1. ``self`` and ``other`` are identical | ||
2. ``self`` or ``other`` is empty | ||
3. ``self`` or ``other`` contain values that can not be compared (a ``RuntimeWarning`` is raised). | ||
|
||
This change will allow to preserve ``sort=True`` to mean "always sort" in a future release. | ||
|
||
The same change applies to :meth:`Index.difference` and :meth:`Index.symmetric_difference`, which | ||
would do not sort the result when the values could not be compared. | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For :meth:`Index.intersection` the option of ``sort=True`` is also renamed | ||
to ``sort=None`` (but for :meth:`Index.intersection` it is not the default), as | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a lot in this sentence. Oh it also isn't quite right... So maybe something like
|
||
the result is not sorted when ``self`` and ``other`` were identical. | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. _whatsnew_0241.regressions: | ||
|
||
Fixed Regressions | ||
^^^^^^^^^^^^^^^^^ | ||
~~~~~~~~~~~~~~~~~ | ||
|
||
- Bug in :meth:`DataFrame.itertuples` with ``records`` orient raising an ``AttributeError`` when the ``DataFrame`` contained more than 255 columns (:issue:`24939`) | ||
- Bug in :meth:`DataFrame.itertuples` orient converting integer column names to strings prepended with an underscore (:issue:`24940`) | ||
|
@@ -28,7 +52,7 @@ Fixed Regressions | |
.. _whatsnew_0241.enhancements: | ||
|
||
Enhancements | ||
^^^^^^^^^^^^ | ||
~~~~~~~~~~~~ | ||
|
||
|
||
.. _whatsnew_0241.bug_fixes: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2245,18 +2245,37 @@ def _get_reconciled_name_object(self, other): | |
return self._shallow_copy(name=name) | ||
return self | ||
|
||
def union(self, other, sort=True): | ||
def _validate_sort_keyword(self, sort): | ||
if sort not in [None, False]: | ||
raise ValueError("The 'sort' keyword only takes the values of " | ||
"None or False; {0} was passed.".format(sort)) | ||
|
||
def union(self, other, sort=None): | ||
""" | ||
Form the union of two Index objects. | ||
|
||
Parameters | ||
---------- | ||
other : Index or array-like | ||
sort : bool, default True | ||
Sort the resulting index if possible | ||
sort : bool or None, default None | ||
Whether to sort the resulting Index. | ||
|
||
* None : Sort the result, except when | ||
|
||
1. `self` and `other` are equal. | ||
2. `self` or `other` has length 0. | ||
3. Some values in `self` or `other` cannot be compared. | ||
A RuntimeWarning is issued in this case. | ||
|
||
* False : do not sort the result. | ||
|
||
.. versionadded:: 0.24.0 | ||
|
||
.. versionchanged:: 0.24.1 | ||
|
||
Changed the default value from ``True`` to ``None`` | ||
(without change in behaviour). | ||
|
||
Returns | ||
------- | ||
union : Index | ||
|
@@ -2269,6 +2288,7 @@ def union(self, other, sort=True): | |
>>> idx1.union(idx2) | ||
Int64Index([1, 2, 3, 4, 5, 6], dtype='int64') | ||
""" | ||
self._validate_sort_keyword(sort) | ||
self._assert_can_do_setop(other) | ||
other = ensure_index(other) | ||
|
||
|
@@ -2319,7 +2339,7 @@ def union(self, other, sort=True): | |
else: | ||
result = lvals | ||
|
||
if sort: | ||
if sort is None: | ||
try: | ||
result = sorting.safe_sort(result) | ||
except TypeError as e: | ||
|
@@ -2342,14 +2362,19 @@ def intersection(self, other, sort=False): | |
Parameters | ||
---------- | ||
other : Index or array-like | ||
sort : bool, default False | ||
Sort the resulting index if possible | ||
sort : False or None, default False | ||
Whether to sort the resulting index. | ||
|
||
* False : do not sort the result. | ||
* None : sort the result, except when `self` and `other` are equal | ||
or when the values cannot be compared. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "...are equal, either one is empty, or when ..." |
||
|
||
.. versionadded:: 0.24.0 | ||
|
||
.. versionchanged:: 0.24.1 | ||
|
||
Changed the default from ``True`` to ``False``. | ||
Changed the default from ``True`` to ``False``, to match | ||
the behaviour of 0.23.4 and earlier. | ||
|
||
Returns | ||
------- | ||
|
@@ -2363,6 +2388,7 @@ def intersection(self, other, sort=False): | |
>>> idx1.intersection(idx2) | ||
Int64Index([3, 4], dtype='int64') | ||
""" | ||
self._validate_sort_keyword(sort) | ||
self._assert_can_do_setop(other) | ||
other = ensure_index(other) | ||
|
||
|
@@ -2402,7 +2428,7 @@ def intersection(self, other, sort=False): | |
|
||
taken = other.take(indexer) | ||
|
||
if sort: | ||
if sort is None: | ||
taken = sorting.safe_sort(taken.values) | ||
if self.name != other.name: | ||
name = None | ||
|
@@ -2415,7 +2441,7 @@ def intersection(self, other, sort=False): | |
|
||
return taken | ||
|
||
def difference(self, other, sort=True): | ||
def difference(self, other, sort=None): | ||
""" | ||
Return a new Index with elements from the index that are not in | ||
`other`. | ||
|
@@ -2425,11 +2451,22 @@ def difference(self, other, sort=True): | |
Parameters | ||
---------- | ||
other : Index or array-like | ||
sort : bool, default True | ||
Sort the resulting index if possible | ||
sort : False or None, default None | ||
Whether to sort the resulting index. By default, the | ||
values are attempted to be sorted, but any TypeError from | ||
incomparable elements is caught by pandas. | ||
|
||
* None : Attempt to sort the result, but catch any TypeErrors | ||
from comparing incomparable elements. | ||
* False : Do not sort the result. | ||
|
||
.. versionadded:: 0.24.0 | ||
|
||
.. versionchanged:: 0.24.1 | ||
|
||
Changed the default value from ``True`` to ``None`` | ||
(without change in behaviour). | ||
|
||
Returns | ||
------- | ||
difference : Index | ||
|
@@ -2444,6 +2481,7 @@ def difference(self, other, sort=True): | |
>>> idx1.difference(idx2, sort=False) | ||
Int64Index([2, 1], dtype='int64') | ||
""" | ||
self._validate_sort_keyword(sort) | ||
self._assert_can_do_setop(other) | ||
|
||
if self.equals(other): | ||
|
@@ -2460,27 +2498,38 @@ def difference(self, other, sort=True): | |
label_diff = np.setdiff1d(np.arange(this.size), indexer, | ||
assume_unique=True) | ||
the_diff = this.values.take(label_diff) | ||
if sort: | ||
if sort is None: | ||
try: | ||
the_diff = sorting.safe_sort(the_diff) | ||
except TypeError: | ||
pass | ||
|
||
return this._shallow_copy(the_diff, name=result_name, freq=None) | ||
|
||
def symmetric_difference(self, other, result_name=None, sort=True): | ||
def symmetric_difference(self, other, result_name=None, sort=None): | ||
""" | ||
Compute the symmetric difference of two Index objects. | ||
|
||
Parameters | ||
---------- | ||
other : Index or array-like | ||
result_name : str | ||
sort : bool, default True | ||
Sort the resulting index if possible | ||
sort : False or None, default None | ||
Whether to sort the resulting index. By default, the | ||
values are attempted to be sorted, but any TypeError from | ||
incomparable elements is caught by pandas. | ||
|
||
* None : Attempt to sort the result, but catch any TypeErrors | ||
from comparing incomparable elements. | ||
* False : Do not sort the result. | ||
|
||
.. versionadded:: 0.24.0 | ||
|
||
.. versionchanged:: 0.24.1 | ||
|
||
Changed the default value from ``True`` to ``None`` | ||
(without change in behaviour). | ||
|
||
Returns | ||
------- | ||
symmetric_difference : Index | ||
|
@@ -2504,6 +2553,7 @@ def symmetric_difference(self, other, result_name=None, sort=True): | |
>>> idx1 ^ idx2 | ||
Int64Index([1, 5], dtype='int64') | ||
""" | ||
self._validate_sort_keyword(sort) | ||
self._assert_can_do_setop(other) | ||
other, result_name_update = self._convert_can_do_setop(other) | ||
if result_name is None: | ||
|
@@ -2524,7 +2574,7 @@ def symmetric_difference(self, other, result_name=None, sort=True): | |
right_diff = other.values.take(right_indexer) | ||
|
||
the_diff = _concat._concat_compat([left_diff, right_diff]) | ||
if sort: | ||
if sort is None: | ||
try: | ||
the_diff = sorting.safe_sort(the_diff) | ||
except TypeError: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.