-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: add na_action to Categorical.map & CategoricalIndex.map #51645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mroeschke
merged 16 commits into
pandas-dev:main
from
topper-123:categorical_map_na_Action
Mar 30, 2023
Merged
Changes from 14 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
6e62b26
ENH: add na_action to Categorical.map
topper-123 7672271
add GH numbers
topper-123 1c5b392
pre-commit issues
topper-123 5b07029
map Categorical with Series
topper-123 85b4193
REF: simplify .map
topper-123 6dfdf87
pass test_map
topper-123 ed6b1d7
fix whatsnew
topper-123 2018aaf
cleanups
topper-123 84beb35
pre-commit
topper-123 01b199d
deprecate Categorical.map(na_action=ignore)
topper-123 7e470c9
fix docstrings
topper-123 afb843b
fix rebase
topper-123 ef5ed65
simplity implementation
topper-123 be2d451
fix warn
topper-123 9d0c325
fix comments
topper-123 8cae085
Merge branch 'master' into categorical_map_na_Action
topper-123 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -1205,7 +1205,11 @@ def remove_unused_categories(self) -> Categorical: | |||||
|
||||||
# ------------------------------------------------------------------ | ||||||
|
||||||
def map(self, mapper, na_action=None): | ||||||
def map( | ||||||
self, | ||||||
mapper, | ||||||
na_action: Literal["ignore"] | None | lib.NoDefault = lib.no_default, | ||||||
): | ||||||
""" | ||||||
Map categories using an input mapping or function. | ||||||
|
||||||
|
@@ -1222,6 +1226,14 @@ def map(self, mapper, na_action=None): | |||||
---------- | ||||||
mapper : function, dict, or Series | ||||||
Mapping correspondence. | ||||||
na_action : {None, 'ignore'}, default 'ignore' | ||||||
If 'ignore', propagate NaN values, without passing them to the | ||||||
mapping correspondence. | ||||||
|
||||||
.. deprecated:: 2.1.0 | ||||||
|
||||||
The dault value of 'ignore' has been deprecated and will be changed to | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
None in the future. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
|
@@ -1245,10 +1257,10 @@ def map(self, mapper, na_action=None): | |||||
>>> cat | ||||||
['a', 'b', 'c'] | ||||||
Categories (3, object): ['a', 'b', 'c'] | ||||||
>>> cat.map(lambda x: x.upper()) | ||||||
>>> cat.map(lambda x: x.upper(), na_action=None) | ||||||
['A', 'B', 'C'] | ||||||
Categories (3, object): ['A', 'B', 'C'] | ||||||
>>> cat.map({'a': 'first', 'b': 'second', 'c': 'third'}) | ||||||
>>> cat.map({'a': 'first', 'b': 'second', 'c': 'third'}, na_action=None) | ||||||
['first', 'second', 'third'] | ||||||
Categories (3, object): ['first', 'second', 'third'] | ||||||
|
||||||
|
@@ -1259,35 +1271,50 @@ def map(self, mapper, na_action=None): | |||||
>>> cat | ||||||
['a', 'b', 'c'] | ||||||
Categories (3, object): ['a' < 'b' < 'c'] | ||||||
>>> cat.map({'a': 3, 'b': 2, 'c': 1}) | ||||||
>>> cat.map({'a': 3, 'b': 2, 'c': 1}, na_action=None) | ||||||
[3, 2, 1] | ||||||
Categories (3, int64): [3 < 2 < 1] | ||||||
|
||||||
If the mapping is not one-to-one an :class:`~pandas.Index` is returned: | ||||||
|
||||||
>>> cat.map({'a': 'first', 'b': 'second', 'c': 'first'}) | ||||||
>>> cat.map({'a': 'first', 'b': 'second', 'c': 'first'}, na_action=None) | ||||||
Index(['first', 'second', 'first'], dtype='object') | ||||||
|
||||||
If a `dict` is used, all unmapped categories are mapped to `NaN` and | ||||||
the result is an :class:`~pandas.Index`: | ||||||
|
||||||
>>> cat.map({'a': 'first', 'b': 'second'}) | ||||||
>>> cat.map({'a': 'first', 'b': 'second'}, na_action=None) | ||||||
Index(['first', 'second', nan], dtype='object') | ||||||
""" | ||||||
if na_action is not None: | ||||||
raise NotImplementedError | ||||||
if na_action is lib.no_default: | ||||||
warnings.warn( | ||||||
"The default value of 'ignore' for the `na_action` parameter in " | ||||||
"pandas.Categorical.map is deprecated and will be " | ||||||
"changed to 'None' in a future version. Please set na_action to the " | ||||||
"desired value to avoid seeing this warning", | ||||||
FutureWarning, | ||||||
stacklevel=find_stack_level(), | ||||||
) | ||||||
na_action = "ignore" | ||||||
|
||||||
assert callable(mapper) or is_dict_like(mapper) | ||||||
|
||||||
new_categories = self.categories.map(mapper) | ||||||
try: | ||||||
return self.from_codes( | ||||||
self._codes.copy(), categories=new_categories, ordered=self.ordered | ||||||
) | ||||||
except ValueError: | ||||||
# NA values are represented in self._codes with -1 | ||||||
# np.take causes NA values to take final element in new_categories | ||||||
if np.any(self._codes == -1): | ||||||
new_categories = new_categories.insert(len(new_categories), np.nan) | ||||||
return np.take(new_categories, self._codes) | ||||||
|
||||||
has_nans = np.any(self._codes == -1) | ||||||
|
||||||
na_val = np.nan | ||||||
if na_action is None and has_nans: | ||||||
na_val = mapper(np.nan) if callable(mapper) else mapper.get(np.nan, np.nan) | ||||||
|
||||||
if new_categories.is_unique and not new_categories.hasnans and na_val is np.nan: | ||||||
new_dtype = CategoricalDtype(new_categories, ordered=self.ordered) | ||||||
return self.from_codes(self._codes.copy(), dtype=new_dtype) | ||||||
|
||||||
if has_nans: | ||||||
new_categories = new_categories.insert(len(new_categories), na_val) | ||||||
|
||||||
return np.take(new_categories, self._codes) | ||||||
|
||||||
__eq__ = _cat_compare_op(operator.eq) | ||||||
__ne__ = _cat_compare_op(operator.ne) | ||||||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to refer to once the deprecation is enforced?