-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: Improve the docstrings of CategoricalIndex.map #20286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 18 commits
4c6d29e
a2bf1c2
bf0b870
dba4d1f
d13f83c
9366e39
e42fd0c
8240278
9087677
da84d5f
848d960
2af44df
1ad38e9
1a8040d
4fdb6a8
986e1dd
0fc2c48
9e25133
1cd4c38
a76a4b5
ecbaca0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1080,20 +1080,63 @@ def remove_unused_categories(self, inplace=False): | |
return cat | ||
|
||
def map(self, mapper): | ||
"""Apply mapper function to its categories (not codes). | ||
""" | ||
Map categories using input correspondence (dict, Series, or function). | ||
|
||
Maps the categories to new categories. If the mapping correspondence is | ||
a bijection (maps each original category to a different new category) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we can use something like "one-to-one mapping" instead ? |
||
the result is a :class:`~pandas.Categorical` which has the same order | ||
property as the original, otherwise a :class:`~pandas.Index` is | ||
returned. | ||
|
||
If a `dict` or :class:`~pandas.Series` is used any unmapped category is | ||
mapped to NaN. Note that if this happens an :class:`~pandas.Index` will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
be returned. | ||
|
||
Parameters | ||
---------- | ||
mapper : callable | ||
Function to be applied. When all categories are mapped | ||
to different categories, the result will be Categorical which has | ||
the same order property as the original. Otherwise, the result will | ||
be np.ndarray. | ||
mapper : function, dict, or Series | ||
Mapping correspondence. | ||
|
||
Returns | ||
------- | ||
applied : Categorical or Index. | ||
pandas.Categorical or pandas.Index | ||
Mapped categorical. | ||
|
||
See Also | ||
-------- | ||
CategoricalIndex.map : Apply a mapping correspondence on a | ||
:class:`~pandas.CategoricalIndex`. | ||
Index.map : Apply a mapping correspondence on an | ||
:class:`~pandas.Index`. | ||
Series.map : Apply a mapping correspondence on a | ||
:class:`~pandas.Series`. | ||
Series.apply : Apply more complex functions on a | ||
:class:`~pandas.Series`. | ||
|
||
Examples | ||
-------- | ||
>>> cat = pd.Categorical(['a', 'b', 'c']) | ||
>>> cat | ||
[a, b, c] | ||
Categories (3, object): [a, b, c] | ||
>>> cat.map(lambda x: x.upper()) | ||
[A, B, C] | ||
Categories (3, object): [A, B, C] | ||
>>> cat.map({'a': 'first', 'b': 'second', 'c': 'third'}) | ||
[first, second, third] | ||
Categories (3, object): [first, second, third] | ||
|
||
If the mapping is not bijective an :class:`~pandas.Index` is returned: | ||
|
||
>>> cat.map({'a': 'first', 'b': 'second', 'c': 'first'}) | ||
Index(['first', 'second', 'first'], dtype='object') | ||
|
||
If a `dict` is used, all unmapped categories are mapped to NaN and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar comment with 'NaN' |
||
the result is an :class:`~pandas.Index`: | ||
|
||
>>> cat.map({'a': 'first', 'b': 'second'}) | ||
Index(['first', 'second', nan], dtype='object') | ||
""" | ||
new_categories = self.categories.map(mapper) | ||
try: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3352,22 +3352,23 @@ def groupby(self, values): | |
return result | ||
|
||
def map(self, mapper, na_action=None): | ||
"""Map values of Series using input correspondence | ||
""" | ||
Map values using input correspondence (a dict, Series, or function). | ||
|
||
Parameters | ||
---------- | ||
mapper : function, dict, or Series | ||
Mapping correspondence. | ||
na_action : {None, 'ignore'} | ||
If 'ignore', propagate NA values, without passing them to the | ||
mapping function | ||
mapping correspondence. | ||
|
||
Returns | ||
------- | ||
applied : Union[Index, MultiIndex], inferred | ||
The output of the mapping function applied to the index. | ||
If the function returns a tuple with more than one element | ||
a MultiIndex will be returned. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to add a small example here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is outside the original scope of the contribution, see first PR comment. |
||
""" | ||
|
||
from .multi import MultiIndex | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -660,20 +660,61 @@ def is_dtype_equal(self, other): | |
take_nd = take | ||
|
||
def map(self, mapper): | ||
"""Apply mapper function to its categories (not codes). | ||
""" | ||
Map values using input correspondence (a dict, Series, or function). | ||
|
||
Maps the values (their categories, not the codes) of the index to new | ||
categories. If the mapping correspondence is a bijection (maps each | ||
original category to a different new category) the result is a | ||
:class:`~pandas.CategoricalIndex` which has the same order property as | ||
the original, otherwise an :class:`~pandas.Index` is returned. | ||
|
||
If a `dict` or :class:`~pandas.Series` is used any unmapped category is | ||
mapped to NaN. Note that if this happens an :class:`~pandas.Index` will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 'NaN' |
||
be returned. | ||
|
||
Parameters | ||
---------- | ||
mapper : callable | ||
Function to be applied. When all categories are mapped | ||
to different categories, the result will be a CategoricalIndex | ||
which has the same order property as the original. Otherwise, | ||
the result will be a Index. | ||
mapper : function, dict, or Series | ||
Mapping correspondence. | ||
|
||
Returns | ||
------- | ||
applied : CategoricalIndex or Index | ||
pandas.CategoricalIndex or pandas.Index | ||
Mapped index. | ||
|
||
See Also | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since all of these methods are similar why don't you add a See Also in all of the instances that refer to one another? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is outside the original scope of the contribution, see first PR comment. |
||
-------- | ||
Index.map : Apply a mapping correspondence on an | ||
:class:`~pandas.Index`. | ||
Series.map : Apply a mapping correspondence on a | ||
:class:`~pandas.Series`. | ||
Series.apply : Apply more complex functions on a | ||
:class:`~pandas.Series`. | ||
|
||
Examples | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to see Examples here - why not add to other There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is outside the original scope of the contribution, see first PR comment. |
||
-------- | ||
>>> idx = pd.CategoricalIndex(['a', 'b', 'c']) | ||
>>> idx | ||
CategoricalIndex(['a', 'b', 'c'], categories=['a', 'b', 'c'], | ||
ordered=False, dtype='category') | ||
>>> idx.map(lambda x: x.upper()) | ||
CategoricalIndex(['A', 'B', 'C'], categories=['A', 'B', 'C'], | ||
ordered=False, dtype='category') | ||
>>> idx.map({'a': 'first', 'b': 'second', 'c': 'third'}) | ||
CategoricalIndex(['first', 'second', 'third'], categories=['first', | ||
'second', 'third'], ordered=False, dtype='category') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche @l736x the point I was trying to make was that since we mention that the ordering property gets retained with a mapping, that we should have an example for an ordered Note that one-to-one mappings will retain the ordering of the CategoricalIndex
idx = pd.CategoricalIndex(['a,'b','c'], ordered=True)
idx.map({'a': 3, 'b': 2, 'c': 1}) Just my $.02 though @jorisvandenbossche I'm good to go whenever you want to merge There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would indeed be good to have an example for this. But maybe just adding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Works for me |
||
|
||
If the mapping is not bijective an :class:`~pandas.Index` is returned: | ||
|
||
>>> idx.map({'a': 'first', 'b': 'second', 'c': 'first'}) | ||
Index(['first', 'second', 'first'], dtype='object') | ||
|
||
If a `dict` is used, all unmapped categories are mapped to NaN and | ||
the result is an :class:`~pandas.Index`: | ||
|
||
>>> idx.map({'a': 'first', 'b': 'second'}) | ||
Index(['first', 'second', nan], dtype='object') | ||
""" | ||
return self._shallow_copy_with_infer(self.values.map(mapper)) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2831,25 +2831,26 @@ def unstack(self, level=-1, fill_value=None): | |
|
||
def map(self, arg, na_action=None): | ||
""" | ||
Map values of Series using input correspondence (which can be | ||
a dict, Series, or function) | ||
Map values of Series using input correspondence (a dict, Series, or | ||
function). | ||
Parameters | ||
---------- | ||
arg : function, dict, or Series | ||
Mapping correspondence. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should add here some details on how a dict and Series are handled. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is outside the original scope of the contribution, see first PR comment. |
||
na_action : {None, 'ignore'} | ||
If 'ignore', propagate NA values, without passing them to the | ||
mapping function | ||
mapping correspondence. | ||
Returns | ||
------- | ||
y : Series | ||
same index as caller | ||
Same index as caller. | ||
Examples | ||
-------- | ||
Map inputs to outputs (both of type `Series`) | ||
Map inputs to outputs (both of type `Series`): | ||
>>> x = pd.Series([1,2,3], index=['one', 'two', 'three']) | ||
>>> x | ||
|
@@ -2900,9 +2901,9 @@ def map(self, arg, na_action=None): | |
See Also | ||
-------- | ||
Series.apply: For applying more complex functions on a Series | ||
DataFrame.apply: Apply a function row-/column-wise | ||
DataFrame.applymap: Apply a function elementwise on a whole DataFrame | ||
Series.apply : For applying more complex functions on a Series. | ||
DataFrame.apply : Apply a function row-/column-wise. | ||
DataFrame.applymap : Apply a function elementwise on a whole DataFrame. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also add Series.replace ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is outside the original scope of the contribution, see first PR comment. |
||
Notes | ||
----- | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simplified to the point that I think you can now just say "If the mapping correspondence is one-to-one the result is a ..." in the second sentence