Skip to content

Allow for dict-like argument to Categorical.rename_categories #17586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 21, 2017

Conversation

alanbato
Copy link
Contributor

Categorical.rename_categories can now take a dict as new_categories and updates the categories to those found in the mapping.
It will only change the categories found in the dict, and leave all others the same.
The dict can be smaller or larger, and may or may not contain mappings referencing the old categories.

Have a good day/night! 🐼 🐍

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. some small comments. pls rebase / ping on green.

@@ -114,7 +114,7 @@ Other Enhancements
- :func:`pd.read_sas()` now recognizes much more of the most frequently used date (datetime) formats in SAS7BDAT files (:issue:`15871`).
- :func:`DataFrame.items` and :func:`Series.items` is now present in both Python 2 and 3 and is lazy in all cases (:issue:`13918`, :issue:`17213`)
- :func:`Styler.where` has been implemented. It is as a convenience for :func:`Styler.applymap` and enables simple DataFrame styling on the Jupyter notebook (:issue:`17474`).

- :func:`Categorical.rename_categories` now accepts a dict-like argument as `new_categories`, and only updates the categories found in that dict. (:issue:`17336`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update / add to the example in categorical.rst as well

@@ -824,7 +824,11 @@ def rename_categories(self, new_categories, inplace=False):
"""
inplace = validate_bool_kwarg(inplace, 'inplace')
cat = self if inplace else self.copy()
cat.categories = new_categories
if isinstance(new_categories, dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use is_dict_like, no need to do this as an if

if is_dict_like(new_categories):
    new_categories = [new_categories.get(item, item).....]
cat.categories = new_categories

inplace=True)
assert res is None
tm.assert_index_equal(cat.categories, expected)
# Test for dicts of smaller length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank lines between sub-tests

@TomAugspurger
Copy link
Contributor

Don't worry about the travis failures. They're unrelated and have been fixed. +1 once you change the if isinstance(...) to use is_dict_like.

@codecov
Copy link

codecov bot commented Sep 19, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@21a3800). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #17586   +/-   ##
=========================================
  Coverage          ?    91.2%           
=========================================
  Files             ?      163           
  Lines             ?    49627           
  Branches          ?        0           
=========================================
  Hits              ?    45263           
  Misses            ?     4364           
  Partials          ?        0
Flag Coverage Δ
#multiple 88.99% <100%> (?)
#single 40.19% <0%> (?)
Impacted Files Coverage Δ
pandas/core/categorical.py 95.59% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 21a3800...440105d. Read the comment docs.

@codecov
Copy link

codecov bot commented Sep 19, 2017

Codecov Report

Merging #17586 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17586      +/-   ##
==========================================
- Coverage   91.19%   91.18%   -0.02%     
==========================================
  Files         163      163              
  Lines       49625    49627       +2     
==========================================
- Hits        45257    45250       -7     
- Misses       4368     4377       +9
Flag Coverage Δ
#multiple 88.96% <100%> (ø) ⬆️
#single 40.19% <0%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/categorical.py 95.59% <100%> (+0.01%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.77% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6930f27...8863408. Read the comment docs.

@@ -115,6 +115,7 @@ Other Enhancements
- :func:`DataFrame.items` and :func:`Series.items` is now present in both Python 2 and 3 and is lazy in all cases (:issue:`13918`, :issue:`17213`)
- :func:`Styler.where` has been implemented. It is as a convenience for :func:`Styler.applymap` and enables simple DataFrame styling on the Jupyter notebook (:issue:`17474`).
- :func:`MultiIndex.is_monotonic_decreasing` has been implemented. Previously returned ``False`` in all cases. (:issue:`16554`)
- :func:`Categorical.rename_categories` now accepts a dict-like argument as `new_categories`, and only updates the categories found in that dict. (:issue:`17336`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove the comma before "and only updates"

@@ -824,7 +825,11 @@ def rename_categories(self, new_categories, inplace=False):
"""
inplace = validate_bool_kwarg(inplace, 'inplace')
cat = self if inplace else self.copy()
cat.categories = new_categories
if is_dict_like(new_categories):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

res = cat.rename_categories({'a': 4, 'b': 3, 'c': 2, 'd': 1})
expected = Index([4, 3, 2, 1])
tm.assert_index_equal(res.categories, expected)
# Test for inplace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

assert res is None

tm.assert_index_equal(cat.categories, expected)
# Test for dicts of smaller length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

inplace=True)
assert res is None

tm.assert_index_equal(cat.categories, expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove newline above this one.

res = cat.rename_categories({'a': 1, 'c': 3})
expected = Index([1, 'b', 3, 'd'])
tm.assert_index_equal(res.categories, expected)
# Test for dicts with bigger length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

'd': 4, 'e': 5, 'f': 6})
expected = Index([1, 2, 3, 4])
tm.assert_index_equal(res.categories, expected)
# Test for dicts with no items from old categories
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

# Test for dicts of smaller length
cat = pd.Categorical(['a', 'b', 'c', 'd'])
res = cat.rename_categories({'a': 1, 'c': 3})
expected = Index([1, 'b', 3, 'd'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

# Test for dicts with no items from old categories
cat = pd.Categorical(['a', 'b', 'c', 'd'])
res = cat.rename_categories({'f': 1, 'g': 3})
expected = Index(['a', 'b', 'c', 'd'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline above this one.

@gfyoung
Copy link
Member

gfyoung commented Sep 20, 2017

@alanbato : LGTM as well. Bunch of nit-picking to enhance readability is all I've got above.

@alanbato
Copy link
Contributor Author

Done @gfyoung. Do we wait for green again?

@gfyoung
Copy link
Member

gfyoung commented Sep 20, 2017

@alanbato : Yes, ping us when it's green.

@alanbato
Copy link
Contributor Author

@TomAugspurger @gfyoung @jreback Ping! 🌵✔️

@gfyoung
Copy link
Member

gfyoung commented Sep 20, 2017

Nice! I'll let @jreback look this over and merge just to double check.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. merge away

cat.categories = new_categories

if is_dict_like(new_categories):
cat.categories = [new_categories.get(item, item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually make sure the doc string is updated here as well (maybe add a versionchanged tag)


Parameters
----------
new_categories : Index-like
new_categories : Index-like or Dict-like
The renamed categories.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dict-like (>=0.21.0)

@alanbato
Copy link
Contributor Author

@jreback 🌵 ✔️

@jreback jreback added this to the 0.21.0 milestone Sep 21, 2017
@jreback jreback merged commit b3087ef into pandas-dev:master Sep 21, 2017
@jreback
Copy link
Contributor

jreback commented Sep 21, 2017

thanks @alanbato nice patch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for dict-like argument to Categorical.rename_categories
4 participants