-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
DOC: Added examples for union_categoricals #16397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -242,6 +242,72 @@ def union_categoricals(to_union, sort_categories=False, ignore_order=False): | |
- sort_categories=True and Categoricals are ordered | ||
ValueError | ||
Empty list of categoricals passed | ||
|
||
Notes | ||
----- | ||
|
||
To learn more about categories, see 'link | ||
<http://pandas.pydata.org/pandas-docs/stable/categorical.html#unioning>__` | ||
|
||
Examples | ||
-------- | ||
|
||
If you want to combine categoricals that do not necessarily have | ||
the same categories, `union_categoricals` will combine a list-like | ||
of categoricals. The new categories will be the union of the | ||
categories being combined. | ||
|
||
>>> a = pd.Categorical(["b", "c"]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a |
||
>>> b = pd.Categorical(["a", "b"]) | ||
>>> union_categoricals([a, b]) | ||
[b, c, a, b] | ||
Categories (3, object): [b, c, a] | ||
|
||
By default, the resulting categories will be ordered as they appear | ||
in the data. If you want the categories to be lexsorted, use | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "resulting categories will be ordered as they appear in the data" is not fully correct (or at least subjective misinterpretable). Eg.
|
||
`sort_categories=True` argument. | ||
|
||
>>> union_categoricals([a, b], sort_categories=True) | ||
[b, c, a, b] | ||
Categories (3, object): [a, b, c] | ||
|
||
`union_categoricals` also works with the case of combining two | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @TomAugspurger @jorisvandenbossche do we quote like this in a doc-string? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is OK (I don't think we consistently follow strict guidelines, but the numpydoc docstring explanation says to use single backticks to refer the keyword arguments or functions) |
||
categoricals of the same categories and order information (e.g. what | ||
you could also `append` for). | ||
|
||
>>> a = pd.Categorical(["a", "b"], ordered=True) | ||
>>> b = pd.Categorical(["a", "b", "a"], ordered=True) | ||
>>> union_categoricals([a, b]) | ||
[a, b, a, b, a] | ||
Categories (2, object): [a < b] | ||
|
||
Raises `TypeError` because the categories are ordered and not identical. | ||
|
||
>>> a = pd.Categorical(["a", "b"], ordered=True) | ||
>>> b = pd.Categorical(["a", "b", "c"], ordered=True) | ||
>>> union_categoricals([a, b]) | ||
TypeError: to union ordered Categoricals, all categories must be the same | ||
|
||
New in version 0.20.0 | ||
|
||
Ordered categoricals with different categories or orderings can be | ||
combined by using the `ignore_ordered=True` argument. | ||
|
||
>>> a = pd.Categorical(["a", "b", "c"], ordered=True) | ||
>>> b = pd.Categorical(["c", "b", "a"], ordered=True) | ||
>>> union_categoricals([a, b], ignore_order=True) | ||
[a, b, c, c, b, a] | ||
Categories (3, object): [a, b, c] | ||
|
||
`union_categoricals` also works with a `CategoricalIndex`, or `Series` | ||
containing categorical data, but note that the resulting array will | ||
always be a plain `Categorical` | ||
|
||
>>> a = pd.Series(["b", "c"], dtype='category') | ||
>>> b = pd.Series(["a", "b"], dtype='category') | ||
>>> union_categoricals([a, b]) | ||
[b, c, a, b] | ||
Categories (3, object): [b, c, a] | ||
""" | ||
from pandas import Index, Categorical, CategoricalIndex, Series | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see
here .....
and the
__
(at the end) should be outside the single quoteThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the starting quote should also be backtick (
`
), not a single quote (like you did for the ending quote)