You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.16.1.txt
+75
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,10 @@ This is a minor bug-fix release from 0.16.0 and includes a a large number of
7
7
bug fixes along several new features, enhancements, and performance improvements.
8
8
We recommend that all users upgrade to this version.
9
9
10
+
Highlights include:
11
+
12
+
- Support for a ``CategoricalIndex``, a category based index, see :ref:`here <whatsnew_0161`.enhancements.categoricalindex>`
13
+
10
14
.. contents:: What's new in v0.16.1
11
15
:local:
12
16
:backlinks: none
@@ -31,6 +35,7 @@ Enhancements
31
35
will return a `np.array` instead of a boolean `Index` (:issue:`8875`). This enables the following expression
32
36
to work naturally:
33
37
38
+
34
39
.. ipython:: python
35
40
36
41
idx = Index(['a1', 'a2', 'b1', 'b2'])
@@ -40,6 +45,7 @@ Enhancements
40
45
s[s.index.str.startswith('a')]
41
46
42
47
- ``DataFrame.mask()`` and ``Series.mask()`` now support same keywords as ``where`` (:issue:`8801`)
48
+
43
49
- ``drop`` function can now accept ``errors`` keyword to suppress ValueError raised when any of label does not exist in the target data. (:issue:`6736`)
44
50
45
51
.. ipython:: python
@@ -50,6 +56,75 @@ Enhancements
50
56
- Allow conversion of values with dtype ``datetime64`` or ``timedelta64`` to strings using ``astype(str)`` (:issue:`9757`)
51
57
- ``get_dummies`` function now accepts ``sparse`` keyword. If set to ``True``, the return ``DataFrame`` is sparse, e.g. ``SparseDataFrame``. (:issue:`8823`)
52
58
59
+
60
+
.. _whatsnew_0161.enhancements.categoricalindex:
61
+
62
+
CategoricalIndex
63
+
^^^^^^^^^^^^^^^^
64
+
65
+
We introduce a ``CategoricalIndex``, a new type of index object that is useful for supporting
66
+
indexing with duplicates. This is a container around a ``Categorical`` (introduced in v0.15.0)
67
+
and allows efficient indexing and storage of an index with a large number of duplicated elements. Prior to 0.16.1,
68
+
setting the index of a ``DataFrame/Series`` with a ``category`` dtype would convert this to regular object-based ``Index``.
69
+
70
+
.. ipython :: python
71
+
72
+
df = DataFrame({'A' : np.arange(6),
73
+
'B' : Series(list('aabbca')).astype('category',
74
+
categories=list('cab'))
75
+
})
76
+
df
77
+
df.dtypes
78
+
df.B.cat.categories
79
+
80
+
setting the index, will create create a CategoricalIndex
81
+
82
+
.. ipython :: python
83
+
84
+
df2 = df.set_index('B')
85
+
df2.index
86
+
87
+
indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an Index with duplicates.
88
+
The indexers MUST be in the category or the operation will raise.
89
+
90
+
.. ipython :: python
91
+
92
+
df2.loc['a']
93
+
94
+
and preserves the ``CategoricalIndex``
95
+
96
+
.. ipython :: python
97
+
98
+
df2.loc['a'].index
99
+
100
+
sorting will order by the order of the categories
101
+
102
+
.. ipython :: python
103
+
104
+
df2.sort_index()
105
+
106
+
groupby operations on the index will preserve the index nature as well
107
+
108
+
.. ipython :: python
109
+
110
+
df2.groupby(level=0).sum()
111
+
df2.groupby(level=0).sum().index
112
+
113
+
reindexing operations, will return a resulting index based on the type of the passed
114
+
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with
115
+
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories
116
+
of the PASSED ``Categorical`` dtype. This allows one to arbitrarly index these even with
117
+
values NOT in the categories, similarly to how you can reindex ANY pandas index.
0 commit comments