You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.16.1.txt
+75
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,10 @@ This is a minor bug-fix release from 0.16.0 and includes a a large number of
7
7
bug fixes along several new features, enhancements, and performance improvements.
8
8
We recommend that all users upgrade to this version.
9
9
10
+
Highlights include:
11
+
12
+
- Support for a ``CategoricalIndex``, a category based index, see :ref:`here <whatsnew_0161`.enhancements.categoricalindex>`
13
+
10
14
.. contents:: What's new in v0.16.1
11
15
:local:
12
16
:backlinks: none
@@ -31,6 +35,7 @@ Enhancements
31
35
will return a `np.array` instead of a boolean `Index` (:issue:`8875`). This enables the following expression
32
36
to work naturally:
33
37
38
+
34
39
.. ipython:: python
35
40
36
41
idx = Index(['a1', 'a2', 'b1', 'b2'])
@@ -40,6 +45,7 @@ Enhancements
40
45
s[s.index.str.startswith('a')]
41
46
42
47
- ``DataFrame.mask()`` and ``Series.mask()`` now support same keywords as ``where`` (:issue:`8801`)
48
+
43
49
- ``drop`` function can now accept ``errors`` keyword to suppress ValueError raised when any of label does not exist in the target data. (:issue:`6736`)
44
50
45
51
.. ipython:: python
@@ -58,6 +64,75 @@ Enhancements
58
64
59
65
- ``DataFrame`` and ``Series`` now have ``_constructor_expanddim`` property as overridable constructor for one higher dimensionality data. This should be used only when it is really needed, see :ref:`here <ref-subclassing-pandas>`
60
66
67
+
.. _whatsnew_0161.enhancements.categoricalindex:
68
+
69
+
CategoricalIndex
70
+
^^^^^^^^^^^^^^^^
71
+
72
+
We introduce a ``CategoricalIndex``, a new type of index object that is useful for supporting
73
+
indexing with duplicates. This is a container around a ``Categorical`` (introduced in v0.15.0)
74
+
and allows efficient indexing and storage of an index with a large number of duplicated elements. Prior to 0.16.1,
75
+
setting the index of a ``DataFrame/Series`` with a ``category`` dtype would convert this to regular object-based ``Index``.
76
+
77
+
.. ipython :: python
78
+
79
+
df = DataFrame({'A' : np.arange(6),
80
+
'B' : Series(list('aabbca')).astype('category',
81
+
categories=list('cab'))
82
+
})
83
+
df
84
+
df.dtypes
85
+
df.B.cat.categories
86
+
87
+
setting the index, will create create a CategoricalIndex
88
+
89
+
.. ipython :: python
90
+
91
+
df2 = df.set_index('B')
92
+
df2.index
93
+
94
+
indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an Index with duplicates.
95
+
The indexers MUST be in the category or the operation will raise.
96
+
97
+
.. ipython :: python
98
+
99
+
df2.loc['a']
100
+
101
+
and preserves the ``CategoricalIndex``
102
+
103
+
.. ipython :: python
104
+
105
+
df2.loc['a'].index
106
+
107
+
sorting will order by the order of the categories
108
+
109
+
.. ipython :: python
110
+
111
+
df2.sort_index()
112
+
113
+
groupby operations on the index will preserve the index nature as well
114
+
115
+
.. ipython :: python
116
+
117
+
df2.groupby(level=0).sum()
118
+
df2.groupby(level=0).sum().index
119
+
120
+
reindexing operations, will return a resulting index based on the type of the passed
121
+
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with
122
+
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories
123
+
of the PASSED ``Categorical`` dtype. This allows one to arbitrarly index these even with
124
+
values NOT in the categories, similarly to how you can reindex ANY pandas index.
0 commit comments