You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.16.1.txt
+75
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,10 @@ This is a minor bug-fix release from 0.16.0 and includes a a large number of
7
7
bug fixes along several new features, enhancements, and performance improvements.
8
8
We recommend that all users upgrade to this version.
9
9
10
+
Highlights include:
11
+
12
+
- Support for a ``CategoricalIndex``, a category based index, see :ref:`here <whatsnew_0161`.enhancements.categoricalindex>`
13
+
10
14
.. contents:: What's new in v0.16.1
11
15
:local:
12
16
:backlinks: none
@@ -31,6 +35,7 @@ Enhancements
31
35
will return a `np.array` instead of a boolean `Index` (:issue:`8875`). This enables the following expression
32
36
to work naturally:
33
37
38
+
34
39
.. ipython:: python
35
40
36
41
idx = Index(['a1', 'a2', 'b1', 'b2'])
@@ -40,6 +45,7 @@ Enhancements
40
45
s[s.index.str.startswith('a')]
41
46
42
47
- ``DataFrame.mask()`` and ``Series.mask()`` now support same keywords as ``where`` (:issue:`8801`)
48
+
43
49
- ``drop`` function can now accept ``errors`` keyword to suppress ValueError raised when any of label does not exist in the target data. (:issue:`6736`)
44
50
45
51
.. ipython:: python
@@ -54,6 +60,75 @@ Enhancements
54
60
- Allow timedelta string conversion when leading zero is missing from time definition, ie `0:00:00` vs `00:00:00`. (:issue:`9570`)
55
61
- Allow Panel.shift with ``axis='items'`` (:issue:`9890`)
56
62
63
+
64
+
.. _whatsnew_0161.enhancements.categoricalindex:
65
+
66
+
CategoricalIndex
67
+
^^^^^^^^^^^^^^^^
68
+
69
+
We introduce a ``CategoricalIndex``, a new type of index object that is useful for supporting
70
+
indexing with duplicates. This is a container around a ``Categorical`` (introduced in v0.15.0)
71
+
and allows efficient indexing and storage of an index with a large number of duplicated elements. Prior to 0.16.1,
72
+
setting the index of a ``DataFrame/Series`` with a ``category`` dtype would convert this to regular object-based ``Index``.
73
+
74
+
.. ipython :: python
75
+
76
+
df = DataFrame({'A' : np.arange(6),
77
+
'B' : Series(list('aabbca')).astype('category',
78
+
categories=list('cab'))
79
+
})
80
+
df
81
+
df.dtypes
82
+
df.B.cat.categories
83
+
84
+
setting the index, will create create a CategoricalIndex
85
+
86
+
.. ipython :: python
87
+
88
+
df2 = df.set_index('B')
89
+
df2.index
90
+
91
+
indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an Index with duplicates.
92
+
The indexers MUST be in the category or the operation will raise.
93
+
94
+
.. ipython :: python
95
+
96
+
df2.loc['a']
97
+
98
+
and preserves the ``CategoricalIndex``
99
+
100
+
.. ipython :: python
101
+
102
+
df2.loc['a'].index
103
+
104
+
sorting will order by the order of the categories
105
+
106
+
.. ipython :: python
107
+
108
+
df2.sort_index()
109
+
110
+
groupby operations on the index will preserve the index nature as well
111
+
112
+
.. ipython :: python
113
+
114
+
df2.groupby(level=0).sum()
115
+
df2.groupby(level=0).sum().index
116
+
117
+
reindexing operations, will return a resulting index based on the type of the passed
118
+
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with
119
+
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories
120
+
of the PASSED ``Categorical`` dtype. This allows one to arbitrarly index these even with
121
+
values NOT in the categories, similarly to how you can reindex ANY pandas index.
0 commit comments