You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.16.1.txt
+75
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,10 @@ This is a minor bug-fix release from 0.16.0 and includes a a large number of
7
7
bug fixes along several new features, enhancements, and performance improvements.
8
8
We recommend that all users upgrade to this version.
9
9
10
+
Highlights include:
11
+
12
+
- Support for a ``CategoricalIndex``, a category based index, see :ref:`here <whatsnew_0161`.enhancements.categoricalindex>`
13
+
10
14
.. contents:: What's new in v0.16.1
11
15
:local:
12
16
:backlinks: none
@@ -31,6 +35,7 @@ Enhancements
31
35
will return a `np.array` instead of a boolean `Index` (:issue:`8875`). This enables the following expression
32
36
to work naturally:
33
37
38
+
34
39
.. ipython:: python
35
40
36
41
idx = Index(['a1', 'a2', 'b1', 'b2'])
@@ -40,6 +45,7 @@ Enhancements
40
45
s[s.index.str.startswith('a')]
41
46
42
47
- ``DataFrame.mask()`` and ``Series.mask()`` now support same keywords as ``where`` (:issue:`8801`)
48
+
43
49
- ``drop`` function can now accept ``errors`` keyword to suppress ValueError raised when any of label does not exist in the target data. (:issue:`6736`)
44
50
45
51
.. ipython:: python
@@ -53,6 +59,75 @@ Enhancements
53
59
54
60
- Allow timedelta string conversion when leading zero is missing from time definition, ie `0:00:00` vs `00:00:00`. (:issue:`9570`)
55
61
62
+
63
+
.. _whatsnew_0161.enhancements.categoricalindex:
64
+
65
+
CategoricalIndex
66
+
^^^^^^^^^^^^^^^^
67
+
68
+
We introduce a ``CategoricalIndex``, a new type of index object that is useful for supporting
69
+
indexing with duplicates. This is a container around a ``Categorical`` (introduced in v0.15.0)
70
+
and allows efficient indexing and storage of an index with a large number of duplicated elements. Prior to 0.16.1,
71
+
setting the index of a ``DataFrame/Series`` with a ``category`` dtype would convert this to regular object-based ``Index``.
72
+
73
+
.. ipython :: python
74
+
75
+
df = DataFrame({'A' : np.arange(6),
76
+
'B' : Series(list('aabbca')).astype('category',
77
+
categories=list('cab'))
78
+
})
79
+
df
80
+
df.dtypes
81
+
df.B.cat.categories
82
+
83
+
setting the index, will create create a CategoricalIndex
84
+
85
+
.. ipython :: python
86
+
87
+
df2 = df.set_index('B')
88
+
df2.index
89
+
90
+
indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an Index with duplicates.
91
+
The indexers MUST be in the category or the operation will raise.
92
+
93
+
.. ipython :: python
94
+
95
+
df2.loc['a']
96
+
97
+
and preserves the ``CategoricalIndex``
98
+
99
+
.. ipython :: python
100
+
101
+
df2.loc['a'].index
102
+
103
+
sorting will order by the order of the categories
104
+
105
+
.. ipython :: python
106
+
107
+
df2.sort_index()
108
+
109
+
groupby operations on the index will preserve the index nature as well
110
+
111
+
.. ipython :: python
112
+
113
+
df2.groupby(level=0).sum()
114
+
df2.groupby(level=0).sum().index
115
+
116
+
reindexing operations, will return a resulting index based on the type of the passed
117
+
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with
118
+
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories
119
+
of the PASSED ``Categorical`` dtype. This allows one to arbitrarly index these even with
120
+
values NOT in the categories, similarly to how you can reindex ANY pandas index.
0 commit comments