|
3 | 3 | v.0.7.0 (Not Yet Released)
|
4 | 4 | --------------------------
|
5 | 5 |
|
6 |
| -API Changes to integer indexing |
7 |
| -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
8 |
| - |
9 |
| -One of the potentially riskiest API changes in 0.7.0, but also one of the most |
10 |
| -important, was a complete review of how **integer indexes** are handled with |
11 |
| -regard to label-based indexing. Here is an example: |
12 |
| - |
13 |
| -.. ipython:: python |
14 |
| - |
15 |
| - s = Series(randn(10), index=range(0, 20, 2)) |
16 |
| - s |
17 |
| - s[0] |
18 |
| - s[2] |
19 |
| - s[4] |
20 |
| - |
21 |
| -This is all exactly identical to the behavior before. However, if you ask for a |
22 |
| -key **not** contained in the Series, in versions 0.6.1 and prior, Series would |
23 |
| -*fall back* on a location-based lookup. This now raises a ``KeyError``: |
24 |
| - |
25 |
| -.. code-block:: ipython |
26 |
| - |
27 |
| - In [2]: s[1] |
28 |
| - KeyError: 1 |
29 |
| - |
30 |
| -This change also has the same impact on DataFrame: |
31 |
| - |
32 |
| -.. code-block:: ipython |
33 |
| - |
34 |
| - In [3]: df = DataFrame(randn(8, 4), index=range(0, 16, 2)) |
35 |
| - |
36 |
| - In [4]: df |
37 |
| - 0 1 2 3 |
38 |
| - 0 0.88427 0.3363 -0.1787 0.03162 |
39 |
| - 2 0.14451 -0.1415 0.2504 0.58374 |
40 |
| - 4 -1.44779 -0.9186 -1.4996 0.27163 |
41 |
| - 6 -0.26598 -2.4184 -0.2658 0.11503 |
42 |
| - 8 -0.58776 0.3144 -0.8566 0.61941 |
43 |
| - 10 0.10940 -0.7175 -1.0108 0.47990 |
44 |
| - 12 -1.16919 -0.3087 -0.6049 -0.43544 |
45 |
| - 14 -0.07337 0.3410 0.0424 -0.16037 |
46 |
| - |
47 |
| - In [5]: df.ix[3] |
48 |
| - KeyError: 3 |
49 |
| - |
50 |
| -API refinements regarding label-based slicing |
51 |
| -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
52 |
| - |
53 |
| -Other relevant API Changes |
54 |
| -~~~~~~~~~~~~~~~~~~~~~~~~~~ |
55 |
| - |
56 | 6 | New features
|
57 | 7 | ~~~~~~~~~~~~
|
58 | 8 |
|
@@ -138,6 +88,136 @@ New features
|
138 | 88 | aggregate with groupby on a DataFrame, yielding an aggregated result with
|
139 | 89 | hierarchical columns (GH166_)
|
140 | 90 |
|
| 91 | +API Changes to integer indexing |
| 92 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 93 | + |
| 94 | +One of the potentially riskiest API changes in 0.7.0, but also one of the most |
| 95 | +important, was a complete review of how **integer indexes** are handled with |
| 96 | +regard to label-based indexing. Here is an example: |
| 97 | + |
| 98 | +.. ipython:: python |
| 99 | + |
| 100 | + s = Series(randn(10), index=range(0, 20, 2)) |
| 101 | + s |
| 102 | + s[0] |
| 103 | + s[2] |
| 104 | + s[4] |
| 105 | + |
| 106 | +This is all exactly identical to the behavior before. However, if you ask for a |
| 107 | +key **not** contained in the Series, in versions 0.6.1 and prior, Series would |
| 108 | +*fall back* on a location-based lookup. This now raises a ``KeyError``: |
| 109 | + |
| 110 | +.. code-block:: ipython |
| 111 | + |
| 112 | + In [2]: s[1] |
| 113 | + KeyError: 1 |
| 114 | + |
| 115 | +This change also has the same impact on DataFrame: |
| 116 | + |
| 117 | +.. code-block:: ipython |
| 118 | + |
| 119 | + In [3]: df = DataFrame(randn(8, 4), index=range(0, 16, 2)) |
| 120 | + |
| 121 | + In [4]: df |
| 122 | + 0 1 2 3 |
| 123 | + 0 0.88427 0.3363 -0.1787 0.03162 |
| 124 | + 2 0.14451 -0.1415 0.2504 0.58374 |
| 125 | + 4 -1.44779 -0.9186 -1.4996 0.27163 |
| 126 | + 6 -0.26598 -2.4184 -0.2658 0.11503 |
| 127 | + 8 -0.58776 0.3144 -0.8566 0.61941 |
| 128 | + 10 0.10940 -0.7175 -1.0108 0.47990 |
| 129 | + 12 -1.16919 -0.3087 -0.6049 -0.43544 |
| 130 | + 14 -0.07337 0.3410 0.0424 -0.16037 |
| 131 | + |
| 132 | + In [5]: df.ix[3] |
| 133 | + KeyError: 3 |
| 134 | + |
| 135 | +In order to support purely integer-based indexing, the following methods have |
| 136 | +been added: |
| 137 | + |
| 138 | +.. csv-table:: |
| 139 | + :header: "Method","Description" |
| 140 | + :widths: 40,60 |
| 141 | + |
| 142 | + ``Series.iget_value(i)``, Retrieve value stored at location ``i`` |
| 143 | + ``Series.iget(i)``, Alias for ``iget_value`` |
| 144 | + ``DataFrame.irow(i)``, Retrieve the ``i``-th row |
| 145 | + ``DataFrame.icol(j)``, Retrieve the ``j``-th column |
| 146 | + "``DataFrame.iget_value(i, j)``", Retrieve the value at row ``i`` and column ``j`` |
| 147 | + |
| 148 | +API tweaks regarding label-based slicing |
| 149 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 150 | + |
| 151 | +Label-based slicing using ``ix`` now requires that the index be sorted |
| 152 | +(monotonic) **unless** both the start and endpoint are contained in the index: |
| 153 | + |
| 154 | +.. ipython:: python |
| 155 | + |
| 156 | + s = Series(randn(6), index=list('gmkaec')) |
| 157 | + s |
| 158 | + |
| 159 | +Then this is OK: |
| 160 | + |
| 161 | +.. ipython:: python |
| 162 | + |
| 163 | + s.ix['k':'e'] |
| 164 | + |
| 165 | +But this is not: |
| 166 | + |
| 167 | +.. code-block:: ipython |
| 168 | + |
| 169 | + In [12]: s.ix['b':'h'] |
| 170 | + KeyError 'b' |
| 171 | + |
| 172 | +If the index had been sorted, the "range selection" would have been possible: |
| 173 | + |
| 174 | +.. ipython:: python |
| 175 | + |
| 176 | + s2 = s.sort_index() |
| 177 | + s2 |
| 178 | + s2.ix['b':'h'] |
| 179 | + |
| 180 | +Changes to Series ``[]`` operator |
| 181 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 182 | + |
| 183 | +As as notational convenience, you can pass a sequence of labels or a label |
| 184 | +slice to a Series when getting and setting values via ``[]`` (i.e. the |
| 185 | +``__getitem__`` and ``__setitem__`` methods). The behavior will be the same as |
| 186 | +passing similar input to ``ix`` **except in the case of integer indexing**: |
| 187 | + |
| 188 | +.. ipython:: python |
| 189 | + |
| 190 | + s = Series(randn(6), index=list('acegkm')) |
| 191 | + s |
| 192 | + s[['m', 'a', 'c', 'e']] |
| 193 | + s['b':'l'] |
| 194 | + s['c':'k'] |
| 195 | + |
| 196 | +In the case of integer indexes, the behavior will be exactly as before |
| 197 | +(shadowing ``ndarray``): |
| 198 | + |
| 199 | +.. ipython:: python |
| 200 | + |
| 201 | + s = Series(randn(6), index=range(0, 12, 2)) |
| 202 | + s[[4, 0, 2]] |
| 203 | + s[1:5] |
| 204 | + |
| 205 | +If you wish to do indexing with sequences and slicing on an integer index with |
| 206 | +label semantics, use ``ix``. |
| 207 | + |
| 208 | +Other API Changes |
| 209 | +~~~~~~~~~~~~~~~~~ |
| 210 | + |
| 211 | +- The deprecated ``LongPanel`` class has been completely removed |
| 212 | + |
| 213 | +- If ``Series.sort`` is called on a column of a DataFrame, an exception will |
| 214 | + now be raised. Before it was possible to accidentally mutate a DataFrame's |
| 215 | + column by doing ``df[col].sort()`` instead of the side-effect free method |
| 216 | + ``df[col].order()`` (GH316_) |
| 217 | + |
| 218 | +- Miscellaneous renames and deprecations which will (harmlessly) raise |
| 219 | + ``FutureWarning`` |
| 220 | + |
141 | 221 | Performance improvements
|
142 | 222 | ~~~~~~~~~~~~~~~~~~~~~~~~
|
143 | 223 |
|
@@ -189,6 +269,7 @@ similar operation to the above but using a Python function:
|
189 | 269 | .. _GH249: https://github.com/wesm/pandas/issues/249
|
190 | 270 | .. _GH267: https://github.com/wesm/pandas/issues/267
|
191 | 271 | .. _GH273: https://github.com/wesm/pandas/issues/273
|
| 272 | +.. _GH316: https://github.com/wesm/pandas/issues/316 |
192 | 273 | .. _GH338: https://github.com/wesm/pandas/issues/338
|
193 | 274 | .. _GH342: https://github.com/wesm/pandas/issues/342
|
194 | 275 | .. _GH374: https://github.com/wesm/pandas/issues/374
|
|
0 commit comments