Skip to content

Commit 8992129

Browse files
committed
Merge pull request #9566 from jreback/loc
API: consistency with .ix and .loc for getitem operations (GH8613)
2 parents d243f3c + 9207145 commit 8992129

File tree

14 files changed

+540
-284
lines changed

14 files changed

+540
-284
lines changed

doc/source/indexing.rst

+22-3
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ of multi-axis indexing.
8585

8686
- ``.iloc`` is primarily integer position based (from ``0`` to
8787
``length-1`` of the axis), but may also be used with a boolean
88-
array. ``.iloc`` will raise ``IndexError`` if a requested
88+
array. ``.iloc`` will raise ``IndexError`` if a requested
8989
indexer is out-of-bounds, except *slice* indexers which allow
9090
out-of-bounds indexing. (this conforms with python/numpy *slice*
9191
semantics). Allowed inputs are:
@@ -292,6 +292,27 @@ Selection By Label
292292
This is sometimes called ``chained assignment`` and should be avoided.
293293
See :ref:`Returning a View versus Copy <indexing.view_versus_copy>`
294294

295+
.. warning::
296+
297+
``.loc`` is strict when you present slicers that are not compatible (or convertible) with the index type. For example
298+
using integers in a ``DatetimeIndex``. These will raise a ``TypeError``.
299+
300+
.. ipython:: python
301+
302+
dfl = DataFrame(np.random.randn(5,4), columns=list('ABCD'), index=date_range('20130101',periods=5))
303+
dfl
304+
305+
.. code-block:: python
306+
307+
In [4]: dfl.loc[2:3]
308+
TypeError: cannot do slice indexing on <class 'pandas.tseries.index.DatetimeIndex'> with these indexers [2] of <type 'int'>
309+
310+
String likes in slicing *can* be convertible to the type of the index and lead to natural slicing.
311+
312+
.. ipython:: python
313+
314+
dfl.loc['20130102':'20130104']
315+
295316
pandas provides a suite of methods in order to have **purely label based indexing**. This is a strict inclusion based protocol.
296317
**at least 1** of the labels for which you ask, must be in the index or a ``KeyError`` will be raised! When slicing, the start bound is *included*, **AND** the stop bound is *included*. Integers are valid labels, but they refer to the label **and not the position**.
297318

@@ -1486,5 +1507,3 @@ This will **not** work at all, and so should be avoided
14861507
The chained assignment warnings / exceptions are aiming to inform the user of a possibly invalid
14871508
assignment. There may be false positives; situations where a chained assignment is inadvertantly
14881509
reported.
1489-
1490-

doc/source/whatsnew/v0.16.0.txt

+133-64
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ users upgrade to this version.
2020
New features
2121
~~~~~~~~~~~~
2222

23+
.. _whatsnew_0160.enhancements:
24+
2325
- Reindex now supports ``method='nearest'`` for frames or series with a monotonic increasing or decreasing index (:issue:`9258`):
2426

2527
.. ipython:: python
@@ -29,7 +31,41 @@ New features
2931

3032
This method is also exposed by the lower level ``Index.get_indexer`` and ``Index.get_loc`` methods.
3133

32-
- DataFrame assign method
34+
- Paths beginning with ~ will now be expanded to begin with the user's home directory (:issue:`9066`)
35+
- Added time interval selection in ``get_data_yahoo`` (:issue:`9071`)
36+
- Added ``Series.str.slice_replace()``, which previously raised ``NotImplementedError`` (:issue:`8888`)
37+
- Added ``Timestamp.to_datetime64()`` to complement ``Timedelta.to_timedelta64()`` (:issue:`9255`)
38+
- ``tseries.frequencies.to_offset()`` now accepts ``Timedelta`` as input (:issue:`9064`)
39+
- Lag parameter was added to the autocorrelation method of ``Series``, defaults to lag-1 autocorrelation (:issue:`9192`)
40+
- ``Timedelta`` will now accept ``nanoseconds`` keyword in constructor (:issue:`9273`)
41+
- SQL code now safely escapes table and column names (:issue:`8986`)
42+
43+
- Added auto-complete for ``Series.str.<tab>``, ``Series.dt.<tab>`` and ``Series.cat.<tab>`` (:issue:`9322`)
44+
- Added ``StringMethods.isalnum()``, ``isalpha()``, ``isdigit()``, ``isspace()``, ``islower()``,
45+
``isupper()``, ``istitle()`` which behave as the same as standard ``str`` (:issue:`9282`)
46+
47+
- Added ``StringMethods.find()`` and ``rfind()`` which behave as the same as standard ``str`` (:issue:`9386`)
48+
49+
- ``Index.get_indexer`` now supports ``method='pad'`` and ``method='backfill'`` even for any target array, not just monotonic targets. These methods also work for monotonic decreasing as well as monotonic increasing indexes (:issue:`9258`).
50+
- ``Index.asof`` now works on all index types (:issue:`9258`).
51+
52+
- Added ``StringMethods.isnumeric`` and ``isdecimal`` which behave as the same as standard ``str`` (:issue:`9439`)
53+
- The ``read_excel()`` function's :ref:`sheetname <_io.specifying_sheets>` argument now accepts a list and ``None``, to get multiple or all sheets respectively. If more than one sheet is specified, a dictionary is returned. (:issue:`9450`)
54+
55+
.. code-block:: python
56+
57+
# Returns the 1st and 4th sheet, as a dictionary of DataFrames.
58+
pd.read_excel('path_to_file.xls',sheetname=['Sheet1',3])
59+
60+
- A ``verbose`` argument has been augmented in ``io.read_excel()``, defaults to False. Set to True to print sheet names as they are parsed. (:issue:`9450`)
61+
- Added ``StringMethods.ljust()`` and ``rjust()`` which behave as the same as standard ``str`` (:issue:`9352`)
62+
- ``StringMethods.pad()`` and ``center()`` now accept ``fillchar`` option to specify filling character (:issue:`9352`)
63+
- Added ``StringMethods.zfill()`` which behave as the same as standard ``str`` (:issue:`9387`)
64+
65+
DataFrame Assign
66+
~~~~~~~~~~~~~~~~
67+
68+
.. _whatsnew_0160.enhancements.assign:
3369

3470
Inspired by `dplyr's
3571
<http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html#mutate>`__ ``mutate`` verb, DataFrame has a new
@@ -71,6 +107,55 @@ calculate the ratio, and plot
71107

72108
See the :ref:`documentation <dsintro.chained_assignment>` for more. (:issue:`9229`)
73109

110+
111+
Interaction with scipy.sparse
112+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
113+
114+
.. _whatsnew_0160.enhancements.sparse:
115+
116+
Added :meth:`SparseSeries.to_coo` and :meth:`SparseSeries.from_coo` methods (:issue:`8048`) for converting to and from ``scipy.sparse.coo_matrix`` instances (see :ref:`here <sparse.scipysparse>`). For example, given a SparseSeries with MultiIndex we can convert to a `scipy.sparse.coo_matrix` by specifying the row and column labels as index levels:
117+
118+
.. ipython:: python
119+
120+
from numpy import nan
121+
s = Series([3.0, nan, 1.0, 3.0, nan, nan])
122+
s.index = MultiIndex.from_tuples([(1, 2, 'a', 0),
123+
(1, 2, 'a', 1),
124+
(1, 1, 'b', 0),
125+
(1, 1, 'b', 1),
126+
(2, 1, 'b', 0),
127+
(2, 1, 'b', 1)],
128+
names=['A', 'B', 'C', 'D'])
129+
130+
s
131+
132+
# SparseSeries
133+
ss = s.to_sparse()
134+
ss
135+
136+
A, rows, columns = ss.to_coo(row_levels=['A', 'B'],
137+
column_levels=['C', 'D'],
138+
sort_labels=False)
139+
140+
A
141+
A.todense()
142+
rows
143+
columns
144+
145+
The from_coo method is a convenience method for creating a ``SparseSeries``
146+
from a ``scipy.sparse.coo_matrix``:
147+
148+
.. ipython:: python
149+
150+
from scipy import sparse
151+
A = sparse.coo_matrix(([3.0, 1.0, 2.0], ([1, 0, 0], [0, 2, 3])),
152+
shape=(3, 4))
153+
A
154+
A.todense()
155+
156+
ss = SparseSeries.from_coo(A)
157+
ss
158+
74159
.. _whatsnew_0160.api:
75160

76161
.. _whatsnew_0160.api_breaking:
@@ -211,96 +296,80 @@ Backwards incompatible API changes
211296
p // 0
212297

213298

299+
Indexing Changes
300+
~~~~~~~~~~~~~~~~
214301

215-
Deprecations
216-
~~~~~~~~~~~~
302+
.. _whatsnew_0160.api_breaking.indexing:
217303

218-
.. _whatsnew_0160.deprecations:
304+
The behavior of a small sub-set of edge cases for using ``.loc`` have changed (:issue:`8613`). Furthermore we have improved the content of the error messages that are raised:
219305

306+
- slicing with ``.loc`` where the start and/or stop bound is not found in the index is now allowed; this previously would raise a ``KeyError``. This makes the behavior the same as ``.ix`` in this case. This change is only for slicing, not when indexing with a single label.
220307

221-
Enhancements
222-
~~~~~~~~~~~~
308+
.. ipython:: python
223309

224-
.. _whatsnew_0160.enhancements:
310+
df = DataFrame(np.random.randn(5,4), columns=list('ABCD'), index=date_range('20130101',periods=5))
311+
df
312+
s = Series(range(5),[-2,-1,1,2,3])
313+
s
225314

226-
- Paths beginning with ~ will now be expanded to begin with the user's home directory (:issue:`9066`)
227-
- Added time interval selection in ``get_data_yahoo`` (:issue:`9071`)
228-
- Added ``Series.str.slice_replace()``, which previously raised ``NotImplementedError`` (:issue:`8888`)
229-
- Added ``Timestamp.to_datetime64()`` to complement ``Timedelta.to_timedelta64()`` (:issue:`9255`)
230-
- ``tseries.frequencies.to_offset()`` now accepts ``Timedelta`` as input (:issue:`9064`)
231-
- Lag parameter was added to the autocorrelation method of ``Series``, defaults to lag-1 autocorrelation (:issue:`9192`)
232-
- ``Timedelta`` will now accept ``nanoseconds`` keyword in constructor (:issue:`9273`)
233-
- SQL code now safely escapes table and column names (:issue:`8986`)
315+
Previous Behavior
234316

235-
- Added auto-complete for ``Series.str.<tab>``, ``Series.dt.<tab>`` and ``Series.cat.<tab>`` (:issue:`9322`)
236-
- Added ``StringMethods.isalnum()``, ``isalpha()``, ``isdigit()``, ``isspace()``, ``islower()``,
237-
``isupper()``, ``istitle()`` which behave as the same as standard ``str`` (:issue:`9282`)
317+
.. code-block:: python
238318

239-
- Added ``StringMethods.find()`` and ``rfind()`` which behave as the same as standard ``str`` (:issue:`9386`)
319+
In [4]: df.loc['2013-01-02':'2013-01-10']
320+
KeyError: 'stop bound [2013-01-10] is not in the [index]'
240321

241-
- ``Index.get_indexer`` now supports ``method='pad'`` and ``method='backfill'`` even for any target array, not just monotonic targets. These methods also work for monotonic decreasing as well as monotonic increasing indexes (:issue:`9258`).
242-
- ``Index.asof`` now works on all index types (:issue:`9258`).
322+
In [6]: s.loc[-10:3]
323+
KeyError: 'start bound [-10] is not the [index]'
243324

244-
- Added ``StringMethods.isnumeric`` and ``isdecimal`` which behave as the same as standard ``str`` (:issue:`9439`)
245-
- The ``read_excel()`` function's :ref:`sheetname <_io.specifying_sheets>` argument now accepts a list and ``None``, to get multiple or all sheets respectively. If more than one sheet is specified, a dictionary is returned. (:issue:`9450`)
325+
New Behavior
326+
327+
.. ipython:: python
328+
329+
df.loc['2013-01-02':'2013-01-10']
330+
s.loc[-10:3]
331+
332+
- allow slicing with float-like values on an integer index for ``.ix``. Previously this was only enabled for ``.loc``:
246333

247334
.. code-block:: python
248335

249-
# Returns the 1st and 4th sheet, as a dictionary of DataFrames.
250-
pd.read_excel('path_to_file.xls',sheetname=['Sheet1',3])
336+
Previous Behavior
251337

252-
- A ``verbose`` argument has been augmented in ``io.read_excel()``, defaults to False. Set to True to print sheet names as they are parsed. (:issue:`9450`)
253-
- Added ``StringMethods.ljust()`` and ``rjust()`` which behave as the same as standard ``str`` (:issue:`9352`)
254-
- ``StringMethods.pad()`` and ``center()`` now accept ``fillchar`` option to specify filling character (:issue:`9352`)
255-
- Added ``StringMethods.zfill()`` which behave as the same as standard ``str`` (:issue:`9387`)
338+
In [8]: s.ix[-1.0:2]
339+
TypeError: the slice start value [-1.0] is not a proper indexer for this index type (Int64Index)
256340

257-
Interaction with scipy.sparse
258-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
341+
New Behavior
259342

260-
.. _whatsnew_0160.enhancements.sparse:
343+
.. ipython:: python
261344

262-
Added :meth:`SparseSeries.to_coo` and :meth:`SparseSeries.from_coo` methods (:issue:`8048`) for converting to and from ``scipy.sparse.coo_matrix`` instances (see :ref:`here <sparse.scipysparse>`). For example, given a SparseSeries with MultiIndex we can convert to a `scipy.sparse.coo_matrix` by specifying the row and column labels as index levels:
345+
In [8]: s.ix[-1.0:2]
346+
Out[2]:
347+
-1 1
348+
1 2
349+
2 3
350+
dtype: int64
263351

264-
.. ipython:: python
352+
- provide a useful exception for indexing with an invalid type for that index when using ``.loc``. For example trying to use ``.loc`` on an index of type ``DatetimeIndex`` or ``PeriodIndex`` or ``TimedeltaIndex``, with an integer (or a float).
265353

266-
from numpy import nan
267-
s = Series([3.0, nan, 1.0, 3.0, nan, nan])
268-
s.index = MultiIndex.from_tuples([(1, 2, 'a', 0),
269-
(1, 2, 'a', 1),
270-
(1, 1, 'b', 0),
271-
(1, 1, 'b', 1),
272-
(2, 1, 'b', 0),
273-
(2, 1, 'b', 1)],
274-
names=['A', 'B', 'C', 'D'])
354+
Previous Behavior
275355

276-
s
356+
.. code-block:: python
277357

278-
# SparseSeries
279-
ss = s.to_sparse()
280-
ss
358+
In [4]: df.loc[2:3]
359+
KeyError: 'start bound [2] is not the [index]'
281360

282-
A, rows, columns = ss.to_coo(row_levels=['A', 'B'],
283-
column_levels=['C', 'D'],
284-
sort_labels=False)
361+
New Behavior
285362

286-
A
287-
A.todense()
288-
rows
289-
columns
363+
.. code-block:: python
290364

291-
The from_coo method is a convenience method for creating a ``SparseSeries``
292-
from a ``scipy.sparse.coo_matrix``:
365+
In [4]: df.loc[2:3]
366+
TypeError: Cannot do slice indexing on <class 'pandas.tseries.index.DatetimeIndex'> with <type 'int'> keys
293367

294-
.. ipython:: python
368+
Deprecations
369+
~~~~~~~~~~~~
295370

296-
from scipy import sparse
297-
A = sparse.coo_matrix(([3.0, 1.0, 2.0], ([1, 0, 0], [0, 2, 3])),
298-
shape=(3, 4))
299-
A
300-
A.todense()
371+
.. _whatsnew_0160.deprecations:
301372

302-
ss = SparseSeries.from_coo(A)
303-
ss
304373

305374
Performance
306375
~~~~~~~~~~~

pandas/core/generic.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1168,11 +1168,11 @@ def _clear_item_cache(self, i=None):
11681168
else:
11691169
self._item_cache.clear()
11701170

1171-
def _slice(self, slobj, axis=0, typ=None):
1171+
def _slice(self, slobj, axis=0, kind=None):
11721172
"""
11731173
Construct a slice of this container.
11741174
1175-
typ parameter is maintained for compatibility with Series slicing.
1175+
kind parameter is maintained for compatibility with Series slicing.
11761176
11771177
"""
11781178
axis = self._get_block_manager_axis(axis)

0 commit comments

Comments
 (0)