Skip to content

Commit 3cf48d3

Browse files
committed
DEPR: deprecate .ix in favor of .loc/.iloc
closes pandas-dev#14218 closes pandas-dev#15116
1 parent 0fe491d commit 3cf48d3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+1597
-1366
lines changed

doc/source/advanced.rst

+6-17
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ of tuples:
230230
Advanced indexing with hierarchical index
231231
-----------------------------------------
232232

233-
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc/.ix`` is a
233+
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc`` is a
234234
bit challenging, but we've made every effort to do so. for example the
235235
following works as you would expect:
236236

@@ -258,7 +258,7 @@ Passing a list of labels or tuples works similar to reindexing:
258258

259259
.. ipython:: python
260260
261-
df.ix[[('bar', 'two'), ('qux', 'one')]]
261+
df.loc[[('bar', 'two'), ('qux', 'one')]]
262262
263263
.. _advanced.mi_slicers:
264264

@@ -604,7 +604,7 @@ intended to work on boolean indices and may return unexpected results.
604604
605605
ser = pd.Series(np.random.randn(10))
606606
ser.take([False, False, True, True])
607-
ser.ix[[0, 1]]
607+
ser.iloc[[0, 1]]
608608
609609
Finally, as a small note on performance, because the ``take`` method handles
610610
a narrower range of inputs, it can offer performance that is a good deal
@@ -620,7 +620,7 @@ faster than fancy indexing.
620620
timeit arr.take(indexer, axis=0)
621621

622622
ser = pd.Series(arr[:, 0])
623-
timeit ser.ix[indexer]
623+
timeit ser.iloc[indexer]
624624
timeit ser.take(indexer)
625625

626626
.. _indexing.index_types:
@@ -661,7 +661,7 @@ Setting the index, will create create a ``CategoricalIndex``
661661
df2 = df.set_index('B')
662662
df2.index
663663
664-
Indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an ``Index`` with duplicates.
664+
Indexing with ``__getitem__/.iloc/.loc`` works similarly to an ``Index`` with duplicates.
665665
The indexers MUST be in the category or the operation will raise.
666666

667667
.. ipython:: python
@@ -759,14 +759,12 @@ same.
759759
sf = pd.Series(range(5), index=indexf)
760760
sf
761761
762-
Scalar selection for ``[],.ix,.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
762+
Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
763763
764764
.. ipython:: python
765765
766766
sf[3]
767767
sf[3.0]
768-
sf.ix[3]
769-
sf.ix[3.0]
770768
sf.loc[3]
771769
sf.loc[3.0]
772770
@@ -783,7 +781,6 @@ Slicing is ALWAYS on the values of the index, for ``[],ix,loc`` and ALWAYS posit
783781
.. ipython:: python
784782
785783
sf[2:4]
786-
sf.ix[2:4]
787784
sf.loc[2:4]
788785
sf.iloc[2:4]
789786
@@ -813,14 +810,6 @@ In non-float indexes, slicing using floats will raise a ``TypeError``
813810
In [3]: pd.Series(range(5)).iloc[3.0]
814811
TypeError: cannot do positional indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [3.0] of <type 'float'>
815812
816-
Further the treatment of ``.ix`` with a float indexer on a non-float index, will be label based, and thus coerce the index.
817-
818-
.. ipython:: python
819-
820-
s2 = pd.Series([1, 2, 3], index=list('abc'))
821-
s2
822-
s2.ix[1.0] = 10
823-
s2
824813
825814
Here is a typical use-case for using this type of indexing. Imagine that you have a somewhat
826815
irregular timedelta-like indexing scheme, but the data is recorded as floats. This could for

doc/source/gotchas.rst

+2-75
Original file line numberDiff line numberDiff line change
@@ -214,27 +214,6 @@ and traded integer ``NA`` capability for a much simpler approach of using a
214214
special value in float and object arrays to denote ``NA``, and promoting
215215
integer arrays to floating when NAs must be introduced.
216216

217-
Integer indexing
218-
----------------
219-
220-
Label-based indexing with integer axis labels is a thorny topic. It has been
221-
discussed heavily on mailing lists and among various members of the scientific
222-
Python community. In pandas, our general viewpoint is that labels matter more
223-
than integer locations. Therefore, with an integer axis index *only*
224-
label-based indexing is possible with the standard tools like ``.ix``. The
225-
following code will generate exceptions:
226-
227-
.. code-block:: python
228-
229-
s = pd.Series(range(5))
230-
s[-1]
231-
df = pd.DataFrame(np.random.randn(5, 4))
232-
df
233-
df.ix[-2:]
234-
235-
This deliberate decision was made to prevent ambiguities and subtle bugs (many
236-
users reported finding bugs when the API change was made to stop "falling back"
237-
on position-based indexing).
238217

239218
Label-based slicing conventions
240219
-------------------------------
@@ -305,15 +284,15 @@ index can be somewhat complicated. For example, the following does not work:
305284

306285
::
307286

308-
s.ix['c':'e'+1]
287+
s.loc['c':'e'+1]
309288

310289
A very common use case is to limit a time series to start and end at two
311290
specific dates. To enable this, we made the design design to make label-based
312291
slicing include both endpoints:
313292

314293
.. ipython:: python
315294
316-
s.ix['c':'e']
295+
s.loc['c':'e']
317296
318297
This is most definitely a "practicality beats purity" sort of thing, but it is
319298
something to watch out for if you expect label-based slicing to behave exactly
@@ -322,58 +301,6 @@ in the way that standard Python integer slicing works.
322301
Miscellaneous indexing gotchas
323302
------------------------------
324303

325-
Reindex versus ix gotchas
326-
~~~~~~~~~~~~~~~~~~~~~~~~~
327-
328-
Many users will find themselves using the ``ix`` indexing capabilities as a
329-
concise means of selecting data from a pandas object:
330-
331-
.. ipython:: python
332-
333-
df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'],
334-
index=list('abcdef'))
335-
df
336-
df.ix[['b', 'c', 'e']]
337-
338-
This is, of course, completely equivalent *in this case* to using the
339-
``reindex`` method:
340-
341-
.. ipython:: python
342-
343-
df.reindex(['b', 'c', 'e'])
344-
345-
Some might conclude that ``ix`` and ``reindex`` are 100% equivalent based on
346-
this. This is indeed true **except in the case of integer indexing**. For
347-
example, the above operation could alternately have been expressed as:
348-
349-
.. ipython:: python
350-
351-
df.ix[[1, 2, 4]]
352-
353-
If you pass ``[1, 2, 4]`` to ``reindex`` you will get another thing entirely:
354-
355-
.. ipython:: python
356-
357-
df.reindex([1, 2, 4])
358-
359-
So it's important to remember that ``reindex`` is **strict label indexing
360-
only**. This can lead to some potentially surprising results in pathological
361-
cases where an index contains, say, both integers and strings:
362-
363-
.. ipython:: python
364-
365-
s = pd.Series([1, 2, 3], index=['a', 0, 1])
366-
s
367-
s.ix[[0, 1]]
368-
s.reindex([0, 1])
369-
370-
Because the index in this case does not contain solely integers, ``ix`` falls
371-
back on integer indexing. By contrast, ``reindex`` only looks for the values
372-
passed in the index, thus finding the integers ``0`` and ``1``. While it would
373-
be possible to insert some logic to check whether a passed sequence is all
374-
contained in the index, that logic would exact a very high cost in large data
375-
sets.
376-
377304
Reindex potentially changes underlying Series dtype
378305
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
379306

doc/source/indexing.rst

+51-18
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ See the :ref:`MultiIndex / Advanced Indexing <advanced>` for ``MultiIndex`` and
6161

6262
See the :ref:`cookbook<cookbook.selection>` for some advanced strategies
6363

64+
.. _indexing.choice:
65+
6466
Different Choices for Indexing
6567
------------------------------
6668

@@ -104,24 +106,13 @@ of multi-axis indexing.
104106

105107
See more at :ref:`Selection by Position <indexing.integer>`
106108

107-
- ``.ix`` supports mixed integer and label based access. It is primarily label
108-
based, but will fall back to integer positional access unless the corresponding
109-
axis is of integer type. ``.ix`` is the most general and will
110-
support any of the inputs in ``.loc`` and ``.iloc``. ``.ix`` also supports floating point
111-
label schemes. ``.ix`` is exceptionally useful when dealing with mixed positional
112-
and label based hierarchical indexes.
113-
114-
However, when an axis is integer based, ONLY
115-
label based access and not positional access is supported.
116-
Thus, in such cases, it's usually better to be explicit and use ``.iloc`` or ``.loc``.
117-
118109
See more at :ref:`Advanced Indexing <advanced>` and :ref:`Advanced
119110
Hierarchical <advanced.advanced_hierarchical>`.
120111

121-
- ``.loc``, ``.iloc``, ``.ix`` and also ``[]`` indexing can accept a ``callable`` as indexer. See more at :ref:`Selection By Callable <indexing.callable>`.
112+
- ``.loc``, ``.iloc``, and also ``[]`` indexing can accept a ``callable`` as indexer. See more at :ref:`Selection By Callable <indexing.callable>`.
122113

123114
Getting values from an object with multi-axes selection uses the following
124-
notation (using ``.loc`` as an example, but applies to ``.iloc`` and ``.ix`` as
115+
notation (using ``.loc`` as an example, but applies to ``.iloc`` as
125116
well). Any of the axes accessors may be the null slice ``:``. Axes left out of
126117
the specification are assumed to be ``:``. (e.g. ``p.loc['a']`` is equiv to
127118
``p.loc['a', :, :]``)
@@ -135,6 +126,48 @@ the specification are assumed to be ``:``. (e.g. ``p.loc['a']`` is equiv to
135126
DataFrame; ``df.loc[row_indexer,column_indexer]``
136127
Panel; ``p.loc[item_indexer,major_indexer,minor_indexer]``
137128

129+
.. _indexing.deprecate_ix:
130+
131+
IX Indexer is Deprecated
132+
------------------------
133+
134+
.. warning::
135+
136+
Startin in 0.20.0, the ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*. This has caused quite a bit of user confusion over the years.
137+
138+
139+
The recommended methods of indexing are:
140+
141+
.. ipython:: python
142+
143+
dfd = pd.DataFrame({'A': [1, 2, 3],
144+
'B': [4, 5, 6]},
145+
index=list('abc'))
146+
147+
dfd
148+
149+
Previous Behavior, where you wish to get the 0th and the 2nd elements from the index in the 'A' column.
150+
151+
.. code-block:: ipython
152+
153+
In [3]: dfd.ix[[0, 2], 'A']
154+
Out[3]:
155+
a 1
156+
c 3
157+
Name: A, dtype: int64
158+
159+
Using ``.loc``. Here we will select the appropriate indexes from the index, then use *label* indexing.
160+
161+
.. ipython:: python
162+
163+
dfd.loc[df.index[[0, 2]], 'A']
164+
165+
Using ``.iloc``. Here we will get the location of the 'A' column, then use *positional* indexing to select things.
166+
167+
.. ipython:: python
168+
169+
dfd.iloc[[0, 2], df.columns.get_loc('A')]
170+
138171
.. _indexing.basics:
139172

140173
Basics
@@ -193,7 +226,7 @@ columns.
193226

194227
.. warning::
195228

196-
pandas aligns all AXES when setting ``Series`` and ``DataFrame`` from ``.loc``, ``.iloc`` and ``.ix``.
229+
pandas aligns all AXES when setting ``Series`` and ``DataFrame`` from ``.loc``, and ``.iloc``.
197230

198231
This will **not** modify ``df`` because the column alignment is before value assignment.
199232

@@ -526,7 +559,7 @@ Selection By Callable
526559

527560
.. versionadded:: 0.18.1
528561

529-
``.loc``, ``.iloc``, ``.ix`` and also ``[]`` indexing can accept a ``callable`` as indexer.
562+
``.loc``, ``.iloc``, and also ``[]`` indexing can accept a ``callable`` as indexer.
530563
The ``callable`` must be a function with one argument (the calling Series, DataFrame or Panel) and that returns valid output for indexing.
531564

532565
.. ipython:: python
@@ -641,7 +674,7 @@ Setting With Enlargement
641674

642675
.. versionadded:: 0.13
643676

644-
The ``.loc/.ix/[]`` operations can perform enlargement when setting a non-existant key for that axis.
677+
The ``.loc/[]`` operations can perform enlargement when setting a non-existant key for that axis.
645678

646679
In the ``Series`` case this is effectively an appending operation
647680

@@ -906,7 +939,7 @@ without creating a copy:
906939

907940
Furthermore, ``where`` aligns the input boolean condition (ndarray or DataFrame),
908941
such that partial selection with setting is possible. This is analogous to
909-
partial setting via ``.ix`` (but on the contents rather than the axis labels)
942+
partial setting via ``.loc`` (but on the contents rather than the axis labels)
910943

911944
.. ipython:: python
912945
@@ -1716,7 +1749,7 @@ A chained assignment can also crop up in setting in a mixed dtype frame.
17161749

17171750
.. note::
17181751

1719-
These setting rules apply to all of ``.loc/.iloc/.ix``
1752+
These setting rules apply to all of ``.loc/.iloc``
17201753

17211754
This is the correct access method
17221755

doc/source/whatsnew/v0.20.0.txt

+48
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ users upgrade to this version.
1010
Highlights include:
1111

1212
- Building pandas for development now requires ``cython >= 0.23`` (:issue:`14831`)
13+
- The ``.ix`` indexer has been deprecated, see :ref:`here <whatsnew.api_breaking.deprecate_ix>`
1314

1415
Check the :ref:`API Changes <whatsnew_0200.api_breaking>` and :ref:`deprecations <whatsnew_0200.deprecations>` before updating.
1516

@@ -122,6 +123,53 @@ Other enhancements
122123
Backwards incompatible API changes
123124
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124125

126+
127+
.. _whatsnew.api_breaking.deprecate_ix
128+
129+
Deprecate .ix
130+
^^^^^^^^^^^^^
131+
132+
The ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*. This has caused quite a bit of user confusion over the years. The full indexing documentation are :ref:`here <indexing>`. (:issue:`14218`)
133+
134+
135+
The recommended methods of indexing are:
136+
137+
- ``.loc`` if you want to *label* index
138+
- ``.iloc`` if you want to *positionally* index.
139+
140+
Using ``.ix`` will now show a deprecation warning with a mini-example of how to convert code.
141+
142+
.. ipython:: python
143+
144+
df = pd.DataFrame({'A': [1, 2, 3],
145+
'B': [4, 5, 6]},
146+
index=list('abc'))
147+
148+
df
149+
150+
Previous Behavior, where you wish to get the 0th and the 2nd elements from the index in the 'A' column.
151+
152+
.. code-block:: ipython
153+
154+
In [3]: df.ix[[0, 2], 'A']
155+
Out[3]:
156+
a 1
157+
c 3
158+
Name: A, dtype: int64
159+
160+
Using ``.loc``. Here we will select the appropriate indexes from the index, then use *label* indexing.
161+
162+
.. ipython:: python
163+
164+
df.loc[df.index[[0, 2]], 'A']
165+
166+
Using ``.iloc``. Here we will get the location of the 'A' column, then use *positional* indexing to select things.
167+
168+
.. ipython:: python
169+
170+
df.iloc[[0, 2], df.columns.get_loc('A')]
171+
172+
125173
.. _whatsnew.api_breaking.index_map
126174

127175
Map on Index types now return other Index types

0 commit comments

Comments
 (0)