Skip to content

Commit 081c06b

Browse files
authored
DEPR: enforce indexing deprecations (#49511)
* DEPR: enforce deprecation of string indexing on DataFrame rows * DEPR: set, dict indexers, DataFrame indexer in iloc * DEPR: disallow passing list to xs * update doc * update docstring, typo fixup
1 parent 6dc92ad commit 081c06b

File tree

17 files changed

+92
-136
lines changed

17 files changed

+92
-136
lines changed

doc/source/whatsnew/v0.11.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -368,7 +368,7 @@ Enhancements
368368
- You can now select with a string from a DataFrame with a datelike index, in a similar way to a Series (:issue:`3070`)
369369

370370
.. ipython:: python
371-
:okwarning:
371+
:okexcept:
372372
373373
idx = pd.date_range("2001-10-1", periods=5, freq='M')
374374
ts = pd.Series(np.random.rand(len(idx)), index=idx)

doc/source/whatsnew/v2.0.0.rst

+4
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,7 @@ Removal of prior version deprecations/changes
311311
- Removed argument ``kind`` from :meth:`Index.get_slice_bound`, :meth:`Index.slice_indexer` and :meth:`Index.slice_locs` (:issue:`41378`)
312312
- Removed arguments ``prefix``, ``squeeze``, ``error_bad_lines`` and ``warn_bad_lines`` from :func:`read_csv` (:issue:`40413`, :issue:`43427`)
313313
- Removed argument ``datetime_is_numeric`` from :meth:`DataFrame.describe` and :meth:`Series.describe` as datetime data will always be summarized as numeric data (:issue:`34798`)
314+
- Disallow passing list ``key`` to :meth:`Series.xs` and :meth:`DataFrame.xs`, pass a tuple instead (:issue:`41789`)
314315
- Disallow subclass-specific keywords (e.g. "freq", "tz", "names", "closed") in the :class:`Index` constructor (:issue:`38597`)
315316
- Removed argument ``inplace`` from :meth:`Categorical.remove_unused_categories` (:issue:`37918`)
316317
- Disallow passing non-round floats to :class:`Timestamp` with ``unit="M"`` or ``unit="Y"`` (:issue:`47266`)
@@ -388,6 +389,8 @@ Removal of prior version deprecations/changes
388389
- Enforced disallowing a tuple of column labels into :meth:`.DataFrameGroupBy.__getitem__` (:issue:`30546`)
389390
- Enforced disallowing setting values with ``.loc`` using a positional slice. Use ``.loc`` with labels or ``.iloc`` with positions instead (:issue:`31840`)
390391
- Enforced disallowing positional indexing with a ``float`` key even if that key is a round number, manually cast to integer instead (:issue:`34193`)
392+
- Enforced disallowing using a :class:`DataFrame` indexer with ``.iloc``, use ``.loc`` instead for automatic alignment (:issue:`39022`)
393+
- Enforced disallowing ``set`` or ``dict`` indexers in ``__getitem__`` and ``__setitem__`` methods (:issue:`42825`)
391394
- Enforced disallowing indexing on a :class:`Index` or positional indexing on a :class:`Series` producing multi-dimensional objects e.g. ``obj[:, None]``, convert to numpy before indexing instead (:issue:`35141`)
392395
- Enforced disallowing ``dict`` or ``set`` objects in ``suffixes`` in :func:`merge` (:issue:`34810`)
393396
- Enforced disallowing :func:`merge` to produce duplicated columns through the ``suffixes`` keyword and already existing columns (:issue:`22818`)
@@ -399,6 +402,7 @@ Removal of prior version deprecations/changes
399402
- Enforced :meth:`Rolling.count` with ``min_periods=None`` to default to the size of the window (:issue:`31302`)
400403
- Renamed ``fname`` to ``path`` in :meth:`DataFrame.to_parquet`, :meth:`DataFrame.to_stata` and :meth:`DataFrame.to_feather` (:issue:`30338`)
401404
- Enforced disallowing indexing a :class:`Series` with a single item list with a slice (e.g. ``ser[[slice(0, 2)]]``). Either convert the list to tuple, or pass the slice directly instead (:issue:`31333`)
405+
- Changed behavior indexing on a :class:`DataFrame` with a :class:`DatetimeIndex` index using a string indexer, previously this operated as a slice on rows, now it operates like any other column key; use ``frame.loc[key]`` for the old behavior (:issue:`36179`)
402406
- Enforced the ``display.max_colwidth`` option to not accept negative integers (:issue:`31569`)
403407
- Removed the ``display.column_space`` option in favor of ``df.to_string(col_space=...)`` (:issue:`47280`)
404408
- Removed the deprecated method ``mad`` from pandas classes (:issue:`11787`)

pandas/core/frame.py

+10-12
Original file line numberDiff line numberDiff line change
@@ -187,8 +187,7 @@
187187
)
188188
from pandas.core.indexing import (
189189
check_bool_indexer,
190-
check_deprecated_indexers,
191-
convert_to_index_sliceable,
190+
check_dict_or_set_indexers,
192191
)
193192
from pandas.core.internals import (
194193
ArrayManager,
@@ -3703,7 +3702,7 @@ def _iter_column_arrays(self) -> Iterator[ArrayLike]:
37033702
yield self._get_column_array(i)
37043703

37053704
def __getitem__(self, key):
3706-
check_deprecated_indexers(key)
3705+
check_dict_or_set_indexers(key)
37073706
key = lib.item_from_zerodim(key)
37083707
key = com.apply_if_callable(key, self)
37093708

@@ -3723,17 +3722,18 @@ def __getitem__(self, key):
37233722
elif is_mi and self.columns.is_unique and key in self.columns:
37243723
return self._getitem_multilevel(key)
37253724
# Do we have a slicer (on rows)?
3726-
indexer = convert_to_index_sliceable(self, key)
3727-
if indexer is not None:
3725+
if isinstance(key, slice):
3726+
indexer = self.index._convert_slice_indexer(
3727+
key, kind="getitem", is_frame=True
3728+
)
37283729
if isinstance(indexer, np.ndarray):
3730+
# reachable with DatetimeIndex
37293731
indexer = lib.maybe_indices_to_slice(
37303732
indexer.astype(np.intp, copy=False), len(self)
37313733
)
37323734
if isinstance(indexer, np.ndarray):
37333735
# GH#43223 If we can not convert, use take
37343736
return self.take(indexer, axis=0)
3735-
# either we have a slice or we have a string that can be converted
3736-
# to a slice for partial-string date indexing
37373737
return self._slice(indexer, axis=0)
37383738

37393739
# Do we have a (boolean) DataFrame?
@@ -3903,11 +3903,9 @@ def __setitem__(self, key, value):
39033903
key = com.apply_if_callable(key, self)
39043904

39053905
# see if we can slice the rows
3906-
indexer = convert_to_index_sliceable(self, key)
3907-
if indexer is not None:
3908-
# either we have a slice or we have a string that can be converted
3909-
# to a slice for partial-string date indexing
3910-
return self._setitem_slice(indexer, value)
3906+
if isinstance(key, slice):
3907+
slc = self.index._convert_slice_indexer(key, kind="getitem", is_frame=True)
3908+
return self._setitem_slice(slc, value)
39113909

39123910
if isinstance(key, DataFrame) or getattr(key, "ndim", None) == 2:
39133911
self._setitem_frame(key, value)

pandas/core/generic.py

+1-6
Original file line numberDiff line numberDiff line change
@@ -3908,12 +3908,7 @@ class animal locomotion
39083908
labels = self._get_axis(axis)
39093909

39103910
if isinstance(key, list):
3911-
warnings.warn(
3912-
"Passing lists as key for xs is deprecated and will be removed in a "
3913-
"future version. Pass key as a tuple instead.",
3914-
FutureWarning,
3915-
stacklevel=find_stack_level(),
3916-
)
3911+
raise TypeError("list keys are not supported in xs, pass a tuple instead")
39173912

39183913
if level is not None:
39193914
if not isinstance(labels, MultiIndex):

pandas/core/indexing.py

+16-56
Original file line numberDiff line numberDiff line change
@@ -687,7 +687,7 @@ def _get_setitem_indexer(self, key):
687687

688688
if isinstance(key, tuple):
689689
for x in key:
690-
check_deprecated_indexers(x)
690+
check_dict_or_set_indexers(x)
691691

692692
if self.axis is not None:
693693
key = _tupleize_axis_indexer(self.ndim, self.axis, key)
@@ -813,7 +813,7 @@ def _ensure_listlike_indexer(self, key, axis=None, value=None) -> None:
813813

814814
@final
815815
def __setitem__(self, key, value) -> None:
816-
check_deprecated_indexers(key)
816+
check_dict_or_set_indexers(key)
817817
if isinstance(key, tuple):
818818
key = tuple(list(x) if is_iterator(x) else x for x in key)
819819
key = tuple(com.apply_if_callable(x, self.obj) for x in key)
@@ -1004,7 +1004,7 @@ def _getitem_nested_tuple(self, tup: tuple):
10041004
# we should be able to match up the dimensionality here
10051005

10061006
for key in tup:
1007-
check_deprecated_indexers(key)
1007+
check_dict_or_set_indexers(key)
10081008

10091009
# we have too many indexers for our dim, but have at least 1
10101010
# multi-index dimension, try to see if we have something like
@@ -1062,7 +1062,7 @@ def _convert_to_indexer(self, key, axis: AxisInt):
10621062

10631063
@final
10641064
def __getitem__(self, key):
1065-
check_deprecated_indexers(key)
1065+
check_dict_or_set_indexers(key)
10661066
if type(key) is tuple:
10671067
key = tuple(list(x) if is_iterator(x) else x for x in key)
10681068
key = tuple(com.apply_if_callable(x, self.obj) for x in key)
@@ -1499,12 +1499,9 @@ def _has_valid_setitem_indexer(self, indexer) -> bool:
14991499
raise IndexError("iloc cannot enlarge its target object")
15001500

15011501
if isinstance(indexer, ABCDataFrame):
1502-
warnings.warn(
1503-
"DataFrame indexer for .iloc is deprecated and will be removed in "
1504-
"a future version.\n"
1505-
"consider using .loc with a DataFrame indexer for automatic alignment.",
1506-
FutureWarning,
1507-
stacklevel=find_stack_level(),
1502+
raise TypeError(
1503+
"DataFrame indexer for .iloc is not supported. "
1504+
"Consider using .loc with a DataFrame indexer for automatic alignment.",
15081505
)
15091506

15101507
if not isinstance(indexer, tuple):
@@ -2493,40 +2490,6 @@ def _tupleize_axis_indexer(ndim: int, axis: AxisInt, key) -> tuple:
24932490
return tuple(new_key)
24942491

24952492

2496-
def convert_to_index_sliceable(obj: DataFrame, key):
2497-
"""
2498-
If we are index sliceable, then return my slicer, otherwise return None.
2499-
"""
2500-
idx = obj.index
2501-
if isinstance(key, slice):
2502-
return idx._convert_slice_indexer(key, kind="getitem", is_frame=True)
2503-
2504-
elif isinstance(key, str):
2505-
2506-
# we are an actual column
2507-
if key in obj.columns:
2508-
return None
2509-
2510-
# We might have a datetimelike string that we can translate to a
2511-
# slice here via partial string indexing
2512-
if idx._supports_partial_string_indexing:
2513-
try:
2514-
res = idx._get_string_slice(str(key))
2515-
warnings.warn(
2516-
"Indexing a DataFrame with a datetimelike index using a single "
2517-
"string to slice the rows, like `frame[string]`, is deprecated "
2518-
"and will be removed in a future version. Use `frame.loc[string]` "
2519-
"instead.",
2520-
FutureWarning,
2521-
stacklevel=find_stack_level(),
2522-
)
2523-
return res
2524-
except (KeyError, ValueError, NotImplementedError):
2525-
return None
2526-
2527-
return None
2528-
2529-
25302493
def check_bool_indexer(index: Index, key) -> np.ndarray:
25312494
"""
25322495
Check if key is a valid boolean indexer for an object with such index and
@@ -2661,27 +2624,24 @@ def need_slice(obj: slice) -> bool:
26612624
)
26622625

26632626

2664-
def check_deprecated_indexers(key) -> None:
2665-
"""Checks if the key is a deprecated indexer."""
2627+
def check_dict_or_set_indexers(key) -> None:
2628+
"""
2629+
Check if the indexer is or contains a dict or set, which is no longer allowed.
2630+
"""
26662631
if (
26672632
isinstance(key, set)
26682633
or isinstance(key, tuple)
26692634
and any(isinstance(x, set) for x in key)
26702635
):
2671-
warnings.warn(
2672-
"Passing a set as an indexer is deprecated and will raise in "
2673-
"a future version. Use a list instead.",
2674-
FutureWarning,
2675-
stacklevel=find_stack_level(),
2636+
raise TypeError(
2637+
"Passing a set as an indexer is not supported. Use a list instead."
26762638
)
2639+
26772640
if (
26782641
isinstance(key, dict)
26792642
or isinstance(key, tuple)
26802643
and any(isinstance(x, dict) for x in key)
26812644
):
2682-
warnings.warn(
2683-
"Passing a dict as an indexer is deprecated and will raise in "
2684-
"a future version. Use a list instead.",
2685-
FutureWarning,
2686-
stacklevel=find_stack_level(),
2645+
raise TypeError(
2646+
"Passing a dict as an indexer is not supported. Use a list instead."
26872647
)

pandas/core/series.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@
139139
from pandas.core.indexes.multi import maybe_droplevels
140140
from pandas.core.indexing import (
141141
check_bool_indexer,
142-
check_deprecated_indexers,
142+
check_dict_or_set_indexers,
143143
)
144144
from pandas.core.internals import (
145145
SingleArrayManager,
@@ -913,7 +913,7 @@ def _slice(self, slobj: slice, axis: Axis = 0) -> Series:
913913
return self._get_values(slobj)
914914

915915
def __getitem__(self, key):
916-
check_deprecated_indexers(key)
916+
check_dict_or_set_indexers(key)
917917
key = com.apply_if_callable(key, self)
918918

919919
if key is Ellipsis:
@@ -1056,7 +1056,7 @@ def _get_value(self, label, takeable: bool = False):
10561056
return self.iloc[loc]
10571057

10581058
def __setitem__(self, key, value) -> None:
1059-
check_deprecated_indexers(key)
1059+
check_dict_or_set_indexers(key)
10601060
key = com.apply_if_callable(key, self)
10611061
cacher_needs_updating = self._check_is_chained_assignment_possible()
10621062

pandas/tests/frame/indexing/test_getitem.py

+6-4
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,10 @@ def test_getitem_listlike(self, idx_type, levels, float_frame):
142142
idx_check = list(idx_type(keys))
143143

144144
if isinstance(idx, (set, dict)):
145-
with tm.assert_produces_warning(FutureWarning):
146-
result = frame[idx]
145+
with pytest.raises(TypeError, match="as an indexer is not supported"):
146+
frame[idx]
147+
148+
return
147149
else:
148150
result = frame[idx]
149151

@@ -467,9 +469,9 @@ def test_getitem_datetime_slice(self):
467469
class TestGetitemDeprecatedIndexers:
468470
@pytest.mark.parametrize("key", [{"a", "b"}, {"a": "a"}])
469471
def test_getitem_dict_and_set_deprecated(self, key):
470-
# GH#42825
472+
# GH#42825 enforced in 2.0
471473
df = DataFrame(
472474
[[1, 2], [3, 4]], columns=MultiIndex.from_tuples([("a", 1), ("b", 2)])
473475
)
474-
with tm.assert_produces_warning(FutureWarning):
476+
with pytest.raises(TypeError, match="as an indexer is not supported"):
475477
df[key]

pandas/tests/frame/indexing/test_indexing.py

+11-11
Original file line numberDiff line numberDiff line change
@@ -1711,14 +1711,14 @@ def test_loc_on_multiindex_one_level(self):
17111711
tm.assert_frame_equal(result, expected)
17121712

17131713

1714-
class TestDepreactedIndexers:
1714+
class TestDeprecatedIndexers:
17151715
@pytest.mark.parametrize(
17161716
"key", [{1}, {1: 1}, ({1}, "a"), ({1: 1}, "a"), (1, {"a"}), (1, {"a": "a"})]
17171717
)
17181718
def test_getitem_dict_and_set_deprecated(self, key):
1719-
# GH#42825
1719+
# GH#42825 enforced in 2.0
17201720
df = DataFrame([[1, 2], [3, 4]], columns=["a", "b"])
1721-
with tm.assert_produces_warning(FutureWarning):
1721+
with pytest.raises(TypeError, match="as an indexer is not supported"):
17221722
df.loc[key]
17231723

17241724
@pytest.mark.parametrize(
@@ -1733,22 +1733,22 @@ def test_getitem_dict_and_set_deprecated(self, key):
17331733
],
17341734
)
17351735
def test_getitem_dict_and_set_deprecated_multiindex(self, key):
1736-
# GH#42825
1736+
# GH#42825 enforced in 2.0
17371737
df = DataFrame(
17381738
[[1, 2], [3, 4]],
17391739
columns=["a", "b"],
17401740
index=MultiIndex.from_tuples([(1, 2), (3, 4)]),
17411741
)
1742-
with tm.assert_produces_warning(FutureWarning):
1742+
with pytest.raises(TypeError, match="as an indexer is not supported"):
17431743
df.loc[key]
17441744

17451745
@pytest.mark.parametrize(
17461746
"key", [{1}, {1: 1}, ({1}, "a"), ({1: 1}, "a"), (1, {"a"}), (1, {"a": "a"})]
17471747
)
1748-
def test_setitem_dict_and_set_deprecated(self, key):
1749-
# GH#42825
1748+
def test_setitem_dict_and_set_disallowed(self, key):
1749+
# GH#42825 enforced in 2.0
17501750
df = DataFrame([[1, 2], [3, 4]], columns=["a", "b"])
1751-
with tm.assert_produces_warning(FutureWarning):
1751+
with pytest.raises(TypeError, match="as an indexer is not supported"):
17521752
df.loc[key] = 1
17531753

17541754
@pytest.mark.parametrize(
@@ -1762,12 +1762,12 @@ def test_setitem_dict_and_set_deprecated(self, key):
17621762
((1, 2), {"a": "a"}),
17631763
],
17641764
)
1765-
def test_setitem_dict_and_set_deprecated_multiindex(self, key):
1766-
# GH#42825
1765+
def test_setitem_dict_and_set_disallowed_multiindex(self, key):
1766+
# GH#42825 enforced in 2.0
17671767
df = DataFrame(
17681768
[[1, 2], [3, 4]],
17691769
columns=["a", "b"],
17701770
index=MultiIndex.from_tuples([(1, 2), (3, 4)]),
17711771
)
1772-
with tm.assert_produces_warning(FutureWarning):
1772+
with pytest.raises(TypeError, match="as an indexer is not supported"):
17731773
df.loc[key] = 1

pandas/tests/frame/indexing/test_xs.py

+4-7
Original file line numberDiff line numberDiff line change
@@ -107,8 +107,7 @@ def test_xs_keep_level(self):
107107
expected = df[:1]
108108
tm.assert_frame_equal(result, expected)
109109

110-
with tm.assert_produces_warning(FutureWarning):
111-
result = df.xs([2008, "sat"], level=["year", "day"], drop_level=False)
110+
result = df.xs((2008, "sat"), level=["year", "day"], drop_level=False)
112111
tm.assert_frame_equal(result, expected)
113112

114113
def test_xs_view(self, using_array_manager, using_copy_on_write):
@@ -225,8 +224,7 @@ def test_xs_with_duplicates(self, key, level, multiindex_dataframe_random_data):
225224
expected = concat([frame.xs("one", level="second")] * 2)
226225

227226
if isinstance(key, list):
228-
with tm.assert_produces_warning(FutureWarning):
229-
result = df.xs(key, level=level)
227+
result = df.xs(tuple(key), level=level)
230228
else:
231229
result = df.xs(key, level=level)
232230
tm.assert_frame_equal(result, expected)
@@ -412,6 +410,5 @@ def test_xs_list_indexer_droplevel_false(self):
412410
# GH#41760
413411
mi = MultiIndex.from_tuples([("x", "m", "a"), ("x", "n", "b"), ("y", "o", "c")])
414412
df = DataFrame([[1, 2, 3], [4, 5, 6]], columns=mi)
415-
with tm.assert_produces_warning(FutureWarning):
416-
with pytest.raises(KeyError, match="y"):
417-
df.xs(["x", "y"], drop_level=False, axis=1)
413+
with pytest.raises(KeyError, match="y"):
414+
df.xs(("x", "y"), drop_level=False, axis=1)

pandas/tests/indexes/datetimes/test_partial_slicing.py

+4-6
Original file line numberDiff line numberDiff line change
@@ -295,12 +295,10 @@ def test_partial_slicing_dataframe(self):
295295
expected = df["a"][theslice]
296296
tm.assert_series_equal(result, expected)
297297

298-
# Frame should return slice as well
299-
with tm.assert_produces_warning(FutureWarning):
300-
# GH#36179 deprecated this indexing
301-
result = df[ts_string]
302-
expected = df[theslice]
303-
tm.assert_frame_equal(result, expected)
298+
# pre-2.0 df[ts_string] was overloaded to interpret this
299+
# as slicing along index
300+
with pytest.raises(KeyError, match=ts_string):
301+
df[ts_string]
304302

305303
# Timestamp with resolution more precise than index
306304
# Compatible with existing key

0 commit comments

Comments
 (0)