Skip to content

DEPR: indexing #49412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 2, 2022
Merged
2 changes: 1 addition & 1 deletion ci/deps/actions-38-minimum_versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ dependencies:
- gcsfs=2021.07.0
- jinja2=3.0.0
- lxml=4.6.3
- matplotlib=3.3.2
- matplotlib=3.6.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did matplotlib need to be bumped? 3.6 was only released in September 2022 so it's fairly new

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.6.0 is the first version that doesnt do series[:, None]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So our existing plotting tests were failing before this bump?

Could you also update this version in install.rst?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I think some matplotlib code can be cleaned up with this bump if interested in following up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. something to grep for to find the relevant code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally in pandas/plotting/_matplotlib/compat.py, I think anything that uses mpl_ge_3_4_0 or mpl_ge_3_5_0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min version build is failing I think with this bump, so the cleanup would need to be done sooner rather than later: https://github.com/pandas-dev/pandas/actions/runs/3380325519/jobs/5612984854

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ive put it into a CLN branch on which the tests are about to finish locally, will push shortly

- numba=0.53.1
- numexpr=2.7.3
- odfpy=1.4.1
Expand Down
2 changes: 1 addition & 1 deletion doc/source/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ Can be managed as optional_extra with ``pandas[plot, output_formatting]``, depen
========================= ================== ================== =============================================================
Dependency Minimum Version optional_extra Notes
========================= ================== ================== =============================================================
matplotlib 3.3.2 plot Plotting library
matplotlib 3.6.0 plot Plotting library
Jinja2 3.0.0 output_formatting Conditional formatting with DataFrame.style
tabulate 0.8.9 output_formatting Printing in Markdown-friendly format (see `tabulate`_)
========================= ================== ================== =============================================================
Expand Down
4 changes: 4 additions & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ Optional libraries below the lowest tested version may still work, but are not c
+=================+=================+=========+
| pyarrow | 6.0.0 | X |
+-----------------+-----------------+---------+
| matplotlib | 3.6.0 | X |
+-----------------+-----------------+---------+
| fastparquet | 0.6.3 | X |
+-----------------+-----------------+---------+

Expand Down Expand Up @@ -276,6 +278,8 @@ Removal of prior version deprecations/changes
- Enforced disallowing a string column label into ``times`` in :meth:`DataFrame.ewm` (:issue:`43265`)
- Enforced disallowing a tuple of column labels into :meth:`.DataFrameGroupBy.__getitem__` (:issue:`30546`)
- Enforced disallowing setting values with ``.loc`` using a positional slice. Use ``.loc`` with labels or ``.iloc`` with positions instead (:issue:`31840`)
- Enforced disallowing positional indexing with a ``float`` key even if that key is a round number, manually cast to integer instead (:issue:`34193`)
- Enforced disallowing indexing on a :class:`Index` or positional indexing on a :class:`Series` producing multi-dimensional objects e.g. ``obj[:, None]``, convert to numpy before indexing instead (:issue:`35141`)
- Enforced disallowing ``dict`` or ``set`` objects in ``suffixes`` in :func:`merge` (:issue:`34810`)
- Enforced disallowing :func:`merge` to produce duplicated columns through the ``suffixes`` keyword and already existing columns (:issue:`22818`)
- Enforced disallowing using :func:`merge` or :func:`join` on a different number of levels (:issue:`34862`)
Expand Down
2 changes: 1 addition & 1 deletion pandas/compat/_optional.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"gcsfs": "2021.07.0",
"jinja2": "3.0.0",
"lxml.etree": "4.6.3",
"matplotlib": "3.3.2",
"matplotlib": "3.6.0",
"numba": "0.53.1",
"numexpr": "2.7.3",
"odfpy": "1.4.1",
Expand Down
19 changes: 7 additions & 12 deletions pandas/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,30 +148,25 @@ def is_bool_indexer(key: Any) -> bool:
return False


def cast_scalar_indexer(val, warn_float: bool = False):
def cast_scalar_indexer(val):
"""
To avoid numpy DeprecationWarnings, cast float to integer where valid.
Disallow indexing with a float key, even if that key is a round number.

Parameters
----------
val : scalar
warn_float : bool, default False
If True, issue deprecation warning for a float indexer.

Returns
-------
outval : scalar
"""
# assumes lib.is_scalar(val)
if lib.is_float(val) and val.is_integer():
if warn_float:
warnings.warn(
"Indexing with a float is deprecated, and will raise an IndexError "
"in pandas 2.0. You can manually convert to an integer key instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
return int(val)
raise IndexError(
# GH#34193
"Indexing with a float is no longer supported. Manually convert "
"to an integer key instead."
)
return val


Expand Down
4 changes: 2 additions & 2 deletions pandas/core/indexers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
check_array_indexer,
check_key_length,
check_setitem_lengths,
deprecate_ndim_indexing,
disallow_ndim_indexing,
is_empty_indexer,
is_list_like_indexer,
is_scalar_indexer,
Expand All @@ -23,7 +23,7 @@
"validate_indices",
"maybe_convert_indices",
"length_of_indexer",
"deprecate_ndim_indexing",
"disallow_ndim_indexing",
"unpack_1tuple",
"check_key_length",
"check_array_indexer",
Expand Down
20 changes: 7 additions & 13 deletions pandas/core/indexers/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,10 @@
TYPE_CHECKING,
Any,
)
import warnings

import numpy as np

from pandas._typing import AnyArrayLike
from pandas.util._exceptions import find_stack_level

from pandas.core.dtypes.common import (
is_array_like,
Expand Down Expand Up @@ -333,22 +331,18 @@ def length_of_indexer(indexer, target=None) -> int:
raise AssertionError("cannot find the length of the indexer")


def deprecate_ndim_indexing(result, stacklevel: int = 3) -> None:
def disallow_ndim_indexing(result) -> None:
"""
Helper function to raise the deprecation warning for multi-dimensional
indexing on 1D Series/Index.
Helper function to disallow multi-dimensional indexing on 1D Series/Index.

GH#27125 indexer like idx[:, None] expands dim, but we cannot do that
and keep an index, so we currently return ndarray, which is deprecated
(Deprecation GH#30588).
and keep an index, so we used to return ndarray, which was deprecated
in GH#30588.
"""
if np.ndim(result) > 1:
warnings.warn(
"Support for multi-dimensional indexing (e.g. `obj[:, None]`) "
"is deprecated and will be removed in a future "
"version. Convert to a numpy array before indexing instead.",
FutureWarning,
stacklevel=find_stack_level(),
raise ValueError(
"Multi-dimensional indexing (e.g. `obj[:, None]`) is no longer "
"supported. Convert to a numpy array before indexing instead."
)


Expand Down
14 changes: 3 additions & 11 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@
extract_array,
sanitize_array,
)
from pandas.core.indexers import deprecate_ndim_indexing
from pandas.core.indexers import disallow_ndim_indexing
from pandas.core.indexes.frozen import FrozenList
from pandas.core.ops import get_op_result_name
from pandas.core.ops.invalid import make_invalid_op
Expand Down Expand Up @@ -5221,7 +5221,7 @@ def __getitem__(self, key):

if is_integer(key) or is_float(key):
# GH#44051 exclude bool, which would return a 2d ndarray
key = com.cast_scalar_indexer(key, warn_float=True)
key = com.cast_scalar_indexer(key)
return getitem(key)

if isinstance(key, slice):
Expand All @@ -5244,15 +5244,7 @@ def __getitem__(self, key):
result = getitem(key)
# Because we ruled out integer above, we always get an arraylike here
if result.ndim > 1:
deprecate_ndim_indexing(result)
if hasattr(result, "_ndarray"):
# i.e. NDArrayBackedExtensionArray
# Unpack to ndarray for MPL compat
# error: Item "ndarray[Any, Any]" of
# "Union[ExtensionArray, ndarray[Any, Any]]"
# has no attribute "_ndarray"
return result._ndarray # type: ignore[union-attr]
return result
disallow_ndim_indexing(result)

# NB: Using _constructor._simple_new would break if MultiIndex
# didn't override __getitem__
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -2031,7 +2031,7 @@ def __reduce__(self):

def __getitem__(self, key):
if is_scalar(key):
key = com.cast_scalar_indexer(key, warn_float=True)
key = com.cast_scalar_indexer(key)

retval = []
for lev, level_codes in zip(self.levels, self.codes):
Expand Down
4 changes: 2 additions & 2 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@
)
from pandas.core.generic import NDFrame
from pandas.core.indexers import (
deprecate_ndim_indexing,
disallow_ndim_indexing,
unpack_1tuple,
)
from pandas.core.indexes.accessors import CombinedDatetimelikeProperties
Expand Down Expand Up @@ -1003,7 +1003,7 @@ def _get_values_tuple(self, key: tuple):
# see tests.series.timeseries.test_mpl_compat_hack
# the asarray is needed to avoid returning a 2D DatetimeArray
result = np.asarray(self._values[key])
deprecate_ndim_indexing(result, stacklevel=find_stack_level())
disallow_ndim_indexing(result)
return result

if not isinstance(self.index, MultiIndex):
Expand Down
20 changes: 8 additions & 12 deletions pandas/tests/indexes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -699,20 +699,16 @@ def test_engine_reference_cycle(self, simple_index):
def test_getitem_2d_deprecated(self, simple_index):
# GH#30588, GH#31479
idx = simple_index
msg = "Support for multi-dimensional indexing"
with tm.assert_produces_warning(FutureWarning, match=msg):
res = idx[:, None]

assert isinstance(res, np.ndarray), type(res)
msg = "Multi-dimensional indexing"
with pytest.raises(ValueError, match=msg):
idx[:, None]

if not isinstance(idx, RangeIndex):
# GH#44051 RangeIndex already raises
with tm.assert_produces_warning(FutureWarning, match=msg):
res = idx[True]
assert isinstance(res, np.ndarray), type(res)
with tm.assert_produces_warning(FutureWarning, match=msg):
res = idx[False]
assert isinstance(res, np.ndarray), type(res)
# GH#44051 RangeIndex already raised pre-2.0 with a different message
with pytest.raises(ValueError, match=msg):
idx[True]
with pytest.raises(ValueError, match=msg):
idx[False]
else:
msg = "only integers, slices"
with pytest.raises(IndexError, match=msg):
Expand Down
6 changes: 2 additions & 4 deletions pandas/tests/indexes/datetimes/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,9 @@ def test_dti_business_getitem(self, freq):
@pytest.mark.parametrize("freq", ["B", "C"])
def test_dti_business_getitem_matplotlib_hackaround(self, freq):
rng = bdate_range(START, END, freq=freq)
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(ValueError, match="Multi-dimensional indexing"):
# GH#30588 multi-dimensional indexing deprecated
values = rng[:, None]
expected = rng.values[:, None]
tm.assert_numpy_array_equal(values, expected)
rng[:, None]

def test_getitem_int_list(self):
dti = date_range(start="1/1/2005", end="12/1/2005", freq="M")
Expand Down
14 changes: 7 additions & 7 deletions pandas/tests/indexes/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,11 @@ def test_can_hold_identifiers(self, simple_index):

@pytest.mark.parametrize("index", ["datetime"], indirect=True)
def test_new_axis(self, index):
with tm.assert_produces_warning(FutureWarning):
# TODO: a bunch of scattered tests check this deprecation is enforced.
# de-duplicate/centralize them.
with pytest.raises(ValueError, match="Multi-dimensional indexing"):
# GH#30588 multi-dimensional indexing deprecated
new_index = index[None, :]
assert new_index.ndim == 2
assert isinstance(new_index, np.ndarray)
index[None, :]

def test_argsort(self, index):
with tm.maybe_produces_warning(
Expand Down Expand Up @@ -1532,15 +1532,15 @@ def test_deprecated_fastpath():


def test_shape_of_invalid_index():
# Currently, it is possible to create "invalid" index objects backed by
# Pre-2.0, it was possible to create "invalid" index objects backed by
# a multi-dimensional array (see https://github.com/pandas-dev/pandas/issues/27125
# about this). However, as long as this is not solved in general,this test ensures
# that the returned shape is consistent with this underlying array for
# compat with matplotlib (see https://github.com/pandas-dev/pandas/issues/27775)
idx = Index([0, 1, 2, 3])
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(ValueError, match="Multi-dimensional indexing"):
# GH#30588 multi-dimensional indexing deprecated
assert idx[:, None].shape == (4, 1)
idx[:, None]


def test_validate_1d_input():
Expand Down
8 changes: 3 additions & 5 deletions pandas/tests/indexes/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,11 +290,9 @@ def test_putmask_with_wrong_mask(self, index):
def test_getitem_deprecated_float(idx):
# https://github.com/pandas-dev/pandas/issues/34191

with tm.assert_produces_warning(FutureWarning):
result = idx[1.0]

expected = idx[1]
assert result == expected
msg = "Indexing with a float is no longer supported"
with pytest.raises(IndexError, match=msg):
idx[1.0]


@pytest.mark.parametrize(
Expand Down
27 changes: 8 additions & 19 deletions pandas/tests/series/indexing/test_getitem.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,19 +269,10 @@ def test_getitem_partial_str_slice_high_reso_with_timedeltaindex(self):

def test_getitem_slice_2d(self, datetime_series):
# GH#30588 multi-dimensional indexing deprecated

with tm.assert_produces_warning(
FutureWarning, match="Support for multi-dimensional indexing"
):
# GH#30867 Don't want to support this long-term, but
# for now ensure that the warning from Index
# doesn't comes through via Series.__getitem__.
result = datetime_series[:, np.newaxis]
expected = datetime_series.values[:, np.newaxis]
tm.assert_almost_equal(result, expected)
with pytest.raises(ValueError, match="Multi-dimensional indexing"):
datetime_series[:, np.newaxis]

# FutureWarning from NumPy.
@pytest.mark.filterwarnings("ignore:Using a non-tuple:FutureWarning")
def test_getitem_median_slice_bug(self):
index = date_range("20090415", "20090519", freq="2B")
ser = Series(np.random.randn(13), index=index)
Expand All @@ -291,6 +282,10 @@ def test_getitem_median_slice_bug(self):
with pytest.raises(ValueError, match=msg):
# GH#31299
ser[indexer]
# but we're OK with a single-element tuple
result = ser[(indexer[0],)]
expected = ser[indexer[0]]
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize(
"slc, positions",
Expand Down Expand Up @@ -554,14 +549,8 @@ def test_getitem_generator(string_series):
],
)
def test_getitem_ndim_deprecated(series):
with tm.assert_produces_warning(
FutureWarning,
match="Support for multi-dimensional indexing",
):
result = series[:, None]

expected = np.asarray(series)[:, None]
tm.assert_numpy_array_equal(result, expected)
with pytest.raises(ValueError, match="Multi-dimensional indexing"):
series[:, None]


def test_getitem_multilevel_scalar_slice_not_implemented(
Expand Down
7 changes: 5 additions & 2 deletions pandas/tests/series/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,8 +184,6 @@ def test_setslice(datetime_series):
assert sl.index.is_unique is True


# FutureWarning from NumPy about [slice(None, 5).
@pytest.mark.filterwarnings("ignore:Using a non-tuple:FutureWarning")
def test_basic_getitem_setitem_corner(datetime_series):
# invalid tuples, e.g. td.ts[:, None] vs. td.ts[:, 2]
msg = "key of type tuple not found and not a MultiIndex"
Expand All @@ -200,6 +198,11 @@ def test_basic_getitem_setitem_corner(datetime_series):
# GH#31299
datetime_series[[slice(None, 5)]]

# but we're OK with a single-element tuple
result = datetime_series[(slice(None, 5),)]
expected = datetime_series[:5]
tm.assert_series_equal(result, expected)

# OK
msg = r"unhashable type(: 'slice')?"
with pytest.raises(TypeError, match=msg):
Expand Down