Skip to content

BUG: Fix is_unique regression for slices of Indexes #57958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 6, 2024
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Fixed regressions
~~~~~~~~~~~~~~~~~
- :meth:`DataFrame.__dataframe__` was producing incorrect data buffers when the a column's type was a pandas nullable on with missing values (:issue:`56702`)
- :meth:`DataFrame.__dataframe__` was producing incorrect data buffers when the a column's type was a pyarrow nullable on with missing values (:issue:`57664`)
- :meth:`Index.is_monotonic_decreasing`, :meth:`Index.is_monotonic_increasing`, and :meth:`Index.is_unique` could incorrectly be ``False`` for an ``Index`` created from a slice of another ``Index``. (:issue:`57911`)
- Avoid issuing a spurious ``DeprecationWarning`` when a custom :class:`DataFrame` or :class:`Series` subclass method is called (:issue:`57553`)
- Fixed regression in precision of :func:`to_datetime` with string and ``unit`` input (:issue:`57051`)

Expand Down
44 changes: 32 additions & 12 deletions pandas/_libs/index.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -253,14 +253,24 @@ cdef class IndexEngine:
return self.sizeof()

cpdef _update_from_sliced(self, IndexEngine other, reverse: bool):
self.unique = other.unique
self.need_unique_check = other.need_unique_check
if other.unique:
self.unique = 1
self.need_unique_check = 0

if not other.need_monotonic_check and (
other.is_monotonic_increasing or other.is_monotonic_decreasing):
self.need_monotonic_check = other.need_monotonic_check
# reverse=True means the index has been reversed
self.monotonic_inc = other.monotonic_dec if reverse else other.monotonic_inc
self.monotonic_dec = other.monotonic_inc if reverse else other.monotonic_dec
self.need_monotonic_check = 0
if len(self.values) > 0 and self.values[0] != self.values[-1]:
# reverse=True means the index has been reversed
if reverse:
self.monotonic_inc = other.monotonic_dec
self.monotonic_dec = other.monotonic_inc
else:
self.monotonic_inc = other.monotonic_inc
self.monotonic_dec = other.monotonic_dec
else:
self.monotonic_inc = 1
self.monotonic_dec = 1

@property
def is_unique(self) -> bool:
Expand Down Expand Up @@ -854,14 +864,24 @@ cdef class SharedEngine:
pass

cpdef _update_from_sliced(self, ExtensionEngine other, reverse: bool):
self.unique = other.unique
self.need_unique_check = other.need_unique_check
if other.unique:
self.unique = 1
self.need_unique_check = 0

if not other.need_monotonic_check and (
other.is_monotonic_increasing or other.is_monotonic_decreasing):
self.need_monotonic_check = other.need_monotonic_check
# reverse=True means the index has been reversed
self.monotonic_inc = other.monotonic_dec if reverse else other.monotonic_inc
self.monotonic_dec = other.monotonic_inc if reverse else other.monotonic_dec
self.need_monotonic_check = 0
if len(self.values) > 0 and self.values[0] != self.values[-1]:
# reverse=True means the index has been reversed
if reverse:
self.monotonic_inc = other.monotonic_dec
self.monotonic_dec = other.monotonic_inc
else:
self.monotonic_inc = other.monotonic_inc
self.monotonic_dec = other.monotonic_dec
else:
self.monotonic_inc = 1
self.monotonic_dec = 1

@property
def is_unique(self) -> bool:
Expand Down
20 changes: 20 additions & 0 deletions pandas/tests/indexes/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -961,6 +961,26 @@ def test_slice_keep_name(self):
index = Index(["a", "b"], name="asdf")
assert index.name == index[1:].name

def test_slice_is_unique(self):
# GH 57911
index = Index([1, 1, 2, 3, 4])
assert not index.is_unique
filtered_index = index[2:].copy()
assert filtered_index.is_unique

def test_slice_is_montonic(self):
"""Test that is_monotonic resets on slices."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly "resets on slices" after your last commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a docstring and a comment referencing the GitHub issue too much?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function name explains it all, but it's up to you

index = Index([1, 2, 3, 3])
assert not index.is_monotonic_decreasing

filtered_index = index[2:].copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also test a null slice i.e. index[:]

assert filtered_index.is_monotonic_decreasing
assert filtered_index.is_monotonic_increasing

filtered_index = index[1:].copy()
assert not filtered_index.is_monotonic_decreasing
assert filtered_index.is_monotonic_increasing

@pytest.mark.parametrize(
"index",
[
Expand Down