Skip to content

Backport PR #31183: BUG: Series/Frame invert dtypes' #31493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1108,6 +1108,7 @@ Numeric
- Bug in :meth:`DataFrame.round` where a :class:`DataFrame` with a :class:`CategoricalIndex` of :class:`IntervalIndex` columns would incorrectly raise a ``TypeError`` (:issue:`30063`)
- Bug in :meth:`Series.pct_change` and :meth:`DataFrame.pct_change` when there are duplicated indices (:issue:`30463`)
- Bug in :class:`DataFrame` cumulative operations (e.g. cumsum, cummax) incorrect casting to object-dtype (:issue:`19296`)
- Bug in dtypes being lost in ``DataFrame.__invert__`` (``~`` operator) with mixed dtypes (:issue:`31183`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these notes not be in 1.0.1 to match the other PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan was to just let the other PR also be backported (since it is tagged with the milestone), then I don't need to move it here as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

- Bug in :class:`~DataFrame.diff` losing the dtype for extension types (:issue:`30889`)
- Bug in :class:`DataFrame.diff` raising an ``IndexError`` when one of the columns was a nullable integer dtype (:issue:`30967`)

Expand Down Expand Up @@ -1260,6 +1261,7 @@ ExtensionArray
- Bug in :class:`arrays.PandasArray` when setting a scalar string (:issue:`28118`, :issue:`28150`).
- Bug where nullable integers could not be compared to strings (:issue:`28930`)
- Bug where :class:`DataFrame` constructor raised ``ValueError`` with list-like data and ``dtype`` specified (:issue:`30280`)
- Bug in dtype being lost in ``__invert__`` (``~`` operator) for extension-array backed ``Series`` and ``DataFrame`` (:issue:`23087`)


Other
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/arrays/masked.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ def __iter__(self):
def __len__(self) -> int:
return len(self._data)

def __invert__(self):
return type(self)(~self._data, self._mask)

def to_numpy(
self, dtype=None, copy=False, na_value: "Scalar" = lib.no_default,
):
Expand Down
5 changes: 3 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1470,8 +1470,9 @@ def __invert__(self):
# inv fails with 0 len
return self

arr = operator.inv(com.values_from_object(self))
return self.__array_wrap__(arr)
new_data = self._data.apply(operator.invert)
result = self._constructor(new_data).__finalize__(self)
return result

def __nonzero__(self):
raise ValueError(
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/arrays/sparse/test_arithmetics.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,14 @@ def test_invert(fill_value):
expected = SparseArray(~arr, fill_value=not fill_value)
tm.assert_sp_array_equal(result, expected)

result = ~pd.Series(sparray)
expected = pd.Series(expected)
tm.assert_series_equal(result, expected)

result = ~pd.DataFrame({"A": sparray})
expected = pd.DataFrame({"A": expected})
tm.assert_frame_equal(result, expected)


@pytest.mark.parametrize("fill_value", [0, np.nan])
@pytest.mark.parametrize("op", [operator.pos, operator.neg])
Expand Down
18 changes: 18 additions & 0 deletions pandas/tests/arrays/test_boolean.py
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,24 @@ def test_ufunc_reduce_raises(values):
np.add.reduce(a)


class TestUnaryOps:
def test_invert(self):
a = pd.array([True, False, None], dtype="boolean")
expected = pd.array([False, True, None], dtype="boolean")
tm.assert_extension_array_equal(~a, expected)

expected = pd.Series(expected, index=["a", "b", "c"], name="name")
result = ~pd.Series(a, index=["a", "b", "c"], name="name")
tm.assert_series_equal(result, expected)

df = pd.DataFrame({"A": a, "B": [True, False, False]}, index=["a", "b", "c"])
result = ~df
expected = pd.DataFrame(
{"A": expected, "B": [False, True, True]}, index=["a", "b", "c"]
)
tm.assert_frame_equal(result, expected)


class TestLogicalOps(BaseOpsUtil):
def test_numpy_scalars_ok(self, all_logical_operators):
a = pd.array([True, False, None], dtype="boolean")
Expand Down
7 changes: 6 additions & 1 deletion pandas/tests/extension/base/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,12 @@ class TestMyDtype(BaseDtypeTests):
from .io import BaseParsingTests # noqa
from .methods import BaseMethodsTests # noqa
from .missing import BaseMissingTests # noqa
from .ops import BaseArithmeticOpsTests, BaseComparisonOpsTests, BaseOpsUtil # noqa
from .ops import ( # noqa
BaseArithmeticOpsTests,
BaseComparisonOpsTests,
BaseOpsUtil,
BaseUnaryOpsTests,
)
from .printing import BasePrintingTests # noqa
from .reduce import ( # noqa
BaseBooleanReduceTests,
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/extension/base/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,3 +168,11 @@ def test_direct_arith_with_series_returns_not_implemented(self, data):
assert result is NotImplemented
else:
raise pytest.skip(f"{type(data).__name__} does not implement __eq__")


class BaseUnaryOpsTests(BaseOpsUtil):
def test_invert(self, data):
s = pd.Series(data, name="name")
result = ~s
expected = pd.Series(~data, name="name")
self.assert_series_equal(result, expected)
4 changes: 4 additions & 0 deletions pandas/tests/extension/test_boolean.py
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,10 @@ class TestPrinting(base.BasePrintingTests):
pass


class TestUnaryOps(base.BaseUnaryOpsTests):
pass


# TODO parsing not yet supported
# class TestParsing(base.BaseParsingTests):
# pass
21 changes: 21 additions & 0 deletions pandas/tests/frame/test_operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,27 @@ def test_invert(self, float_frame):

tm.assert_frame_equal(-(df < 0), ~(df < 0))

def test_invert_mixed(self):
shape = (10, 5)
df = pd.concat(
[
pd.DataFrame(np.zeros(shape, dtype="bool")),
pd.DataFrame(np.zeros(shape, dtype=int)),
],
axis=1,
ignore_index=True,
)
result = ~df
expected = pd.concat(
[
pd.DataFrame(np.ones(shape, dtype="bool")),
pd.DataFrame(-np.ones(shape, dtype=int)),
],
axis=1,
ignore_index=True,
)
tm.assert_frame_equal(result, expected)

@pytest.mark.parametrize(
"df",
[
Expand Down