Skip to content

Commit fb754d7

Browse files
jbrockmendeldatapythonistamroeschke
authored
ENH: __pandas_priority__ (pandas-dev#48347)
* ENH: __pandas_priority__ * doc, test * Update doc/source/development/extending.rst Co-authored-by: Marc Garcia <[email protected]> * Whatsnew * GH ref * suggested doc edit * Update doc/source/development/extending.rst Co-authored-by: Matthew Roeschke <[email protected]> * Update doc/source/development/extending.rst Co-authored-by: Matthew Roeschke <[email protected]> * Update doc/source/development/extending.rst Co-authored-by: Matthew Roeschke <[email protected]> * lint fixup * suggested edits --------- Co-authored-by: Marc Garcia <[email protected]> Co-authored-by: Matthew Roeschke <[email protected]>
1 parent 3161e3f commit fb754d7

File tree

8 files changed

+85
-5
lines changed

8 files changed

+85
-5
lines changed

doc/source/development/extending.rst

+46
Original file line numberDiff line numberDiff line change
@@ -488,3 +488,49 @@ registers the default "matplotlib" backend as follows.
488488
489489
More information on how to implement a third-party plotting backend can be found at
490490
https://github.com/pandas-dev/pandas/blob/main/pandas/plotting/__init__.py#L1.
491+
492+
.. _extending.pandas_priority:
493+
494+
Arithmetic with 3rd party types
495+
-------------------------------
496+
497+
In order to control how arithmetic works between a custom type and a pandas type,
498+
implement ``__pandas_priority__``. Similar to numpy's ``__array_priority__``
499+
semantics, arithmetic methods on :class:`DataFrame`, :class:`Series`, and :class:`Index`
500+
objects will delegate to ``other``, if it has an attribute ``__pandas_priority__`` with a higher value.
501+
502+
By default, pandas objects try to operate with other objects, even if they are not types known to pandas:
503+
504+
.. code-block:: python
505+
506+
>>> pd.Series([1, 2]) + [10, 20]
507+
0 11
508+
1 22
509+
dtype: int64
510+
511+
In the example above, if ``[10, 20]`` was a custom type that can be understood as a list, pandas objects will still operate with it in the same way.
512+
513+
In some cases, it is useful to delegate to the other type the operation. For example, consider I implement a
514+
custom list object, and I want the result of adding my custom list with a pandas :class:`Series` to be an instance of my list
515+
and not a :class:`Series` as seen in the previous example. This is now possible by defining the ``__pandas_priority__`` attribute
516+
of my custom list, and setting it to a higher value, than the priority of the pandas objects I want to operate with.
517+
518+
The ``__pandas_priority__`` of :class:`DataFrame`, :class:`Series`, and :class:`Index` are ``4000``, ``3000``, and ``2000`` respectively. The base ``ExtensionArray.__pandas_priority__`` is ``1000``.
519+
520+
.. code-block:: python
521+
522+
class CustomList(list):
523+
__pandas_priority__ = 5000
524+
525+
def __radd__(self, other):
526+
# return `self` and not the addition for simplicity
527+
return self
528+
529+
custom = CustomList()
530+
series = pd.Series([1, 2, 3])
531+
532+
# Series refuses to add custom, since it's an unknown type with higher priority
533+
assert series.__add__(custom) is NotImplemented
534+
535+
# This will cause the custom class `__radd__` being used instead
536+
assert series + custom is custom

doc/source/whatsnew/v2.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ enhancement2
2828

2929
Other enhancements
3030
^^^^^^^^^^^^^^^^^^
31+
- Implemented ``__pandas_priority__`` to allow custom types to take precedence over :class:`DataFrame`, :class:`Series`, :class:`Index`, or :class:`ExtensionArray` for arithmetic operations, :ref:`see the developer guide <extending.pandas_priority>` (:issue:`48347`)
3132
- :meth:`MultiIndex.sort_values` now supports ``na_position`` (:issue:`51612`)
3233
- :meth:`MultiIndex.sortlevel` and :meth:`Index.sortlevel` gained a new keyword ``na_position`` (:issue:`51612`)
3334
- Improve error message when setting :class:`DataFrame` with wrong number of columns through :meth:`DataFrame.isetitem` (:issue:`51701`)

pandas/core/arrays/base.py

+6
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,12 @@ class ExtensionArray:
235235
# Don't override this.
236236
_typ = "extension"
237237

238+
# similar to __array_priority__, positions ExtensionArray after Index,
239+
# Series, and DataFrame. EA subclasses may override to choose which EA
240+
# subclass takes priority. If overriding, the value should always be
241+
# strictly less than 2000 to be below Index.__pandas_priority__.
242+
__pandas_priority__ = 1000
243+
238244
# ------------------------------------------------------------------------
239245
# Constructors
240246
# ------------------------------------------------------------------------

pandas/core/frame.py

+4
Original file line numberDiff line numberDiff line change
@@ -634,6 +634,10 @@ class DataFrame(NDFrame, OpsMixin):
634634
_hidden_attrs: frozenset[str] = NDFrame._hidden_attrs | frozenset([])
635635
_mgr: BlockManager | ArrayManager
636636

637+
# similar to __array_priority__, positions DataFrame before Series, Index,
638+
# and ExtensionArray. Should NOT be overridden by subclasses.
639+
__pandas_priority__ = 4000
640+
637641
@property
638642
def _constructor(self) -> Callable[..., DataFrame]:
639643
return DataFrame

pandas/core/indexes/base.py

+4
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,10 @@ class Index(IndexOpsMixin, PandasObject):
351351
# To hand over control to subclasses
352352
_join_precedence = 1
353353

354+
# similar to __array_priority__, positions Index after Series and DataFrame
355+
# but before ExtensionArray. Should NOT be overridden by subclasses.
356+
__pandas_priority__ = 2000
357+
354358
# Cython methods; see github.com/cython/cython/issues/2647
355359
# for why we need to wrap these instead of making them class attributes
356360
# Moreover, cython will choose the appropriate-dtyped sub-function

pandas/core/ops/common.py

+4-5
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
from pandas._libs.missing import is_matching_na
1515

1616
from pandas.core.dtypes.generic import (
17-
ABCDataFrame,
1817
ABCIndex,
1918
ABCSeries,
2019
)
@@ -75,10 +74,10 @@ def new_method(self, other):
7574
# For comparison ops, Index does *not* defer to Series
7675
pass
7776
else:
78-
for cls in [ABCDataFrame, ABCSeries, ABCIndex]:
79-
if isinstance(self, cls):
80-
break
81-
if isinstance(other, cls):
77+
prio = getattr(other, "__pandas_priority__", None)
78+
if prio is not None:
79+
if prio > self.__pandas_priority__:
80+
# e.g. other is DataFrame while self is Index/Series/EA
8281
return NotImplemented
8382

8483
other = item_from_zerodim(other)

pandas/core/series.py

+4
Original file line numberDiff line numberDiff line change
@@ -352,6 +352,10 @@ class Series(base.IndexOpsMixin, NDFrame): # type: ignore[misc]
352352
base.IndexOpsMixin._hidden_attrs | NDFrame._hidden_attrs | frozenset([])
353353
)
354354

355+
# similar to __array_priority__, positions Series after DataFrame
356+
# but before Index and ExtensionArray. Should NOT be overridden by subclasses.
357+
__pandas_priority__ = 3000
358+
355359
# Override cache_readonly bc Series is mutable
356360
# error: Incompatible types in assignment (expression has type "property",
357361
# base class "IndexOpsMixin" defined the type as "Callable[[IndexOpsMixin], bool]")

pandas/tests/test_downstream.py

+16
Original file line numberDiff line numberDiff line change
@@ -267,3 +267,19 @@ def test_frame_setitem_dask_array_into_new_col():
267267
tm.assert_frame_equal(result, expected)
268268
finally:
269269
pd.set_option("compute.use_numexpr", olduse)
270+
271+
272+
def test_pandas_priority():
273+
# GH#48347
274+
275+
class MyClass:
276+
__pandas_priority__ = 5000
277+
278+
def __radd__(self, other):
279+
return self
280+
281+
left = MyClass()
282+
right = Series(range(3))
283+
284+
assert right.__add__(left) is NotImplemented
285+
assert right + left is left

0 commit comments

Comments
 (0)