Skip to content

API: EA._can_hold_na -> EDtype.can_hold_na #41654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 8, 2021

Conversation

jbrockmendel
Copy link
Member

@simonjayhawkins simonjayhawkins added API Design ExtensionArray Extending pandas with custom dtypes or arrays. labels May 25, 2021
@@ -158,6 +158,9 @@ class DatetimeLikeArrayMixin(OpsMixin, NDArrayBackedExtensionArray):
_is_recognized_dtype: Callable[[DtypeObj], bool]
_recognized_scalars: tuple[type, ...]
_ndarray: np.ndarray
# Incompatible types in assignment (expression has type "bool", base class
# "ExtensionArray" defined the type as "Callable[[ExtensionArray], bool]")
_can_hold_na = True # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if u make this a property (like u do below) then can avoid the typing issue

# (previously declared in base class "Block")
_can_hold_na = False # type: ignore[misc]
# Incompatible types in assignment (expression has type "bool", base class
# "Block" defined the type as "Callable[[Block], bool]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here

@jreback jreback added this to the 1.3 milestone May 27, 2021
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @jorisvandenbossche if any comments

@jreback
Copy link
Contributor

jreback commented May 27, 2021

pandas/tests/extension/test_external_block.py:18: error: Cannot override final attribute "_can_hold_na" (previously declared in base class "Block") [misc]

actually this is failing precommit

@jbrockmendel
Copy link
Member Author

pre-commit fixed

@@ -367,6 +367,13 @@ def _get_common_dtype(self, dtypes: list[DtypeObj]) -> DtypeObj | None:
else:
return None

@property
def can_hold_na(self) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be public?

@jorisvandenbossche
Copy link
Member

I know this is an existing property on the ExtensionArray, but do we actually support (or want to support) an extension array that cannot hold missing values? (I don't think we currently have one, so this is also not really tested?)

@jorisvandenbossche
Copy link
Member

I don't think we currently have one

Except for PandasArray, I assume (but since that's not actually stored in practice, that's a bit a special case anyhow)

@jbrockmendel
Copy link
Member Author

I think Sparse[inty]?

@simonjayhawkins simonjayhawkins added the Blocker Blocking issue or pull request for an upcoming release label Jun 2, 2021
@simonjayhawkins
Copy link
Member

i've added blocker since I don't think we should change this during the rc. so we either get it in for 1.3.0rc or move to 1.4/2.0

@jbrockmendel
Copy link
Member Author

is the public/private question (and a rebase) the only outstanding issue? im happy to make this private

@jorisvandenbossche
Copy link
Member

Also the question whether we should have this at all, IMO. And if we have that property, what does it mean? (what's the behaviour for operations like setitem or reindexing that can introduce missing values?)

@jbrockmendel
Copy link
Member Author

Also the question whether we should have this at all, IMO. And if we have that property, what does it mean? (what's the behaviour for operations like setitem or reindexing that can introduce missing values?)

I'm open to having this discussion, but I think it can be done separately from the "where should this attribute live" discussion given that it already exists on EA.

@jorisvandenbossche
Copy link
Member

But if we are not yet sure if and how such property would be used, we can also leave it where it is for now? (or what problem is this PR fixing?)

If moving it, let's at least make it private and document it that way (as an internal property that no EA author should change, as that is undefined behaviour)

@jbrockmendel
Copy link
Member Author

(or what problem is this PR fixing?)

The corner cases I want to rule out are where can_hold_na is values-dependent and potentially non-constant, e.g. (xref #40574)

class WeirdEA(ExtensionArray):
      @property
      def _can_hold_na(self) -> bool:
           return some_func(self[0])

@jreback
Copy link
Contributor

jreback commented Jun 3, 2021

this lgtm, i agree this should be on the dtype.

@simonjayhawkins simonjayhawkins mentioned this pull request Jun 7, 2021
@jreback
Copy link
Contributor

jreback commented Jun 7, 2021

@jorisvandenbossche ok here?

@@ -644,6 +644,7 @@ Other API changes
- Accessing ``_constructor_expanddim`` on a :class:`DataFrame` and ``_constructor_sliced`` on a :class:`Series` now raise an ``AttributeError``. Previously a ``NotImplementedError`` was raised (:issue:`38782`)
- Added new ``engine`` and ``**engine_kwargs`` parameters to :meth:`DataFrame.to_sql` to support other future "SQL engines". Currently we still only use ``SQLAlchemy`` under the hood, but more engines are planned to be supported such as ``turbodbc`` (:issue:`36893`)
- Removed redundant ``freq`` from :class:`PeriodIndex` string representation (:issue:`41653`)
- :attr:`ExtensionArray._can_hold_na` should now be implemented on the :class:`ExtensionDtype` subclass as ``_can_hold_na`` (:issue:`40574`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this mention, if we are not yet clear on the intended semantics of it.

@jorisvandenbossche
Copy link
Member

Yeah, I won't object moving it. Just a tiny comment on the whatsnew.

@jbrockmendel
Copy link
Member Author

whatsnew removed + greenish

@simonjayhawkins simonjayhawkins merged commit e4dedca into pandas-dev:master Jun 8, 2021
@simonjayhawkins
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the api-can_hold_na branch June 8, 2021 14:43
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Blocker Blocking issue or pull request for an upcoming release ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: EA._can_hold_na should be a dtype attribute, not EA
4 participants