-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: move Block.astype implementation to dtypes/cast.py #40141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: move Block.astype implementation to dtypes/cast.py #40141
Conversation
pandas/core/array_algos/cast.py
Outdated
return values | ||
|
||
|
||
def astype_array_safe(values, dtype, copy: bool = False, errors: str = "raise"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values: ArrayLike; dtype: DtypeObj?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy isn't smart enough for that (we also didn't annotate dtype
in the code from where I copied this). But will add values: ArrayLike
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense. if we move the pandas_dtype call up into the caller we can do it once instead of (block|array)-wise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we move the pandas_dtype call up into the caller we can do it once instead of (block|array)-wise
Then also the dtype
validation needs to be moved up (requiring some more changes to avoid duplication of that part). At the moment, it's a rather straightfoward cut and paste from blocks.py to cast.py, so I would maybe prefer to keep it that way for this PR.
dtype = pandas_dtype(dtype) | ||
values = self.values | ||
if values.dtype.kind in ["m", "M"]: | ||
values = self.array_values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could move this into astype_array_safe and use ensure_wrapped_if_datetimelike; would make it robust to AM/BM (though i think both AM and BM now have PRs to make the arrays EAs to begin with)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ArrayManager already stores it as EAs (after this array), I would prefer to leave it here (then your PR changing to store EAs in BlockManager as well can remove those two lines)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comments otherwise lgtm
pandas/core/dtypes/cast.py
Outdated
@@ -1225,6 +1227,107 @@ def astype_nansafe( | |||
return arr.astype(dtype, copy=copy) | |||
|
|||
|
|||
def astype_array(values: ArrayLike, dtype: DtypeObj, copy: bool = False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a return annotation
pandas/core/dtypes/cast.py
Outdated
|
||
def astype_array_safe( | ||
values: ArrayLike, dtype, copy: bool = False, errors: str = "raise" | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
This moves the
astype
implementation (so the current logic of callingEA.astype
orastype_nansafe
depending on the array type, handling datetime and string special cases, handling error keyword, etc) from being defined onBlock
to thearray_algos/
submodule.That's a useful clean-up anyway I think (it's not block specific), and this way it can then be reused in ArrayManager.
I currently put it in
core/array_algos/cast.py
since the other similar functions are also put inarray_algos
(shift
,putmask
,replace
,quantile
etc). But since this is casting specific and we already have most of the underlying pieces incore/dtypes/cast.py
, could also just put it there (either is fine for me).