Skip to content

CLN refactor maybe-castable #39257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 7, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions pandas/core/construction.py
Original file line number Diff line number Diff line change
Expand Up @@ -584,9 +584,13 @@ def _try_cast(arr, dtype: Optional[DtypeObj], copy: bool, raise_cast_failure: bo
Otherwise an object array is returned.
"""
# perf shortcut as this is the most common case
if isinstance(arr, np.ndarray):
if maybe_castable(arr) and not copy and dtype is None:
return arr
if (
isinstance(arr, np.ndarray)
and maybe_castable(arr.dtype)
and not copy
and dtype is None
):
return arr

if isinstance(dtype, ExtensionDtype) and (dtype.kind != "M" or is_sparse(dtype)):
# create an extension array from its dtype
Expand Down
12 changes: 5 additions & 7 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -1301,20 +1301,18 @@ def convert_dtypes(
return inferred_dtype


def maybe_castable(arr: np.ndarray) -> bool:
def maybe_castable(dtype: DtypeObj) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think just np.dtype?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in light of #39104 (comment) , maybe it's worth mothballing this too until we have numpy types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk let's revert this part

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont understand why "just use np.dtype" isnt the right thing to do here

Copy link
Member Author

@MarcoGorelli MarcoGorelli Jan 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the objection was that it resolves to Any:

$ cat t.py 
import numpy as np

a: np.dtype
reveal_type(a)
$ mypy t.py 
t.py:4: note: Revealed type is 'Any'
  • is that OK?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wont that get fixed on numpy's end at some point? i mean, np.dtype is the correct type here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem (in general) is that we are adding a type that resolves to Any, so if there are mypy issues 'lurking' they will not be reported.

when we do have numpy types, we may then have more errors to resolve.

That train left the station long ago. so potentially having a few more errors is nbd.

this is a different scenario to #39104 (comment). That PR is focused on removing type ignores. In that case, the correct type is np.ufunc and not Callable, so again changing that is the right thing to do. just maybe not at the right time.

# return False to force a non-fastpath

assert isinstance(arr, np.ndarray) # GH 37024

# check datetime64[ns]/timedelta64[ns] are valid
# otherwise try to coerce
kind = arr.dtype.kind
kind = dtype.kind
if kind == "M":
return is_datetime64_ns_dtype(arr.dtype)
return is_datetime64_ns_dtype(dtype)
elif kind == "m":
return is_timedelta64_ns_dtype(arr.dtype)
return is_timedelta64_ns_dtype(dtype)

return arr.dtype.name not in POSSIBLY_CAST_DTYPES
return dtype.name not in POSSIBLY_CAST_DTYPES


def maybe_infer_to_datetimelike(
Expand Down