Skip to content

TYP: remove #type: ignore for pd.array constructor #33706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins commented Apr 21, 2020

pandas\core\arrays\categorical.py:481: error: Argument 1 to "array" has incompatible type "Categorical"; expected "Sequence[object]"

@simonjayhawkins simonjayhawkins added the Typing type annotations, mypy/pyright type checking label Apr 21, 2020
@@ -275,28 +276,28 @@ def array(
):
dtype = data.dtype

data = extract_array(data, extract_numpy=True)
_data = extract_array(data, extract_numpy=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? What is returned here should still be typed the same as the original data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is returned here should still be typed the same as the original data

IIUC the return type is a numpy array. And therefore data should be narrowed to just np.ndarray. This is not the same as the original data, which is typed as Union[Sequence[object], AnyArrayLike]

The crux of issue is np.ndarray resolves to Any.

data is typed Union[Sequence[object], AnyArrayLike] which is equivalent to Union[Sequence[object], Any, Index, Series, ExtensionArray].

np.ndarray resolving to Any prevents mypy being able to type narrow from extract_array for a few reasons.

  1. if we overload extract_array we get error: Overloaded function signature 2 will never be matched: signature 1's parameter type(s) are the same or broader.
@overload
def extract_array(
    obj: Union[Sequence[object], AnyArrayLike], extract_numpy: bool = False
) -> ArrayLike:
    ...


@overload
def extract_array(obj: Any, extract_numpy: bool = False) -> Any:
    ...


def extract_array(obj: Any, extract_numpy: bool = False) -> Union[Any, ArrayLike]:

AnyArrayLike includes np.ndarray which resolves to Any and therefore the second signature is unreachable.

  1. even if np.ndarray did not resolve to Any, without Literal (py 3.8) we can't overload extract_numpy to return a numpy array i.e to implicitly cast to the return type of extract_array, i.e. from extract_array(Union[Sequence[object], AnyArrayLike], extract_numpy=True) -> np.ndarray ( requires additional overloads to above)

  2. without the overloads, we have just a union return type from extract_array. we can't cast from Union[Sequence[object], AnyArrayLike] to Union[Any, ArrayLike] without a mypy error pandas\core\construction.py:279: error: Redundant cast to "Any" (unless we remove warn_redundant_casts = True from setup.cfg)

so in summary, _data is a separate type to data that resolves for Any since it is a numpy array following the extract_array call and we revert to dynamic typing for the rest of the method.

return result


def extract_array(obj, extract_numpy: bool = False):
def extract_array(obj: Any, extract_numpy: bool = False) -> Union[Any, ArrayLike]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it work to make the Any a Sequence[object] ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_array accepts any object and returns the object unchanged if can't extract array.

if isinstance(obj, (ABCIndexClass, ABCSeries)):
obj = obj.array
if extract_numpy and isinstance(obj, ABCPandasArray):
obj = obj.to_numpy()
return obj

we could look into changing this (raise instead of returning object?) but would not change the numpy array resolving to Any issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking some more, will revert this for now.

@jreback jreback added this to the 1.1 milestone Apr 22, 2020
@jreback
Copy link
Contributor

jreback commented Apr 25, 2020

can u rebase

@jreback jreback merged commit 64d544c into pandas-dev:master Apr 26, 2020
@jreback
Copy link
Contributor

jreback commented Apr 26, 2020

thanks @simonjayhawkins

@simonjayhawkins simonjayhawkins deleted the remove-ignore-for-array-constructor branch April 27, 2020 07:29
rhshadrach pushed a commit to rhshadrach/pandas that referenced this pull request May 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants