Skip to content

#25790 Updating type hints to Python3 syntax in pandas/core/array #25829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Mar 30, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions pandas/core/arrays/array_.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import Optional, Sequence, Union
from typing import Optional, Sequence, TYPE_CHECKING, Union

import numpy as np

Expand All @@ -11,11 +11,14 @@
from pandas import compat


def array(data, # type: Sequence[object]
dtype=None, # type: Optional[Union[str, np.dtype, ExtensionDtype]]
copy=True, # type: bool
):
# type: (...) -> ExtensionArray
if TYPE_CHECKING:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? conditional imports for type checking are not supportable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, we need ExtensionArray to be defined to check this function's return type. Since ExtensionArray is defined elsewhere, we need to import it during type checking. I based this code on the approach described in PEP 484 — Runtime or type checking.

Is there a better, supportable way to do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this is the same conversation as #25802 (comment)

@jreback our support of 3.5 includes 3.5.0 and 3.5.1 right? This wasn't introduced until 3.5.2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would instead simply use the ABCExtensionArray classes which would work for this (and lots of other clases). we should maybe just do this generally. These are by-definition available with no dependencies for this exact purpose.

from pandas.core.arrays.base import ExtensionArray


def array(data: Sequence[object],
dtype: Optional[Union[str, np.dtype, ExtensionDtype]] = None,
copy: bool = True,
) -> 'ExtensionArray':
"""
Create an array.

Expand Down
69 changes: 35 additions & 34 deletions pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,8 +214,7 @@ def __getitem__(self, item):
"""
raise AbstractMethodError(self)

def __setitem__(self, key, value):
# type: (Union[int, np.ndarray], Any) -> None
def __setitem__(self, key: Union[int, np.ndarray], value: Any) -> None:
"""
Set one or more values inplace.

Expand Down Expand Up @@ -262,8 +261,7 @@ def __setitem__(self, key, value):
type(self), '__setitem__')
)

def __len__(self):
# type: () -> int
def __len__(self) -> int:
"""
Length of this array

Expand All @@ -287,32 +285,28 @@ def __iter__(self):
# Required attributes
# ------------------------------------------------------------------------
@property
def dtype(self):
# type: () -> ExtensionDtype
def dtype(self) -> ExtensionDtype:
"""
An instance of 'ExtensionDtype'.
"""
raise AbstractMethodError(self)

@property
def shape(self):
# type: () -> Tuple[int, ...]
def shape(self) -> Tuple[int, ...]:
"""
Return a tuple of the array dimensions.
"""
return (len(self),)

@property
def ndim(self):
# type: () -> int
def ndim(self) -> int:
"""
Extension Arrays are only allowed to be 1-dimensional.
"""
return 1

@property
def nbytes(self):
# type: () -> int
def nbytes(self) -> int:
"""
The number of bytes needed to store this object in memory.
"""
Expand Down Expand Up @@ -343,8 +337,7 @@ def astype(self, dtype, copy=True):
"""
return np.array(self, dtype=dtype, copy=copy)

def isna(self):
# type: () -> Union[ExtensionArray, np.ndarray]
def isna(self) -> Union['ExtensionArray', np.ndarray]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well use the ABC here and throughout module

"""
A 1-D array indicating if each value is missing.

Expand All @@ -366,8 +359,7 @@ def isna(self):
"""
raise AbstractMethodError(self)

def _values_for_argsort(self):
# type: () -> np.ndarray
def _values_for_argsort(self) -> np.ndarray:
"""
Return values for sorting.

Expand Down Expand Up @@ -482,8 +474,11 @@ def dropna(self):
"""
return self[~self.isna()]

def shift(self, periods=1, fill_value=None):
# type: (int, object) -> ExtensionArray
def shift(
self,
periods: int = 1,
fill_value: object = None,
) -> 'ExtensionArray':
"""
Shift values by desired number.

Expand Down Expand Up @@ -598,8 +593,7 @@ def searchsorted(self, value, side="left", sorter=None):
arr = self.astype(object)
return arr.searchsorted(value, side=side, sorter=sorter)

def _values_for_factorize(self):
# type: () -> Tuple[np.ndarray, Any]
def _values_for_factorize(self) -> Tuple[np.ndarray, Any]:
"""
Return an array and missing value suitable for factorization.

Expand All @@ -623,8 +617,10 @@ def _values_for_factorize(self):
"""
return self.astype(object), np.nan

def factorize(self, na_sentinel=-1):
# type: (int) -> Tuple[np.ndarray, ExtensionArray]
def factorize(
self,
na_sentinel: int = -1,
) -> Tuple[np.ndarray, 'ExtensionArray']:
"""
Encode the extension array as an enumerated type.

Expand Down Expand Up @@ -726,8 +722,12 @@ def repeat(self, repeats, axis=None):
# Indexing methods
# ------------------------------------------------------------------------

def take(self, indices, allow_fill=False, fill_value=None):
# type: (Sequence[int], bool, Optional[Any]) -> ExtensionArray
def take(
self,
indices: Sequence[int],
allow_fill: bool = False,
fill_value: Any = None
) -> 'ExtensionArray':
"""
Take elements from an array.

Expand Down Expand Up @@ -816,8 +816,7 @@ def take(self, indices, allow_fill=False, fill_value=None):
# pandas.api.extensions.take
raise AbstractMethodError(self)

def copy(self, deep=False):
# type: (bool) -> ExtensionArray
def copy(self, deep: bool = False) -> 'ExtensionArray':
"""
Return a copy of the array.

Expand Down Expand Up @@ -853,8 +852,10 @@ def __repr__(self):
length=len(self),
dtype=self.dtype)

def _formatter(self, boxed=False):
# type: (bool) -> Callable[[Any], Optional[str]]
def _formatter(
self,
boxed: bool = False,
) -> Callable[[Any], Optional[str]]:
"""Formatting function for scalar values.

This is used in the default '__repr__'. The returned formatting
Expand All @@ -881,8 +882,7 @@ def _formatter(self, boxed=False):
return str
return repr

def _formatting_values(self):
# type: () -> np.ndarray
def _formatting_values(self) -> np.ndarray:
# At the moment, this has to be an array since we use result.dtype
"""
An array of values to be printed in, e.g. the Series repr
Expand All @@ -898,8 +898,10 @@ def _formatting_values(self):
# ------------------------------------------------------------------------

@classmethod
def _concat_same_type(cls, to_concat):
# type: (Sequence[ExtensionArray]) -> ExtensionArray
def _concat_same_type(
cls,
to_concat: Sequence['ExtensionArray']
) -> 'ExtensionArray':
"""
Concatenate multiple array

Expand All @@ -921,8 +923,7 @@ def _concat_same_type(cls, to_concat):
_can_hold_na = True

@property
def _ndarray_values(self):
# type: () -> np.ndarray
def _ndarray_values(self) -> np.ndarray:
"""
Internal pandas method for lossy conversion to a NumPy ndarray.

Expand Down
34 changes: 18 additions & 16 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,7 @@ def _get_attributes_dict(self):
return {k: getattr(self, k, None) for k in self._attributes}

@property
def _scalar_type(self):
# type: () -> Union[type, Tuple[type]]
def _scalar_type(self) -> Union[type, Tuple[type]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Type consistently

"""The scalar associated with this datelike

* PeriodArray : Period
Expand All @@ -68,8 +67,10 @@ def _scalar_type(self):
"""
raise AbstractMethodError(self)

def _scalar_from_string(self, value):
# type: (str) -> Union[Period, Timestamp, Timedelta, NaTType]
def _scalar_from_string(
self,
value: str,
) -> Union[Period, Timestamp, Timedelta, NaTType]:
"""
Construct a scalar type from a string.

Expand All @@ -89,8 +90,10 @@ def _scalar_from_string(self, value):
"""
raise AbstractMethodError(self)

def _unbox_scalar(self, value):
# type: (Union[Period, Timestamp, Timedelta, NaTType]) -> int
def _unbox_scalar(
self,
value: Union[Period, Timestamp, Timedelta, NaTType],
) -> int:
"""
Unbox the integer value of a scalar `value`.

Expand All @@ -109,8 +112,10 @@ def _unbox_scalar(self, value):
"""
raise AbstractMethodError(self)

def _check_compatible_with(self, other):
# type: (Union[Period, Timestamp, Timedelta, NaTType]) -> None
def _check_compatible_with(
self,
other: Union[Period, Timestamp, Timedelta, NaTType],
) -> None:
"""
Verify that `self` and `other` are compatible.

Expand Down Expand Up @@ -350,8 +355,7 @@ def __iter__(self):
return (self._box_func(v) for v in self.asi8)

@property
def asi8(self):
# type: () -> np.ndarray
def asi8(self) -> np.ndarray:
"""
Integer representation of the values.

Expand Down Expand Up @@ -402,8 +406,7 @@ def shape(self):
return (len(self),)

@property
def size(self):
# type: () -> int
def size(self) -> int:
"""The number of elements in this array."""
return np.prod(self.shape)

Expand Down Expand Up @@ -461,10 +464,9 @@ def __getitem__(self, key):

def __setitem__(
self,
key, # type: Union[int, Sequence[int], Sequence[bool], slice]
value, # type: Union[NaTType, Any, Sequence[Any]]
):
# type: (...) -> None
key: Union[int, Sequence[int], Sequence[bool], slice],
value: Union[NaTType, Any, Sequence[Any]]
) -> None:
# I'm fudging the types a bit here. "Any" above really depends
# on type(self). For PeriodArray, it's Period (or stuff coercible
# to a period in from_sequence). For DatetimeArray, it's Timestamp...
Expand Down
3 changes: 1 addition & 2 deletions pandas/core/arrays/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -514,8 +514,7 @@ def _box_func(self):
return lambda x: Timestamp(x, freq=self.freq, tz=self.tz)

@property
def dtype(self):
# type: () -> Union[np.dtype, DatetimeTZDtype]
def dtype(self) -> Union[np.dtype, DatetimeTZDtype]:
"""
The dtype for the DatetimeArray.

Expand Down
6 changes: 2 additions & 4 deletions pandas/core/arrays/integer.py
Original file line number Diff line number Diff line change
Expand Up @@ -452,8 +452,7 @@ def astype(self, dtype, copy=True):
return astype_nansafe(data, dtype, copy=None)

@property
def _ndarray_values(self):
# type: () -> np.ndarray
def _ndarray_values(self) -> np.ndarray:
"""Internal pandas method for lossy conversion to a NumPy ndarray.

This method is not part of the pandas interface.
Expand Down Expand Up @@ -509,8 +508,7 @@ def value_counts(self, dropna=True):

return Series(array, index=index)

def _values_for_argsort(self):
# type: () -> np.ndarray
def _values_for_argsort(self) -> np.ndarray:
"""Return values for sorting.

Returns
Expand Down
32 changes: 16 additions & 16 deletions pandas/core/arrays/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,8 +183,12 @@ def _simple_new(cls, values, freq=None, **kwargs):
return cls(values, freq=freq, **kwargs)

@classmethod
def _from_sequence(cls, scalars, dtype=None, copy=False):
# type: (Sequence[Optional[Period]], PeriodDtype, bool) -> PeriodArray
def _from_sequence(
cls,
scalars: Sequence[Optional[Period]],
dtype: PeriodDtype = None,
copy: bool = False,
) -> 'PeriodArray':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ABCPeriodArray

if dtype:
freq = dtype.freq
else:
Expand Down Expand Up @@ -246,8 +250,7 @@ def _generate_range(cls, start, end, periods, freq, fields):
# -----------------------------------------------------------------
# DatetimeLike Interface

def _unbox_scalar(self, value):
# type: (Union[Period, NaTType]) -> int
def _unbox_scalar(self, value: Union[Period, NaTType]) -> int:
if value is NaT:
return value.value
elif isinstance(value, self._scalar_type):
Expand All @@ -258,8 +261,7 @@ def _unbox_scalar(self, value):
raise ValueError("'value' should be a Period. Got '{val}' instead."
.format(val=value))

def _scalar_from_string(self, value):
# type: (str) -> Period
def _scalar_from_string(self, value: str) -> Period:
return Period(value, freq=self.freq)

def _check_compatible_with(self, other):
Expand Down Expand Up @@ -540,14 +542,9 @@ def _sub_period(self, other):
@Appender(dtl.DatetimeLikeArrayMixin._addsub_int_array.__doc__)
def _addsub_int_array(
self,
other, # type: Union[ExtensionArray, np.ndarray[int]]
op # type: Callable[Any, Any]
):
# type: (...) -> PeriodArray

# TODO: ABCIndexClass is a valid type for other but had to be excluded
# due to length of Py2 compatability comment; add back in once migrated
# to Py3 syntax
other: Union[ExtensionArray, np.ndarray, ABCIndexClass],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for removing the type for ndarray?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During my testing at the time (before you added mypy.ini and the rest of the type checking framework to the project), mypy was throwing an error that suggested that ndarrays couldn't take a subscript, so I dropped it. I probably had something in my testing environment set up incorrectly (or maybe more strictly, not sure), because it doesn't throw that error with the current mypy.ini minus the blacklist. Will add it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, it wasn't mypy. code_checks doctests throws

...pandas/core/arrays/period.py", line 545, in PeriodArray
    other: Union[ExtensionArray, np.ndarray[int], ABCIndexClass],
TypeError: 'type' object is not subscriptable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Kind of strange this throws an error now but not before. Something to revisit later

op: Callable[[Any], Any]
) -> 'PeriodArray':
assert op in [operator.add, operator.sub]
if op is operator.sub:
other = -other
Expand Down Expand Up @@ -716,8 +713,11 @@ def _raise_on_incompatible(left, right):
# -------------------------------------------------------------------
# Constructor Helpers

def period_array(data, freq=None, copy=False):
# type: (Sequence[Optional[Period]], Optional[Tick], bool) -> PeriodArray
def period_array(
data: Sequence[Optional[Period]],
freq: Optional[Tick] = None,
copy: bool = False,
) -> PeriodArray:
"""
Construct a new PeriodArray from a sequence of Period scalars.

Expand Down
Loading