Skip to content

TYP: annotations for internals, set_axis #32376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 2, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3599,7 +3599,7 @@ def align(
see_also_sub=" or columns",
)
@Appender(NDFrame.set_axis.__doc__)
def set_axis(self, labels, axis=0, inplace=False):
def set_axis(self, labels, axis: Axis = 0, inplace: bool = False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type and labels?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type is easy. for labels I'm ambivalent since it could be listlike, but should be Index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i take it back, return type depends on inplace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i take it back, return type depends on inplace

yeah, that is causing us problems without Literal.

return super().set_axis(labels, axis=axis, inplace=inplace)

@Substitution(**_shared_doc_kwargs)
Expand Down
5 changes: 3 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -520,7 +520,7 @@ def _obj_with_exclusions(self: FrameOrSeries) -> FrameOrSeries:
""" internal compat with SelectionMixin """
return self

def set_axis(self, labels, axis=0, inplace=False):
def set_axis(self, labels, axis: Axis = 0, inplace: bool = False):
"""
Assign desired index to given axis.

Expand Down Expand Up @@ -561,7 +561,8 @@ def set_axis(self, labels, axis=0, inplace=False):
obj.set_axis(labels, axis=axis, inplace=True)
return obj

def _set_axis(self, axis, labels) -> None:
def _set_axis(self, axis: int, labels: Index) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels should be the same as index_like parameter of ensure_index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on Best Practices here? I'd like to push up the stack where we do ensure_index. so in this case the annotation is kind of "you should pass Index, but we're not that strict about it"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes sense to limit the types of internal functions to reduce the number of duplicated checks. maybe add a TODO comment here that the ensure_index added is to be eventually removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"you should pass Index, but we're not that strict about it"

mypy will be

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy will be

im skeptical in this case because it goes through cython. This gets called when you do frame.columns = ["foo"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im skeptical in this case because it goes through cython

fair enough, cython is another pain point for us with type annotations.

labels = ensure_index(labels)
self._data.set_axis(axis, labels)
self._clear_item_cache()

Expand Down
16 changes: 7 additions & 9 deletions pandas/core/internals/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,15 +164,15 @@ def __nonzero__(self):
__bool__ = __nonzero__

@property
def shape(self):
def shape(self) -> Tuple[int, ...]:
return tuple(len(ax) for ax in self.axes)

@property
def ndim(self) -> int:
return len(self.axes)

def set_axis(self, axis, new_labels):
new_labels = ensure_index(new_labels)
def set_axis(self, axis: int, new_labels: Index):
# Caller is responsible for ensuring we have an Index object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this isn't public, the comment isn't needed with type annotations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is mypy smart enough to enforce this? I'm hoping to ween internals off of ensure_index completely, the comment is to keep track of where it has been done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my only concern is keeping comments in-sync with code changes over time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is mypy smart enough to enforce this?

as we add more type hints throughout the code. (the pace should be quicker but too much bikeshedding IMO) if mypy is green, job done! you can only get so far if the type hints are wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pace should be quicker but too much bikeshedding IMO

Fair enough, im happy to defer to you on these

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pace should be quicker but too much bikeshedding IMO

Fair enough, im happy to defer to you on these

The point is, mypy is the checker, so the reviewer's don't need to be. (there does need to be some consideration of style, readability, maintenance etc)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(there does need to be some consideration of style, readability, maintenance etc)

and we could start to address those concerns with tools like https://pypi.org/project/flake8-annotations-complexity/

old_len = len(self.axes[axis])
new_len = len(new_labels)

Expand All @@ -184,7 +184,7 @@ def set_axis(self, axis, new_labels):

self.axes[axis] = new_labels

def rename_axis(self, mapper, axis, copy: bool = True, level=None):
def rename_axis(self, mapper, axis: int, copy: bool = True, level=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type, mapper and level

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like level may be inconsistent with the caller's signature, so punting for now. xref #32349.

"""
Rename one of axes.

Expand Down Expand Up @@ -233,7 +233,7 @@ def _rebuild_blknos_and_blklocs(self):
self._blklocs = new_blklocs

@property
def items(self):
def items(self) -> Index:
return self.axes[0]

def _get_counts(self, f):
Expand Down Expand Up @@ -623,7 +623,7 @@ def comp(s, regex=False):
bm._consolidate_inplace()
return bm

def is_consolidated(self):
def is_consolidated(self) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an aside, could this be a property? and remove/inline _consolidate_check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably could, would need to do some profiling. xref #32261.

"""
Return True if more than one block with the same dtype
"""
Expand Down Expand Up @@ -688,7 +688,7 @@ def get_numeric_data(self, copy: bool = False):
self._consolidate_inplace()
return self.combine([b for b in self.blocks if b.is_numeric], copy)

def combine(self, blocks, copy=True):
def combine(self, blocks, copy: bool = True):
""" return a new manager with the blocks """
if len(blocks) == 0:
return self.make_empty()
Expand Down Expand Up @@ -992,7 +992,6 @@ def delete(self, item):
self.blocks = tuple(
b for blkno, b in enumerate(self.blocks) if not is_blk_deleted[blkno]
)
self._shape = None
self._rebuild_blknos_and_blklocs()

def set(self, item, value):
Expand Down Expand Up @@ -1160,7 +1159,6 @@ def insert(self, loc: int, item, value, allow_duplicates: bool = False):

self.axes[0] = new_axis
self.blocks += (block,)
self._shape = None

self._known_consolidated = False

Expand Down
23 changes: 13 additions & 10 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
TYPE_CHECKING,
Any,
Callable,
Hashable,
Iterable,
List,
Optional,
Expand All @@ -23,7 +22,7 @@
from pandas._config import get_option

from pandas._libs import lib, properties, reshape, tslibs
from pandas._typing import Label
from pandas._typing import Axis, DtypeObj, Label
from pandas.compat.numpy import function as nv
from pandas.util._decorators import Appender, Substitution, doc
from pandas.util._validators import validate_bool_kwarg, validate_percentile
Expand Down Expand Up @@ -177,7 +176,7 @@ class Series(base.IndexOpsMixin, generic.NDFrame):

_typ = "series"

_name: Optional[Hashable]
_name: Label
_metadata: List[str] = ["name"]
_internal_names_set = {"index"} | generic.NDFrame._internal_names_set
_accessors = {"dt", "cat", "str", "sparse"}
Expand Down Expand Up @@ -391,9 +390,12 @@ def _can_hold_na(self):

_index = None

def _set_axis(self, axis, labels, fastpath: bool = False) -> None:
def _set_axis(self, axis: int, labels, fastpath: bool = False) -> None:
"""
Override generic, we want to set the _typ here.

This is called from the cython code when we set the `index` attribute
directly, e.g. `series.index = [1, 2, 3]`.
"""
if not fastpath:
labels = ensure_index(labels)
Expand All @@ -413,6 +415,7 @@ def _set_axis(self, axis, labels, fastpath: bool = False) -> None:

object.__setattr__(self, "_index", labels)
if not fastpath:
# The ensure_index call aabove ensures we have an Index object
self._data.set_axis(axis, labels)

def _update_inplace(self, result, **kwargs):
Expand All @@ -421,25 +424,25 @@ def _update_inplace(self, result, **kwargs):

# ndarray compatibility
@property
def dtype(self):
def dtype(self) -> DtypeObj:
"""
Return the dtype object of the underlying data.
"""
return self._data.dtype

@property
def dtypes(self):
def dtypes(self) -> DtypeObj:
"""
Return the dtype object of the underlying data.
"""
return self._data.dtype

@property
def name(self) -> Optional[Hashable]:
def name(self) -> Label:
return self._name

@name.setter
def name(self, value: Optional[Hashable]) -> None:
def name(self, value: Label) -> None:
if not is_hashable(value):
raise TypeError("Series.name must be a hashable type")
object.__setattr__(self, "_name", value)
Expand Down Expand Up @@ -689,7 +692,7 @@ def __array_ufunc__(
inputs = tuple(extract_array(x, extract_numpy=True) for x in inputs)
result = getattr(ufunc, method)(*inputs, **kwargs)

name: Optional[Hashable]
name: Label
if len(set(names)) == 1:
name = names[0]
else:
Expand Down Expand Up @@ -3983,7 +3986,7 @@ def rename(
see_also_sub="",
)
@Appender(generic.NDFrame.set_axis.__doc__)
def set_axis(self, labels, axis=0, inplace=False):
def set_axis(self, labels, axis: Axis = 0, inplace: bool = False):
return super().set_axis(labels, axis=axis, inplace=inplace)

@Substitution(**_shared_doc_kwargs)
Expand Down