Skip to content

TYP: Add TypeVars to NDFrame #30613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 3, 2020
Merged

Conversation

topper-123
Copy link
Contributor

@topper-123 topper-123 commented Jan 2, 2020

Adding TypeVars to NDFrame methods that don't return optional values. This ensures that e.g. (DataFrame|Series).astype and (DataFrame|Series).copy have a known return type, which is nice.

Also adds PandasObject._ensure_type, which is a method used to solve the problem with optional return values, but where we actually know the return type. This is a helper method to help with the tediousness of optional return values experienced in #30565.

@jreback jreback added the Typing type annotations, mypy/pyright type checking label Jan 2, 2020
# TODO: Not sure if above is correct - need someone to confirm.
axis = self._get_axis_number(kwargs.pop("axis", self._stat_axis_name))
if fill_method is None:
data = self
else:
data = self.fillna(method=fill_method, limit=limit, axis=axis)
data = self._ensure_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why wouldn't you instead type .fillna?

Copy link
Contributor Author

@topper-123 topper-123 Jan 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.fillna is aready typed as Optional[FrameOrSeries]. The problem here is that data is typed to FrameORSeries and it can't accept reassigning type, per mypy's rules. the alternative to the above would be:

filled = self.fillna(method=fill_method, axis=axis, limit=limit)
assert isinstance(filled, type(self))
data = filled

which gets repetive with a code base the size of Pandas', IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying the Optional is throwing it off or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but by design from the mypy folks: Mypy complains if a parameter has its type redefined to a different/broader type.

So if ‘data’ has been defined as a type DataFrame, mypy will complain if we try to redefine ‘data’ to be Optional[DataFrame].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I c, since we have inplace args the return type must be Optional. this is super annoying. Let's deprecate inplace!

yeah ok i understand why you need this.

@jreback jreback added this to the 1.0 milestone Jan 3, 2020
@jreback jreback merged commit d2b4e78 into pandas-dev:master Jan 3, 2020
@jreback
Copy link
Contributor

jreback commented Jan 3, 2020

thanks @topper-123

Copy link

@ericchansen ericchansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit added _ensure_type, which utilizes isinstance. However, it's reasonable that someone may want to create a DataFrame subclass. Therefore, _ensure_type should use issubclass.


Used by type checkers.
"""
assert isinstance(obj, type(self)), type(obj)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line breaks any code that creates a subclass of the pandas.core.frame.DataFrame. issubclass should be used instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see also #31925 (maybe repeat your comment there). We should fix this for 1.0.2

ericchansen added a commit to ericchansen/pandas that referenced this pull request Mar 3, 2020
…1925)

Commit pandas-dev/pandas@6fd326d in
pull request pandas-dev#30613 added `_ensure_type`, which
utilizes `isinstance`. However, it is reasonable to assume that someone
may want to create a DataFrame subclass. Therefore, `_ensure_type`
should use `issubclass`.
ericchansen added a commit to ericchansen/pandas that referenced this pull request Mar 3, 2020
…1925)

Commit pandas-dev/pandas@6fd326d in
pull request pandas-dev#30613 added `_ensure_type`, which
utilizes `isinstance`. However, it is reasonable to assume that someone
may want to create a DataFrame subclass. Therefore, `_ensure_type`
should use `issubclass`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants