-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TYP: pandas/core/frame.py #38416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: pandas/core/frame.py #38416
Conversation
suggest that you do reasonably sized pieces so this doesn't sit in review for too long - better multiple but smaller prs as this is a huge file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer you don't mix easy and hard ones. e.g. a pass on bool / strings ones. and Axis / Level. These are easy to review. by mixing lots of things its very hard to give good scrutiny here.
would prefer that.
pandas/core/frame.py
Outdated
data, | ||
orient: str = "columns", | ||
dtype: Optional[Dtype] = None, | ||
columns: Optional[List] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this should be Arraylike here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean columns
? The doctring says list (columns
is actually column labels so it seems to make sense)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be Optional[List[Label]]
pandas/core/frame.py
Outdated
@@ -1440,7 +1456,7 @@ def to_numpy( | |||
|
|||
return result | |||
|
|||
def to_dict(self, orient="dict", into=dict): | |||
def to_dict(self, orient: str = "dict", into=dict) -> Union[Dict, List, Mapping]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapping should be enough here, why did you add List?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with orient="records"
we return a list:
Lines 1589 to 1599 in 36c4d5c
elif orient == "records": | |
columns = self.columns.tolist() | |
rows = ( | |
dict(zip(columns, row)) | |
for row in self.itertuples(index=False, name=None) | |
) | |
return [ | |
into_c((k, maybe_box_datetimelike(v)) for k, v in row.items()) | |
for row in rows | |
] | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can at least get rid of Dict then - no reason to include Dict and Mapping unless I am overlooking something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right - done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with
orient="records"
we return a list:
we should probably overload using Literal to avoid Union return types. but not in this pass.
pandas/core/frame.py
Outdated
@@ -1161,7 +1168,7 @@ def __len__(self) -> int: | |||
""" | |||
return len(self.index) | |||
|
|||
def dot(self, other): | |||
def dot(self, other: Union[AnyArrayLike, DataFrame]) -> FrameOrSeriesUnion: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be explict and add Series to these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/core/frame.py
Outdated
@@ -3311,7 +3329,7 @@ def _box_col_values(self, values, loc: int) -> Series: | |||
# ---------------------------------------------------------------------- | |||
# Unsorted | |||
|
|||
def query(self, expr, inplace=False, **kwargs): | |||
def query(self, expr: str, inplace: bool = False, **kwargs) -> Optional[DataFrame]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think this is correct for the output type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is right. From docstring:
DataFrame resulting from the provided query expression or None if ``inplace=True``.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no my point is that the doc-string is wrong. this can be a Series or Scalar. basically this can be anything. I would remove it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - removed (and will fix the docstring in follow-on, too)
pandas/core/frame.py
Outdated
def eval(self, expr, inplace=False, **kwargs): | ||
def eval( | ||
self, expr: str, inplace: bool = False, **kwargs | ||
) -> Optional[Union[AnyArrayLike, Scalar]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the output type here is suspect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From docstring:
Returns
-------
ndarray, scalar, pandas object, or None
The result of the evaluation or None if ``inplace=True``.
so does Optional[Union[AnyArrayLike, DataFrame, Scalar]]
work?
pandas/core/frame.py
Outdated
def apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds): | ||
def apply( | ||
self, | ||
func: Callable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we are more specific on this, though it maybe difficult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking - maybe we could do something like what was done for AggFuncType
pandas/core/reshape/pivot.py
Outdated
@@ -612,7 +612,7 @@ def crosstab( | |||
margins=margins, | |||
margins_name=margins_name, | |||
dropna=dropna, | |||
**kwargs, | |||
**kwargs, # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't add type ignores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay I'll skip pivot in this PR (this is a weird issue with typing **kwargs
)
For sure. I'll keep the hard ones that I've already attempted here so all the comments are in one place. I'll resubmit easy and follow-ons in separate PRs |
needs merge master and will look |
mypy green + all comments addressed (or moved out to orthogonal PRs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good - just a few things
pandas/core/frame.py
Outdated
@@ -1440,7 +1456,7 @@ def to_numpy( | |||
|
|||
return result | |||
|
|||
def to_dict(self, orient="dict", into=dict): | |||
def to_dict(self, orient: str = "dict", into=dict) -> Union[Dict, List, Mapping]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can at least get rid of Dict then - no reason to include Dict and Mapping unless I am overlooking something
pandas/core/frame.py
Outdated
@@ -7802,7 +7828,7 @@ def apply( | |||
) | |||
return op.get_result() | |||
|
|||
def applymap(self, func, na_action: Optional[str] = None) -> DataFrame: | |||
def applymap(self, func: Callable, na_action: Optional[str] = None) -> DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be typed more strictly than Callable? Subscripting Callable is much more useful to a reader if possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - @jreback asked for this too. I'll put up a follow-on for this (to avoid expanding the scope of this PR)
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
@arw2019 if you want to merge master and update (or close and open a new one ok too) |
@jreback @simonjayhawkins updated, CI green, comments addressed |
ok. maybe we should merge this. there has been plenty of discussion here. |
@jreback @WillAyd i'm merging later today if no objections. mypy is green and typing PRs easily become stale or get abandoned. There are no ignores, casts or asserts added here and no existing annotations changed/deleted. if any of the typing additions added here are problematic they will be picked up by mypy as more types are added. |
sure let's make sure this is rebased first |
yep did that locally and ran mypy before I commented. will push and merge on green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @arw2019 |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
I plan to do a pass through the whole file - putting it up in case anybody has feedback on what I have