Skip to content

TYP: Add annotation to assert_frame_equal and assert_series_equal (GH26302) #39504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 44 additions & 40 deletions pandas/_testing/asserters.py
Original file line number Diff line number Diff line change
Expand Up @@ -822,33 +822,33 @@ def assert_extension_array_equal(

# This could be refactored to use the NDFrame.equals method
def assert_series_equal(
left,
right,
check_dtype=True,
check_index_type="equiv",
check_series_type=True,
check_less_precise=no_default,
check_names=True,
check_exact=False,
check_datetimelike_compat=False,
check_categorical=True,
check_category_order=True,
check_freq=True,
check_flags=True,
rtol=1.0e-5,
atol=1.0e-8,
obj="Series",
left: Series,
right: Series,
Comment on lines +825 to +826
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #29364 for a prior discussion on this. IIRC These asserters are intended to raise AssertionError, not TypeError if left of right is not of the correct type.

assert_series_equal is part of the public api and therefore types added here will be used when checking user code. This may lead to false positives, depending on how users are using this function.

probably best to leave left and right untyped for now.

check_dtype: Union[bool, str] = True,
check_index_type: Union[bool, str] = "equiv",
check_series_type: bool = True,
check_less_precise: Union[bool, int] = no_default,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins how shall we type this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revealed type is Any, so not yet an issue. Lets defer handling of this till it becomes an issue. i.e. when we merge the type stub from pyright for lib.pyx

check_names: bool = True,
check_exact: bool = False,
check_datetimelike_compat: bool = False,
check_categorical: bool = True,
check_category_order: bool = True,
check_freq: bool = True,
check_flags: bool = True,
rtol: float = 1.0e-5,
atol: float = 1.0e-8,
obj: str = "Series",
*,
check_index=True,
):
check_index: bool = True,
) -> None:
"""
Check that left and right Series are equal.

Parameters
----------
left : Series
right : Series
check_dtype : bool, default True
check_dtype : bool or {'equiv'}, default True
Whether to check the Series dtype is identical.
check_index_type : bool or {'equiv'}, default 'equiv'
Whether to check the Index class, dtype and inferred_type
Expand Down Expand Up @@ -954,7 +954,11 @@ def assert_series_equal(
obj=f"{obj}.index",
)

if check_freq and isinstance(left.index, (DatetimeIndex, TimedeltaIndex)):
if (
check_freq
and isinstance(left.index, (DatetimeIndex, TimedeltaIndex))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, what hits this?

Copy link
Contributor Author

@avinashpancham avinashpancham Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On L959 we use ridx.freq. Initially we didn't check if right.index was also of type DatetimeIndex or TimedeltaIndex and therefore mypy wasnt sure if the index had the freq attribute. This threw the following error

error: "Index" has no attribute "freq"  [attr-defined]

and isinstance(right.index, (DatetimeIndex, TimedeltaIndex))
):
lidx = left.index
ridx = right.index
assert lidx.freq == ridx.freq, (lidx.freq, ridx.freq)
Expand Down Expand Up @@ -1071,25 +1075,25 @@ def assert_series_equal(

# This could be refactored to use the NDFrame.equals method
def assert_frame_equal(
left,
right,
check_dtype=True,
check_index_type="equiv",
check_column_type="equiv",
check_frame_type=True,
check_less_precise=no_default,
check_names=True,
by_blocks=False,
check_exact=False,
check_datetimelike_compat=False,
check_categorical=True,
check_like=False,
check_freq=True,
check_flags=True,
rtol=1.0e-5,
atol=1.0e-8,
obj="DataFrame",
):
left: DataFrame,
right: DataFrame,
check_dtype: Union[bool, str] = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on L128 we call assert_frame_equal with check_dtype=check_dtype and the default value of check_dtype in that context is "equiv". If we don't type it is as Union[bool, str] we get the following error:

error: Argument "check_dtype" to "assert_frame_equal" has incompatible type "Union[bool, str]"; expected "bool"  [arg-type]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use Literal["equiv"] here instead of str for clarity.

check_index_type: Union[bool, str] = "equiv",
check_column_type: Union[bool, str] = "equiv",
check_frame_type: bool = True,
check_less_precise: Union[bool, int] = no_default,
check_names: bool = True,
by_blocks: bool = False,
check_exact: bool = False,
check_datetimelike_compat: bool = False,
check_categorical: bool = True,
check_like: bool = False,
check_freq: bool = True,
check_flags: bool = True,
rtol: float = 1.0e-5,
atol: float = 1.0e-8,
obj: str = "DataFrame",
) -> None:
"""
Check that left and right DataFrame are equal.

Expand All @@ -1104,7 +1108,7 @@ def assert_frame_equal(
First DataFrame to compare.
right : DataFrame
Second DataFrame to compare.
check_dtype : bool, default True
check_dtype : bool or {'equiv'}, default True
Whether to check the DataFrame dtype is identical.
check_index_type : bool or {'equiv'}, default 'equiv'
Whether to check the Index class, dtype and inferred_type
Expand Down