Skip to content

TYP: Made ffill_indexer arg-type ndarray #44893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 28, 2021
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 2 additions & 10 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1648,16 +1648,8 @@ def get_result(self) -> DataFrame:
right_join_indexer: np.ndarray | None

if self.fill_method == "ffill":
# error: Argument 1 to "ffill_indexer" has incompatible type
# "Optional[ndarray]"; expected "ndarray"
left_join_indexer = libjoin.ffill_indexer(
left_indexer # type: ignore[arg-type]
)
# error: Argument 1 to "ffill_indexer" has incompatible type
# "Optional[ndarray]"; expected "ndarray"
right_join_indexer = libjoin.ffill_indexer(
right_indexer # type: ignore[arg-type]
)
left_join_indexer = libjoin.ffill_indexer(np.asarray(left_indexer))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid calling asarray. This adds unnecessary overhead and does not guarantee that we get an array back if None is passed into it.

You can use assert if it is guaranteed, that left_indexer is not None here. Adjusting the return type of _get_join_info would be even better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phofl Hi Patrick. I see what you mean about the overhead of np.asarray(). None seems to come from L957 and L962 in _get_join_info which comes from return value of _left_join_on_index L2071. Not sure what a good alternative np.ndarray alternative to None would be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reason about the code s.t. we know we can never get here with left_indexer being None? if so, then can do a left_indexer = cast(np.ndarray, left_indexer). otherwise

if left_indexer is None:
     raise AppropriateException("Useful message")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel thank you for the suggestion. I was unsure if it was okay to use typing.cast() Do these changes look okay?

right_join_indexer = libjoin.ffill_indexer(np.asarray(right_indexer))
else:
left_join_indexer = left_indexer
right_join_indexer = right_indexer
Expand Down