-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REGR: replace with multivalued regex raising #40604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REGR: replace with multivalued regex raising #40604
Conversation
pandas/core/internals/blocks.py
Outdated
# is not indexable | ||
m = masks[i][ | ||
mask_pos : mask_pos + blk.shape[0] | ||
] # type: ignore[index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think if you define mib = masks[i]
or something that might make this type:ignore unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call!
pandas/core/internals/blocks.py
Outdated
if blk.ndim == 1: | ||
m = masks[i] | ||
else: | ||
# GH-39338: _replace_coerce can split a block, so we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it gets split, will it be all-single-column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question will look into that, would help simplify logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjusted to account for this - split is happening in split_and_operate
, which looks to always give single column result
|
||
# GH-39338: _replace_coerce can split a block into | ||
# single-column blocks, so track the index so we know | ||
# where to index into the mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be hard to backport as this code has changed a lot (though possible this particular section has not).
thanks @mzeitlin11 |
@meeseeksdev backport 1.2.x |
This comment has been minimized.
This comment has been minimized.
) Co-authored-by: Matthew Zeitlin <[email protected]>
Issue comes from the first pair in the
regex
arg hitting the simple replace path, which causes theObjectBlock
to be split. From here, the mask indexing will be wrong, but that will not cause an error unless a later pair hits_replace_regex
, which actually uses the mask.