-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Regression in astype not casting to bytes #39484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def test_astype_bytes(self): | ||
# GH#39474 | ||
result = DataFrame(["foo", "bar", "baz"]).astype(bytes) | ||
assert result.dtypes[0] == np.dtype("S3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct, it should be object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was np.dtype("S3")
on 1.1.5 too, so wrong there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could remove the test to avoid testing wrong behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah this is definitely wrong, we should never have a numpy string type in output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have unfortunately been doing this for many releases ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grr really?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok @phofl let's fix this for 1.3 (the S dtypes) and backport this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok all right. Will open issue as follow up
@@ -2247,7 +2247,7 @@ class ObjectBlock(Block): | |||
_can_hold_na = True | |||
|
|||
def _maybe_coerce_values(self, values): | |||
if issubclass(values.dtype.type, (str, bytes)): | |||
if issubclass(values.dtype.type, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you changing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this in #39065, which caused the regression. have to handle the replace issue at another place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i
@@ -2247,7 +2247,7 @@ class ObjectBlock(Block): | |||
_can_hold_na = True | |||
|
|||
def _maybe_coerce_values(self, values): | |||
if issubclass(values.dtype.type, (str, bytes)): | |||
if issubclass(values.dtype.type, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i
def test_astype_bytes(self): | ||
# GH#39474 | ||
result = DataFrame(["foo", "bar", "baz"]).astype(bytes) | ||
assert result.dtypes[0] == np.dtype("S3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah this is definitely wrong, we should never have a numpy string type in output.
@phofl if you'd merge master |
merged master |
Opened #39566 |
thanks @phofl |
@meeseeksdev backport 1.2.x |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon! If these instruction are inaccurate, feel free to suggest an improvement. |
(cherry picked from commit 300d1fc)
Not sure if this is the right place