Skip to content

BUG: Never end up with numpy string dtypes #39566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
phofl opened this issue Feb 2, 2021 · 6 comments
Open
3 tasks done

BUG: Never end up with numpy string dtypes #39566

phofl opened this issue Feb 2, 2021 · 6 comments
Labels
Deprecate Functionality to remove in pandas Dtype Conversions Unexpected or buggy dtype conversions Strings String extension data type and string data

Comments

@phofl
Copy link
Member

phofl commented Feb 2, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

DataFrame(["foo", "bar", "baz"]).astype(bytes)

As discussed in #39484 we do not want to get numpy string dtypes here. This should have dtype object

@phofl phofl added Bug Needs Triage Issue that has not been reviewed by a pandas team member Strings String extension data type and string data and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 2, 2021
@jreback jreback added this to the 1.3 milestone Feb 2, 2021
@jreback jreback added the Dtype Conversions Unexpected or buggy dtype conversions label Feb 2, 2021
@jreback
Copy link
Contributor

jreback commented Feb 2, 2021

this is a breaking change, but was never intended and we have 0 support for S (or B) types as native pandas types.

@jorisvandenbossche
Copy link
Member

Can we deprecate this in some way first?

And what do you exactly want to change? Only the resulting dtype to be object (but still do the conversion to bytes), or actually disallow casting to bytest altogether?

@jorisvandenbossche
Copy link
Member

One option would be to deprecate astype(bytes) or astype(np.dtype("S")), and let users use another method to actually convert values to bytes (eg s.map(bytes) might also do that?)

@simonjayhawkins
Copy link
Member

removing the milestone

@simonjayhawkins simonjayhawkins removed this from the 1.3 milestone Jun 11, 2021
@mroeschke mroeschke added Deprecate Functionality to remove in pandas and removed Bug labels Aug 15, 2021
@jbrockmendel
Copy link
Member

Related: when we pass a np.dtype("S") ndarray to Series or DataFrame do we cast to object? ATM we do this for only a few variants of the DataFrame constructor, not for Series

@jbrockmendel
Copy link
Member

Resolving this would fix #52373.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Dtype Conversions Unexpected or buggy dtype conversions Strings String extension data type and string data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants