-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Regression: "invalid type <class 'pandas.core.dtypes.dtypes.CategoricalDtype'> for astype" #17780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It needs to be an instance of
And you can use |
this should just raise as passing an instance of a type is not valid (if it worked before that was accidental). Though you could show a deprecation warning (and have it work) if you want. |
If it was never supposed to work, then my opinion is that we can avoid the DeprecationWarning. I just used to find it cleaner to do |
I think this should work, its similar to:
where |
@toobaz would do a PR for this? |
Aren't strings an example? In [2]: np.array([1, 2, 3]).astype(np.str_)
Out[2]:
array(['1', '2', '3'],
dtype='<U21')
I can try (but not very soon) |
I first thought this as well (so putting me in favor of allowing the case from the initial post), but it is not fully equivalent. That said, I don't really think that equivalence (or non-equivalence) matters that much in deciding whether we want to allow Whatever we do, it would be nice to deal with |
I didn't realize that the previous behavior was to convert to object, and not Categorical. I don't feel strongly about this at all, but I would prefer to raise with a nice error message. Pro of error message: Catch cases where the user meant to instantiate the CDT with categories, but forgot Con: Have to type an extra I'll make a PR. |
Should we catch all the other extension types as well? I can't imagine a case were people want |
Aha, I had missed that too. |
Sorry in advance if this is actually desired behaviour. The
whatsnew/v0.21.0.txt
only states"passing
categories
orordered
kwargs to :func:Series.astype
is deprecated, in favor of passing a :ref:CategoricalDtype <whatsnew_0210.enhancements.categorical_dtype>
(:issue:#17636)"... so I assume it is not.
Code Sample, a copy-pastable example if possible
Until recently (way more recently than this particular git commit):
Now:
Problem description
Again, not really sure this is undesired... maybe it would be enough to clarify a bit the whatsnew note.
Expected Output
Previous behaviour.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
pandas: 0.21.0.dev+572.g8e89cb3e1
pytest: 3.0.6
pip: 9.0.1
setuptools: 33.1.1
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: 5.2.2
sphinx: None
patsy: 0.4.1+dev
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: 3.7.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: