We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This code raises an error:
>>> s = pd.Series([0, pd.NA], dtype='Int8') >>> s.astype('string') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-12-e5922a9c01ed> in <module> ----> 1 s.astype('string') /opt/tljh/user/envs/py37-sf/lib/python3.7/site-packages/pandas/core/generic.py in astype(self, dtype, copy, errors) 5696 else: 5697 # else, only a single dtype is given -> 5698 new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors) 5699 return self._constructor(new_data).__finalize__(self) 5700 /opt/tljh/user/envs/py37-sf/lib/python3.7/site-packages/pandas/core/internals/managers.py in astype(self, dtype, copy, errors) 580 581 def astype(self, dtype, copy: bool = False, errors: str = "raise"): --> 582 return self.apply("astype", dtype=dtype, copy=copy, errors=errors) 583 584 def convert(self, **kwargs): /opt/tljh/user/envs/py37-sf/lib/python3.7/site-packages/pandas/core/internals/managers.py in apply(self, f, filter, **kwargs) 440 applied = b.apply(f, **kwargs) 441 else: --> 442 applied = getattr(b, f)(**kwargs) 443 result_blocks = _extend_blocks(applied, result_blocks) 444 /opt/tljh/user/envs/py37-sf/lib/python3.7/site-packages/pandas/core/internals/blocks.py in astype(self, dtype, copy, errors) 605 if self.is_extension: 606 # TODO: Should we try/except this astype? --> 607 values = self.values.astype(dtype) 608 else: 609 if issubclass(dtype.type, str): /opt/tljh/user/envs/py37-sf/lib/python3.7/site-packages/pandas/core/arrays/integer.py in astype(self, dtype, copy) 463 kwargs = {} 464 --> 465 data = self.to_numpy(dtype=dtype, **kwargs) 466 return astype_nansafe(data, dtype, copy=False) 467 /opt/tljh/user/envs/py37-sf/lib/python3.7/site-packages/pandas/core/arrays/masked.py in to_numpy(self, dtype, copy, na_value) 130 ) 131 # don't pass copy to astype -> always need a copy since we are mutating --> 132 data = self._data.astype(dtype) 133 data[self._mask] = na_value 134 else: TypeError: data type not understood
Converting a nullable int to nullable string requires a double conversion:
s = pd.Series([0, 1], dtype='Int8') s.astype(str).astype('string')
Likewise, converting a nullable string to nullable int requires two steps:
s = pd.Series(['0', '1'], dtype='string') s.astype('int8').astype('Int8')
Moreover, if the nullable string series has NAs, converting to a nullable int becomes much harder (I don't know if there is a simpler way):
s = pd.Series(['0', pd.NA], dtype='string') s.astype('object').replace(pd.NA, np.nan).astype('float64').astype('Int8')
I would be nice to directly convert from nullable int to nullable string and vice versa in one step.
I'd like for all the following conversions to work:
s = pd.Series([0, 1], dtype='Int8') s.astype('string') # currently raises
s = pd.Series(['0', '1'], dtype='string') s.astype('Int8') # currently raises
s = pd.Series(['0', pd.NA], dtype='string') s.astype('Int8') # currently raises
pd.show_versions()
commit : None python : 3.7.4.final.0 python-bits : 64 OS : Linux OS-release : 4.15.0-1058-aws machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : en_US.UTF-8
pandas : 1.0.1 numpy : 1.18.1 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 45.1.0.post20200119 Cython : 0.29.13 pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.1 IPython : 7.8.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : 3.1.0 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 0.15.1 pytables : None pytest : None pyxlsb : None s3fs : None scipy : 1.3.1 sqlalchemy : None tables : None tabulate : 0.8.3 xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : None numba : 0.45.1
The text was updated successfully, but these errors were encountered:
Thanks. Duplicate of #31204 I think.
Sorry, something went wrong.
@TomAugspurger, sorry I couldn't find the previous issue #31204. I'll comment there to add the roundtrip argument (Int -> string -> Int).
pd.StringDtype()
No branches or pull requests
Code Sample, a copy-pastable example if possible
This code raises an error:
Problem description
Converting a nullable int to nullable string requires a double conversion:
Likewise, converting a nullable string to nullable int requires two steps:
Moreover, if the nullable string series has NAs, converting to a nullable int becomes much harder (I don't know if there is a simpler way):
Expected Output
I would be nice to directly convert from nullable int to nullable string and vice versa in one step.
I'd like for all the following conversions to work:
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-1058-aws
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200119
Cython : 0.29.13
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.15.1
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : 1.3.1
sqlalchemy : None
tables : None
tabulate : 0.8.3
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
numba : 0.45.1
The text was updated successfully, but these errors were encountered: