Int64Dtype in read_csv leads to unexpected values #26259
Labels
Bug
Dtype Conversions
Unexpected or buggy dtype conversions
ExtensionArray
Extending pandas with custom dtypes or arrays.
IO CSV
read_csv, to_csv
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
Code Sample, a copy-pastable example if possible
Problem description
I would like to read csv files with nullable (big) integers into a dataframe. The integers represent nanoseconds since the UNIX epoch 1970. Using the Int64Dtype introduced in 0.24.0 seems like the way to go. I quote from the FAQ:
https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html#nan-integer-na-values-and-na-type-promotions
Expected Output
Actual Output
Output of
pd.show_versions()
pandas: 0.24.2
pytest: None
pip: 19.1
setuptools: 41.0.1
Cython: None
numpy: 1.16.3
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
The text was updated successfully, but these errors were encountered: