-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Column dtype change on write of improper value #26049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
That is strange. Option 2 is what we would want here - investigation and PRs to fix would certainly be welcome |
Behavior is consistent with numpy. import numpy as np
value = np.array([300], dtype=np.uint8)[0]
print(f'value = {value}, value should = 300')
print(f'{300 - 2**8} = 300 - 2**8') Output
|
My apologies, I failed to see the dtype change in the original example. Numpy's behavior is not exactly consistent pandas' as it doesn't cast the dtype to int64. array = np.array([300], dtype=np.uint8)
print(f'value = {array[0]}, value should = 300')
print(f'array.dtype = {array.dtype}') Ouput
Will continue investigating on the pandas side. |
@WillAyd The current behavior on master is
Is int64 and 300 still the expected behavior? |
Hmm OK. This is a pretty ambiguous set of actions so I think matching numpy is the best we can offer. Even there, this probably just falls back to the C standard for wrap around of unsigned integers |
Code Sample
output:
Problem description
When writing, e.g., a too big integer to an 8-bit unsigned integer column, the value of the written integer is casted to uint8 and the data type of the column is changed to int64.
Expected Output
I would expect that either the value is casted and the data type is retained or the data type gets changed and the value is retained.
or
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.125-linuxkit
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.29.6
numpy: 1.16.2
scipy: 1.2.1
pyarrow: 0.11.1
xarray: 0.11.3
IPython: 7.1.1
sphinx: 2.0.0
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: 3.5.1
numexpr: 2.6.9
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.5
lxml.etree: 4.3.0
bs4: 4.7.1
html5lib: None
sqlalchemy: 1.3.1
pymysql: None
psycopg2: 2.7.6.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: 0.2.1
pandas_gbq: None
pandas_datareader: None
gcsfs: None
The text was updated successfully, but these errors were encountered: