Skip to content

Bug: loc assignment results in wrong dtype #31340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
krasnikov138 opened this issue Jan 27, 2020 · 5 comments · Fixed by #31483
Closed

Bug: loc assignment results in wrong dtype #31340

krasnikov138 opened this issue Jan 27, 2020 · 5 comments · Fixed by #31483
Labels
Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@krasnikov138
Copy link

Let's see output of this code snippet

df = pd.DataFrame({'id': ['A'], 'a': [1.2], 'b': [0.0], 'c': [-2.5]})
cols = ['a', 'b', 'c']
df.loc[:, cols] = df.loc[:, cols].astype('float32')
print(df.dtypes)

Problem description

In this code we expect columns 'a', 'b', 'c' will have types float32. But in fact the result is:

>>> print(df.dtypes)
id     object
a     float32
b       int64
c     float32
dtype: object

Bug occurs when the value in column is close to integer. In other words, for df = pd.DataFrame({'id': ['A'], 'a': [1.2], 'b': [1.0], 'c': [-2.5]}) behavior will be the same. However, if df has 2 rows or more the result is right:

>>> df = pd.DataFrame({'id': ['A', 'B'], 'a': [1.2, 0.55], 'b': [0.0, 0.0], 'c': [-2.5, 12.56]})
>>> df.loc[:, cols] = df.loc[:, cols].astype('float32')
>>> df.dtypes
id     object
a     float32
b     float32
c     float32
dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.5.final.0 python-bits : 64 OS : Linux OS-release : 4.19.6-200.fc28.x86_64 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 0.25.3
numpy : 1.17.4
pytz : 2018.4
dateutil : 2.7.5
pip : 19.2.3
setuptools : 41.6.0
Cython : 0.28.2
pytest : 3.5.1
hypothesis : None
sphinx : 1.7.4
blosc : None
feather : None
xlsxwriter : 1.0.4
lxml.etree : 4.2.1
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.7.5 (dt dec pq3 ext lo64)
jinja2 : 2.10
IPython : 6.4.0
pandas_datareader: None
bs4 : 4.6.3
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.2.1
matplotlib : 3.0.1
numexpr : 2.6.5
odfpy : None
openpyxl : 2.5.3
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : 1.2.14
tables : 3.4.3
xarray : None
xlrd : 1.1.0
xlwt : 1.3.0
xlsxwriter : 1.0.4

@MarcoGorelli
Copy link
Member

Thanks for the report. I couldn't reproduce this on master:

df = pd.DataFrame({'id': ['A'], 'a': [1.2], 'b': [0.0], 'c': [-2.5]}) 
   ...: cols = ['a', 'b', 'c'] 
   ...: df.loc[:, cols] = df.loc[:, cols].astype('float32') 
   ...: print(df.dtypes)                                                                                                                                                                                         
id     object
a     float32
b     float32
c     float32
dtype: object

@MarcoGorelli
Copy link
Member

Will close this then, as it looks like it's been fixed already

@jreback
Copy link
Contributor

jreback commented Jan 28, 2020

@MarcoGorelli i think this is worth a validation test (unless we already have something that is pretty close in the tests)

@MarcoGorelli MarcoGorelli reopened this Jan 28, 2020
@MarcoGorelli
Copy link
Member

unless we already have something that is pretty close in the tests

Admittedly I hadn't checked for that, sorry - have reopened for now, will take a look later

@prakhar987
Copy link
Contributor

Reproducible on 0.25.0 (fixed on master). I will try to come up with a test if one doesn't exist.

prakhar987 added a commit to prakhar987/pandas that referenced this issue Jan 31, 2020
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves labels Feb 1, 2020
@jreback jreback added this to the 1.1 milestone Feb 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants