Skip to content

DOC: Strange behavior when combining astype and loc #13260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rladeira opened this issue May 23, 2016 · 4 comments
Closed

DOC: Strange behavior when combining astype and loc #13260

rladeira opened this issue May 23, 2016 · 4 comments
Labels
Docs Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@rladeira
Copy link

rladeira commented May 23, 2016

I am trying to use astype together with loc to convert just a subset of columns to a specified dtype. However, I can't understand what is actually happening. Is this the expected behavior when combining
astype and loc?

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]})

print df.loc[:, ['a', 'b']].astype(np.uint8).dtypes

df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8)
print df.dtypes

Output:

>>> df.loc[:, ['a', 'b']].astype(np.uint8).dtypes
a    uint8
b    uint8
dtype: object
>>> df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8)
>>> df.dtypes
a    int64
b    int64
c    int64
dtype: object

Expected Output

>>> df.loc[:, ['a', 'b']].astype(np.uint8).dtypes
a    uint8
b    uint8
dtype: object
>>> df.loc[:, ['a', 'b']] = df.loc[:, ['a', 'b']].astype(np.uint8)
>>> df.dtypes
a    uint8
b    uint8
c    int64
dtype: object

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.2.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: None
pip: 8.1.1
setuptools: 20.6.7
Cython: None
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: 4.1.2
sphinx: None
patsy: None
dateutil: 2.5.2
pytz: 2016.3
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None

@jreback
Copy link
Contributor

jreback commented May 23, 2016

This is as expected; use getitem if you want to re-assign w/o masking. In this case your mask is ':' (meaning the entire columns), but upcasting is done when used in this way (and that's why its upcast to int64).

IOW, .loc will try to fit in what you are assigning to the current dtypes, while [] will overwrite them taking the dtype from the rhs. subtle point.

In [1]: df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [7, 8, 9]})

In [2]: df
Out[2]: 
   a  b  c
0  1  4  7
1  2  5  8
2  3  6  9

In [3]: df[['a','b']] = df[['a','b']].astype(np.uint8)

In [4]: df
Out[4]: 
   a  b  c
0  1  4  7
1  2  5  8
2  3  6  9

In [5]: df.dtypes
Out[5]: 
a    uint8
b    uint8
c    int64
dtype: object

@jreback jreback closed this as completed May 23, 2016
@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Dtype Conversions Unexpected or buggy dtype conversions labels May 23, 2016
@jreback
Copy link
Contributor

jreback commented May 23, 2016

This is a bit related to the docs we added here: #13070

If you'd like to propose an example for this section would take it: http://pandas-docs.github.io/pandas-docs-travis/basics.html#astype (e.g. showing pitfalls, e.g. this example)

@rladeira
Copy link
Author

Subtle point, indeed. Thanks for the clarification.

@jreback
Copy link
Contributor

jreback commented May 23, 2016

let me reopen as a doc-issue.

@jreback jreback reopened this May 23, 2016
@jreback jreback added the Docs label May 23, 2016
@jreback jreback added this to the Next Major Release milestone May 23, 2016
@jreback jreback changed the title Strange behavior when combining astype and loc DOC: Strange behavior when combining astype and loc May 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants