Skip to content

read_excel: Wrong dtypes for numeric text fields #11927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
suokunlong opened this issue Dec 30, 2015 · 1 comment
Closed

read_excel: Wrong dtypes for numeric text fields #11927

suokunlong opened this issue Dec 30, 2015 · 1 comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request IO Excel read_excel, to_excel

Comments

@suokunlong
Copy link

In the following example, df column a are exported as text (string) in xlsx file, but imported back to df2 as int.

In [161]: df = pd.DataFrame({'a':['01','02','03','04'], 'b':[5,6,7,8]})

In [162]: df
Out[162]:
a b
0 01 5
1 02 6
2 03 7
3 04 8

In [163]: df.dtypes
Out[163]:
a object
b int64
dtype: object

In [164]: df.to_excel('tmp.xlsx')

In [165]: df2 = pd.read_excel('tmp.xlsx')

In [166]: df2
Out[166]:
a b
0 1 5
1 2 6
2 3 7
3 4 8

In [167]: df2.dtypes
Out[167]:
a int64
b int64
dtype: object

Users usually set columns (which contain numeric data) as "text" in excel for special purposes, pandas should keep these dtypes as str, please do not try to convert it back to numberic. At least, pandas should provide an option in pd.read_excel() to switch this dtype conversion.

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request IO Excel read_excel, to_excel labels Dec 30, 2015
@jreback
Copy link
Contributor

jreback commented Dec 30, 2015

this is a dupe of #8212

@jreback jreback closed this as completed Dec 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request IO Excel read_excel, to_excel
Projects
None yet
Development

No branches or pull requests

2 participants