Skip to content

API: different default for header kwarg in read_csv vs read_excel #11889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jorisvandenbossche opened this issue Dec 23, 2015 · 1 comment
Open
Labels
API - Consistency Internal Consistency of API/Behavior Enhancement IO Excel read_excel, to_excel Needs Discussion Requires discussion from core team before further action

Comments

@jorisvandenbossche
Copy link
Member

From #11874 (comment)

The default for hader in read_csv is 'infer', while for read_excel this is 0. In most cases this does not matter I think (as the infer will use 0 as the default). But, in the case of specifying the names explicitly, this makes a difference:

For read_csv, you need to explicitly pass header=0 to replace header existing in file:

In [35]: s = """a,b,c
   ....: 1,2,3
   ....: 4,5,6"""

In [48]: pd.read_csv(StringIO(s), names=list('ABC'))
Out[48]:
   A  B  C
0  a  b  c
1  1  2  3
2  4  5  6

In [49]: pd.read_csv(StringIO(s), header=None, names=list('ABC'))
Out[49]:
   A  B  C
0  a  b  c
1  1  2  3
2  4  5  6

In [51]: pd.read_csv(StringIO(s), header=0, names=list('ABC'))
Out[51]:
   A  B  C
0  1  2  3
1  4  5  6

while for read_excel, you need to explicitly pass header=None if the file contains no header row:

In [31]: pd.read_excel('test_names.xlsx')
Out[31]:
   a  b  c
0  1  2  3
1  4  5  6

In [46]: pd.read_excel('test_names.xlsx', names=['A', 'B', 'C'])     <--- ignoring the data of the first row
Out[46]:
   A  B  C
0  1  2  3
1  4  5  6

In [45]: pd.read_excel('test_names.xlsx', header=False, names=['A', 'B', 'C'])
Out[45]:
   A  B  C
0  1  2  3
1  4  5  6

In [47]: pd.read_excel('test_names.xlsx', header=None, names=['A', 'B', 'C'])
Out[47]:
   A  B  C
0  a  b  c
1  1  2  3
2  4  5  6

Would it make sense to change the read_excel behaviour to make it more consistent to the read_csv one?

@jorisvandenbossche jorisvandenbossche added API Design IO Excel read_excel, to_excel Needs Discussion Requires discussion from core team before further action labels Dec 23, 2015
@jorisvandenbossche
Copy link
Member Author

@jbrockmendel jbrockmendel added API - Consistency Internal Consistency of API/Behavior and removed API Design labels Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Enhancement IO Excel read_excel, to_excel Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants