read.csv index_col argument accepts string, list-of-string #22276

smcinerney · 2018-08-10T22:47:51Z

read.csv index_col argument has been accepting either string or list-of-string for years, but the doc (as of 0.24 dev) has never been updated to reflect this. Current and suggested text at bottom.
All columns used in index_col get dropped as regular columns. The doc never explicitly says this and it causes user confusion.
The doc does however discuss "malformed file with delimiters at the end of each line... you might consider index_col=False", this is overly prominent for a rare defective case and should be shunted somewhere less prominent, or at minimum relegated to a parenthesized footnote.

Current read.csv doc:

index_col : int or sequence or False, default None
Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each line, you might consider index_col=False to force pandas to not use the first column as the index (row names).

Suggested read.csv doc:

index_col : int/string or sequence of int/string or False, default None
Column(s) to use as the row labels of the DataFrame, either given as string name or column index.
If a sequence of int/string is given, a MultiIndex is used.
Columns used for the index (row names) are dropped from the actual columns of the input dataframe. (They are accessible via .index).
(Note: index_col=False can be used to force pandas to not use the first column as the index, e.g. when you have a malformed file with delimiters at the end of each line).

The text was updated successfully, but these errors were encountered:

WillAyd · 2018-08-11T00:29:46Z

PRs to improve documentation are always welcome - if you have something you feel is better feel free to submit and the team will provide feedback!

NikhilKumarM · 2018-08-11T02:16:47Z

I will take this up.

smcinerney · 2018-08-14T00:10:29Z

Can anyone pinpoint the version this was introduced? (at least as old as 0.20.4) and whether it was intentional or accidental?

(I couldn't figure it out from reading through the history and blame on https://github.com/pandas-dev/pandas/blob/master/pandas/io/parsers.py)

WillAyd added the Docs label Aug 11, 2018

kimsey0 mentioned this issue Mar 1, 2019

Update documentation of read_csv to explain that index_col can be a string containg a column name #25502

Merged

gfyoung added the IO CSV read_csv, to_csv label Mar 2, 2019

jreback added this to the 0.25.0 milestone Mar 3, 2019

TomAugspurger closed this as completed in #25502 Mar 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read.csv index_col argument accepts string, list-of-string #22276

read.csv index_col argument accepts string, list-of-string #22276

smcinerney commented Aug 10, 2018 •

edited

Loading

WillAyd commented Aug 11, 2018

NikhilKumarM commented Aug 11, 2018

smcinerney commented Aug 14, 2018 •

edited

Loading

read.csv index_col argument accepts string, list-of-string #22276

read.csv index_col argument accepts string, list-of-string #22276

Comments

smcinerney commented Aug 10, 2018 • edited Loading

WillAyd commented Aug 11, 2018

NikhilKumarM commented Aug 11, 2018

smcinerney commented Aug 14, 2018 • edited Loading

smcinerney commented Aug 10, 2018 •

edited

Loading

smcinerney commented Aug 14, 2018 •

edited

Loading