Skip to content

DataFrame.from_csv undocumented behavior of index_col #9556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
EntilZha opened this issue Feb 26, 2015 · 6 comments · Fixed by #10163
Closed

DataFrame.from_csv undocumented behavior of index_col #9556

EntilZha opened this issue Feb 26, 2015 · 6 comments · Fixed by #10163
Labels
Milestone

Comments

@EntilZha
Copy link

Took me a while to find this (in the end by stackoverflow), but I ran into the problem while trying to parse a csv file. Panda kept parsing the first column as the index column instead of generating a new index. I could not find a way to disable this after reading the docs here:

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.from_csv.html

As it turns out, due to the line of code below, if the argument is set to False, then it has the behavior I desired. This is completely undocumented and would be very helpful to have. I haven't contributed before to Pandas, so if it is more than a quick change a maintainer can make, I could put the work in for a PR, even if its just documentation changes.

https://github.com/pydata/pandas/blob/master/pandas/io/parsers.py#L1064

@jorisvandenbossche
Copy link
Member

Please have a look at the docstring of read_csv: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

I think there this feature is explained. from_csv should actually be deprecated and the docs should just refer to read_csv (see #4191)

@EntilZha
Copy link
Author

I would agree with #4191 which points to #4916 that it should be deprecated. For the short term, could the docs be updated to reflect that from_csv will be deprecated or that users should use read_csv instead (perhaps additionally document the False usage).

@jorisvandenbossche
Copy link
Member

Yes I will do a PR. At least the 'deprecation' can be documented by removing the current doctring and just pointing to read_csv (and noting the one difference in the default value of parse_dates)

@jorisvandenbossche
Copy link
Member

or if you would like to do a PR for this, certainly welcome!

@EntilZha
Copy link
Author

I would be good with either. Should be a quick change, most of the time tbh would be in forking and generating the docs. I could probably work on it tonight/tomorrow.

@jorisvandenbossche
Copy link
Member

Give it a go! (I would also work at it this weekend at the earliest). Normally, for such a small change, building the full docs should not be necessary (but of course, it can be a good exercise to try to build them), just have a look at the existing docstrings for the format (and here for a description of the format: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#docstring-standard)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants