Skip to content

ENH: Allow comment lines to be written to CSV outputs. Complements read_csv's comment param #53569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

canthonyscott
Copy link


There are many use cases (especially in the scientific community) where the best/only course of action is to enable to embedding of configuration parameters and/or other metadata into the beginning of a CSV file itself. These are typically prefaced with some comment-indication prefix such as #. This maintains human readability while attaching the metadata to the generated file itself.

Pandas' read_csv method already implements a feature to read such files and ignore these lines when parsing the the data into a dataframe. This PR implements the complement of this feature. It allows users to write these metadata and/or comment lines in their CSV outputs as well.

The to_csv method was extended to allow a comment param matching that in read_csv along with a comment_lines param which takes an iterator of strings, each of which will be written to the beginning of the file as a single line, prior to and CSV header or body data being written.

@canthonyscott canthonyscott changed the title EHN: Allow comment lines to be written to CSV outputs. Complements read_csv's comment param ENH: Allow comment lines to be written to CSV outputs. Complements read_csv's comment param Jun 8, 2023
@twoertwein
Copy link
Member

Thank you @canthonyscott for your PR. When proposing a new feature, it is typically better to first open an issue before implementing it to get feedback on the idea.

to_csv supports file handles, so a user could "easily" (but probably less convenient) have their own wrapper:

with open("test.csv", mode="wt") as handle:
    handle.write(comments)
    dataframe.to_csv(handle)

@canthonyscott
Copy link
Author

That makes total sense. I can go open one. I may have been a little over-eager to try and see if what I wanted to do was an actual possibility first. Thank you!

@canthonyscott
Copy link
Author

Closing this PR as discussion in the issue has led to a better solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants