Skip to content

DOC: Add to the io documentation of on_bad_lines to alert users of silently skipped lines. #50311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 3, 2023

Conversation

kostyafarber
Copy link
Contributor

This PR adds to the documentation of on_bad_lines for reading CSV files. Specifically, when using user-defined callables for on_bad_lines any bad line that isn't a 'line with too many fields' is silently skipped. The user should be aware of this behaviour.

The issue has been left open for further discussion of what is considered a 'bad line' and whether any code change should be made to accomodate the new definition of a bad line.

@kostyafarber
Copy link
Contributor Author

friendly ping @mroeschke

@kostyafarber
Copy link
Contributor Author

@mroeschke

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @kostyafarber

@mroeschke mroeschke added this to the 2.0 milestone Jan 3, 2023
@mroeschke mroeschke merged commit 5097302 into pandas-dev:main Jan 3, 2023
@mroeschke
Copy link
Member

Thanks @kostyafarber

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: on_bad_lines=callable does not invoke callable for all bad lines
3 participants