Skip to content

DOC: include defaults into docs for true_values and false_values #48597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 23, 2022

Conversation

AlexKirko
Copy link
Member

@AlexKirko AlexKirko commented Sep 17, 2022

Motivation

This PR touches only one docstring, so no other checks are necessary. As outlined in #48487, when we call pd.read_csv, we eventually go to _libs/ops.pyx, where there are defaults for true_values and false_values:

    # the defaults
    true_vals = {'True', 'TRUE', 'true'}
    false_vals = {'False', 'FALSE', 'false'}

    if true_values is not None:
        true_vals = true_vals | set(true_values)

    if false_values is not None:
        false_vals = false_vals | set(false_values)

In addition, there appears to be a str.lower() or str.upper() call somewhere, although I can't find it, because trUe is treated as true-like too. We should outline this in the docstring, so that it's not a surprise to our users.

Definition of done

  • mention the default values for true_values and false_values in the docstring for pd.read_csv

@AlexKirko
Copy link
Member Author

I don't think a whatsnew entry or anything else is needed for a doc-only PR, but please tell me if I've missed anything.

@AlexKirko
Copy link
Member Author

There are a bunch of errors, but it doesn't look like they have anything to do with the docstring change, and they seem to be cropping up over all the current PRs.

@mroeschke mroeschke added Docs IO CSV read_csv, to_csv labels Sep 19, 2022
@@ -195,9 +195,9 @@
Dict of functions for converting values in certain columns. Keys can either
be integers or column labels.
true_values : list, optional
Values to consider as True.
Values to consider as True in addition to case-insensitive variants of "True".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if there are tests for case-insensitive variants of True/False?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question, it will tell us whether the case insensitivity is intended or unnoticed behavior. I'll take a look today or tomorrow.

Copy link
Member Author

@AlexKirko AlexKirko Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke I've taken a look, and the closest that I can find is test_boolean_dtype here. There are also some conversion tests in test_na_values.py. But those have a different purpose.

What would your suggestion be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking. I think test_boolean_dtype should cover the additional docstring note you added here.

@mroeschke mroeschke added this to the 1.6 milestone Sep 23, 2022
@mroeschke mroeschke merged commit 9ad36f2 into pandas-dev:main Sep 23, 2022
@mroeschke
Copy link
Member

Thanks @AlexKirko

@mroeschke mroeschke modified the milestones: 1.6, 2.0 Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: pandas.read_csv
2 participants