Skip to content

fixed issue#59670. DOC #59714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

fixed issue#59670. DOC #59714

wants to merge 2 commits into from

Conversation

@@ -2126,7 +2126,8 @@ def from_records(
associated with them, this argument provides names for the
columns. Otherwise this argument indicates the order of the columns
in the result (any names not found in the data will become all-NA
columns).
columns).Additionally,specifying `columns` will limit the DataFrame to only
include the specified columns, similar to an "include" or "usecols" functionality.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest simplifying this language as follows:

Otherwise this argument indicates the order of the columns in the result (any names not found in the data will become all-NA columns) and limits the data to these columns if not all column names are provided.

I don't think it's worth mentioning "include" or "usecols" because it's better to keep the description brief.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current:
Column names to use. If the passed data do not have names associated with them, this argument provides names for the columns. Otherwise, this argument indicates the order of the columns in the result (any names not found in the data will become all-NA columns).

Propose 1:
Column names to use. If the passed data do not have names associated with them, this argument provides names for the columns. Otherwise, this argument indicates the order of the columns in the result (any names not found in the data will become all-NA columns) and limits the data to these columns if not all column names are provided.

Proposed 2:
The columns argument specifies the column names for the DataFrame. If the data does not have column names, this argument assigns them. If the data already includes column names, this argument determines the order of the columns and limits the DataFrame to include only the columns listed. Any columns not specified will be excluded.

Would this revision work, or do you think there's a better way to phrase it? I’d love to hear your thoughts. Thanks so much for your time!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend leaving the first sentences alone, since they're not part of this issue, and also not limit the scope of the argument to DataFrames. If it's not working for other types, that's something that can be fixed rather than documenting the bug / limitation (see e.g. this issue that was just filed: #59717).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a polite reply to that message:


Thank you for your feedback! I agree, keeping the first sentences unchanged makes sense, and addressing the broader scope beyond just DataFrames is the right approach. If this affects other types, fixing the issue rather than documenting a limitation would indeed be the best course of action. Thanks for pointing that out!

@mroeschke
Copy link
Member

Looks like this issue has already been addressed. Thanks for the PR but closing

@mroeschke mroeschke closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Document that DataFrame.from_records()'s columns argument also acts as "include"
3 participants