Skip to content

DOC: Inconsistencies in pandas.DataFrame.pivot_table parameter descriptions #53351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
tpaxman opened this issue May 23, 2023 · 2 comments · Fixed by #53605
Closed
1 task done

DOC: Inconsistencies in pandas.DataFrame.pivot_table parameter descriptions #53351

tpaxman opened this issue May 23, 2023 · 2 comments · Fixed by #53605
Labels

Comments

@tpaxman
Copy link
Contributor

tpaxman commented May 23, 2023

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.pivot_table.html

Documentation problem

In the pandas.DataFrame.pivot_table function, there are issues with grammar and sentence structure in the descriptions for the index, columns, and aggfunc parameters, which may cause some confusion for users. The issues are described below:

  • index:
    • The grammar in the phrase "it is being used as the same manner as column values" is not correct.
    • It also uses a different tense than other descriptions (i.e., "is being used" rather than "will be", which is what other parameter descriptions are using (see, margins, dropna, and margins_name).
    • The sentence "Keys to group by on the pivot table index" seems like it should be the first sentence in the paragraph (i.e., before the 'if' condition descriptions).
    • There are two conditions for 'If an array is passed...'; however, they are split up in the paragraph as the first and last sentence. These could be combined for clarity.
    • The description for list input is somewhat unclear about what the list is, as it says "The list can contain..." in contrast with the specific phrasing used to refer to array input ("If an array is passed...")
  • columns:
    • Same issues as `index.
  • aggfunc:
    • indefinite articles are not used before data types, which is inconsistent with the descriptions for other parameters.
    • For example, it uses "If list of functions passed" and "If dict passed", whereas the descripions for index and columns use phrasing such as "If an array is passed"
    • A period is missing after "(inferred from the function objects themselves)"

Suggested fix for documentation

The following fixes are proposed for the descriptions of the index, columns, and aggfunc parameters:

index

Correct grammar and tense to be consistent with other parameter descriptions; rearrange sentence order to be more consistent and clear.

OLD:

If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values.

NEW:

Keys to group by on the pivot table index. If a list is passed, it can contain any of the other types (except list). If an array is passed, it must be the same length as the data and will be used in the same manner as column values.

columns

Correct grammar and tense to be consistent with other parameter descriptions; rearrange sentence order to be more consistent and clear.

OLD:

If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values.

NEW:

Keys to group by on the pivot table column. If a list is passed, it can contain any of the other types (except list). If an array is passed, it must be the same length as the data and will be used in the same manner as column values.

aggfunc

Add indefinite articles to be consistent with other parameter descriptions

OLD:

If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves) If dict is passed, the key is column to aggregate and value is function or list of functions. If margin=True, aggfunc will be used to calculate the partial aggregates.

NEW:

If a list of functions is passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves). If a dict is passed, the key is column to aggregate and the value is function or list of functions. If margin=True, aggfunc will be used to calculate the partial aggregates.

@tpaxman tpaxman added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels May 23, 2023
tpaxman added a commit to tpaxman/pandas that referenced this issue May 24, 2023
@lithomas1 lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label May 30, 2023
@lithomas1
Copy link
Member

PR(s) are welcome for this.

@tpaxman
Copy link
Contributor Author

tpaxman commented May 30, 2023

I will take this and submit a PR

mroeschke pushed a commit that referenced this issue Jun 12, 2023
…parameter descriptions (#53605)

DOC: Inconsistencies in pandas.DataFrame.pivot_table parameter descriptions (#53351)
Daquisu pushed a commit to Daquisu/pandas that referenced this issue Jul 8, 2023
…parameter descriptions (pandas-dev#53605)

DOC: Inconsistencies in pandas.DataFrame.pivot_table parameter descriptions (pandas-dev#53351)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants