Skip to content

DOC: per docs, "margins" parameter returns row and column sums if True, but I think it returns the mean #48916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
bhagerty opened this issue Oct 2, 2022 · 1 comment
Labels
Docs Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@bhagerty
Copy link

bhagerty commented Oct 2, 2022

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.pivot_table.html

Documentation problem

The documentation for the margins parameter says:

  • Add all row / columns (e.g. for subtotal / grand totals).

From what I can see, margins=True will return row and column means, not sums, at least if no aggfunc is specified. If there's a way to get sums, not means, I don't know what it is. Here's an example showing margins=True returning row and column means, not sums:

print(sales.pivot_table(values="weekly_sales", index="department", columns="type", fill_value=0, margins=True))
type                A           B        All
department                                  
1           30961.725   44050.627  32052.467
2           67600.159  112958.527  71380.023
...               ...         ...        ...
98          12875.423     217.428  11820.590
99            379.124       0.000    379.124
All         23674.667   25696.678  23843.950

Suggested fix for documentation

I don't know the fix for sure, because I don't know if margins=True behaves differently when an aggfunc is specified. But if it returns the mean rather than the sum, I suggest:

  • Provide the mean value for each row and each column.
@bhagerty bhagerty added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 2, 2022
@MatteoRaso
Copy link
Contributor

The default aggfunc is "mean", which is what's causing this behaviour. You wouldn't think that from reading the docs, though.

MatteoRaso added a commit to MatteoRaso/pandas that referenced this issue Oct 6, 2022
The documentation for the margins parameter of pivot_table
was incorrect. It said that the parameter added rows and columns,
but it actually passed them to aggfunc. I used the documentation
from the user guide to replace the old documentation, as well as
added a sentence to the documentation for aggfunc that explained
its role in calculating margins.
mroeschke added a commit that referenced this issue Oct 6, 2022
* DOC: Fixed documentation for pivot_table margins (#48916)

The documentation for the margins parameter of pivot_table
was incorrect. It said that the parameter added rows and columns,
but it actually passed them to aggfunc. I used the documentation
from the user guide to replace the old documentation, as well as
added a sentence to the documentation for aggfunc that explained
its role in calculating margins.

* Update pandas/core/frame.py

Co-authored-by: Matthew Roeschke <[email protected]>

Co-authored-by: Matthew Roeschke <[email protected]>
@bhagerty bhagerty closed this as completed Oct 7, 2022
noatamir pushed a commit to noatamir/pandas that referenced this issue Nov 9, 2022
…andas-dev#48965)

* DOC: Fixed documentation for pivot_table margins (pandas-dev#48916)

The documentation for the margins parameter of pivot_table
was incorrect. It said that the parameter added rows and columns,
but it actually passed them to aggfunc. I used the documentation
from the user guide to replace the old documentation, as well as
added a sentence to the documentation for aggfunc that explained
its role in calculating margins.

* Update pandas/core/frame.py

Co-authored-by: Matthew Roeschke <[email protected]>

Co-authored-by: Matthew Roeschke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants