-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: aggfunc use different default arguments in pivot_table #36508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
there have been several issues about this - |
@dcsaba89 Apparently it is using the sample (unbiased) standard deviation, see #34437 for info. This example demonstrates why this is an issue:
edit: Added lambda method showing how aggregating with |
I agree with you @denck007, this behavior is unexpected and furthermore undocumented. However, I think this behavior should be fixed to return the biased std_dev as this is what np.std returns by default. |
Closed by #57444 |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
The default value of ddof for np.std is 0:
numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=)
When np,std passed to aggfunc to calculate the standard deviation of X it returns the unbiased standard deviation, because it picks different ddof =1 and does not pick up the default ddof = 0.
On the other hand the expected behavior if we have a function f, when we pass it to aggfunc:
aggfunc=f and aggfunc=lambda x: f(x) must return exactly the same result.
Expected Output
To summerize, the expected behavior is to use the function's default arguments when it is passed to aggregate values in pd.pivot_table.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2a7d332
python : 3.8.5.final.0
python-bits : 32
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.1.2
numpy : 1.19.2
The text was updated successfully, but these errors were encountered: