-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: sample variance -> population variance #46482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you elaborate why you would make this change? |
https://en.wikipedia.org/wiki/Bias_of_an_estimator note that sample variance can be either biased or unbiased. Note that the usual definition of sample variance is and this is an unbiased estimator of the population variance. The phrase above is quoted as it is in wikipedia. |
Thanks, would you be interested in submitting a pr? |
sure |
Pandas version checks
main
hereLocation of the documentation
https://pandas.pydata.org/docs/dev/user_guide/gotchas.html
Documentation problem
Differences with NumPy
For Series and DataFrame objects, var() normalizes by
N-1 to produce unbiased estimates of the sample variance, while NumPy’s numpy.var() normalizes by N, which measures the variance of the sample.
"unbiased estimates of the sample variance" needs to be corrected to 'the population variance'
Suggested fix for documentation
"unbiased estimates of the sample variance" needs to be corrected to 'the population variance'
The text was updated successfully, but these errors were encountered: