Skip to content

BUG: Wrong result of Kurtosis #59572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
xitongsys opened this issue Aug 21, 2024 · 2 comments
Closed
3 tasks done

BUG: Wrong result of Kurtosis #59572

xitongsys opened this issue Aug 21, 2024 · 2 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@xitongsys
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

a = pd.Series([0,0,0,0,0.00005])

a.kurtosis()
Out[205]: np.float64(0.0)

scipy.stats.kurtosis(a,bias=False)
Out[206]: np.float64(4.999999999999997)

Issue Description

Pandas can't give a right result of kurtosis. But the scipy.stats.kurtosis can.

Expected Behavior

see the code

Installed Versions

Replace this line with the output of pd.show_versions()

@xitongsys xitongsys added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 21, 2024
@Liam3851
Copy link
Contributor

This appears to be a duplicate of #57972. Seems this is a floating point underflow as the result of a fix to #18044. E.g. in pd.core.nanops.nankurt:

https://github.com/pandas-dev/pandas/blob/7945e563d36bcf4694ccc44698829a6221905839/pandas/core/nanops.py#L1354C2-L1359C47

    # floating point error
    #
    # #18044 in _libs/windows.pyx calc_kurt follow this behavior
    # to fix the fperr to treat denom <1e-14 as zero
    numerator = _zero_out_fperr(numerator)
    denominator = _zero_out_fperr(denominator)

Here the denominator is coming out as 2.4e-17 which is getting set to 0, because the numbers involved are so small.

Not sure how scipy is handling this to make it numerically stable for extremely small values.

@fbourgey fbourgey removed their assignment Aug 21, 2024
@mroeschke
Copy link
Member

Closing as a duplicate of #57972

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

4 participants