-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
The indexes of DataFrame.describe(percentiles=[0.29, 0.57, 0.58]) are incorrect #48298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The indexes of DataFrame.describe(percentiles=[0.29, 0.57, 0.58]) are incorrect #48298
Conversation
pandas/io/formats/format.py
Outdated
@@ -1721,10 +1721,10 @@ def format_percentiles( | |||
|
|||
percentiles = 100 * percentiles | |||
|
|||
int_idx = np.isclose(percentiles.astype(int), percentiles) | |||
int_idx = np.isclose(percentiles.round().astype(int), percentiles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make percentiles.round().astype(int)
a dedicated variable so it can be used here and below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fairly good. Could use a whatsnew note in 1.6.rst
Co-authored-by: Matthew Roeschke <[email protected]>
|
||
if np.all(int_idx): | ||
out = percentiles.astype(int).astype(str) | ||
out = percentiles_round_type.astype(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the example in the original issue hit this path or the one below? Just want to make sure we have 2 tests that didn't work before and hit this change and the one below to work correctly now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the example in the original issue hit this path or the one below? Just want to make sure we have 2 tests that didn't work before and hit this change and the one below to work correctly now
a) The example from issue will hit this path.
pd.DataFrame(data=[1, 2, 3], columns=['value']).describe(percentiles=[0.29, 0.57, 0.58])
if np.all(int_idx):
out = percentiles_round_type.astype(str)
return [i + "%" for i in out]
b) The example below I own create will hit below path.
pd.DataFrame(data=[1, 2, 3], columns=['value']).describe(percentiles=[0.291, 0.57, 0.58])
out[int_idx] = percentiles[int_idx].round().astype(int).astype(str)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the pd.DataFrame(data=[1, 2, 3], columns=['value']).describe(percentiles=[0.29, 0.57, 0.58])
case as a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have add test
Thanks @weikhor! |
… incorrect (pandas-dev#48298) * add test * add test * add * Update doc/source/whatsnew/v1.6.0.rst Co-authored-by: Matthew Roeschke <[email protected]> * add Co-authored-by: Matthew Roeschke <[email protected]>
DataFrame.describe(percentiles=[0.29, 0.57, 0.58])
are incorrect #46362doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.